Hi Lars, You should think of the statestore as a pub/sub system used by the other impala processes (impalads and catalogd). One example topic is the "catalog_update" which is used to communicate catalog changes (e.g. adding/dropping a table, etc) from the catalog to all the coordinators (impalads). Periodically, the statestore will contact the catalog to ask for metadata changes that get disseminated to the cluster (coordinators).
All other user-triggered operations (DDL/DML, INVALIDATE, REFRESH, etc) use more or less the following path. The issuing impalad sends the request to the catalog, the catalog processes it and then responds back to the impalad; this is a synchronous operation. In the background, the statestore performs the task described above to propagate metadata changes to all the other impalad nodes. That's at a high level how impalads, catalogd and statestore interact with each other. Is this what you're interested in or are you looking for more low-level implementation details? Thanks Dimitris On Fri, Dec 8, 2017 at 2:48 AM, Lars Francke <[email protected]> wrote: > Hi, > > I'm trying to understand how the communication between the components > works. > > I understand that an impala daemon subscribes to the statestore. The > statestore seems to have the concept of heartbeats and topics. But I'm not > sure what topics are all about. > > The docs also say that only the statestore communicates with the catalog > service. How does that happen? How is a INVALIDATE/REFRESH statement routed > from a daemon to the catalog service and back? > > I'm sure I'll have follow-up questions but this would already be very > helpful. Thank you! > > Cheers, > Lars >
