There is also this page, which has another paper published by the Impala
team, as well as other related materials:
https://cwiki.apache.org/confluence/display/IMPALA/Impala+Reading+List


On Wed, Apr 5, 2017 at 7:02 PM, Dimitris Tsirogiannis <
[email protected]> wrote:

> Hi Antoni,
>
> Regarding question 2. The catalog server collects file metadata, including
> block locations from the HDFS NameNode and caches them in memory. Overtime,
> file metadata are broadcast using the statestore to all the Impala servers
> and stored in their local metadata caches.
>
> Dimitris
>
> On Tue, Apr 4, 2017 at 9:24 PM, Antoni Ivanov <[email protected]> wrote:
>
> > Hi,
> > I've been reading on design of catalog service/statestore.
> > Mostly from White paper about Impala - http://cidrdb.org/cidr2015/
> > Papers/CIDR15_Paper28.pdf
> > I got it from Impala confluence wiki https://cwiki.apache.org/
> > confluence/display/IMPALA/Impala+Presentations%2C+Papers+and+Blog+Posts
> > It’s rather interesting – it has fairly detailed (but clear) design of
> > different components
> >
> > Are there other sources (except the source code)?
> >
> > Question 2: I’ve been wondering does Impalad caches files location itself
> > – they don’t seem
> > to be stored in hive metastatore. Just the partition location is there,
> > right?
> >
> >
>

Reply via email to