Hey, The table itself is just Impala's metadata (indeed located in every coordinator's catalog cache), the underlying data (read from HDFS) is stored on [replication factor] number of nodes.
Jeszy On 17 August 2017 at 16:38, Alexander Shoshin <[email protected]> wrote: > Hi Petter, > > > > thanks for the advice. Replication factor could be a reason. It is equals to > 3 at the moment. I will try to increase it up to 6 and see if there are any > changes. > > > > But I am not sure about positive effect. I have made a data rebalancing and > I guess each table should be located on all Data Nodes no matter what > replication factor is. > > > > Regards, > > Alexander > > > > > > From: Petter von Dolwitz (Hem) [mailto:[email protected]] > Sent: Thursday, August 17, 2017 4:55 PM > To: [email protected] > Subject: Re: Impala daemon memory > > > > Hi, > > > > could it be related to data locality, i.e if you use HDFS with replication > factor 3? You could check this by increasing the replication factor to 6 for > the files you use and see if there is a change. > > > > Br, > > Petter > > > > Den 17 aug. 2017 11:57 fm skrev "Alexander Shoshin" > <[email protected]>: > > Hi, team! > > > > I have an issue working with impala. Maybe you could help me? > > > > My data is stored in parquet files. I am running queries to the data on > Impala through JMeter. I have 6 Impala daemons and JMeter sends queries > randomly to each of them. > > The problem is that only 3 of 6 Impala daemons use all available memory. > Others 3 Impala daemons use 3-4 times less memory. > > > > These 3 daemons each time are different, but there are always 3 of them. > When I tried to disable 2 daemons I saw the same picture: 3 daemons used all > available memory and 1 daemon not. All Impala daemons have the same > mem_limit setting. > > > > Have you ever had such strange behavior? Why not all Impala daemons use all > available memory? > > > > Regards, > > Alexander Shoshin > >
