Hi Petter, thanks for the advice. Replication factor could be a reason. It is equals to 3 at the moment. I will try to increase it up to 6 and see if there are any changes.
But I am not sure about positive effect. I have made a data rebalancing and I guess each table should be located on all Data Nodes no matter what replication factor is. Regards, Alexander From: Petter von Dolwitz (Hem) [mailto:[email protected]] Sent: Thursday, August 17, 2017 4:55 PM To: [email protected] Subject: Re: Impala daemon memory Hi, could it be related to data locality, i.e if you use HDFS with replication factor 3? You could check this by increasing the replication factor to 6 for the files you use and see if there is a change. Br, Petter Den 17 aug. 2017 11:57 fm skrev "Alexander Shoshin" <[email protected]<mailto:[email protected]>>: Hi, team! I have an issue working with impala. Maybe you could help me? My data is stored in parquet files. I am running queries to the data on Impala through JMeter. I have 6 Impala daemons and JMeter sends queries randomly to each of them. The problem is that only 3 of 6 Impala daemons use all available memory. Others 3 Impala daemons use 3-4 times less memory. These 3 daemons each time are different, but there are always 3 of them. When I tried to disable 2 daemons I saw the same picture: 3 daemons used all available memory and 1 daemon not. All Impala daemons have the same mem_limit setting. Have you ever had such strange behavior? Why not all Impala daemons use all available memory? Regards, Alexander Shoshin
