Hi Community,

I'm investigating the peak times where the impala daeomns memory were
consumed so i can distribute my queries in the right way.

While looking into one scenario, i see one query with the below stats:

*** The query has several joins and this is the reason why it take much
time.

Duration: 4.1m
Rows Produced: 30102
Aggregate Peak Memory Usage: 6.5 GiB
Per Node Peak Memory Usage: 6.5 GiB
The total number of bytes read from HDFS :727 MiB
Memory Accrual: 86228522168 byte seconds
Pool: default-pool
Query State: FINISHED
Threads: CPU Time: 8m

====================

I'm intersting to understand why Aggregate Peak Memory Usage and Per Node
Peak Memory Usage are identical, while looking in the query profile i see
it ran several fragemments on different nodes.

Also i see that this query all the times it ran, it has the mentioned 2
parmeters with identical values:( Daily scheduled query)

What i'm trying to understand:

1) If the query has several fragments, shouldn't the 2 parmeters be
different?

2) Is this scenrio can happen since the HDFS reads byte is small and it may
cause all the data to be read from single node?

3) Since i see that the node that was with peak memory is the coordinator,
and i see that the 2 parameters are identical, is it mean that the
cordinator also executed most of  the query?

4) while thinking when these 2 metrics can have the same value is that at a
particular time there was one node participating in the query coordinations
and execution which is for sure will be the coordinator node and at that
point the query has the highest aggregarte memory consumption.

Is my assumption true?

5) If my previous assumption is true, then, is there anyway to force the
coordinator to un participate in the query consumption? ( Since have 2-3
queries running at the same time with such scenrio will fail)

6) While looking in the fragments i see the following on one of the
fragements:

PeakMemoryUsage: 141.1 MiB
PerHostPeakMemUsage: 6.5 GiB

IS the peakMemoryUsage is refer to the node that ran the specific fragment
and the perHostPeakMemUsage refers to the node with the peak memory cross
the cluster in this soecifc query?


-- 
Take Care
Fawze Abujaber

Reply via email to