Hi! As Gabor wrote, due the the LIMIT clause a top-n node is created, which needs a limited number of memory. This can be seen in the plan in the profile: 01:TOP-N [LIMIT=2]
You can disable this optimisation with query option set disable_outermost_topn=true; - running the same query with that option will lead to fully sorting the input data. On Tue, Jul 19, 2022 at 9:59 AM Gabor Kaszab <gaborkas...@cloudera.com> wrote: > Hi Sameera, > > If a sorter (or a TOP-N node in your case) doesn't fit into memory then it > does a partial sort for the data that fits into memory (called a "run" if > I'm not mistaken) and writes them into disk. After this it loads another > subset of the data into memory, does a partial sort again, and spills to > disk. It does this until the node runs out of rows to be processed. Once > this happens a merging sort is applied on the partially sorted "runs". > Specifically in your query the 6 executor nodes each do these partial sorts > and then the merging sort to get the 2 rows as a result due to the LIMIT > clause for each executor node. A finishing step is a MERGING-EXCHANGE that > receives 2 rows from each 6 executors and gets the final result of the > "ORDER BY LIMIT 2". > As a result the more memory you allocate to Impala the bigger these > partially sorted "runs" could be, the easier the merging sort becomes. > Additionally, I haven't checked but I assume if you have a LIMIT clause > Impala doesn't have to spill the whole run into memory just the part that > survives the LIMIT clause (2 rows per run in this case) so the less runs > you have the less you have to deal with IO to disk. > > Update: I gave this a second thought and I assume that a top-n node (when > you have a LIMIT clause) doesn't even have to spill to disk either as in > your case it only has to maintain the highest 2 values due to the LIMIT > clause. > > Hope this helps. > > On Mon, Jul 18, 2022 at 8:26 PM sameera <sameeranadis...@gmail.com> wrote: > >> Hi Team, >> >> I'm seeing this behaviour regarding memory utilization in impala >> executors. >> I have a table with 298GB of parquet data in S3 and I'm trying to scan >> all partitions and do a Top-N operation to understand the memory usage in >> executors. >> >> When i set executor startup option mem_limit to 1GB, the query takes >> 600s. With 10GB mem_limit it completes within 200s while utilizing memory >> upto 10GB. This could be because Impala has decided to set >> NumScannerThreadsStarted 1 in 1GB mem_limit and NumScannerThreadsStarted >> are 16 in 10GB mem_limit settings. >> >> During both tests I didn't see any intermediate files created in scratch >> dirs. My question is how does Impala manage to complete the entire table >> sorting when memory is limited to 1GB? Please help me to understand how it >> internally works. Any design document would be really helpful. >> >> select * from execution_report_s3_lse order by time_sequence limit 1; >> >> Impala version 3.4 >> 6 executor node cluster with dedicated coordinators. >> Node spec - 16 Core, 32 GB memory >> >> Thank you, >> Sameera. >> >