[ 
https://issues.apache.org/jira/browse/IMPALA-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16700235#comment-16700235
 ] 

ASF subversion and git services commented on IMPALA-7836:
---------------------------------------------------------

Commit 8872e8bf80d01b8e2ea88432f27eefc7a1a2169d in impala's branch 
refs/heads/branch-3.1.0 from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8872e8b ]

IMPALA-7836: [DOCS] Document TOPN_BYTES_LIMIT query option

Change-Id: Ib7109c2949ee5137d8b4a748227948b79bd93f52
Reviewed-on: http://gerrit.cloudera.org:8080/11914
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Tim Armstrong <[email protected]>
(cherry picked from commit 731254b52934c17d953da541df8bc4493beb037a)


> Impala 3.1 Doc: New query option 'topn_bytes_limit' for TopN to Sort 
> conversion
> -------------------------------------------------------------------------------
>
>                 Key: IMPALA-7836
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7836
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Docs, Frontend
>    Affects Versions: Impala 2.9.0
>            Reporter: Sahil Takiar
>            Assignee: Alex Rodoni
>            Priority: Major
>              Labels: future_release_doc
>             Fix For: Impala 3.1.0
>
>
> IMPALA-5004 adds a new query level option called 'topn_bytes_limit' that we 
> should document. The changes in IMPALA-5004 work by estimating the amount of 
> memory required to run a TopN operator. The memory estimate is based on the 
> size of the individual tuples that need to be processed by the TopN operator, 
> as well as the sum of the limit and offset in the query. TopN operators don't 
> spill to disk so they have to keep all rows they process in memory.
> If the estimated size of the working set of the TopN operator exceeds the 
> threshold of 'topn_bytes_limit' the TopN operator will be replaced with a 
> Sort operator. The Sort operator can spill to disk, but it processes all the 
> data (the limit and offset have no affect). So switching to Sort might incur 
> performance penalties, but it will require less memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to