[
https://issues.apache.org/jira/browse/TRAFODION-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15544445#comment-15544445
]
ASF GitHub Bot commented on TRAFODION-2259:
-------------------------------------------
GitHub user prashanth-vasudev opened a pull request:
https://github.com/apache/incubator-trafodion/pull/743
[TRAFODION-2259] TopN sort changes.
This includes first of changes related to sort implementation in executor.
cqd gen_sort_topn_size 'N' forces sort to use topn.
Subsequent changes in compiler will be able to push down topn to sort.
Additional cleanup , error handling will be checked in subsequent changes
once compiler changes are in.
1. Sort would initially maintain Top N array of elements to being with.
2. Read records into TopN array.
3. Once TopN array is full, heapify the array into max heap. Top node in
the heap is always the highest node.
4. Subsequent record read either gets discarded( if greater than top node)
or replace top node( if lesser then top node) . if replaced top node,
re-balance the heap.
5. Repeat steps 4 until last record is read.
6. sort the final heap using heap sort.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/prashanth-vasudev/incubator-trafodion TopN
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-trafodion/pull/743.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #743
----
commit 61cc7b5b26d760839e6515b3e7cc94d8ed5ed879
Author: Prashant Vasudev <[email protected]>
Date: 2016-10-04T05:45:43Z
[TRAFODION-2259] TopN sort changes.
This includes first of changes related to sort implementation in executor.
cqd gen_sort_topn_size 'N' forces sort to use topn.
Subsequent changes in compiler will be able to push down topn to sort.
----
> Sort TopN operator
> ------------------
>
> Key: TRAFODION-2259
> URL: https://issues.apache.org/jira/browse/TRAFODION-2259
> Project: Apache Trafodion
> Issue Type: Improvement
> Components: sql-exe
> Affects Versions: 2.1-incubating
> Reporter: Prashanth Vasudev
> Assignee: Prashanth Vasudev
>
> Sort operator consumes all records before producing sorted records. For
> certain use cases where only Top N records are required, today sort consumes
> all records into memory and overflows( spills ) to disk. This impacts
> performance.
> if topN is pushed down to sort, only required memory can be allocated and
> sort would only hold topN records in memory. Once all the records are read,
> sorted records in topN is returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)