[ 
https://issues.apache.org/jira/browse/TAJO-36?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi updated TAJO-36:
-----------------------------

    Labels: gsoc gsoc2013 mentor  (was: gsoc2013 mentor)
    
> Improve ExternalSortExec with N-merge sort and final pass omission
> ------------------------------------------------------------------
>
>                 Key: TAJO-36
>                 URL: https://issues.apache.org/jira/browse/TAJO-36
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>            Reporter: Hyunsik Choi
>              Labels: gsoc, gsoc2013, mentor
>
> Background:
> The current ExternalSortExec just uses the binary external merge sort 
> algorithm 
> (http://en.wikipedia.org/wiki/External_sorting#External_merge_sort). In other 
> words, for each pass, ExternalSortExec just merges two files into one sorted 
> file.
> Proposal:
> The goal of this proposal is to improve ExternalSortExec with the following 
> improvements:
> * N-merge sort - we can merge N files though more memory at each pass. It 
> will reduce the number of passes. Consequently, it will reduces considerable 
> I/O overheads.
> * the final pass omission - a physical operator is pipelined by the parent 
> operator. The final pass of the merge sort must also be invoked by the parent 
> physical operator. So, we can omit the final pass of the merge sort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to