[
https://issues.apache.org/jira/browse/TAJO-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13636079#comment-13636079
]
Hyunsik Choi commented on TAJO-36:
----------------------------------
Hi Alexander Sibetheros,
Thank you for your interest! You seem to have great background related to this
project.
You should submit your proposal to GSoC. Then, we will select candidate(s)
based on the strength of their proposal. If you want proposal examples, you can
find them in Jira.
You can find more information about Tajo from http://tajo.incubator.apache.org
and http://wiki.apache.org/tajo. Especially, "GettingStarted"
(http://wiki.apache.org/tajo/GettingStarted) is a good point to start, and
"HowToContribute" would be helpful for you.
Thank you!
> Improve ExternalSortExec with N-merge sort and final pass omission
> ------------------------------------------------------------------
>
> Key: TAJO-36
> URL: https://issues.apache.org/jira/browse/TAJO-36
> Project: Tajo
> Issue Type: Improvement
> Components: physical operator
> Reporter: Hyunsik Choi
> Labels: gsoc, gsoc2013, mentor
>
> Background:
> The current ExternalSortExec just uses the binary external merge sort
> algorithm
> (http://en.wikipedia.org/wiki/External_sorting#External_merge_sort). In other
> words, for each pass, ExternalSortExec just merges two files into one sorted
> file.
> Proposal:
> The goal of this proposal is to improve ExternalSortExec with the following
> improvements:
> * N-merge sort - we can merge N files though more memory at each pass. It
> will reduce the number of passes. Consequently, it will reduces considerable
> I/O overheads.
> * the final pass omission - a physical operator is pipelined by the parent
> operator. The final pass of the merge sort must also be invoked by the parent
> physical operator. So, we can omit the final pass of the merge sort.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira