[ 
https://issues.apache.org/jira/browse/TAJO-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635727#comment-13635727
 ] 

Alexander Sibetheros edited comment on TAJO-36 at 4/18/13 10:32 PM:
--------------------------------------------------------------------

Hello,
I read the above summary of the project requested and it seems rather 
interesting.
Although I don't have any coding history with the apache project, I just 
finished reading the ExternalSortExec.java and the implementation now seems 
rather straight forward, so I believe with some guidance I should be able to 
handle the task.

Currently I am a 4th year student in Informatics(University of Athens), with 
great background in algorithms(especially sorting), strong c,c++,java,python 
skills and recently took part in the Sigmod 2013 programming contest(results 
pending), which required lots of research into algorithms and fast indexing and 
sorting mechanisms. This summer I have no remaining classes and my graduation 
thesis will begin in October, so I will have plenty of time to write code, test 
thoroughly and document.  
 
                
      was (Author: sib_):
    Hello,
I read the above summary of the project requested and it seems rather 
interesting.
I am a 4th year student in Informatics(University of Athens), with great 
background in algorithms(especially sorting), strong c,c++,java,python skills 
and recently took part in the Sigmod 2013 programming contest(results pending), 
which required lots of research into algorithms and fast indexing and sorting 
mechanisms. 
I just finished reading the ExternalSortExec.java and the implementation now 
seems rather straight forward, so I believe with some guidance I should be able 
to handle the task. 
                  
> Improve ExternalSortExec with N-merge sort and final pass omission
> ------------------------------------------------------------------
>
>                 Key: TAJO-36
>                 URL: https://issues.apache.org/jira/browse/TAJO-36
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>            Reporter: Hyunsik Choi
>              Labels: gsoc, gsoc2013, mentor
>
> Background:
> The current ExternalSortExec just uses the binary external merge sort 
> algorithm 
> (http://en.wikipedia.org/wiki/External_sorting#External_merge_sort). In other 
> words, for each pass, ExternalSortExec just merges two files into one sorted 
> file.
> Proposal:
> The goal of this proposal is to improve ExternalSortExec with the following 
> improvements:
> * N-merge sort - we can merge N files though more memory at each pass. It 
> will reduce the number of passes. Consequently, it will reduces considerable 
> I/O overheads.
> * the final pass omission - a physical operator is pipelined by the parent 
> operator. The final pass of the merge sort must also be invoked by the parent 
> physical operator. So, we can omit the final pass of the merge sort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to