[
https://issues.apache.org/jira/browse/TAJO-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi updated TAJO-983:
------------------------------
Attachment: TAJO-983.140924-Hyunsik.patch
+1 for your latest patch.
I've touched indent and some trivial things on your patch:
* Removed useLocalFile flag from Fetcher because FileChunk::fromRemote is
logically equivalent to useLocalFile flag.
* In 734 line of Task, I added some code to skip creation of storeChunk when
storeChunk retrieved from getLocalStoredFileChunk is null. It's because
storeChunk will be NULL and this case is normal state when a range request is
out of range.
* Replaced "interFile_" string used in ExternalSortExec by some defined
constant variable. I also added the prefix character '@' - which can not be
used for table name identifier - in order to avoid potential possibility of
duplicated fragment names.
If you agree with my additional change, I'll commit the patch including my
change.
Thank you very much for your contribution!
> Worker should directly read Intermediate data stored in localhost rather than
> fetching
> --------------------------------------------------------------------------------------
>
> Key: TAJO-983
> URL: https://issues.apache.org/jira/browse/TAJO-983
> Project: Tajo
> Issue Type: Bug
> Components: data shuffle
> Reporter: Hyunsik Choi
> Assignee: Mai Hai Thanh
> Attachments: TAJO-983.140820.0.patch.txt, TAJO-983.140822.patch.txt,
> TAJO-983.140825.1.patch.txt, TAJO-983.140902.patch, TAJO-983.140904.patch,
> TAJO-983.140916.patch, TAJO-983.140918.patch, TAJO-983.140922.patch,
> TAJO-983.140923.patch, TAJO-983.140924-Hyunsik.patch
>
>
> Currently, worker always fetches all intermediate via Fetcher and than store
> them in local file system even though some intermediate data already are
> stored in local file system. It is inefficient and causes unnecessary I/O and
> extra storage occupation. We should improve it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)