[ 
https://issues.apache.org/jira/browse/HAMA-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143820#comment-13143820
 ] 

ChiaHung Lin commented on HAMA-258:
-----------------------------------

The current patch supports contiguous access mode. In addition to this, our 
framework also needs the ability to have random access to the underlying data. 
These can be achieved with the expression of data layout. 

Idea is borrowed from mpi i/o. A file is consisted of file type(s), which is a 
template/ pattern in describing data layout, and displacement, an absolute 
position from which the first file type begins. File type is consisted of 
elementary type, which is the basic construct unit (e.g. byte), and holes that 
define non-accessible area. Suppose there are 3 processes (or tasks). The first 
process holds a file type containing 1 elementary type and 5 holes (total size 
is 6 elementary types) with its elementary type sitting at the first position 
(e.g. array[0]). The second holds 2 elementary types and 4 holes where 
elementary types stay at the 2nd and 3rd position (array[1] and array[2]). The 
third holds 3 elementary types and 3 holes with elementary type positions at 
the 4th, 5th, and 6th (array[3], array[4], array[5]). Holes occupy places where 
elementary type left marking the data is non-accessible. 

The whole file thus can be expressed with the composition of three processes/ 
tasks. And each process has a view to the part that it want to access. 
Therefore, an contiguous split can be expressed by constructing how many times 
an elementary type (e.g. byte) would repeat (n x byte). For non contiguous 
access (random access), a process can specify elementary type and the layout 
(file type) to describe data it wants to access. 

The benefit, according to mpi i/o report, is its design goal favours common 
usage patterns (90%) corresponded to real world requirements. 
                
> Design a input and output system
> --------------------------------
>
>                 Key: HAMA-258
>                 URL: https://issues.apache.org/jira/browse/HAMA-258
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: HAMA-258_improved.patch, IONoInput.patch, io_v01.patch, 
> io_v02.patch, io_v03.patch, io_v04.patch
>
>
> This issue will handle the input and output system with data splitter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to