[
https://issues.apache.org/jira/browse/HAMA-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143820#comment-13143820
]
ChiaHung Lin commented on HAMA-258:
-----------------------------------
The current patch supports contiguous access mode. In addition to this, our
framework also needs the ability to have random access to the underlying data.
These can be achieved with the expression of data layout.
Idea is borrowed from mpi i/o. A file is consisted of file type(s), which is a
template/ pattern in describing data layout, and displacement, an absolute
position from which the first file type begins. File type is consisted of
elementary type, which is the basic construct unit (e.g. byte), and holes that
define non-accessible area. Suppose there are 3 processes (or tasks). The first
process holds a file type containing 1 elementary type and 5 holes (total size
is 6 elementary types) with its elementary type sitting at the first position
(e.g. array[0]). The second holds 2 elementary types and 4 holes where
elementary types stay at the 2nd and 3rd position (array[1] and array[2]). The
third holds 3 elementary types and 3 holes with elementary type positions at
the 4th, 5th, and 6th (array[3], array[4], array[5]). Holes occupy places where
elementary type left marking the data is non-accessible.
The whole file thus can be expressed with the composition of three processes/
tasks. And each process has a view to the part that it want to access.
Therefore, an contiguous split can be expressed by constructing how many times
an elementary type (e.g. byte) would repeat (n x byte). For non contiguous
access (random access), a process can specify elementary type and the layout
(file type) to describe data it wants to access.
The benefit, according to mpi i/o report, is its design goal favours common
usage patterns (90%) corresponded to real world requirements.
> Design a input and output system
> --------------------------------
>
> Key: HAMA-258
> URL: https://issues.apache.org/jira/browse/HAMA-258
> Project: Hama
> Issue Type: New Feature
> Components: bsp
> Affects Versions: 0.3.0
> Reporter: Edward J. Yoon
> Assignee: Edward J. Yoon
> Fix For: 0.4.0
>
> Attachments: HAMA-258_improved.patch, IONoInput.patch, io_v01.patch,
> io_v02.patch, io_v03.patch, io_v04.patch
>
>
> This issue will handle the input and output system with data splitter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira