[ 
https://issues.apache.org/jira/browse/GIRAPH-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558454#comment-13558454
 ] 

Nitay Joffe commented on GIRAPH-483:
------------------------------------

Sounds good, go ahead assign it to yourself - I look forward to seeing the diff 
:)
                
> InputSplit needs to be Writable
> -------------------------------
>
>                 Key: GIRAPH-483
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-483
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Priority: Minor
>
> Working on Hive I/O recently I found this out the hard way...
> We use InputSplit in Giraph in order to make things work easily with Hadoop. 
> However our usage of the interface is not actually consistent. Specifically, 
> in InputSplitsCallable#getInputSplit we have the following:
>   ((Writable) inputSplit).readFields(inputStream);
> This means our InputSplit has to be Writable. If it's not (as mine wasn't 
> initially when implementing a new input format) things break badly. For a 
> simple start we should at least put some instanceof check around that cast 
> and an informative error message.
> Furthermore, looking deeper into it I noticed we don't actually ever use the 
> getLength() method in InputSplit, just getLocations(). So really the "right" 
> way to have things IMO is to have our own GiraphInputSplit interface, which 
> extends Writable, and has the getLocations() method.
> Doing this is tricky though as it will likely break existing I/O formats, so 
> will require some care...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to