[
https://issues.apache.org/jira/browse/SQOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246173#comment-14246173
]
Jarek Jarcec Cecho commented on SQOOP-1900:
-------------------------------------------
It's a way for the IDF to serialize (and read) data in their internal format,
so that they don't have to be converted to text or objects that rest of the
Sqoop framework understands. I believe that from performance perspective, those
methods are the fastest option, correct? I'm not familiar with Spark too
heavily, but I would assume that even there we will need to provide way how to
transfer data from machine to machine?
> IDF API read/ write method
> ---------------------------
>
> Key: SQOOP-1900
> URL: https://issues.apache.org/jira/browse/SQOOP-1900
> Project: Sqoop
> Issue Type: Sub-task
> Components: sqoop2-framework
> Reporter: Veena Basavaraj
> Fix For: 1.99.5
>
>
> At this point I am not clear what the real use of the following 2 methods are
> in the IDF API. Can anyone explain? I have not seen it used anywhere in the
> code I might be missing something
> {code}
> /**
> * Serialize the fields of this object to <code>out</code>.
> *
> * @param out <code>DataOuput</code> to serialize this object into.
> * @throws IOException
> */
> public abstract void write(DataOutput out) throws IOException;
> /**
> * Deserialize the fields of this object from <code>in</code>.
> *
> * <p>For efficiency, implementations should attempt to re-use storage in
> the
> * existing object where possible.</p>
> *
> * @param in <code>DataInput</code> to deseriablize this object from.
> * @throws IOException
> */
> public abstract void read(DataInput in) throws IOException;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)