Veena Basavaraj created SQOOP-1989:
--------------------------------------
Summary: IDF optimization to store object array in memory ( so
matching is faster)
Key: SQOOP-1989
URL: https://issues.apache.org/jira/browse/SQOOP-1989
Project: Sqoop
Issue Type: Sub-task
Reporter: Veena Basavaraj
according to the IDF api, we never cache or store the csv text/ not object
array representation in memory, we only store the native format as below
{code}
**
* Get one row of data.
*
* @return - One row of data, represented in the internal/native format of
* the intermediate data format implementation.
*/
public T getData() {
return data;
}
{code}
But the matcher code in SqoopWritable always calls the setObjectData and
getObjectData on every row in the data, which mean we exercise this call no
matter what the native format is
toIDF.setObjectData(matcher.getMatchingData(fromIDF.getObjectData()));
So should we not store the object array in memory?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)