[ 
https://issues.apache.org/jira/browse/SQOOP-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259152#comment-14259152
 ] 

Veena Basavaraj edited comment on SQOOP-1901 at 12/26/14 5:27 PM:
------------------------------------------------------------------

Summary from the review board discussions.


In case of CSVIDF, data is the CSV string and hence we proactively construct it 
when object array is set, for instance setObjectData(..) proactively constructs 
the csvText ( represented by data) and everything else in LAZY, i.e in case of 
CSVIDF, object array is constructed lazily from the csvText (i.e data) when 
getObjectData() is called, we do not store the object array. 

In case of JSONIDF, data is JSONObject ( or any other IDF say AvroIDF, data is 
the avro object), so we store in memory only the JSON/ avro object and lazily 
construct both csv and object array

Here are the high level details on what each method in the JSON IDF will do.

1. Store the source of truth in data i.e JSONObject
2. When setData is called, store the JSONObject in data and nothing else, 
everything is lazy
3. When setObjectData is called, construct the JSONObject from object array, do 
not store any CSVText nor objectArray
4. When setCsvText is called, also construct JSONObject from it, do not store 
any CSVText nor objectArray
5. When getObjectData is called, convert from JSONObject to objectArray, so 
this on demand, or rather called lazy, since we are not sure if these methods 
will be called in all cases
6. When getCSVText is called, convert from JSONObject to CSVText, same as above
7. When getData is called, return the JSONObject





was (Author: vybs):

Summary from the review board discussions.


In case of CSVIDF, data is the CSV string and hence we proactively construct it 
when object array is set, for instance setObjectData(..) proactively constructs 
the csvText ( represented by data) and everything else in LAZY, i.e in case of 
CSVIDF, object array is constructed lazily from the csvText (i.e data) when 
getObjectData() is called, we do not store the object array. 

In case of JSONIDF, data is JSONObject ( or any other IDF say AvroIDF, data is 
the avro object), so we store in memory only the JSON/ avro object and lazily 
construct both csv and object array

Here are the high level details on what each method in the JSON IDF will do.

Store the source of truth in data i.e JSONObject
When setData is called, store the JSONObject in data and nothing else, 
everything is lazy
When setObjectData is called, construct the JSONObject from object array, do 
not store any CSVText nor objectArray
When setCsvText is called, also construct JSONObject from it, do not store any 
CSVText nor objectArray
when getObjectData is called, convert from JSONObject to objectArray, so this 
on demand, or rather called lazy, since we are not sure if these methods will 
be called in all cases
When getCSVText is called, convert from JSONObject to CSVText, same as above
When getData is called, return the JSONObject




> Supporting DRY code in new IDF impementation JSONIDF
> ----------------------------------------------------
>
>                 Key: SQOOP-1901
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1901
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: sqoop2-framework
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>         Attachments: SQOOP-1901-v2.patch
>
>
> As the title suggests, we want to encourage DRY code in the new IDF 
> implementations.
> As the IDF api mandates CSV and object format for all its sub implementation, 
> I propose we move the common functionality to the base IDF class so that JSON 
> IDF or AvroIDF does not have to repeat this code.
> The only parts of the code that needs to be in subclasses is how then handle 
> the conversion between the "T" ( generic parameter) and the csv/ object 
> representations.
> I saw that http://ingest.tips/2014/12/11/sqoop-1-99-4-release/ mentions 
> extensind from CSVIDF and this cannot technically work since we have the 
> generic T that will be different for AvroIDF or JSON IDF
> Update:
> Also extending from CSVIDF seems a bit ilogical, since the IDF API says that 
> it needs CSV and object Array, these functionality of converting between the 
> two i.e text to object and object to text should be in base class.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to