[
https://issues.apache.org/jira/browse/SQOOP-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229162#comment-14229162
]
Jarek Jarcec Cecho commented on SQOOP-1811:
-------------------------------------------
I see, let me reiterate the proposal using my own words to ensure that I'm on
the same page. The proposal is that the parent {{IntermediateDataFormat}} class
should hold the {{text}} variable that will hold the Sqoop CSV-ish format of
given row. This variable will be private to the parent class and will be only
accessible to outside world and children implementations via public
{{setCSVTextData()}} and {{getCSVTextData()}} methods. If my understanding is
correct, then I do have couple of concerns:
* Defining the variable {{text}} as final means that we will need to
instantiate new IDF class for every transferred row whereas today we are using
one instance for entire extractor/loader instance.
* When using methods {{setData()}} and {{setObjectData()}} the IDF
implementation is responsible to convert given data into CSV-ish text and call
method {{setSqoopCSVString()}}. This will mean that we will always convert the
data into CSV-ish text, regardless whether we need that or not.
* When calling method {{getData()}} and {{getObjectData()}} the IDF
implementation is responsible to check whether the {{text}} is in sync with the
internal representation because otherwise we might end up with data corruption.
E.g. calling {{setSqoopCSVString()}} won't alter internal representation
accordingly and therefore call to {{getData()}} and {{getObjectData()}} has to
be protected.
Let me know if that make any sense.
> IDF API changes
> ---------------
>
> Key: SQOOP-1811
> URL: https://issues.apache.org/jira/browse/SQOOP-1811
> Project: Sqoop
> Issue Type: Sub-task
> Components: sqoop2-framework
> Reporter: Veena Basavaraj
> Fix For: 1.99.5
>
>
> 1. update the java docs for IDF apis.
> 2. Make the getTextData final and call it getCSV and setCSV, so it is
> obvious that we want to enforce CSV format
> the following code can move to the base class IntermediateDataFormat and
> made final, so there is no way to override this and we can enforce all to
> return String instead of generic T
> {code}
> // hold the string in IDF base class
> private final String text.
>
> public final String getCSVTextData() {
> return text;
> }
>
> public final void setCSVTextData(String text) {
> this.text = text;
> }
> {code}
> There is code in CSVIDF implementation that has the rules for CSV parsing
> that can be pulled out into CSV Utils so that the connectors can use
> The T in CSV happens to String, which is just a coincidence, If I write a new
> IDF implementation T can be a custom object that could encapsulate the whole
> row.
> Third, getData and setData can have custom implementation so they can be
> overriden to return the generic type T
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)