[ 
https://issues.apache.org/jira/browse/SQOOP-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257594#comment-14257594
 ] 

Veena Basavaraj commented on SQOOP-1804:
----------------------------------------

Not sure I understand what is limiting scope mean. can you elaborate?There are 
multiple connotations to it, so I would welcome a explicit answer

I have also iterated before, that the output is not a long always, It is 
completely acceptable to store a boolean saying the type of write ( append or 
merge or overwrite) that was done, so the next run the connector knows what to 
do, it can be anything and the point is to not limit this to any connector.

Consider the effort required to add this support later? migrating from counters 
to another table? 

> Respository Structure + API: Storing/Retrieving the From/To state of the 
> incremental read/ write
> ------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1804
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1804
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> Details of this proposal are in the wiki.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Wheretostoretheoutputinsqoop?
> Update: The above highlights the pros and cons of each approach. 
> #4 is chosen, since it is less intrusive, more clean and allows U/Edit per 
> value in the output easily.
> Will use this ticket for more detailed discussion on storage options for the 
> output from connectors
> 1. 
> {code}
> // will have FK to submission
>  public static final String QUERY_CREATE_TABLE_SQ_JOB_OUTPUT_SUBMISSION =
>      "CREATE TABLE " + TABLE_SQ_JOB_OUTPUT + " ("
>      + COLUMN_SQ_JOB_OUT_ID + " BIGINT GENERATED ALWAYS AS IDENTITY (START 
> WITH 1, INCREMENT BY 1), "
>      + COLUMN_SQ_JOB_OUT_KEY + " VARCHAR(32), "
>      + COLUMN_SQ_JOB_OUT_VALUE + " LONG VARCHAR,"
>      + COLUMN_SQ_JOB_OUT_TYPE + " VARCHAR(32),"
>      + COLUMN_SQD_ID + " VARCHAR(32)," // FK to the direction table, since 
> this allows to distinguish output from FROM/ TO part of the job
>    + COLUMN_SQRS_SUBMISSION + " BIGINT, "
>    + "CONSTRAINT " + CONSTRAINT_SQRS_SQS + " "
>      + "FOREIGN KEY (" + COLUMN_SQRS_SUBMISSION + ") "
>        + "REFERENCES " + TABLE_SQ_SUBMISSION + "(" + COLUMN_SQS_ID + ") ON 
> DELETE CASCADE "
> {code}
> 2.
> At the code level, we will define  MOutputType, one of the types can be BLOB 
> as well, if a connector decides to store the value as a BLOB
> {code}
> class JobOutput {
> String key;
> Object value;
> MOutputType type;
> }
> {code}
> 3. 
> At the repository API, add a new API to get job output for a particular 
> submission Id and allow updates on values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to