write

Veena Basavaraj (JIRA) Tue, 23 Dec 2014 19:37:30 -0800

    [ 
https://issues.apache.org/jira/browse/SQOOP-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257895#comment-14257895
 ]


Veena Basavaraj edited comment on SQOOP-1804 at 12/24/14 3:37 AM:
------------------------------------------------------------------

[~gwenshap] not sure I got the point of declaring it in advance, where do you 
suggest this declaring in advance will be one, they will add it to context as a 
key value pair and use it. 

I am still not sure why supporting a generic way is turning out to be complex 
according to your comments,

The offline discussion points were this.

1. described  the reasoning of storing STATE / OUTPUT, we decided to call it 
STATE if that conveys the message

2. Adding a new table was agreed upon

3. Gwen suggested that we dont have to duplicate the key/names of the state 
fields in every row per submission. so 2 tables would normalize the data

SQ_JOB_STATE_KEY

ID
TYPE
NAME
DIRECTION

----
SQ_JOB_STATE_VALUE
VALUE
SUBMISSION_ID



was (Author: vybs):
[~gwenshap] not sure I got the point of declaring it in advance, where do you 
suggest this declaring in advance will be one, they will add it to context as a 
key value pair and use it. 

I am still not sure why supporting a generic way is turning out to be complex,? 

> Repository Structure + API: Storing/Retrieving the From/To state of the 
> incremental read/ write
> -----------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1804
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1804
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> Details of this proposal are in the wiki.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Wheretostoretheoutputinsqoop?
> Update: The above highlights the pros and cons of each approach. 
> #4 is chosen, since it is less intrusive, more clean and allows U/Edit per 
> value in the output easily.
> Will use this ticket for more detailed discussion on storage options for the 
> output from connectors
> 1. 
> {code}
> // will have FK to submission
>  public static final String QUERY_CREATE_TABLE_SQ_JOB_OUTPUT_SUBMISSION =
>      "CREATE TABLE " + TABLE_SQ_JOB_OUTPUT + " ("
>      + COLUMN_SQ_JOB_OUT_ID + " BIGINT GENERATED ALWAYS AS IDENTITY (START 
> WITH 1, INCREMENT BY 1), "
>      + COLUMN_SQ_JOB_OUT_KEY + " VARCHAR(32), "
>      + COLUMN_SQ_JOB_OUT_VALUE + " LONG VARCHAR,"
>      + COLUMN_SQ_JOB_OUT_TYPE + " VARCHAR(32),"
>      + COLUMN_SQD_ID + " VARCHAR(32)," // FK to the direction table, since 
> this allows to distinguish output from FROM/ TO part of the job
>    + COLUMN_SQRS_SUBMISSION + " BIGINT, "
>    + "CONSTRAINT " + CONSTRAINT_SQRS_SQS + " "
>      + "FOREIGN KEY (" + COLUMN_SQRS_SUBMISSION + ") "
>        + "REFERENCES " + TABLE_SQ_SUBMISSION + "(" + COLUMN_SQS_ID + ") ON 
> DELETE CASCADE "
> {code}
> 2.
> At the code level, we will define  MOutputType, one of the types can be BLOB 
> as well, if a connector decides to store the value as a BLOB
> {code}
> class JobOutput {
> String key;
> Object value;
> MOutputType type;
> }
> {code}
> 3. 
> At the repository API, add a new API to get job output for a particular 
> submission Id and allow updates on values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (SQOOP-1804) Repository Structure + API: Storing/Retrieving the From/To state of the incremental read/ write

Reply via email to