[
https://issues.apache.org/jira/browse/SQOOP-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257590#comment-14257590
]
Abraham Elmahrek commented on SQOOP-1804:
-----------------------------------------
[~vybs], my perspective on the goals: debug-ability (is it easy to just look up
in the database when debugging), ease of data migration, simplicity of
querying. With that, I have couple of thoughts:
# Providing the generic solution (storing all input types) will make it hard to
reduce the scope later if we would like to.
# Creating a separate table seems like the best solution if we want to support
generic input types. It's very debuggable, requires a join to query, and data
migrations will differ per repository implementation.
# Re-using counters seems like the way to go if we only support integer types
for now (limit the scope). It's very debuggable, can re-use existing counters
code, and migrating from integer to other types is a bit easier than the other
way around.
[~vybs], I feel limiting the scope is a good idea for future supportability
and, eventually, a better compatibility story.
> Respository Structure + API: Storing/Retrieving the From/To state of the
> incremental read/ write
> ------------------------------------------------------------------------------------------------
>
> Key: SQOOP-1804
> URL: https://issues.apache.org/jira/browse/SQOOP-1804
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
> Assignee: Veena Basavaraj
> Fix For: 1.99.5
>
>
> Details of this proposal are in the wiki.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Wheretostoretheoutputinsqoop?
> Update: The above highlights the pros and cons of each approach.
> #4 is chosen, since it is less intrusive, more clean and allows U/Edit per
> value in the output easily.
> Will use this ticket for more detailed discussion on storage options for the
> output from connectors
> 1.
> {code}
> // will have FK to submission
> public static final String QUERY_CREATE_TABLE_SQ_JOB_OUTPUT_SUBMISSION =
> "CREATE TABLE " + TABLE_SQ_JOB_OUTPUT + " ("
> + COLUMN_SQ_JOB_OUT_ID + " BIGINT GENERATED ALWAYS AS IDENTITY (START
> WITH 1, INCREMENT BY 1), "
> + COLUMN_SQ_JOB_OUT_KEY + " VARCHAR(32), "
> + COLUMN_SQ_JOB_OUT_VALUE + " LONG VARCHAR,"
> + COLUMN_SQ_JOB_OUT_TYPE + " VARCHAR(32),"
> + COLUMN_SQD_ID + " VARCHAR(32)," // FK to the direction table, since
> this allows to distinguish output from FROM/ TO part of the job
> + COLUMN_SQRS_SUBMISSION + " BIGINT, "
> + "CONSTRAINT " + CONSTRAINT_SQRS_SQS + " "
> + "FOREIGN KEY (" + COLUMN_SQRS_SUBMISSION + ") "
> + "REFERENCES " + TABLE_SQ_SUBMISSION + "(" + COLUMN_SQS_ID + ") ON
> DELETE CASCADE "
> {code}
> 2.
> At the code level, we will define MOutputType, one of the types can be BLOB
> as well, if a connector decides to store the value as a BLOB
> {code}
> class JobOutput {
> String key;
> Object value;
> MOutputType type;
> }
> {code}
> 3.
> At the repository API, add a new API to get job output for a particular
> submission Id and allow updates on values.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)