[
https://issues.apache.org/jira/browse/SQOOP-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286852#comment-14286852
]
Veena Basavaraj edited comment on SQOOP-1804 at 1/22/15 2:58 AM:
-----------------------------------------------------------------
[~vinothchandar] as far as implementation is considered. SQOOP-1803 has
details, we wont have a new method, we just use the extract and load apis in
the "Extractor" and "Loader" respectively.
The nice thing about current MR engine implementation is that the output
committer controls when the job is finally done ( success or failure), we will
pass these values to this phase and commit them to the sqoop repository only
then. ( this is true for both the Fetch/ Merge configs).
{code}
class SqoopDestroyerOutputCommitter extends OutputCommitter {
@Override
public void setupJob(JobContext jobContext) {
}
@Override
public void commitJob(JobContext jobContext) throws IOException {
super.commitJob(jobContext);
invokeDestroyerExecutor(jobContext, true);
}
{code}
was (Author: vybs):
[~vinothchandar] as far as implementation is considered. SQOOP-1803 has
details, we wont have a new method, we just use the extract and load apis in
the "Extractor" and "Loader" respectively.
The nice thing about current MR engine implementation is that the output
committer controls when the job is finally done ( success or failure), we will
pass these values to this phase and commit them to the sqoop repository only
then. ( this is true for both the Fetch/ Merge configs).
> Repository Structure + API: Storing/Retrieving the From/To state of the
> incremental read/ write
> -----------------------------------------------------------------------------------------------
>
> Key: SQOOP-1804
> URL: https://issues.apache.org/jira/browse/SQOOP-1804
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
> Assignee: Veena Basavaraj
> Fix For: 1.99.5
>
>
> Details of this proposal are in the wiki.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Wheretostoretheoutputinsqoop?
> Update: The above highlights the pros and cons of each approach.
> #4 is chosen, since it is less intrusive, more clean and allows U/Edit per
> value in the output easily.
> Will use this ticket for more detailed discussion on storage options for the
> output from connectors
> 1.
> {code}
> // will have FK to submission
> public static final String QUERY_CREATE_TABLE_SQ_JOB_OUTPUT_SUBMISSION =
> "CREATE TABLE " + TABLE_SQ_JOB_OUTPUT + " ("
> + COLUMN_SQ_JOB_OUT_ID + " BIGINT GENERATED ALWAYS AS IDENTITY (START
> WITH 1, INCREMENT BY 1), "
> + COLUMN_SQ_JOB_OUT_KEY + " VARCHAR(32), "
> + COLUMN_SQ_JOB_OUT_VALUE + " LONG VARCHAR,"
> + COLUMN_SQ_JOB_OUT_TYPE + " VARCHAR(32),"
> + COLUMN_SQD_ID + " VARCHAR(32)," // FK to the direction table, since
> this allows to distinguish output from FROM/ TO part of the job
> + COLUMN_SQRS_SUBMISSION + " BIGINT, "
> + "CONSTRAINT " + CONSTRAINT_SQRS_SQS + " "
> + "FOREIGN KEY (" + COLUMN_SQRS_SUBMISSION + ") "
> + "REFERENCES " + TABLE_SQ_SUBMISSION + "(" + COLUMN_SQS_ID + ") ON
> DELETE CASCADE "
> {code}
> 2.
> At the code level, we will define MOutputType, one of the types can be BLOB
> as well, if a connector decides to store the value as a BLOB
> {code}
> class JobOutput {
> String key;
> Object value;
> MOutputType type;
> }
> {code}
> 3.
> At the repository API, add a new API to get job output for a particular
> submission Id and allow updates on values.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)