[
https://issues.apache.org/jira/browse/SQOOP-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277389#comment-14277389
]
Veena Basavaraj commented on SQOOP-1168:
----------------------------------------
For Question 3:
I have explained this before, but I will try to rehash the same information
this time with an example.
In case of a JDBC connector( one of our favorite connectors), it is upto the
connector to say how it wants to ask what rows to read. Here are 2 ways as I
stated in my wiki.
{code}
// In FromJobConfiguration
@ConfigClass(validators = {
@Validator(DeltaFetchConfig1.DeltaFetchConfig1Validator.class)})
public class DeltaFetchConfig1 {
@Input(size = 255) public String column;
// validate supported oeprators, can provide a default value etc...
@Input public String operator;
@Input public String value;
}
or
@ConfigClass(validators = {
@Validator(DeltaFetchConfig2.DeltaFetchConfig2Validator.class)})
public class DeltaFetchConfig2 {
@Input(size = 255) public String deltaFetchQuery;
}
{code}
When it processes the records, it wants to store state, at this point it can
called last_value, row_key_last_processed, current_end_point, whatever relevant
to it, Note the type of the input was a query - and the type of state in a long
value or may be a timestamp, it can be anything.
So next question, what happens when the user wants to do the next run?
Very well, he will issue the command start job --j 1, at this point there is no
room for him/her to reset the config nor state.
But there are cases we want to provide this ability, so we provide a show
config --jid 1 --type from or show state --jid 1--type from then we can also
have update config --jid --type from and display all the configs that they can
edit.
Independently whether a input is editable or not, an annotation can be added
and it is a enhancement we can do as part for SQOOP-1648 ticket for sure. Same
with state.
Next question, do we want to correlate a input and state, I do not think so,
because it one to many, a config input in a connectors might result in multiple
state values. But if we want to provide a means to associate it, we can do it,
so in the above case, the state key/name prev_value will be associated with
the input query
So with this, we have not tampered with user inputs, we let them edit it if
they want to
Lastly, the most important question how does connectors decide whether to use
the input or state. I would say it is obvious that connectors have to give
importance to latest input value ( if override/ update flag was yes) than the
prev run value. But a connector may choose not to do it, it is the custom logic
of connector at that point. But in the submission history the connector should
be able to tell the user what value was used for the query ( another
information it can write back for user to see what was done), its that flexible.
https://issues.apache.org/jira/browse/SQOOP-1804, has more details if we want
to have the state variable explicitly exposed by connector in the config class
as well.
> Sqoop2: Delta Fetch/ Merge ( formerly called Incremental Import )
> -----------------------------------------------------------------
>
> Key: SQOOP-1168
> URL: https://issues.apache.org/jira/browse/SQOOP-1168
> Project: Sqoop
> Issue Type: Bug
> Reporter: Hari Shreedharan
> Assignee: Veena Basavaraj
> Fix For: 1.99.6
>
>
> The formal design wiki is here
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+and+Merge+Design
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)