[
https://issues.apache.org/jira/browse/SQOOP-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245944#comment-14245944
]
Veena Basavaraj commented on SQOOP-1168:
----------------------------------------
yes, [~vinothchandar] merge seems like better word to convey both use cases,
i.e add and modify in case of writing.
so for DFM ( Delta Fetch Merge)!
I am convinced at this point that we should not distinguish as last_modified or
primary_key/ append, that sqoop1 does, since such semantics may not be
consistent in every data source sqoop 2 supports
BTW, have been thinking hard on this and would appreciate your thoughts.
Connectors point of view
Do we really need connectors to distinguish between the two use cases? At this
point I am convinced that the connector needs to explicitly say that they
support DFM in the directions that they support. i.e a connector can support DF
and and not DM or vice versa. So the initializer seems like place to declare
this
They can also expose a bunch of predicate objects or even better to have
delta-fetch config and delta-merge config. I am debating whether the word
predicate just adds to more confusion, and I should just stick to config, since
that term is familiar and predicate in fact is bunch of key /value pairs
anyways. I am leaning towards configs. having these configs, the connector can
decide to get as granular as it wants to be and even gives itself a ability to
distinguish between the so called "incremental/ sequential" fetch vs "delta/
random" reads
Usability point of view
1. As a user, does he have to really know the difference between a full
transfer and sub set transfer of records? To be more precise, should we keep
the semantic of job creation as
A) {code} start job -j 1 {code} and then ask if it is delta fetch or delta
merge? and if they say yes, ask for the inputs related to the delta fetch and
delta merge?
B) {code} start delta-fetch-merge-job -j 1 {code}, so we dont have to ask for
another question if it is a delta or not.
Some interesting cases, what if the user chooses a connector that does not
support DF or DM, then we should bail out with an error immediately?
Did I make sense? Let me know, appreciate your time.
> Sqoop2: Incremental and Delta updates ( formerly called Incremental Import )
> ----------------------------------------------------------------------------
>
> Key: SQOOP-1168
> URL: https://issues.apache.org/jira/browse/SQOOP-1168
> Project: Sqoop
> Issue Type: Bug
> Reporter: Hari Shreedharan
> Assignee: Veena Basavaraj
> Fix For: 1.99.5
>
>
> The formal design wiki is here
> https://cwiki.apache.org/confluence/display/SQOOP/Incremental+and+Delta+Update+Design
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)