Answers inline, thanks for the feedback, good points. I have tried to
answer them. Let me know if I was unclear.




Best,
*./Vee*

On Fri, Feb 6, 2015 at 4:57 PM, Gwen Shapira <[email protected]> wrote:

> Hi,
>
> Reviewed the design in the wiki (
> https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+Config+as+Top+Level+Entity
> ).
> Thanks for writing such detailed plan. I think its a good idea allow
> direct editing of configs and the scope of changes look right.
>
> Few questions:
>
> 1. In requirements, you mention editing inputs per submission and referred
> to SQOOP-2025. However, SQOOP-2025 discusses storing history, and I'm not
> sure it makes sense for history to be editable (well, except for soviet
> history...). Did you mean to allow editing history? Or are you referring to
> something else?
>

​When the design wiki was first written the SQOOP-2025 was in discussion.
In future I hope we store config per submission and not overwrite it, at
that point ​we should allow reading by submission ID. Editing on submission
history was not intended, I have corrected the wiki into 2 separate bullet
points.

   - Read  the Config Inputs by Type/SubType and By Job /Submission ( since
   SQOOP-2025 <https://issues.apache.org/jira/browse/SQOOP-2025> we may be
   able to have configs by submissionId)
   - Update the Config Inputs by Type/SubType for the latest/last
   submission in the job. We should not allow editing previous submissions and
   it should be read only


> 2. CLI commands:
> "show config foo --type JOB --subType from --id 1"
>
> I see few possible issues here:
> 1. I think users don't see config names, so they won't be able to know
> about "foo"
>
​Config objects per type are lists. So ability to edit per list is easier
since they dont have to go through filling all other unrelated configs in
the list. Users do see the names when they list the configs per connector.
Am I missing something?


2. We don't want to use IDs, in CLI (thats an issue across the board, so we
> may leave it here and fix somewhere else)
>
​
We have not yet added support for names in any commands so far, there is
ticket for it, at that point it makes sense to support names, id exists so
far for every other command and hence I think it is consistent.​



> 3. Having type and subtype seems a bit confusing.  Actually, since we
> don't allow creating configs directly, users may not view them as "first
> class".
> In their perspective, they created jobs and links and now they get to view
> and edit parts of those jobs and links.
>

​This is the point I am trying to fix as well, when creating a job and
link, instead of having to provide the same set of config inputs every
time, it is convenient to give a config name/ id​, especially when creating
REST calls, it is highly erratic and verbose and difficult to fill all the
config inputs in a JSON structure ( for the POST), when we could just give
a config name or id. so the intention of this ticket is the user should
start seeing config as a first class citizen. I will make this statement
explicit in the design wiki if it is not already. Speaking from the
experience of using these apis in HUE app it is unnecessarily hard.

>
> How about just adding config-type as a filter to the existing job and link
> commands?
>
> "show job --name my_job --config-type from"
> and
> "alter job --name my_job --config-type from"
>
> This seems to also match the REST API better. Although user facing
> commands don't have to match REST API.
> Perhaps others want to chime in here?
>
​Answered above and explained it again below.

>
> 3.  In REST API, why are we using subType and not type? Is type already
> used somewhere?
>

​I spent a few iterations to understand what can be the best way here.
As I said above, in future as more connectors are added, we can see config
objects with more inputs, and if we keep extending the config/input
annotation as per https://issues.apache.org/jira/browse/SQOOP-1643, it
might be useful to have configs and inputs as first class citizens and
having both rest apis/ command line support to edit / read them
individually and not having associated with a JOB/ LINK.

Currently there is a top level ENUM ( MConfigType ), we have 2 values for
it LINK/ JOB. This is what I refer to as type in the command line. In case
of rest API. I used it a a resource name,  v1/config/LINK?name=?&id=?&subType=,
but if this is confusing we can also use it as

v1/config?type=LINk&name=?&id=?&subType=


​Second,

If you see the changes proposed to the MConfigType, it stores the subTypes
as part of the Type.

At one point, I thought why not have direction as a parameter for type,
JOB, but direction is not relevant to all configurables. i,e if for the
driver configs, "direction" has no meaning. Similarly for the type "LINK"
there is no concept of direction.

Hence I went with the subType, where subType is a second level hierarchy
for distinguishing the types of configs that are supported in sqoop

Alternatives are possible, but we have to bear in mind that config/ config
inputs cannot be associated with jobs and links. they are associated with
connectors.

The config input values are associated with jobs and links rather.


> 4. Repository changes: Yes! I suspect we need those anyway.
>
​Ok I assume the current signature is fine with you then​



>
> Gwen
>
>
>
>

Reply via email to