Answers inline, thanks for the feedback, good points. I have tried to answer them. Let me know if I was unclear.
Best, *./Vee* On Fri, Feb 6, 2015 at 4:57 PM, Gwen Shapira <[email protected]> wrote: > Hi, > > Reviewed the design in the wiki ( > https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+Config+as+Top+Level+Entity > ). > Thanks for writing such detailed plan. I think its a good idea allow > direct editing of configs and the scope of changes look right. > > Few questions: > > 1. In requirements, you mention editing inputs per submission and referred > to SQOOP-2025. However, SQOOP-2025 discusses storing history, and I'm not > sure it makes sense for history to be editable (well, except for soviet > history...). Did you mean to allow editing history? Or are you referring to > something else? > When the design wiki was first written the SQOOP-2025 was in discussion. In future I hope we store config per submission and not overwrite it, at that point we should allow reading by submission ID. Editing on submission history was not intended, I have corrected the wiki into 2 separate bullet points. - Read the Config Inputs by Type/SubType and By Job /Submission ( since SQOOP-2025 <https://issues.apache.org/jira/browse/SQOOP-2025> we may be able to have configs by submissionId) - Update the Config Inputs by Type/SubType for the latest/last submission in the job. We should not allow editing previous submissions and it should be read only > 2. CLI commands: > "show config foo --type JOB --subType from --id 1" > > I see few possible issues here: > 1. I think users don't see config names, so they won't be able to know > about "foo" > Config objects per type are lists. So ability to edit per list is easier since they dont have to go through filling all other unrelated configs in the list. Users do see the names when they list the configs per connector. Am I missing something? 2. We don't want to use IDs, in CLI (thats an issue across the board, so we > may leave it here and fix somewhere else) > We have not yet added support for names in any commands so far, there is ticket for it, at that point it makes sense to support names, id exists so far for every other command and hence I think it is consistent. > 3. Having type and subtype seems a bit confusing. Actually, since we > don't allow creating configs directly, users may not view them as "first > class". > In their perspective, they created jobs and links and now they get to view > and edit parts of those jobs and links. > This is the point I am trying to fix as well, when creating a job and link, instead of having to provide the same set of config inputs every time, it is convenient to give a config name/ id, especially when creating REST calls, it is highly erratic and verbose and difficult to fill all the config inputs in a JSON structure ( for the POST), when we could just give a config name or id. so the intention of this ticket is the user should start seeing config as a first class citizen. I will make this statement explicit in the design wiki if it is not already. Speaking from the experience of using these apis in HUE app it is unnecessarily hard. > > How about just adding config-type as a filter to the existing job and link > commands? > > "show job --name my_job --config-type from" > and > "alter job --name my_job --config-type from" > > This seems to also match the REST API better. Although user facing > commands don't have to match REST API. > Perhaps others want to chime in here? > Answered above and explained it again below. > > 3. In REST API, why are we using subType and not type? Is type already > used somewhere? > I spent a few iterations to understand what can be the best way here. As I said above, in future as more connectors are added, we can see config objects with more inputs, and if we keep extending the config/input annotation as per https://issues.apache.org/jira/browse/SQOOP-1643, it might be useful to have configs and inputs as first class citizens and having both rest apis/ command line support to edit / read them individually and not having associated with a JOB/ LINK. Currently there is a top level ENUM ( MConfigType ), we have 2 values for it LINK/ JOB. This is what I refer to as type in the command line. In case of rest API. I used it a a resource name, v1/config/LINK?name=?&id=?&subType=, but if this is confusing we can also use it as v1/config?type=LINk&name=?&id=?&subType= Second, If you see the changes proposed to the MConfigType, it stores the subTypes as part of the Type. At one point, I thought why not have direction as a parameter for type, JOB, but direction is not relevant to all configurables. i,e if for the driver configs, "direction" has no meaning. Similarly for the type "LINK" there is no concept of direction. Hence I went with the subType, where subType is a second level hierarchy for distinguishing the types of configs that are supported in sqoop Alternatives are possible, but we have to bear in mind that config/ config inputs cannot be associated with jobs and links. they are associated with connectors. The config input values are associated with jobs and links rather. > 4. Repository changes: Yes! I suspect we need those anyway. > Ok I assume the current signature is fine with you then > > Gwen > > > >
