[ 
https://issues.apache.org/jira/browse/SQOOP-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated SQOOP-1378:
--------------------------------

    Attachment: SQOOP-1378.0.patch

Untested. Just to give everyone an idea of what my refactored solution looks 
like.

Its fairly extensible (You can add many methods of resolving schemas), but 
currently I'm only adding "by location" (for writing to CSVs) and "by name" to 
translate between DB tables.

Its also pretty tightly coupled with our use of CSV for intermediate data 
format. We can change it later, but I don't see this as a priority.

Oh, and I removed a bunch of unused "getSchema" APIs. We no longer have a 
single schema, and since they were unused, I couldn't figure out which schema 
they referred to.

> Sqoop2: From/To: Refactor schema
> --------------------------------
>
>                 Key: SQOOP-1378
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1378
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Abraham Elmahrek
>            Assignee: Gwen Shapira
>         Attachments: SQOOP-1378.0.patch
>
>
> Relational database systems, hierarchical databases, etc. tend to have a well 
> defined schema. Key-value DBs, BigTable clones, etc. tend to have weakly 
> defined schemas. In fact, a key-value datastore may not have any kind of 
> schema (other than the fact is is key-value).
> Schemas seem like they are local to the connector and should not be needed by 
> the framework. Or, there should be a common Schema format that every 
> connector knows how to decipher.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to