[
https://issues.apache.org/jira/browse/SQOOP-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Mazak updated SQOOP-1409:
------------------------------
Attachment: SQOOP-1409_2.patch
Finally got around to looking at the tests for this. It's not an Avro bug.
Upon further inspection, this is more critical than --update-key x
--update-mode updateonly not working for Avro. Using the arguments
--update-key x with --update-mode updateonly goes through *JdbcUpdateExportJob*
and with --update-mode allowinsert goes through *JdbcUpsertExportJob* which
extends JdbcUpdateExportJob. Because JdbcUpdateExportJob getMapperClass only
handles Sequence Files and Text, the Update and Upsert functionality is broken
for Avro, Parquet, and HCat with any target database (MySQL, Oracle, SQLServer,
etc). With those formats, you will be given the TextExportMapper which of
course does not work.
This second patch is the refactor to have JdbcUpdateExportJob extend
*JdbcExportJob* and rely on its logic in getMapperClass which handles all the
formats. Once the hierarchy was straightened out, each subclass can rely on
parent *JdbcExportJob*'s getMapperClass() and, to be DRY, configureDB() method.
Therefore, I've attached a new patch with unit tests in JdbcExportJobTest.
These test getting the right mapper class for the format no matter how you
constructed JdbcExportJob - whether as a JdbcUpdateExportJob,
JdbcUpsertExportJob, or JdbcCallExportJob.
I've tested Avro and Text to SQLServer, but could use some help testing other
combinations. Again, this change is not Avro-specific, but makes it so all
combinations of Export are consistent.
> Update only export from Avro Data Files doesn't work
> ----------------------------------------------------
>
> Key: SQOOP-1409
> URL: https://issues.apache.org/jira/browse/SQOOP-1409
> Project: Sqoop
> Issue Type: Bug
> Components: connectors/generic
> Affects Versions: 1.4.4
> Reporter: Paul Mazak
> Attachments: SQOOP-1409.patch, SQOOP-1409_2.patch
>
>
> When the sqoop export --update-mode updateonly is used (in conjuction with
> --update-key), the only available file formats to export are SequenceFile and
> Text. Avro and HCat are not implemented in JdbcUpdateExportJob.java even
> though they are handled in JdbcExportJob.java.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)