-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25284/
-----------------------------------------------------------

Review request for Sqoop.


Bugs: SQOOP-1395
    https://issues.apache.org/jira/browse/SQOOP-1395


Repository: sqoop-trunk


Description
-------

If you import a table "users". Sqoop will generate an entity class named 
"users.java". The class will be compiled, submitted and used by a mapreduce 
job. If the target file format is Avro or Parquet, an Avro schema will be 
generated as well. According to Avro specification, the entity class is 
described as "record", the name of the "record" is "users".

For Parquet file format handling, we use the Kite SDK to manage Parquet file 
reading and writing with minimal efforts. Kite requires an Avro schema and all 
data records to be packed into GenericRecord instances. There will be a problem 
here. Kite will read the schema first and try to instantiate a record regarding 
its name. In this case, Kite will try to instantiate a "users" class. 
Unfortunately, there is a "users.java" out there. This will cause mapreduce job 
fail.

In order to solve this problem, I intend to keep the name of the entity class 
and the Avro record different.

The patch will:

    Change the record name in Avro schema.
    Remove the SqoopAvroRecord, as it is no longer required. (ClassWriter.java 
is reverted to previous state)


Diffs
-----

  src/java/org/apache/sqoop/avro/AvroUtil.java 4b37d58 
  src/java/org/apache/sqoop/lib/SqoopAvroRecord.java 80875d2 
  src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java 6adad79 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 905ba8c 
  src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java effbadd 
  src/java/org/apache/sqoop/mapreduce/ParquetJob.java a74432a 
  src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 806bace 
  src/java/org/apache/sqoop/orm/ClassWriter.java 4f9dedd 
  src/java/org/apache/sqoop/orm/TableClassName.java 88ab622 

Diff: https://reviews.apache.org/r/25284/diff/


Testing
-------

Manually verified the unittests of Avro and Parquet file formats
Manually tested local Parquet and Avro import export.


Thanks,

Qian Xu

Reply via email to