-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47108/
-----------------------------------------------------------

Review request for Sqoop.


Repository: sqoop-trunk


Description
-------

With the current implementation of ClassWriter the generated table ORM classes 
contains a setField which is built around long if statemetns (having a single 
branch for every private field). Altough this concept works perfectly for 
small/midsize (regarding to the number of columns) tables, in case of wide ones 
(>>500 column) it causes a relevant performance degradation (and thus making 
export much slower than should be, as seen in the JIRA task). Attached I 
provide a proposed solution to avoid it. According to my own measurements this 
solution is 250x faster than the current one. (Tested with 800 field wide table 
ORMs 20000,100000,1m,5m rows).

Please review it and share your thoughts!


Diffs
-----

  src/java/org/apache/sqoop/orm/ClassWriter.java 23a9c41 
  src/java/org/apache/sqoop/orm/CompilationManager.java ce165e8 
  src/test/com/cloudera/sqoop/orm/TestClassWriter.java 498db73 

Diff: https://reviews.apache.org/r/47108/diff/


Testing
-------

The current unit testcase has been only extended with one test method which 
simulates the "insertion" of 20000 rows (calling all the 800 setters 20000 
times with random values), but I've also tested with 100000,1m,5m rows on my 
local environment. It showed this solution is at least 250x faster.

Any additional idea for testing is more than welcome from the community.


Thanks,

Attila Szabo

Reply via email to