----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47108/ -----------------------------------------------------------
Review request for Sqoop. Repository: sqoop-trunk Description ------- With the current implementation of ClassWriter the generated table ORM classes contains a setField which is built around long if statemetns (having a single branch for every private field). Altough this concept works perfectly for small/midsize (regarding to the number of columns) tables, in case of wide ones (>>500 column) it causes a relevant performance degradation (and thus making export much slower than should be, as seen in the JIRA task). Attached I provide a proposed solution to avoid it. According to my own measurements this solution is 250x faster than the current one. (Tested with 800 field wide table ORMs 20000,100000,1m,5m rows). Please review it and share your thoughts! Diffs ----- src/java/org/apache/sqoop/orm/ClassWriter.java 23a9c41 src/java/org/apache/sqoop/orm/CompilationManager.java ce165e8 src/test/com/cloudera/sqoop/orm/TestClassWriter.java 498db73 Diff: https://reviews.apache.org/r/47108/diff/ Testing ------- The current unit testcase has been only extended with one test method which simulates the "insertion" of 20000 rows (calling all the 800 setters 20000 times with random values), but I've also tested with 100000,1m,5m rows on my local environment. It showed this solution is at least 250x faster. Any additional idea for testing is more than welcome from the community. Thanks, Attila Szabo
