Sandish Kumar HN has posted comments on this change. Change subject: kudu client tools for hadoop and spark import/export(csv,parquet,avro) ......................................................................
Patch Set 6: (24 comments) http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ExportCsvMapper.java File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ExportCsvMapper.java: Line 69: StringBuilder buf = new StringBuilder(); > I'd still advocate removing this javadoc section, it adds nothing and it's Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ImportParquet.java File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ImportParquet.java: Line 56: > I just noticed, you shouldn't use this Log that comes from Parquet. It's be Done Line 97: // Pre-flight checks of input parquet schema and table schema. > nit: missing space after //m Done Line 99: if (schema.containsField(columnSchema.getName())) { > Why do you not also check the type? Done Line 101: .equals(getTypeName(columnSchema.getType()))) { > Having System.exits in the code isn't good, ideally this case would be test Done Line 101: .equals(getTypeName(columnSchema.getType()))) { > I'd suggestion throwing an exception and catching it in run() below. Done Line 101: .equals(getTypeName(columnSchema.getType()))) { > what should I use instead of system.exit(0). just LOG.error("") enough?? Done Line 104: } > Well Kudu doesn't support Parquet's TIMESTAMP, it's not about a recommendat Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITExportCsv.java File java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITExportCsv.java: Line 17: > nit: missing a blank line. Done Line 68: > nit: end comments with a period. Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITImportParquet.java File java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITImportParquet.java: Line 17: > nit: missing blank line. Done Line 50: import org.apache.kudu.mapreduce.HadoopTestingUtility; > I'd suggest having a separate test that specifically verifies the pre-fligh Done Line 50: import org.apache.kudu.mapreduce.HadoopTestingUtility; > separate test-case for pre-flight checks? and what should be the expected r Done Line 50: import org.apache.kudu.mapreduce.HadoopTestingUtility; > The goal is to test all the major code paths, right now your patch isn't te Done Line 107: } > nit: long line Done Line 111: > nit: long line Done Line 115: > openTable isn't a cheap call, do it only once. Done Line 116: KuduTable openTable = openTable(TABLE_NAME); > Use scanTableToStrings and verify all the rows instead. Better for type con Done Line 130: + "required boolean column4_b; " > nit: long line Done Line 133: SimpleGroupFactory f = new SimpleGroupFactory(schema); > Those lines are all too long, also you could probably refactor this? Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java File java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java: PS3, Line 204: > What strings? Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala: Line 119: kc.upsertRows(df, args.tableName) > Forgot to remove? Done http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/TestImportExportFiles.scala File java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/TestImportExportFiles.scala: Line 17: > nit: add a blank line. Done Line 66: "--inferschema=true"), sc) > ? Done -- To view, visit http://gerrit.cloudera.org:8080/7421 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If462af948651f3869b444e82151c3559fde19142 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sandish Kumar HN <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sandish Kumar HN <[email protected]> Gerrit-HasComments: Yes
