Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15176 )
Change subject: KUDU-3049: [spark] Automatic handling of schema drift ...................................................................... Patch Set 5: Code-Review+1 (2 comments) Looks this good to me, I just noticed that a few cases are not covered by the tests in Scala sqlContext. Not sure whether it's critical, but anyways wanted to point at that. http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala: http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@572 PS5, Line 572: assertEquals(3, afterDf.schema.fields.length) paranoid nit: I didn't notice it in the first review round, but there isn't a precondition on the number of the columns in the table before attempting to insert the row with the new column. Of course, it's highly unlikely that the test table's structure ever changes to add 'val2' column, but to make things consistent in such unlikely case, maybe consider adding assertEquals(2, df.schema.fields.length) before line 569? http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@576 PS5, Line 576: Do you think it's worth adding a couple more scenarios: * try to have the scenario similar to insert a row with not-yet-existing column where handleSchemaDrift is not set (i.e. testing the backward compatibility of the schema drift change)? * a scenario when one actor adds a new column with one type, while another actor (i.e. different sqlContext) tries to insert a row with new column of the same name, but of different type right after that? -- To view, visit http://gerrit.cloudera.org:8080/15176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib1edebb293d6ae79c26a0ecb9ce7755308f667f4 Gerrit-Change-Number: 15176 Gerrit-PatchSet: 5 Gerrit-Owner: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Bankim Bhavsar <ban...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 11 Feb 2020 21:04:30 +0000 Gerrit-HasComments: Yes