Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15176 )
Change subject: KUDU-3049: [spark] Automatic handling of schema drift ...................................................................... Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestAlterTable.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestAlterTable.java: http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestAlterTable.java@162 PS5, Line 162: NonRecoverableException thrown = Assert.assertThrows(NonRecoverableException.class, new ThrowingRunnable() { > Too long here and L554. Done http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@357 PS5, Line 357: log.info(s"column already exists in table '$tableName' while handling schema drift") > Would be nice to add a test to cover this. Might be something we can't guar I have a hard time invoking this within a test scenario. It's also difficult to validate it occurred other than parsing logs. It's especially hard because we call syncClient.openTable right before we alter the table. If it helps I added the call to syncClient.alterTable twice here and validated the correct logging behavior. http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala: http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@572 PS5, Line 572: assertEquals(3, afterDf.schema.fields.length) > paranoid nit: I didn't notice it in the first review round, but there isn't Done http://gerrit.cloudera.org:8080/#/c/15176/5/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@576 PS5, Line 576: > Do you think it's worth adding a couple more scenarios: I added a step to the test to validate the `handleSchemaDrift = false` handling. For the wrong type handling, that is a behavior/test in its own right regardless of drift. I added a test case for that. -- To view, visit http://gerrit.cloudera.org:8080/15176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib1edebb293d6ae79c26a0ecb9ce7755308f667f4 Gerrit-Change-Number: 15176 Gerrit-PatchSet: 5 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 11 Feb 2020 22:45:11 +0000 Gerrit-HasComments: Yes
