[GitHub] [hudi] nsivabalan commented on a change in pull request #2653: [WIP] [HUDI 1615] Fixing null schema in bulk_insert row writer path

GitBox Thu, 11 Mar 2021 13:49:23 -0800


nsivabalan commented on a change in pull request #2653:
URL: https://github.com/apache/hudi/pull/2653#discussion_r592746307




##########
File path: 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
##########
@@ -86,6 +85,43 @@ class TestCOWDataSource extends HoodieClientTestBase {
     assertTrue(HoodieDataSourceHelpers.hasNewCommits(fs, basePath, "000"))
   }
 
+  /**
+   * Test for https://issues.apache.org/jira/browse/HUDI-1615. Null Schema in 
BulkInsert row writer flow.
+   * This was reported by customer when archival kicks in as the schema in 
commit metadata is not set for bulk_insert
+   * row writer flow.
+   * In this test, we trigger a round of bulk_inserts and set archive related 
configs to be minimal. So, after 5 rounds,
+   * archival should kick in and 2 commits should be archived. If schema is 
valid, no exception will be thrown. If not,
+   * NPE will be thrown. Hence we don't have any explicit assertions.
+   */
+  @Test
+  def testArchivalIssue(): Unit = {
+    var structType : StructType = null
+    for (i <- 1 to 5) {

Review comment:
       This test runs for ~ 2 mins in my mac. Not sure how else we can test 
this fix. any pointers are appreciated. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on a change in pull request #2653: [WIP] [HUDI 1615] Fixing null schema in bulk_insert row writer path

Reply via email to