[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-838529283 @vinothchandar , i have rebased this pr pls check them, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-837732845 @vinothchandar i will rebase this pr, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-823714627 @lw309637554 @nsivabalan thanks for your review. i will try testHoodieRealtimeCombineHoodieInputFormat in another pr, since it has nothing to do with this problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-822129270 @lw309637554 thanks for your reviewer. i left comments for your questions. Another question: TestHoodieCombineHiveInputFormat.testHoodieRealtimeCombineHoodieInputFormat is disabled by default, i have checked that test function, and find these exists some problems. could i fix those problem and enable TestHoodieCombineHiveInputFormat.testHoodieRealtimeCombineHoodieInputFormat -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-817071591 @garyli1019 unit test has added, pls review again, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-811705289 thanks @garyli1019 . ok, i will try to add unit test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-806722454 @garyli1019 could you pls help me review this pr, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
xiarixiaoyao commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-806721657 test step: before patch: step1: val df = spark.range(0, 10).toDF("keyid") .withColumn("col3", expr("keyid")) .withColumn("p", lit(0)) .withColumn("p1", lit(0)) .withColumn("p2", lit(7)) .withColumn("a1", lit(Array[String] ("sb1", "rz"))) .withColumn("a2", lit(Array[String] ("sb1", "rz"))) // create hoodie table hive_14b merge(df, 4, "default", "hive_14b", DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert") notice: bulk_insert will produce 4 files in hoodie table step2: val df = spark.range(9, 12).toDF("keyid") .withColumn("col3", expr("keyid")) .withColumn("p", lit(0)) .withColumn("p1", lit(0)) .withColumn("p2", lit(7)) .withColumn("a1", lit(Array[String] ("sb1", "rz"))) .withColumn("a2", lit(Array[String] ("sb1", "rz"))) // upsert table merge(df, 4, "default", "hive_14b", DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "upsert") now : we have four base files and one log file in hoodie table step3: spark-sql/beeline: select count(col3) from hive_14b_rt; then the query failed. 2021-03-25 20:23:14,014 | INFO | AsyncDispatcher event handler | Diagnostics report from attempt_1615883368881_0038_m_00_0: Error: java.lang.NullPointerException2021-03-25 20:23:14,014 | INFO | AsyncDispatcher event handler | Diagnostics report from attempt_1615883368881_0038_m_00_0: Error: java.lang.NullPointerException at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:101) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:43) at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:79) at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:36) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:92) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:43) at org.apache.hudi. hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:79) at org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:68) at org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:77) at org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:42) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:205) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:191) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) at org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) at java.security.AccessController.doPrivileged(Native Method) at javax .security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) after patch: spark-sql/hive-beeline select count(col3) from hive_14b_rt; +-+ | _c0 | +-+ | 12 | +-+ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org