[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-05-11 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-838529283


   @vinothchandar   , i have rebased this pr   pls check them, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-05-10 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-837732845


   @vinothchandar  i will  rebase this pr, thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-20 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-823714627


   @lw309637554  @nsivabalan  thanks for your review. i will try 
testHoodieRealtimeCombineHoodieInputFormat in another pr, since 
   it has nothing to do with this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-18 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-822129270


   @lw309637554   thanks for your reviewer.  i left comments for your questions.
   
   Another question: 
TestHoodieCombineHiveInputFormat.testHoodieRealtimeCombineHoodieInputFormat is 
disabled by default,  i have checked that test function, and find these exists 
some problems.  could i fix those problem and enable 
TestHoodieCombineHiveInputFormat.testHoodieRealtimeCombineHoodieInputFormat


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-09 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-817071591


   @garyli1019  unit test has added, pls review again, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-01 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-811705289


   thanks @garyli1019 .   ok, i will try to add unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-03-25 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-806722454


   @garyli1019 could you pls help me review this pr, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-03-25 Thread GitBox


xiarixiaoyao commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-806721657


   test step:
   before patch:
   step1:
   
   val df = spark.range(0, 10).toDF("keyid")
   .withColumn("col3", expr("keyid"))
   .withColumn("p", lit(0))
   .withColumn("p1", lit(0))
   .withColumn("p2", lit(7))
   .withColumn("a1", lit(Array[String] ("sb1", "rz")))
   .withColumn("a2", lit(Array[String] ("sb1", "rz")))
   
   // create hoodie table hive_14b
   
merge(df, 4, "default", "hive_14b", 
DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert")
   
   notice:  bulk_insert will produce 4 files in hoodie table
   
   step2:
   
val df = spark.range(9, 12).toDF("keyid")
   .withColumn("col3", expr("keyid"))
   .withColumn("p", lit(0))
   .withColumn("p1", lit(0))
   .withColumn("p2", lit(7))
   .withColumn("a1", lit(Array[String] ("sb1", "rz")))
   .withColumn("a2", lit(Array[String] ("sb1", "rz")))
   
   // upsert table 
   
   merge(df, 4, "default", "hive_14b", 
DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "upsert")
   
now :  we have four base files and one log file in hoodie table
   
   step3: 
   
   spark-sql/beeline: 
   
select count(col3) from hive_14b_rt;
   
   then the query failed.
   2021-03-25 20:23:14,014 | INFO  | AsyncDispatcher event handler | 
Diagnostics report from attempt_1615883368881_0038_m_00_0: Error: 
java.lang.NullPointerException2021-03-25 20:23:14,014 | INFO  | AsyncDispatcher 
event handler | Diagnostics report from attempt_1615883368881_0038_m_00_0: 
Error: java.lang.NullPointerException at 
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:101)
 at 
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:43)
 at 
org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:79)
 at 
org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:36)
 at 
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:92)
 at 
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:43)
 at org.apache.hudi.
 
hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:79)
 at 
org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:68)
 at 
org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:77)
 at 
org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.next(HoodieCombineRealtimeRecordReader.java:42)
 at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:205)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:191) 
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at 
org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) at 
java.security.AccessController.doPrivileged(Native Method) at javax
 .security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
   
   
   after patch:
   spark-sql/hive-beeline 
select count(col3) from hive_14b_rt;
   +-+
   |   _c0   |
   +-+
   | 12  |
   +-+
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org