MyLanPangzi opened a new issue #2813:
URL: https://github.com/apache/hudi/issues/2813


   **Describe the problem you faced**
   
   flink write mor table but cannot using hive agg query newest data.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.flink write mor table
   2.create hive extrenal table using 
org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat 
   ```sql
   CREATE EXTERNAL TABLE dwd_sale_sale_detail_rt
   (
       `_hoodie_commit_time`    STRING,
       `_hoodie_commit_seqno`   STRING,
       `_hoodie_record_key`     STRING,
       `_hoodie_partition_path` STRING,
       `_hoodie_file_name`      STRING,
       shopid                   STRING
   salevalue decimal(1,2)
   ) partitioned by (`sdt` string)
       ROW FORMAT SERDE
           'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
       STORED AS INPUTFORMAT
           'org.apache.hudi.hadoop.HoodieParquetInputFormat'
           OUTPUTFORMAT
               'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
       LOCATION 
'hdfs://nameservice1/user/xiebo/hudi/dwd/dwd_sale_sale_detail_rt';
   ```
   3.using hive shell agg query get error
   4.
   
   **Expected behavior**
   
   hive query mor correctly and return agg result.
   
   **Environment Description**
   
   * Hudi version : 0.9.0
   
   * Spark version :
   
   * Hive version : 1.1 cdh 5.6.12
   
   * Hadoop version : 2.6 cdh 5.6.12
   
   * Storage (HDFS/S3/GCS..) : hdfs
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   
   021-04-13 17:05:45,815 INFO [IPC Server handler 6 on 46363] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1594105654926_12624744_m_000012_0 is : 0.0
   2021-04-13 17:05:45,818 FATAL [IPC Server handler 8 on 46363] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1594105654926_12624744_m_000012_0 - exited : java.io.IOException: 
java.lang.reflect.InvocationTargetException
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:734)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:438)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
   Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
        ... 11 more
   Caused by: java.lang.IllegalArgumentException: HoodieRealtimeRecordReader 
can only work on RealtimeSplit and not with 
hdfs://nameservice1/user/hudi/dwd/dwd_sale_sale_detail_rt/20210413/ab5a8ff3-4647-46ae-ba13-7b6eb7914516_8-10-0_20210413170058.parquet:0+57883047
        at 
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
        at 
org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:117)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to