leosanqing opened a new issue, #8872: URL: https://github.com/apache/hudi/issues/8872
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? YES **Describe the problem you faced** hello, I exec "select count(1) from table_rt" to query table rows error, I used `bulk_insert` to insert data,so 8 datas are parquet format. ```sql select count(1) from t12_ro; --- it's ok, result is 8 select count(1) from t12_rt; --- error, error info is as follows ``` ``` Diagnostic Messages for this Task: Error: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:217) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:345) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:176) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257) ... 11 more Caused by: java.lang.IllegalArgumentException: HoodieRealtimeRecordReader can only work on RealtimeSplit and not with hdfs://bigdata01:9000/hudi_test/t12/par1/b0cdf43d-8f7b-486b-bc8f-d4bf09d1c5dc-0_1-2-0_20230528142248215.parquet:0+434511 at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40) at org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:65) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:99) ... 16 more ``` when I set `set hive.input.format = org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat` , It's still wrong, but error info is diff; ``` URL: http://bigdata01:18088/taskdetails.jsp?jobid=job_1685068536659_0019&tipid=task_1685068536659_0019_m_000000 ----- Diagnostic Messages for this Task: Error: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch cannot be cast to org.apache.hadoop.io.ArrayWritable at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.createValue(RealtimeCompactedRecordReader.java:213) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.createValue(RealtimeCompactedRecordReader.java:54) at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.createValue(HoodieRealtimeRecordReader.java:89) at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.createValue(HoodieRealtimeRecordReader.java:36) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.createValue(RealtimeCompactedRecordReader.java:213) at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.createValue(RealtimeCompactedRecordReader.java:54) at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.createValue(HoodieRealtimeRecordReader.java:89) at org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.createValue(HoodieCombineRealtimeRecordReader.java:88) at org.apache.hudi.hadoop.realtime.HoodieCombineRealtimeRecordReader.createValue(HoodieCombineRealtimeRecordReader.java:41) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.createValue(MapTask.java:187) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL ``` **To Reproduce** Steps to reproduce the behavior: 1. create MOR tableļ¼using bulk_insert(not necessary) 2. exec select(*) from table_rt 3.set hive.input.format = org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat 4. exec select(*) from table_rt **Expected behavior** Query the count correctly for rt table; **Environment Description** * Hudi version : 0.13 * Spark version :xxx * Hive version : 3.2.x * Hadoop version : 3.2.x * Storage (HDFS/S3/GCS..) : hdfs * Running on Docker? (yes/no) : no **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
