bigdata-spec opened a new issue, #9829:
URL: https://github.com/apache/hudi/issues/9829

   
   **Environment Description**
   
   * Hudi version : 0.13.1
   
   * Spark version : 3.3.2
   
   * Hive version : 2.1.0-cdh6.3.2
   
   * Hadoop version : 3.0.0-cdh6.3.2
   
   * Storage (HDFS/S3/GCS..) : hdfs
   
   * Running on Docker? (yes/no) : no
   
   
   Hi,I meet an error by query hudi table.
   where I run 
   ```
   select count(*),dt
   from  zone_dw.dws_xx_refresh_hi group by dt
   ```
   
   
   
   ```
   2023-10-07 15:13:42.013 ERROR io.trino.spi.TrinoException: Error occurs when 
executing flatMap
        at 
io.trino.plugin.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:276)
        at 
io.trino.plugin.hive.util.ResumableTasks$1.run(ResumableTasks.java:38)
        at io.trino.$gen.Trino_360____20230925_110711_2.run(Unknown Source)
        at 
io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:80)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: org.apache.hudi.exception.HoodieException: Error occurs when 
executing flatMap
        at 
org.apache.hudi.common.function.FunctionWrapper.lambda$throwingFlatMapWrapper$1(FunctionWrapper.java:50)
        at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271)
        at 
java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at 
java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
        at 
java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
        at 
java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
        at 
java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
        at 
java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
        at 
java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
        at 
java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
        at 
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
        at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at 
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
        at 
org.apache.hudi.common.engine.HoodieLocalEngineContext.flatMap(HoodieLocalEngineContext.java:118)
        at 
org.apache.hudi.metadata.FileSystemBackedTableMetadata.getPartitionPathWithPathPrefix(FileSystemBackedTableMetadata.java:109)
        at 
org.apache.hudi.metadata.FileSystemBackedTableMetadata.lambda$getPartitionPathWithPathPrefixes$0(FileSystemBackedTableMetadata.java:91)
        at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271)
        at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at 
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
        at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
        at 
org.apache.hudi.metadata.FileSystemBackedTableMetadata.getPartitionPathWithPathPrefixes(FileSystemBackedTableMetadata.java:95)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.listPartitionPaths(BaseHoodieTableFileIndex.java:281)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:206)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:383)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.<init>(BaseHoodieTableFileIndex.java:159)
        at 
org.apache.hudi.hadoop.HiveHoodieTableFileIndex.<init>(HiveHoodieTableFileIndex.java:52)
        at 
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatusForSnapshotMode(HoodieCopyOnWriteTableInputFormat.java:235)
        at 
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatus(HoodieCopyOnWriteTableInputFormat.java:142)
        at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
        at 
org.apache.hudi.hadoop.HoodieParquetInputFormatBase.getSplits(HoodieParquetInputFormatBase.java:68)
        at 
io.trino.plugin.hive.BackgroundHiveSplitLoader.lambda$loadPartition$3(BackgroundHiveSplitLoader.java:485)
        at 
io.trino.plugin.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:25)
        at io.trino.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:98)
        at 
io.trino.plugin.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:485)
        at 
io.trino.plugin.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:345)
        at 
io.trino.plugin.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:269)
        ... 6 more
   Caused by: java.io.FileNotFoundException: File 
hdfs://nameservice1/user/hive/warehouse/zone_dw.db/dws_xxx_refresh_hi/dt=20230906/hh=23
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
        at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
        at 
org.apache.hudi.metadata.FileSystemBackedTableMetadata.lambda$getPartitionPathWithPathPrefix$f0540b37$1(FileSystemBackedTableMetadata.java:111)
        at 
org.apache.hudi.common.function.FunctionWrapper.lambda$throwingFlatMapWrapper$1(FunctionWrapper.java:48)
        ... 46 more
   ```
   /user/hive/warehouse/zone_dw.db/dws_xxx_refresh_hi/dt=20230906/hh=23 is not  
exist. 
   but this sql ,hive and spark can query . 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to