[ 
https://issues.apache.org/jira/browse/HUDI-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4223:
----------------------------
    Description: 
When loading the metadata table in Spark shell using the following code, it 
throws NullPointerException from getLogRecordScanner
{code:java}
spark.read.format("hudi").load("s3a://<base_path>/.hoodie/metadata/").show  
{code}
 
{code:java}
Caused by: java.lang.NullPointerException   at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:484)
   at 
org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:342)   
at 
org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:173)
   at 
org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:252)
   at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:101)   
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)   at 
org.apache.spark.scheduler.Task.run(Task.scala:131)   at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)   at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
  at java.lang.Thread.run(Thread.java:748)  {code}

  was:
When loading the metadata table in Spark shell using the following code, it 
throws NullPointerException from getLogRecordScanner
{code:java}
Caused by: java.lang.NullPointerException   at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:484)
   at 
org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:342)   
at 
org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:173)
   at 
org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:252)
   at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:101)   
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)   at 
org.apache.spark.scheduler.Task.run(Task.scala:131)   at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)   at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
  at java.lang.Thread.run(Thread.java:748)  {code}


> Reading metadata table throws NullPointerException from getLogRecordScanner
> ---------------------------------------------------------------------------
>
>                 Key: HUDI-4223
>                 URL: https://issues.apache.org/jira/browse/HUDI-4223
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>    Affects Versions: 0.11.0
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.11.1
>
>
> When loading the metadata table in Spark shell using the following code, it 
> throws NullPointerException from getLogRecordScanner
> {code:java}
> spark.read.format("hudi").load("s3a://<base_path>/.hoodie/metadata/").show  
> {code}
>  
> {code:java}
> Caused by: java.lang.NullPointerException   at 
> org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:484)
>    at 
> org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:342) 
>   at 
> org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:173)
>    at 
> org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:252)
>    at 
> org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:101)  
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
> org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
> org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)   at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)   at 
> org.apache.spark.rdd.RDD.iterator(RDD.scala:337)   at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)   at 
> org.apache.spark.scheduler.Task.run(Task.scala:131)   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
>    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)   at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>    at java.lang.Thread.run(Thread.java:748)  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to