cb149 edited a comment on issue #3984:
URL: https://github.com/apache/hudi/issues/3984#issuecomment-1008878036
> hi @cb149 @nsivabalan @xushiyan I have found this problem, just need to
`set hoodie.file.index.enable=false` to work
>
> ```
> val tripsSnapshotDF = spark.read.format("hudi")
> .option("hoodie.file.index.enable", "false")
> .load(basePath)
> ```
HI @XuQianJin-Stars that solves the problem but decreases the performance
extremely, since it takes a very long time before the Stage in Spark is visible.
E.g. as a workaround I am using `....where("_partition like
'year=2021/month=6/%'").count` (depending on which column contains the
partitionpath) , which takes like 5 seconds total, while using
_hoodie.file.index.enable false_ takes multiple minutes
Also, using this with the example from the quickstart guide returns 0 if you
run `tripsSnapshotDF.count`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]