XuQianJin-Stars commented on issue #3984:
URL: https://github.com/apache/hudi/issues/3984#issuecomment-1013615169
> > hi @cb149 @nsivabalan @xushiyan I have found this problem, just need to
`set hoodie.file.index.enable=false` to work
> > ```
> > val tripsSnapshotDF = spark.read.format("hudi")
> > .option("hoodie.file.index.enable", "false")
> > .load(basePath)
> > ```
>
> HI @XuQianJin-Stars that solves the problem but decreases the performance
extremely, since it takes a very long time before the Stage in Spark is visible.
>
> E.g. as a workaround I am using `....where("_partition like
'year=2021/month=6/%'").count` (depending on which column contains the
partitionpath) , which takes like 5 seconds total, while using
_hoodie.file.index.enable false_ takes multiple minutes
Regarding this, we will divide it into three steps to completely solve this
problem,
[HUDI-3200](https://issues.apache.org/jira/browse/HUDI-3200)、[HUDI-3201](https://issues.apache.org/jira/browse/HUDI-3201)、[HUDI-3202](https://issues.apache.org/jira/browse/HUDI-3202)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]