Yanjia Gary Li created HUDI-597:
-----------------------------------
Summary: Enable incremental pulling from defined partitions
Key: HUDI-597
URL: https://issues.apache.org/jira/browse/HUDI-597
Project: Apache Hudi (incubating)
Issue Type: New Feature
Reporter: Yanjia Gary Li
Assignee: Yanjia Gary Li
For the use case that I only need to pull the incremental part of certain
partitions, I need to do the incremental pulling from the entire dataset first
then filtering in Spark.
If we can use the folder partitions directly as part of the input path, it
could run faster by only load relevant parquet files.
Example:
{code:java}
spark.read.format("org.apache.hudi")
.option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY,DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL)
.option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY, "000")
.load(path, "year=2020/*/*/*")
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)