[
https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111795#comment-17111795
]
Yanjia Gary Li commented on HUDI-110:
-------------------------------------
IIUC, this ticket is trying to extract the partition info from the folder
structure when querying through Spark. Please let me know if I am wrong.
I made a PR with an example. This feature is actually supported already.
> Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer
> ------------------------------------------------------------------------------
>
> Key: HUDI-110
> URL: https://issues.apache.org/jira/browse/HUDI-110
> Project: Apache Hudi (incubating)
> Issue Type: Improvement
> Components: DeltaStreamer, Spark Integration, Usability
> Reporter: Balaji Varadarajan
> Assignee: Yanjia Gary Li
> Priority: Minor
> Labels: bug-bash-0.6.0, pull-request-available
>
> Currently
> SlashEncodedDayPartitionValueExtractor is the default being used. This is not
> a common format outside Uber.
>
> Also, Spark DataSource provides partitionedBy clauses which has not been
> integrated for Hudi Data Source. We need to investigate how we can leverage
> partitionBy clause for partitioning.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)