[
https://issues.apache.org/jira/browse/HUDI-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237775#comment-17237775
]
Nishith Agarwal commented on HUDI-55:
-------------------------------------
Blurb from slack channel :
```
I have a requirement to compact datalake but need bucketing on top of
compaction so that during query time, only the files relevant to the "id" in
query would be scanned. Is that supported in Hudi? If not, is it possible to
extend Hudi to support it? Hello Team - we have a need for bucketing our
datasets (primarily to keep the parquet file size optimized for faster read).
We see that Hudi doesn't support bucketing now. Are there any plans to support
bucketing in the future?
I have a requirement to compact datalake but need bucketing on top of
compaction so that during query time, only the files relevant to the "id" in
query would be scanned. Is that supported in Hudi? If not, is it possible to
extend Hudi to support it? Following up on the email"Bucketing in Hudi", we
would like to schedule a meeting to understand and estimate the code changes
needed to achieve bucketing in Hudi. The high level requirements are as
detailed in email but we could chat further in the
meeting to get into specifics. When would be the earliest we could have this
discussion?
```
> Investigate support for bucketed tables ala Hive #74
> ----------------------------------------------------
>
> Key: HUDI-55
> URL: https://issues.apache.org/jira/browse/HUDI-55
> Project: Apache Hudi
> Issue Type: New Feature
> Components: Hive Integration
> Reporter: Vinoth Chandar
> Priority: Major
>
> https://github.com/uber/hudi/issues/74
--
This message was sent by Atlassian Jira
(v8.3.4#803005)