[
https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976791#comment-16976791
]
Vinoth Chandar commented on HUDI-184:
-------------------------------------
[~yanghua] been thinking a bit around what we could do to unlock progress with
Flink... There are two now
1) Index
2) Size estimations
I wonder if we can think about starting with a simpler model first.. i.e use
Joins for index, see if we can make the functionality work correctly without
`WorkloadProfile` ..
Also there are probably some bigger questions to answer first? e.g if we are
targetting the streaming APIs, then whats the execution model? In Spark
Streaming, we commit after each micro batch. When do we commit for Flink
writing?
> Integrate Hudi with Apache Flink
> --------------------------------
>
> Key: HUDI-184
> URL: https://issues.apache.org/jira/browse/HUDI-184
> Project: Apache Hudi (incubating)
> Issue Type: New Feature
> Components: Write Client
> Reporter: vinoyang
> Assignee: vinoyang
> Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here:
> [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)