[ 
https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976791#comment-16976791
 ] 

Vinoth Chandar commented on HUDI-184:
-------------------------------------

[~yanghua]  been thinking a bit around what we could do to unlock progress with 
Flink...  There are two now 

1) Index

2) Size estimations

I wonder if we can think about starting with a simpler model first.. i.e use 
Joins for index, see if we can make the functionality work correctly without 
`WorkloadProfile` .. 

Also there are probably some bigger questions to answer first? e.g if we are 
targetting the streaming APIs, then whats the execution model? In Spark 
Streaming, we commit after each micro batch.  When do we commit for Flink 
writing? 

 

> Integrate Hudi with Apache Flink
> --------------------------------
>
>                 Key: HUDI-184
>                 URL: https://issues.apache.org/jira/browse/HUDI-184
>             Project: Apache Hudi (incubating)
>          Issue Type: New Feature
>          Components: Write Client
>            Reporter: vinoyang
>            Assignee: vinoyang
>            Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here: 
> [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to