[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

Vinoth Chandar (Jira) Mon, 02 Dec 2019 10:53:42 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986264#comment-16986264
 ]


Vinoth Chandar commented on HUDI-288:
-------------------------------------

> So keeping <topic name> in the target path looks a bit skeptical to me 
> because a topic name might not necessarily include table name

I agree. I was suggesting a mere sane default, we should let the user override 
as needed using a TableConfig like mechanism, if needed.. If not, by default 
table_name = topic_name seems acceptable to me. At Uber atleast, it was very 
useful for auto creating Hudi datasets based on newly added kafka topics for e.g

> then it will keep on running for the first table itself and will never pick 
> up the next table

Yes. you need a thread per DeltaSync instance.. Supporting continuous mode 
would be good for k8s deployments, where cluster setup and teardown are costly 
affairs.. Continuous mode solves the problem of managing compactions for MOR. 
For COW, running without continuous mode could be sufficient. We can phase this 
in slowly as well.  

 

So, whos going to drive this? :)  We should also give this tool a Cool name :D 

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -------------------------------------------------------------------------------------
>
>                 Key: HUDI-288
>                 URL: https://issues.apache.org/jira/browse/HUDI-288
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: deltastreamer
>            Reporter: Vinoth Chandar
>            Assignee: leesf
>            Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@<dev.hudi.apache.org>
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

Reply via email to