[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

leesf (Jira) Mon, 18 Nov 2019 04:31:15 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976500#comment-16976500
 ]


leesf commented on HUDI-288:
----------------------------

[~vinoth] Sorry for late feedback. After a closer look to code paths, I prefer 
the second solution that we can write a new tool that wraps the current 
DeltaStreamer, just uses the kafka topic regex to identify all topics that need 
to be ingested, and just creates one delta streamer each topic within a SINGLE 
spark application. This solution is easier compared to the first solution.

Two questions. If the topics need to be ingested do not in regex pattern, 
should we also allow users to list all topics explicitly? 
Second, in currenty data flow, the relationship of kafka topic to 
_targetBasePath _is one-to-one, should we allow users to specify multi 
targetBasePath while consuming many topics? and the same to the config 
_targetTableName_ in hive.

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -------------------------------------------------------------------------------------
>
>                 Key: HUDI-288
>                 URL: https://issues.apache.org/jira/browse/HUDI-288
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: deltastreamer
>            Reporter: Vinoth Chandar
>            Assignee: leesf
>            Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@<dev.hudi.apache.org>
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

Reply via email to