[ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977695#comment-16977695
 ] 

Vinoth Chandar commented on HUDI-288:
-------------------------------------

I think we can stick to the same whitelist/blacklist that Kafka itself uses? 

 

> should we allow users to specify multi targetBasePath while consuming many 
> topics, I think only one targetBasePath is simpler but does it make sense? 

We need to do this. IIUC, even now, we can specifiy multiple topics as source 
but they get written as a single Hudi dataset. Here, we want to ingest kafka 
topics are separate Hudi datasets.  1-1 mapping between a kafka topic and a 
hudi dataset.. I think the tool can take a `--base-path-prefix` and place each 
hudi dataset under `<base_path_prefix>/<topic_name>`. Also we could allow topic 
level overrides as needed.. for deltra steamer/hudi properties.. Our 
DFSPropertiesConfiguration class already supports includes as well. 

 

Are you targetting this for 0.5.1 next release? Or do you think we can pick up 
some things already labelled for that release.

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -------------------------------------------------------------------------------------
>
>                 Key: HUDI-288
>                 URL: https://issues.apache.org/jira/browse/HUDI-288
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: deltastreamer
>            Reporter: Vinoth Chandar
>            Assignee: leesf
>            Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@<dev.hudi.apache.org>
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to