Re: Using Hudi to Pull multiple tables

Pratyaksh Sharma Tue, 01 Oct 2019 00:39:53 -0700

Hi Gurudatt,

With a minimal code change, you can subscribe to multiple Kafka topics
using KafkaOffsetGen.java class. I feel the bigger problem in this case is
going to be managing multiple target schemas because we register
ParquetWriter with a single target schema at a time. I would also like to
know if we have a workaround for such a case.


On Tue, Oct 1, 2019 at 12:33 PM Gurudatt Kulkarni <[email protected]>
wrote:

> Hi All,
>
> I have a use case where I need to pull multiple tables (say close to 100)
> into Hadoop. Do we need to schedule 100 Hudi jobs to pull these tables? Can
> there be a workaround where there is one Hudi Application pulling from
> multiple Kafka topics? This will avoid creating multiple SparkSessions and
> avoid the memory overhead that comes with it.
>
> Regards,
> Gurudatt
>

Re: Using Hudi to Pull multiple tables

Reply via email to