Hi Gurudatt, With a minimal code change, you can subscribe to multiple Kafka topics using KafkaOffsetGen.java class. I feel the bigger problem in this case is going to be managing multiple target schemas because we register ParquetWriter with a single target schema at a time. I would also like to know if we have a workaround for such a case.
On Tue, Oct 1, 2019 at 12:33 PM Gurudatt Kulkarni <[email protected]> wrote: > Hi All, > > I have a use case where I need to pull multiple tables (say close to 100) > into Hadoop. Do we need to schedule 100 Hudi jobs to pull these tables? Can > there be a workaround where there is one Hudi Application pulling from > multiple Kafka topics? This will avoid creating multiple SparkSessions and > avoid the memory overhead that comes with it. > > Regards, > Gurudatt >
