[GitHub] [hudi] bvaradar commented on issue #2355: [SUPPORT] Best method to have Hudi process single stream from multiple source tables

GitBox Mon, 21 Dec 2020 13:04:31 -0800


bvaradar commented on issue #2355:
URL: https://github.com/apache/hudi/issues/2355#issuecomment-749195029



   @WTa-hash : This sounds like a general spark streaming question. Is it 
possible to create separate threads within your foreatchbatch after you grouped 
dataframes and concurrently write to separate hudi tables ?
   
   Another option is one spark-stream for each sink table but use the same 
kinesis stream. The spark stream has to filter records landing to the same sink 
table. This will be inefficient w.r.t network I/O but it will break the 
synchronization with other table writes. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bvaradar commented on issue #2355: [SUPPORT] Best method to have Hudi process single stream from multiple source tables

Reply via email to