[GitHub] [hudi] rajgowtham24 commented on issue #2086: [SUPPORT] Hive Sync Not Working through run_sync_tool.sh

GitBox Wed, 16 Sep 2020 03:22:45 -0700


rajgowtham24 commented on issue #2086:
URL: https://github.com/apache/hudi/issues/2086#issuecomment-693315044



   @umehrot2 Currently we are reading real-time files from inbound bucket and 
writing into a landing bucket.
   Since the data would be consumed by multiple reporting tools, we thought of 
having two different buckets for write(using hudi) and read(using reporting 
tools).
   
   For the above approach instead of writing twice into each bucket. we thought 
of moving the underlying hudi table data files into reporting-bucket and run 
the hive_sync on top of the reporting-bucket hudi table data files. After the 
hive sync on reporting bucket, if both the table data are in sync, we though of 
using this approach for our Ingestion Pattern


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] rajgowtham24 commented on issue #2086: [SUPPORT] Hive Sync Not Working through run_sync_tool.sh

Reply via email to