lw309637554 commented on pull request #2263:
URL: https://github.com/apache/hudi/pull/2263#issuecomment-748632649


   > > This is great, thanks @satishkotha !
   > > I have completed a first pass. Don't have major concerns. May be we can 
work through these initial comments, as I complete the remainder.
   > > On follow ups
   > > 
   > > * IIUC inline clustering should work as-is from datasource/deltastreamer 
paths with this change, by passing necessary configs. We should create two JIRA 
one each for support for async clustering via datasource and deltastreamer?
   > > * Can you share how much testing on a production environment has been 
done for this.
   > 
   > @vinothchandar
   > 
   > * Looks like @lw309637554 created 
https://issues.apache.org/jira/browse/HUDI-1399 for async clustering and 
there's some good discussion there.
   > * I did some basic testing in staging environment mostly with inline 
clustering. I have another PR for test suite changes to validate async 
clustering via calling writeClient APIs. I hope to get more production scale 
tests over next week.
   
   @satishkotha  i had added inline clustering unit tests for spark datasource 
and deltastreamer in my local branch .  When this pr merged , i can open pull 
request. Then base on the unit tests , will land the independent clustering 
spark job , then async clustering  via datasource and via deltastreamer.   cc 
@vinothchandar 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to