nsivabalan commented on issue #7734:
URL: https://github.com/apache/hudi/issues/7734#issuecomment-1404081333

   @VitoMakarevich : Can we sync up via general slack in apache hudi workspace. 
Would like to get more clarify around the scenario. 
   If my understanding is right. 
   there are two hudi tables in play here. TableA and TableB(both are hudi 
tables). TableB is populated by doing a snapshot query on tableA and doing a 
filtering on top (from what you described you are not doing leveraging 
incremental query. Curious to understand why though?). So, in this pipeline, 
you are seeing an uptick in the GET and HEAD calls with 0.12.1 compared to 
0.11.0 (w/o any metadata table). Do you happened to have separate dashboard for 
requests to TableA vs TableB? 
   
   And you have commits and clean going on. You are not sure whats playing a 
part here. Can you disable clean for few commits and see do you see similar 
trend here. 
   
   Commit of interest has 4GB data ingested. 55k records(~80k record size), 
where 38k are updates and rest is inserts. ~= 70% updates. 
   
   Let us know how we can sync up via slack to investigate this more. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to