kbuci commented on issue #17844:
URL: https://github.com/apache/hudi/issues/17844#issuecomment-3820542082

   Thanks for the discussion
   
   @vinothchandar 
   
   > empty clean for now, without any additional format changes or complexity 
(stop gap)
   
   as requested, here is a PR for supporting an empty clean plan 
https://github.com/apache/hudi/pull/11605/changes . Surya had created this back 
in HUDI 0.x, but I can rebase it on latest master if it helps. 
   
   > To ponder; if we can maintain the table service state also in the MDT..
   For our use cases we want to support being able to use "restore" to restore 
a datasets to an earlier instant on the timeline. So I believe this approach 
would satisfy that requirement. Since the MDT would also be restored to said 
instant. But thinking out loud:
   - Even this allows us to no longer need empty clean, we would still need 
some "operation" that wouldn't actually clean anything but would write a new 
ECTR to the MDT. And some configuration/logic to determine who (clean? 
archival?) attempts this operation (and how often). So even though this 
prevents us from having to block archival on ECTR, it looks to me that we would 
still have to orchestrate some operation at a regular cadence (the same way we 
do for empty clean)
   - In our org's 0.14 HUDI workloads, we sometimes have to delete and 
re-create the MDT to mitigate bugs/issues that merge (since re-creating the MDT 
is much cheaper for us than re-creating the entire dataset/data files). If we 
want to still keep this flexibility/mitigation option available in 1.2 without 
forcing our workloads to do a full scan clean (after this MDT rebootstrap 
event), we might want to consider also duplicating the latest ECTR somewhere 
else outside of the proposed MDT partition (such as some [dot] hoodie metafile)
   
   > Orthogonally, love to see some numbers at which we see this issues i.e 
number of files and such.. 
   
   Sure let me try to find some logs/numbers on our end. But I should mention 
upfront that, in addition to using HUDI 0.14, we:
   - Do not use timeline server. This is since back in 0.10 we noticed issues 
(memory pressure iirc) for workloads where we process multiple datasets in a 
single "ingestion" spark job. And we have not yet internally validated if this 
still is an issue on 0.14/1.x
   - Use in memory filesystem view.  We don't use spillable filesystem view due 
to a correctness issue we noticed due to serialization 
https://github.com/apache/hudi/issues/17957 
   
   
   
   @danny0405 
   > in this commit, the fs view request from clean planner goes though a set 
of new stateless APIs(that basically does not cache the file group on fs view), 
so that to midigate the memory pressure of driver.
   
   Thanks for sharing this context. We actually backported some recent changes 
to clean like https://github.com/apache/hudi/pull/10928 to our 0.14 tbuild o 
try to mitigate these issues. I have not done any proper profiling but my 
initial hunch is that:
   - The `shouldUseBatchLookup` option in CleanPlanActionExecutor is 
continually adding partitions from all batches to the FS view without 
"unloading" partitions from prior batch. So the driver memory usage continually 
increases. And then when it is copied over to the spark task (which "looks for" 
all files to clean in a given partition) that adds more memory usage as well. 
   (note that as mentioned above, although we use MDT we do not use timeline 
server or spillable FS view)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to