geruh opened a new pull request, #14341:
URL: https://github.com/apache/iceberg/pull/14341

   We've been seeing a need for repair manifests functionality at AWS to fix 
some corrupted Iceberg tables and been using a workaround with the original pr 
to fix. So I reached out to @amogh-jahagirdar about getting this through, and 
he let me pick up the work on `RepairManifests` #10784.  Originally by 
@tabmatfournier in #10445.
   
   You can check out the original discussions in #10784. 
   
   In this PR, I’ve updated `RepairManifests` so that the result returns counts 
instead of full paths for duplicated, recovered and removed files. This helps 
avoid extremely large lists in the results when huge portions of the tables 
manifests have been impacted. For instance, when users have S3 retention 
policies or partitioned paths that are deleted from storage. In most cases, I 
think it's safer to return stats based on the count of operations than 
inspecting each file. The updated result still includes the new set of manifest 
lists, so users can inspect the changes if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to