There were a number of discussions that happened during ApacheCon. In the spirit of the Apache Way, I am taking the conversation online, sharing with the larger community and also capturing requirements. Credits to Owen who started this discussion.
There are a number of scenarios where users want to partially rewrite file blocks, and it would make sense to create a file system API to make these operations efficient. 1. Apache Iceberg or other evolvable table format. These table formats need to update table schema. The underlying files are rewritten but only a subset of blocks are changed. It would be much more efficient if a new file can be composed using some of the existing file blocks. 2. GDPR compliance "the right to erasure" Files must be rewritten to remove a person's data at request. Again, this is efficient because only a small set of file blocks is updated. 3. In-place erasure coding conversion. I had a proposal to support atomically rewriting replicated files into erasure coded files. This can be the building block to support auto-tiering. Thoughts? What would be a good FS interface to support these requirements? For Ozone folks, Ritesh opened a jira: HDDS-7297 <https://issues.apache.org/jira/browse/HDDS-7297> but I figured a larger conversation should happen so that we can take into the consideration of other FS implementations. Thanks, Weichiu