[
https://issues.apache.org/jira/browse/HDDS-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632420#comment-17632420
]
Stephen O'Donnell commented on HDDS-7350:
-----------------------------------------
{quote}
In other words, if compression and/or encryption is enabled, each chunk will
contain
compressed and/or encrypted data. The processing stream is created per each
served
key, not for each chunk.
For example, we want to upload a 1GB file into Ozone, the transparent
compression
reduces the total file size to 800MB, which is further split into 4MB chunks
and transferred
to the server.
{quote}
I think this approach will limit the usefulness of the feature, as it will
prevent someone from being able to seek within the file. Eg, if I have a 1GB
file, and I want to seek to 500MB - how can I get there without reading all the
of data until I reach 500MB? With a non-compressed file, we can goto a block
and then a chunk offset and then seek within the chunk. You should look at how
ZFS does transparent compression - it allows seeking within a file just as if
it was not compressed. I've got a feeling we will need to have a compressions
stream at the chunk level.
We also need to think through how compression would work with EC. Perhaps it
does not need to be implemented in the first phase, but we need to think it out
so there is not some limitation of the design which would make it very
difficult to add later.
> Ozone Transparent Data Compression Support
> ------------------------------------------
>
> Key: HDDS-7350
> URL: https://issues.apache.org/jira/browse/HDDS-7350
> Project: Apache Ozone
> Issue Type: New Feature
> Reporter: Kirill Sizov
> Assignee: Kirill Sizov
> Priority: Major
> Attachments: compression_ozone - 2022.10.1.pdf,
> compression_ozone-2022.10.2.pdf, compression_ozone-2022.11.1.pdf,
> compression_ozone-2022.11.2.pdf
>
>
> Currently Ozone stores uncompressed data, which in case of text or a similar
> format may benefit from being compressed. This may save significant amount of
> space and hence the money.
> See the attached document for the design.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]