Re: [s3ql] Re: Partial block caching implementation

Ivan Shapovalov Fri, 10 Jul 2020 14:29:26 -0700

On 2020-07-10 at 19:54 +0100, Nikolaus Rath wrote:
> On Jul 10 2020, Daniel Jagszent <[email protected]> wrote:
> > > Ah yes, compression and probably encryption will indeed preclude
> > > any
> > > sort of partial block caching. An implementation will have to be
> > > limited to plain uncompressed blocks, which is okay for my use-
> > > case
> > > though (borg provides its own encryption and compression anyway).
> > > [...]
> > Compression and encryption are integral parts of S3QL and I would
> > argue
> > that disabling them is only an edge case.
> 
> If I were to write S3QL from scratch, I would probably not support
> this
> at all, right. However, since the feature is present, I think we
> ought
> to consider it fully supported ("edge case" makes it sound as if this
> isn't the case).
> 
> 
> > I might be wrong but I think Nikolaus (maintainer of S3QL) will not
> > accept such a huge change into S3QL that is only beneficial for an
> > edge
> > case.
> 
> Never say never, but the bar is certainly high here. I think there
> are
> more promising avenues to explore - eg. storing the
> compressed/uncompressed offset mapping to make partial retrieval work
> for all cases.

Hmm, I'm not sure how's that supposed to work.

AFAICS, s3ql uses "solid compression", meaning that the entire block is
compressed at once. It is generally impossible to extract a specific
range of uncompressed data without decompressing the whole stream.[1]

Encryption does not pose this kind of existential problem — AES is used
in CTR mode, which theoretically permits random-access decryption — but
the crypto library in use, python-cryptography, doesn't seem to permit
this sort of trickery.

[1]: This can be solved by converting compression layer into a block-
based one, but this will naturally break compatibility (i. e. we will
have to introduce a new set of compression algorithms, that is, another
corner case) and will require to either compromise on the block size,
or introduce complex indirection (such as storing compressed-
uncompressed offset maps along with the object itself), or completely
blow metadata out of proportion (recording an offset mapping for each
128K of the data). Regardless of the implementation plan this will also
compromise the compression efficiency. Completely not worth it, IMO.

--
Ivan Shapovalov / intelfx /

--
You received this message because you are subscribed to the Google Groups
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/s3ql/994adf80ffb0a27abc8e66e88d00f95d054a7b2c.camel%40intelfx.name.

signature.asc
Description: This is a digitally signed message part

Re: [s3ql] Re: Partial block caching implementation

Reply via email to