On 2020-07-11 at 12:13 +0100, Nikolaus Rath wrote: > On Jul 11 2020, Ivan Shapovalov <[email protected]> wrote: > > On 2020-07-10 at 19:54 +0100, Nikolaus Rath wrote: > > > On Jul 10 2020, Daniel Jagszent <[email protected]> wrote: > > > > > Ah yes, compression and probably encryption will indeed > > > > > preclude > > > > > any sort of partial block caching. An implementation will > > > > > have to > > > > > be limited to plain uncompressed blocks, which is okay for my > > > > > use- case though (borg provides its own encryption and > > > > > compression anyway). [...] > > > > Compression and encryption are integral parts of S3QL and I > > > > would > > > > argue that disabling them is only an edge case. > > > > > > If I were to write S3QL from scratch, I would probably not > > > support > > > this at all, right. However, since the feature is present, I > > > think we > > > ought to consider it fully supported ("edge case" makes it sound > > > as > > > if this isn't the case). > > > > > > > > > > I might be wrong but I think Nikolaus (maintainer of S3QL) will > > > > not > > > > accept such a huge change into S3QL that is only beneficial for > > > > an > > > > edge > > > > case. > > > > > > Never say never, but the bar is certainly high here. I think > > > there > > > are > > > more promising avenues to explore - eg. storing the > > > compressed/uncompressed offset mapping to make partial retrieval > > > work > > > for all cases. > > > > Hmm, I'm not sure how's that supposed to work. > > > > AFAICS, s3ql uses "solid compression", meaning that the entire > > block is > > compressed at once. It is generally impossible to extract a > > specific > > range of uncompressed data without decompressing the whole > > stream.[1] > > At least bzip2 always works in blocks, IIRC blocks are at most 900 kB > (for highest compression settings). I wouldn't be surprised if the > same > holds for LZMA.
True, I forgot that bzip2 is inherently block-based. Not sure about LZMA or gzip, but there is still a significant obstacle: how would you extract this information from the compression libraries? > > We could track the size of each compressed block, and store it as > part > of the metadata of the object (so it doesn't blow-up the SQLite > table). > > > Encryption does not pose this kind of existential problem — AES is > > used > > in CTR mode, which theoretically permits random-access decryption — > > but > > the crypto library in use, python-cryptography, doesn't seem to > > permit > > this sort of trickery. > > Worst case you can feed X bytes of garbage into the decrypter and > then > start with the partial block - with CTR you should get the right > output. Yes, that could probably work. Still feels like a grand hack. -- Ivan Shapovalov / intelfx / -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/s3ql/85251a7f8c1176770ca65acc679266d89c3a0211.camel%40intelfx.name.
signature.asc
Description: This is a digitally signed message part
