clintropolis commented on PR #18982: URL: https://github.com/apache/druid/pull/18982#issuecomment-3845684399
quite a lot of the segment already uses lz4 by default so curious how effective running the whole thing through lz4 again would be, did you do any experiments to compare with just not 'externally' compressing at all in deep storage? S3 support for honoring `druid.storage.zip` was added in #18544, and we've been using it in some of our clusters alongside the virtual storage functionality introduced in #18176, since not having to decompress the segments to load them speeds up things quite a lot. It trades some extra space, but has been worth it for our use case. Also, the new 'V10' segment format introduced in #18880 was built around some future ideas i have about improving the virtual storage functionality to start only downloading the v10 metadata to enable partial downloads of only the parts of the segment that are needed to take part in a query, which will require the segment not be 'externally' compressed in deep storage (columns inside the segment can obviously still be compressed). Fwiw, I'm not necessarily opposed to making this 'external' segment compression stuff more configurable (but it is certainly a bit tedious with the current interfaces since it needs to be handled by each implementation of segment pusher/puller separately). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
