Hi

I'm looking for advice on a "to bundle or not to bundle" question for a PR I'm working on which enables the reading and writing of all of the compression codecs standardised for Parquet.  That amounts to adding support for LZO, LZ4, Brotli and Zstandard.

Apart from some minor code changes in Drill in itself, users will obviously also need implementations of each codec and we don't currently bundle all of the aforementioned.  In cases where native codec libs are involved then I guess platform specifics would become a consideration but let's gloss over that for now.

In the case of LZO I believe that a GPL license applies and I don't think it can ever be bundled (but we can still enable it and provide instructions for users to add it to their installations themselves).  In the case of Brotli there is an Apache-licensed implementation that we can bundle if we don't mind adding a 750KB JAR file.

So my question is: should I bundle all of the codecs that I can, making things work out of the box but adding to the size of the distributable?  Or should I put in documentation and error messages that instruct users to get the codecs themselves instead?

Thanks
James

Reply via email to