(1) seems not to be the issue because it is Apache licensed.
(2) and (3) are the big issues, because it requires a provided huge uber
jar that essentially leaks Hadoop classes into core SDK [1] so it is
definitely concerning.

We discussed at some point during the PR that added ZStandard support about
creating some sort of Registrar for compression algorithms [2] but we
decided to not go ahead because we could achieve that for the zstd case via
the optional dependencies of commons-compress. Maybe it is time to
reconsider if such mechanism is worth. For example for users that may not
care about having the hadoop leakage to be able to use LZO.

Refs.
[1] https://mvnrepository.com/artifact/io.airlift/aircompressor/0.16
[2] https://issues.apache.org/jira/browse/BEAM-6422




On Tue, Dec 3, 2019 at 7:01 PM Robert Bradshaw <[email protected]> wrote:

> Is there a way to wrap this up as an optional dependency with multiple
> possible providers, if there's no good library satisfying all of the
> conditions (in particular (1))?
>
> On Tue, Dec 3, 2019 at 9:47 AM Luke Cwik <[email protected]> wrote:
> >
> > I was hoping that someone in the community would provide some
> alternatives since there are quite a few implementations.
> >
> > On Tue, Dec 3, 2019 at 8:20 AM Amogh Tiwari <[email protected]> wrote:
> >>
> >> Hi Luke,
> >>
> >> I agree with your thoughts and observations. But, airlift:aircompressor
> is the only implementation of LZO in pure java. That straight away solves
> #5.
> >> The other implementations that I found either have licensing issues
> (since LZO natively uses GNU GPL licence) or are implemented using .c, .h
> and jni (which again make them dependent on the OS). Please refer these:
> twitter/hadoop-lzo and shevek/lzo-java.
> >> These were the main reasons why we based this on airlift:aircompressor.
> >>
> >> Thanks and Regards,
> >> Amogh
> >>
> >>
> >>
> >> On Tue, Dec 3, 2019 at 2:59 AM Luke Cwik <[email protected]> wrote:
> >>>
> >>> I took a look. My biggest concern is finding a good LZO
> implementation. Looking for one that preferably has:
> >>> 1) Apache license
> >>> 2) Has zero transitive dependencies
> >>> 3) Is small
> >>> 4) Is performant
> >>> 5) Is native java or supports execution on the three main OSs
> (Windows, Linux, Mac)
> >>>
> >>> In your PR you suggested using io.airlift:aircompressor:0.16 which
> doesn't meet item #2 and its transitive dependency fails #3.
> >>>
> >>> On Mon, Dec 2, 2019 at 12:16 PM Amogh Tiwari <[email protected]>
> wrote:
> >>>>
> >>>> Hi,
> >>>> I have filed a PR for an extension that will enable Apache Beam to
> work with LZO/LZOP compression. Please refer.
> >>>> I would love it if someone can take this up and review it.
> >>>> Please feel free to share your thoughts/suggestions.
> >>>> Regards,
> >>>> Amogh
>

Reply via email to