@Ismaël: it is another topic, I can show you privately part of it but I
can't share it publicly ATM sadly :(. But long story short for avro: yes
the stack being way outdated and with known issues it should be upgraded
quickly or removed.

On the beam dependencies: I understand avro is commonly used but others
serializations as well when you start to take into account companies which
can have chosen parket or even json/xml (not a joke). For them it means,
relying on a stack which is uncontrolled, inconsistent (keep in mind it
forces beam to have 2 unrelated jackson version and expose it in the IDE
which is a bad user experience when it comes to completion) and prevents to
deploy due to security scans.

In other words: I can see a coupling in term of code but I really think it
is wrong to enforce it as much as it is today. This means the fallback
chain can be handled with a SPI probably or even reflection (for detection
only) but the dependency should be removed.

More generally beam sdk core shouldnt rely on external dependencies at all
IMHO since it is the core of the project and should be integrable with any
environment without enforcing any debt or dependencies.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau>

2017-12-19 11:28 GMT+01:00 Ismaël Mejía <[email protected]>:

> There are two important points raised here that we are missing in the
> discussion:
>
> 1. We don’t have any kind of security validation for Beam and its
> dependencies. This is important and it would be really nice if we
> could get this improved and some sort of automation on this. However I
> doubt there is a free service for this. Does the ASF has something for
> this? or some of the companies involved in the project could help us
> setup this?
>
> Romain, can you share the security report with the Beam community so
> we can be aware of this an other security issues at least for the
> moment?
>
> 2. The issues raised on Avro security should also be raised into the
> Avro community. Beam is using the latest Avro release so in part Beam
> cannot be blamed of dependency negligence since we follow the latest
> Avro release.
>
> Are the issues about Avro itself or about the Jackson version because
> if it is this the case I don’t think the things will move quickly:
> https://issues.apache.org/jira/browse/AVRO-1126
>
>
> On Tue, Dec 19, 2017 at 11:23 AM, Jean-Baptiste Onofré <[email protected]>
> wrote:
> > Agree, it's what I meant by "core transforms".
> >
> > Regards
> > JB
> >
> > On 12/19/2017 11:18 AM, Reuven Lax wrote:
> >>
> >> Keep in mind that today Avro is one of the most common coders used for
> >> user data types, not just for file IO. The reason for this is that it's
> the
> >> easiest way to get a coder for a users POJO - you simply annotate the
> POJO
> >> with @DefaultCoder(AvroCoder.class), and it works. This is the coder
> used
> >> for all internal shuffles (e.g. GroupByKey).
> >>
> >> I would argue that most users don't really care about Avro for this use
> >> case, what they really want is a way of saying "make this POJO work" and
> >> Avro is the only way we give them. This was part of my argument in the
> >> schema docs. However the status quo is that they use Avro here.
> >>
> >> Reuven
> >>
> >> On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré <[email protected]
> >> <mailto:[email protected]>> wrote:
> >>
> >>     Hi Romain,
> >>
> >>     it sounds good to me. I think any format should be packaged as an
> >> extension.
> >>
> >>     The only point is that some core transforms expect specific format,
> >> so, it
> >>     means that users will have to remember to add the avro extension to
> >> use some
> >>     transforms (or the transforms could be an extension as well). I have
> >> to
> >>     check the transforms working like this.
> >>
> >>     Regards
> >>     JB
> >>
> >>     On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote:
> >>
> >>         Hi guys,
> >>
> >>         checking security issues of the project I'm responsible of
> (which
> >>         integrates beam) I realized the java sdk core module depends on
> >> avro. On
> >>         security point of view it is a blocker cause of the legacy avro
> >> brings
> >>         (jackson from codehaus etc) but all that can be fixed. However I
> >> would
> >>         like to take this opportunity to open the topic of avro in the
> >> core
> >>         dependencies.
> >>
> >>           From my point of view it doesn't make much sense cause it is
> >> just one
> >>         of the serialization you can use with the file IO and it is
> highly
> >> not
> >>         probable all the potential formats are imported in the core.
> Since
> >> it is
> >>         a very local usage and not a core feature I think it should be
> >> extracted
> >>         - we can discuss extracting the actual transforms from the core
> in
> >>         another thread, it would make a lot of sense IMHO but not the
> >> current topic.
> >>
> >>         Therefore I'd like to propose to extract avro format - like
> others
> >> - in
> >>         an extension and remove it as a hard requirement of the core to
> >> bring
> >>         more consistency and modularity to beam.
> >>
> >>         Wdyt?
> >>
> >>         Romain Manni-Bucau
> >>         @rmannibucau <https://twitter.com/rmannibucau
> >>         <https://twitter.com/rmannibucau>> | Blog
> >>         <https://rmannibucau.metawerx.net/
> >>         <https://rmannibucau.metawerx.net/>> | Old Blog
> >>         <http://rmannibucau.wordpress.com
> >> <http://rmannibucau.wordpress.com>> |
> >>         Github <https://github.com/rmannibucau
> >>         <https://github.com/rmannibucau>> | LinkedIn
> >>         <https://www.linkedin.com/in/rmannibucau
> >>         <https://www.linkedin.com/in/rmannibucau>>
> >>
> >>
> >>     --     Jean-Baptiste Onofré
> >>     [email protected] <mailto:[email protected]>
> >>     http://blog.nanthrax.net
> >>     Talend - http://www.talend.com
> >>
> >>
> >
> > --
> > Jean-Baptiste Onofré
> > [email protected]
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>

Reply via email to