Agree, it's what I meant by "core transforms".

Regards
JB

On 12/19/2017 11:18 AM, Reuven Lax wrote:
Keep in mind that today Avro is one of the most common coders used for user data types, not just for file IO. The reason for this is that it's the easiest way to get a coder for a users POJO - you simply annotate the POJO with @DefaultCoder(AvroCoder.class), and it works. This is the coder used for all internal shuffles (e.g. GroupByKey).

I would argue that most users don't really care about Avro for this use case, what they really want is a way of saying "make this POJO work" and Avro is the only way we give them. This was part of my argument in the schema docs. However the status quo is that they use Avro here.

Reuven

On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré <[email protected] <mailto:[email protected]>> wrote:

    Hi Romain,

    it sounds good to me. I think any format should be packaged as an extension.

    The only point is that some core transforms expect specific format, so, it
    means that users will have to remember to add the avro extension to use some
    transforms (or the transforms could be an extension as well). I have to
    check the transforms working like this.

    Regards
    JB

    On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote:

        Hi guys,

        checking security issues of the project I'm responsible of (which
        integrates beam) I realized the java sdk core module depends on avro. On
        security point of view it is a blocker cause of the legacy avro brings
        (jackson from codehaus etc) but all that can be fixed. However I would
        like to take this opportunity to open the topic of avro in the core
        dependencies.

          From my point of view it doesn't make much sense cause it is just one
        of the serialization you can use with the file IO and it is highly not
        probable all the potential formats are imported in the core. Since it is
        a very local usage and not a core feature I think it should be extracted
        - we can discuss extracting the actual transforms from the core in
        another thread, it would make a lot of sense IMHO but not the current 
topic.

        Therefore I'd like to propose to extract avro format - like others - in
        an extension and remove it as a hard requirement of the core to bring
        more consistency and modularity to beam.

        Wdyt?

        Romain Manni-Bucau
        @rmannibucau <https://twitter.com/rmannibucau
        <https://twitter.com/rmannibucau>> | Blog
        <https://rmannibucau.metawerx.net/
        <https://rmannibucau.metawerx.net/>> | Old Blog
        <http://rmannibucau.wordpress.com <http://rmannibucau.wordpress.com>> |
        Github <https://github.com/rmannibucau
        <https://github.com/rmannibucau>> | LinkedIn
        <https://www.linkedin.com/in/rmannibucau
        <https://www.linkedin.com/in/rmannibucau>>


-- Jean-Baptiste Onofré
    [email protected] <mailto:[email protected]>
    http://blog.nanthrax.net
    Talend - http://www.talend.com



--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to