Agree. I think having a generic notion of schemas and rows subsumes most usage of Avro today (with the exception of IOs.)
On Wed, Dec 20, 2017 at 6:49 PM, Lukasz Cwik <[email protected]> wrote: > I believe that the progress towards creating a Row type within Apache Beam > will subsume the responsibilities that Avro provides today. Once the Row > type exists, I agree with Romain that Avro should move out to be an > extension but would wait for the next Apache Beam major version release > before doing it because of backwards compatibility. > > On Tue, Dec 19, 2017 at 2:53 AM, Jean-Baptiste Onofré <[email protected]> > wrote: > >> Hi Ismaël, >> >> 1. We can always use license and version Maven plugin to check that. >> However, it's under the responsibility of the PMC to enforce >> security/CVE/legal points. >> >> 2. Agree, CVE (via [email protected]) should be raised to Avro. >> >> Take a look on https://www.apache.org/security/ >> >> Regards >> JB >> >> >> On 12/19/2017 11:28 AM, Ismaël Mejía wrote: >> >>> There are two important points raised here that we are missing in the >>> discussion: >>> >>> 1. We don’t have any kind of security validation for Beam and its >>> dependencies. This is important and it would be really nice if we >>> could get this improved and some sort of automation on this. However I >>> doubt there is a free service for this. Does the ASF has something for >>> this? or some of the companies involved in the project could help us >>> setup this? >>> >>> Romain, can you share the security report with the Beam community so >>> we can be aware of this an other security issues at least for the >>> moment? >>> >>> 2. The issues raised on Avro security should also be raised into the >>> Avro community. Beam is using the latest Avro release so in part Beam >>> cannot be blamed of dependency negligence since we follow the latest >>> Avro release. >>> >>> Are the issues about Avro itself or about the Jackson version because >>> if it is this the case I don’t think the things will move quickly: >>> https://issues.apache.org/jira/browse/AVRO-1126 >>> >>> >>> On Tue, Dec 19, 2017 at 11:23 AM, Jean-Baptiste Onofré <[email protected]> >>> wrote: >>> >>>> Agree, it's what I meant by "core transforms". >>>> >>>> Regards >>>> JB >>>> >>>> On 12/19/2017 11:18 AM, Reuven Lax wrote: >>>> >>>>> >>>>> Keep in mind that today Avro is one of the most common coders used for >>>>> user data types, not just for file IO. The reason for this is that >>>>> it's the >>>>> easiest way to get a coder for a users POJO - you simply annotate the >>>>> POJO >>>>> with @DefaultCoder(AvroCoder.class), and it works. This is the coder >>>>> used >>>>> for all internal shuffles (e.g. GroupByKey). >>>>> >>>>> I would argue that most users don't really care about Avro for this use >>>>> case, what they really want is a way of saying "make this POJO work" >>>>> and >>>>> Avro is the only way we give them. This was part of my argument in the >>>>> schema docs. However the status quo is that they use Avro here. >>>>> >>>>> Reuven >>>>> >>>>> On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Hi Romain, >>>>> >>>>> it sounds good to me. I think any format should be packaged as an >>>>> extension. >>>>> >>>>> The only point is that some core transforms expect specific >>>>> format, >>>>> so, it >>>>> means that users will have to remember to add the avro extension >>>>> to >>>>> use some >>>>> transforms (or the transforms could be an extension as well). I >>>>> have >>>>> to >>>>> check the transforms working like this. >>>>> >>>>> Regards >>>>> JB >>>>> >>>>> On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote: >>>>> >>>>> Hi guys, >>>>> >>>>> checking security issues of the project I'm responsible of >>>>> (which >>>>> integrates beam) I realized the java sdk core module depends >>>>> on >>>>> avro. On >>>>> security point of view it is a blocker cause of the legacy >>>>> avro >>>>> brings >>>>> (jackson from codehaus etc) but all that can be fixed. >>>>> However I >>>>> would >>>>> like to take this opportunity to open the topic of avro in the >>>>> core >>>>> dependencies. >>>>> >>>>> From my point of view it doesn't make much sense cause it is >>>>> just one >>>>> of the serialization you can use with the file IO and it is >>>>> highly >>>>> not >>>>> probable all the potential formats are imported in the core. >>>>> Since >>>>> it is >>>>> a very local usage and not a core feature I think it should be >>>>> extracted >>>>> - we can discuss extracting the actual transforms from the >>>>> core in >>>>> another thread, it would make a lot of sense IMHO but not the >>>>> current topic. >>>>> >>>>> Therefore I'd like to propose to extract avro format - like >>>>> others >>>>> - in >>>>> an extension and remove it as a hard requirement of the core >>>>> to >>>>> bring >>>>> more consistency and modularity to beam. >>>>> >>>>> Wdyt? >>>>> >>>>> Romain Manni-Bucau >>>>> @rmannibucau <https://twitter.com/rmannibucau >>>>> <https://twitter.com/rmannibucau>> | Blog >>>>> <https://rmannibucau.metawerx.net/ >>>>> <https://rmannibucau.metawerx.net/>> | Old Blog >>>>> <http://rmannibucau.wordpress.com >>>>> <http://rmannibucau.wordpress.com>> | >>>>> Github <https://github.com/rmannibucau >>>>> <https://github.com/rmannibucau>> | LinkedIn >>>>> <https://www.linkedin.com/in/rmannibucau >>>>> <https://www.linkedin.com/in/rmannibucau>> >>>>> >>>>> >>>>> -- Jean-Baptiste Onofré >>>>> [email protected] <mailto:[email protected]> >>>>> http://blog.nanthrax.net >>>>> Talend - http://www.talend.com >>>>> >>>>> >>>>> >>>> -- >>>> Jean-Baptiste Onofré >>>> [email protected] >>>> http://blog.nanthrax.net >>>> Talend - http://www.talend.com >>>> >>> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> > >
