Re: Next major milestone: first stable release
Yes, fully agree. As far as I understood/know, BEAM-59 is targeted for Beam 1.0 (it's what we discussed with Pei and Davor). Regards JB On 03/01/2017 11:39 AM, Ismaël Mejía wrote: Also joining a bit late, I agree with Amit, HDFS improvements are a really good thing to have before the stable release. I will also add the IOChannelFactory refactorings to support things like Read.from(“hdfs://”) aka BEAM-59. In the worse case particular IOs can still be marked as experimental to show users that they can still evolve, even after the first ‘stable’ release, the part that we have to pay more attention is not to break the core SDK. And the question about Data Locality (BEAM-673) is where I am afraid that we can have some breaking changes because there is not a way from the IOs (Source/Sink) to send ‘a hint’ to the runner about Data Locality (please correct me if I am wrong). And this even if not supported in the first stable release by any runner, would be a really great thing to have and I think this is a good moment to do it, to avoid breaking any IO/runner signature because of new methods. What do the others think ? Ismaël On Tue, Feb 28, 2017 at 6:29 PM, Amit Selawrote: Joining in just a bit late, I'll be quick and say that IMHO the SDK is mature enough and so my only point to add is *HDFS support*. I think that in terms of adoption we have to support HDFS as a "first-class citizen" via the FileSystem API, and provide data locality (batch) on top of it - it serves not only HDFS, but other eco-system IOs such as HBase. From my experience with talking to people and companies, most are running batch in production with some streaming POC or even production use, but batch still takes most of production work. If we give them the same production results, with the Beam API, we can on-board them faster and make it easier for them to adopt streaming as well. Thanks, Amit On Tue, Feb 28, 2017 at 7:12 PM Davor Bonaci wrote: Alright -- sounds like we have a consensus to proceed with the first stable release after 0.6.0, targeting end of March / early April. I'll kick off separate threads for specific decisions we need to make. On Thu, Feb 23, 2017 at 6:07 AM, Aljoscha Krettek wrote: I think we're ready for this! The public APIs are in very good shape, especially now that we have the new DoFn, user facing state and timers and splittable DoFn. Not all Runners support the more advanced features but we can work on this after a stable release and there are enough runners that support a large part of the features. Best, Aljoscha On Thu, 23 Feb 2017 at 06:15 Kenneth Knowles wrote: On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalath < chamik...@apache.org> wrote: I think, this point applies to Python SDK as well (though as you mentioned, API hiding in Python is a mere convention (prefix with underscore) not enforced. We already have mechanism for marking APIs as deprecated which might be useful here: https://github.com/apache/beam/blob/master/sdks/python/ apache_beam/utils/annotations.py - Cham Perhaps an explicit @public annotation would fit. I could imagine easily generating a spec to check against from such annotations, though tooling is secondary to documentation. Kenn -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com
Re: Next major milestone: first stable release
Alright -- sounds like we have a consensus to proceed with the first stable release after 0.6.0, targeting end of March / early April. I'll kick off separate threads for specific decisions we need to make. On Thu, Feb 23, 2017 at 6:07 AM, Aljoscha Krettekwrote: > I think we're ready for this! The public APIs are in very good shape, > especially now that we have the new DoFn, user facing state and timers and > splittable DoFn. Not all Runners support the more advanced features but we > can work on this after a stable release and there are enough runners that > support a large part of the features. > > Best, > Aljoscha > > On Thu, 23 Feb 2017 at 06:15 Kenneth Knowles > wrote: > > > On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalath < > chamik...@apache.org> > > wrote: > > > > > > I think, this point applies to Python SDK as well (though as you > > mentioned, > > > API hiding in Python is a mere convention (prefix with underscore) not > > > enforced. We already have mechanism for marking APIs as deprecated > which > > > might be useful here: > > > https://github.com/apache/beam/blob/master/sdks/python/ > > > apache_beam/utils/annotations.py > > > > > > - Cham > > > > > > > Perhaps an explicit @public annotation would fit. I could imagine easily > > generating a spec to check against from such annotations, though tooling > is > > secondary to documentation. > > > > Kenn > > >
Re: Next major milestone: first stable release
On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalathwrote: > > I think, this point applies to Python SDK as well (though as you mentioned, > API hiding in Python is a mere convention (prefix with underscore) not > enforced. We already have mechanism for marking APIs as deprecated which > might be useful here: > https://github.com/apache/beam/blob/master/sdks/python/ > apache_beam/utils/annotations.py > > - Cham > Perhaps an explicit @public annotation would fit. I could imagine easily generating a spec to check against from such annotations, though tooling is secondary to documentation. Kenn
Re: Next major milestone: first stable release
Great to see Beam API becoming stable and Python SDK becoming a part of it. On Wed, Feb 22, 2017 at 2:57 PM Kenneth Knowleswrote: This is pretty exciting. I'll add thoughts inline. On Tue, Feb 21, 2017 at 6:31 PM, Davor Bonaci wrote: > * Support for all major operating systems. > Do we have a testing plan? * No backward-incompatible API changes within a given major version for the > user-facing APIs across the project. > I think we should use a tool like japicmp to catch errors here for Java. I've never used such a tool in Python; does anyone know of anything? > * Exception: internally-facing APIs, such as APIs between components. > Let's annotate these, too. We'll need to have an annotation for tooling anyhow. I would even be comfortable having this extend @Deprecated, since we should often have a goal of getting rid of them :-) and it will cause a visible warning in an IDE. And I would add "* Exception: X Y Z reflection or reflection-like things" because: Java: reflection is often not considered, since it breaks information hiding. But downcasts and instanceof are a gray area, which depends on the context. We might want to explicitly document where these are supported in a backwards-compatible way and otherwise say that they are not, etc. Python: mechanically-enforced information hiding is rare and unidiomatic so there's a lot more that is *technically* public while conceptually private. I'll defer to contributors to the Python SDK about where they draw the line but it would be nice to be explicit. I think, this point applies to Python SDK as well (though as you mentioned, API hiding in Python is a mere convention (prefix with underscore) not enforced. We already have mechanism for marking APIs as deprecated which might be useful here: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/annotations.py - Cham Kenn
Re: Next major milestone: first stable release
Hi Davor, Fully agree. It would be great to have a first 1.0.0 "stable" release in a near future. Thanks ! Regards JB On 02/22/2017 03:31 AM, Davor Bonaci wrote: Graduating from incubation was our single, unifying goal for the past year. With graduation now behind us, I think it is worth looking ahead towards the next major milestone: the first stable release. I think the first stable release is the logical next step. It enables the growth of our user community by providing necessary guarantees and confidence to deploy Beam into production. It is our message to the world: “Beam is ready for prime time”. With that, I’d like to start a discussion what the stable release really means. For me, it is two equally important things: * Production quality: it “just works”. * Commitment to the API compatibility for the user-facing APIs. Production quality is sometimes hard to define, but includes the following: * No (known) major bugs. * Polished user experience. * Good documentation. * Support for all major operating systems. * Dependencies hidden from the callers (shading, API surface tests). * Etc. On the other hand, the API compatibility aspect includes: * Proper use of semantic versioning [1]: major.minor.patch. * No backward-incompatible API changes within a given major version for the user-facing APIs across the project. * Exception: APIs marked as experimental. * Exception: internally-facing APIs, such as APIs between components. * Any and all work can still proceed; we just need to be careful to do it in a compatible way, at the worst, by introducing a new API and deprecating the old one. Time-wise, I think we are not far away from this goal. We do have a compelling offering. Our APIs are already fairly stable. We just need a little bit of effort across the project to polish the experience and do those last few changes we always wanted. With that, I’d suggest to target: * One more pre-release in late February/early March. * The first stable release around the end of March. I think it is worth noting that we’ll never get to perfection, and we’ll never be able to finish “everything”. All that work, however, can still proceed after the first stable release (just with a little extra overhead). I’d love to hear everyone’s thoughts on this topic. It involves the future project direction -- I’d like to invite everyone to participate! If we have a consensus, I’d like to start marking progress on this effort rather quickly. Perhaps we can jointly coordinate a project-wide effort to polish the last few things and reach the first stable release. Thanks! Davor [1] http://semver.org/ -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com