Re: Next major milestone: first stable release

2017-03-01 Thread Jean-Baptiste Onofré

Yes, fully agree.

As far as I understood/know, BEAM-59 is targeted for Beam 1.0 (it's what 
we discussed with Pei and Davor).


Regards
JB

On 03/01/2017 11:39 AM, Ismaël Mejía wrote:

Also joining a bit late, I agree with Amit, HDFS improvements are a really
good thing to have before the stable release. I will also add the
IOChannelFactory refactorings to support things like Read.from(“hdfs://”)
aka BEAM-59.

In the worse case particular IOs can still be marked as experimental to
show users that they can still evolve, even after the first ‘stable’
release, the part that we have to pay more attention is not to break the
core SDK. And the question about Data Locality (BEAM-673) is where I am
afraid that we can have some breaking changes because there is not a way
from the IOs (Source/Sink) to send ‘a hint’ to the runner about Data
Locality (please correct me if I am wrong). And this even if not supported
in the first stable release by any runner, would be a really great thing to
have and I think this is a good moment to do it, to avoid breaking any
IO/runner signature because of new methods.

What do the others think ?
Ismaël



On Tue, Feb 28, 2017 at 6:29 PM, Amit Sela  wrote:


Joining in just a bit late, I'll be quick and say that IMHO the SDK is
mature enough and so my only point to add is *HDFS support*.
I think that in terms of adoption we have to support HDFS as a "first-class
citizen" via the FileSystem API, and provide data locality (batch) on top
of it - it serves not only HDFS, but other eco-system IOs such as HBase.
From my experience with talking to people and companies, most are running
batch in production with some streaming POC or even production use, but
batch still takes most of production work. If we give them the same
production results, with the Beam API, we can on-board them faster and make
it easier for them to adopt streaming as well.

Thanks,
Amit

On Tue, Feb 28, 2017 at 7:12 PM Davor Bonaci  wrote:


Alright -- sounds like we have a consensus to proceed with the first

stable

release after 0.6.0, targeting end of March / early April. I'll kick off
separate threads for specific decisions we need to make.

On Thu, Feb 23, 2017 at 6:07 AM, Aljoscha Krettek 
wrote:


I think we're ready for this! The public APIs are in very good shape,
especially now that we have the new DoFn, user facing state and timers

and

splittable DoFn. Not all Runners support the more advanced features but

we

can work on this after a stable release and there are enough runners

that

support a large part of the features.

Best,
Aljoscha

On Thu, 23 Feb 2017 at 06:15 Kenneth Knowles 
wrote:


On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalath <

chamik...@apache.org>

wrote:


I think, this point applies to Python SDK as well (though as you

mentioned,

API hiding in Python is a mere convention (prefix with underscore)

not

enforced. We already have mechanism for marking APIs as deprecated

which

might be useful here:
https://github.com/apache/beam/blob/master/sdks/python/
apache_beam/utils/annotations.py

- Cham



Perhaps an explicit @public annotation would fit. I could imagine

easily

generating a spec to check against from such annotations, though

tooling

is

secondary to documentation.

Kenn











--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Next major milestone: first stable release

2017-02-28 Thread Davor Bonaci
Alright -- sounds like we have a consensus to proceed with the first stable
release after 0.6.0, targeting end of March / early April. I'll kick off
separate threads for specific decisions we need to make.

On Thu, Feb 23, 2017 at 6:07 AM, Aljoscha Krettek 
wrote:

> I think we're ready for this! The public APIs are in very good shape,
> especially now that we have the new DoFn, user facing state and timers and
> splittable DoFn. Not all Runners support the more advanced features but we
> can work on this after a stable release and there are enough runners that
> support a large part of the features.
>
> Best,
> Aljoscha
>
> On Thu, 23 Feb 2017 at 06:15 Kenneth Knowles 
> wrote:
>
> > On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalath <
> chamik...@apache.org>
> > wrote:
> > >
> > > I think, this point applies to Python SDK as well (though as you
> > mentioned,
> > > API hiding in Python is a mere convention (prefix with underscore) not
> > > enforced. We already have mechanism for marking APIs as deprecated
> which
> > > might be useful here:
> > > https://github.com/apache/beam/blob/master/sdks/python/
> > > apache_beam/utils/annotations.py
> > >
> > > - Cham
> > >
> >
> > Perhaps an explicit @public annotation would fit. I could imagine easily
> > generating a spec to check against from such annotations, though tooling
> is
> > secondary to documentation.
> >
> > Kenn
> >
>


Re: Next major milestone: first stable release

2017-02-22 Thread Kenneth Knowles
On Wed, Feb 22, 2017 at 5:35 PM, Chamikara Jayalath 
wrote:
>
> I think, this point applies to Python SDK as well (though as you mentioned,
> API hiding in Python is a mere convention (prefix with underscore) not
> enforced. We already have mechanism for marking APIs as deprecated which
> might be useful here:
> https://github.com/apache/beam/blob/master/sdks/python/
> apache_beam/utils/annotations.py
>
> - Cham
>

Perhaps an explicit @public annotation would fit. I could imagine easily
generating a spec to check against from such annotations, though tooling is
secondary to documentation.

Kenn


Re: Next major milestone: first stable release

2017-02-22 Thread Chamikara Jayalath
Great to see Beam API becoming stable and Python SDK becoming a part of it.

On Wed, Feb 22, 2017 at 2:57 PM Kenneth Knowles 
wrote:

This is pretty exciting. I'll add thoughts inline.

On Tue, Feb 21, 2017 at 6:31 PM, Davor Bonaci  wrote:

> * Support for all major operating systems.
>

Do we have a testing plan?

* No backward-incompatible API changes within a given major version for the
>
user-facing APIs across the project.
>

I think we should use a tool like japicmp to catch errors here for Java.
I've never used such a tool in Python; does anyone know of anything?


> * Exception: internally-facing APIs, such as APIs between components.
>

Let's annotate these, too. We'll need to have an annotation for tooling
anyhow. I would even be comfortable having this extend @Deprecated, since
we should often have a goal of getting rid of them :-) and it will cause a
visible warning in an IDE.

And I would add "* Exception: X Y Z reflection or reflection-like things"
because:

Java: reflection is often not considered, since it breaks information
hiding. But downcasts and instanceof are a gray area, which depends on the
context. We might want to explicitly document where these are supported in
a backwards-compatible way and otherwise say that they are not, etc.

Python: mechanically-enforced information hiding is rare and unidiomatic so
there's a lot more that is *technically* public while conceptually private.
I'll defer to contributors to the Python SDK about where they draw the line
but it would be nice to be explicit.


I think, this point applies to Python SDK as well (though as you mentioned,
API hiding in Python is a mere convention (prefix with underscore) not
enforced. We already have mechanism for marking APIs as deprecated which
might be useful here:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/annotations.py

- Cham



Kenn


Re: Next major milestone: first stable release

2017-02-21 Thread Jean-Baptiste Onofré

Hi Davor,

Fully agree. It would be great to have a first 1.0.0 "stable" release in 
a near future.


Thanks !
Regards
JB

On 02/22/2017 03:31 AM, Davor Bonaci wrote:

Graduating from incubation was our single, unifying goal for the past year.
With graduation now behind us, I think it is worth looking ahead towards
the next major milestone: the first stable release.

I think the first stable release is the logical next step. It enables the
growth of our user community by providing necessary guarantees and
confidence to deploy Beam into production. It is our message to the world:
“Beam is ready for prime time”.

With that, I’d like to start a discussion what the stable release really
means. For me, it is two equally important things:
* Production quality: it “just works”.
* Commitment to the API compatibility for the user-facing APIs.

Production quality is sometimes hard to define, but includes the following:
* No (known) major bugs.
* Polished user experience.
* Good documentation.
* Support for all major operating systems.
* Dependencies hidden from the callers (shading, API surface tests).
* Etc.

On the other hand, the API compatibility aspect includes:
* Proper use of semantic versioning [1]: major.minor.patch.
* No backward-incompatible API changes within a given major version for the
user-facing APIs across the project.
* Exception: APIs marked as experimental.
* Exception: internally-facing APIs, such as APIs between components.
* Any and all work can still proceed; we just need to be careful to do it
in a compatible way, at the worst, by introducing a new API and deprecating
the old one.

Time-wise, I think we are not far away from this goal. We do have a
compelling offering. Our APIs are already fairly stable. We just need a
little bit of effort across the project to polish the experience and do
those last few changes we always wanted. With that, I’d suggest to target:
* One more pre-release in late February/early March.
* The first stable release around the end of March.

I think it is worth noting that we’ll never get to perfection, and we’ll
never be able to finish “everything”. All that work, however, can still
proceed after the first stable release (just with a little extra overhead).

I’d love to hear everyone’s thoughts on this topic. It involves the future
project direction -- I’d like to invite everyone to participate!

If we have a consensus, I’d like to start marking progress on this effort
rather quickly. Perhaps we can jointly coordinate a project-wide effort to
polish the last few things and reach the first stable release.

Thanks!

Davor

[1] http://semver.org/



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com