Re: SNAPSHOTS have not been updated since february

2019-04-22 Thread Yifan Zou
The Infra is reviewing the security on our new Jenkins agents. In the
meanwhile, I assigned the publishing job to the 'ubuntu' labelled nodes as
a workaround. The snapshots were successfully published in
beam_Release_NightlySnapshot#421
. I'll
move the job assignment back once our nodes are fully setup.

Thanks

Yifan


On Fri, Apr 19, 2019 at 2:05 AM Ismaël Mejía  wrote:

> Thanks everyone for the quick answer and thanks Yifan for taking care.
>
> On Thu, Apr 18, 2019 at 7:15 PM Yifan Zou  wrote:
> >
> > The origin build nodes were updated in Jan 24 and the nexus credentials
> were removed from the filesystem because they are not supposed to be on
> external build nodes (nodes Infra does not own). We now need to set up the
> role account on the new Beam JNLP nodes. I am still contacting Infra to
> bring the snapshot back.
> >
> > Yifan
> >
> > On Thu, Apr 18, 2019 at 10:09 AM Lukasz Cwik  wrote:
> >>
> >> The permissions issue is that the credentials needed to publish to the
> maven repository are only deployed on machines managed by Apache Infra. Now
> that the machines have been given back to each project to manage Yifan was
> investigating some other way to get the permissions on to the machine.
> >>
> >> On Thu, Apr 18, 2019 at 10:06 AM Boyuan Zhang 
> wrote:
> >>>
> >>> There is a test target
> https://builds.apache.org/job/beam_Release_NightlySnapshot/ in beam,
> which builds and pushes snapshot to maven every day. Current failure is
> like, the jenkin machine cannot publish artifacts into maven owing to some
> weird permission issue. I think +Yifan Zou  is working on it actively.
> >>>
> >>> On Thu, Apr 18, 2019 at 9:44 AM Ismaël Mejía 
> wrote:
> 
>  And is there a way we can detect SNAPSHOTS not been published daily in
>  the future?
> 
>  On Thu, Apr 18, 2019 at 6:37 PM Ismaël Mejía 
> wrote:
>  >
>  > Any progress on this?
>  >
>  > On Wed, Mar 27, 2019 at 5:38 AM Daniel Oliveira <
> danolive...@google.com> wrote:
>  > >
>  > > I made a bug for this specific issue (artifacts not publishing to
> the Apache Maven repo): https://issues.apache.org/jira/browse/BEAM-6919
>  > >
>  > > While I was gathering info for the bug report I also noticed
> +Yifan Zou has an experimental PR testing a fix:
> https://github.com/apache/beam/pull/8148
>  > >
>  > > On Tue, Mar 26, 2019 at 11:42 AM Boyuan Zhang 
> wrote:
>  > >>
>  > >> +Daniel Oliveira
>  > >>
>  > >> On Tue, Mar 26, 2019 at 9:57 AM Boyuan Zhang 
> wrote:
>  > >>>
>  > >>> Sorry for the typo. Ideally, the snapshot publish is
> independent from postrelease_snapshot.
>  > >>>
>  > >>> On Tue, Mar 26, 2019 at 9:55 AM Boyuan Zhang <
> boyu...@google.com> wrote:
>  > 
>  >  Hey,
>  > 
>  >  I'm trying to publish the artifacts by commenting "Run Gradle
> Publish" in my PR, but there are several errors saying "cannot write
> artifacts into dir", anyone has idea on it? Ideally, the snapshot publish
> is dependent from postrelease_snapshot. The publish task is to build and
> publish artifacts and the postrelease_snapshot is to verify whether the
> snapshot works.
>  > 
>  >  On Tue, Mar 26, 2019 at 8:45 AM Ahmet Altay 
> wrote:
>  > >
>  > > I believe this is related to
> https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang has a
> fix in progress https://github.com/apache/beam/pull/8132
>  > >
>  > > On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía <
> ieme...@gmail.com> wrote:
>  > >>
>  > >> I was trying to validate a fix on the Spark runner and
> realized that
>  > >> Beam SNAPSHOTS have not been updated since February 24 !
>  > >>
>  > >>
> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>  > >>
>  > >> Can somebody please take a look at why this is not been
> updated?
>  > >>
>  > >> Thanks,
>  > >> Ismaël
>


Beam Summit at ApacheCon

2019-04-22 Thread Austin Bennett
Beam Summit will be at ApacheCon this year -- please consider submitting!

Dates for Beam Summit 11 and 12 September 2019.  There are other tracks at
ApacheCon during this and on other dates too.

https://www.apachecon.com/acna19/cfp.html


Re: Integration of python/portable runner tests for Samza runner

2019-04-22 Thread Ankur Goenka
Hi Daniel,

We use flinkCompatibilityMatrix [1] to check the Flink compatibility with
python. This is python equivalent to validatesRunner tests in java for
portable runners.
I think we can reuse it for Samza Portable runner with minor refactoring.

[1]
https://github.com/apache/beam/blob/bdb1a713a120a887e71e85c77879dc4446a58541/sdks/python/build.gradle#L305

On Mon, Apr 22, 2019 at 3:21 PM Daniel Chen  wrote:

> Hi everyone,
>
> I'm working on improving the validation of the Python portable Samza
> runner. For java, we have the gradle task ( :validatesRunner) that runs the
> runner validation tests.
> I am looking for pointers on how to similarly integrate/enable the
> portability and Python tests for the Samza runner.
>
> Any help will be greatly appreciated.
>
> Thanks,
> Daniel
>


Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Thomas Weise
One more suggestion:

It would be nice to be able to select the environment for the external
transforms. For example, I would like to be able to use EMBEDDED for Flink.
That's implicit for sources which are runner native unbounded read
translations, but it should also be possible for writes. That would then be
similar to how pipelines are packaged and run with the "legacy" runner.

Thomas


On Mon, Apr 22, 2019 at 1:18 PM Ankur Goenka  wrote:

> Great discussion!
> I have a few points around the structure of proto but that is less
> important as it can evolve.
> However, I think that artifact compatibility is another important aspect
> to look at.
> Example: TransformA uses Guava 1.6>< 1.7, TransformB uses 1.8><1.9 and
> TransformC uses 1.6><1.8. As sdk provide the environment for each
> transform, it can not simply say EnvironmentJava for both TransformA and
> TransformB as the dependencies are not compatible.
> We should have separate environment associated with TransformA and
> TransformB in this case.
>
> To support this case, we need 2 things.
> 1: Granular metadata about the dependency including type.
> 2: Complete list of the transforms to be expanded.
>
> Elaboration:
> The compatibility check can be done in a crude way if we provide all the
> metadata about the dependency to expansion service.
> Also, the expansion service should expand all the applicable transforms in
> a single call so that it knows about incompatibility and create separate
> environments for these transforms. So in the above example, expansion
> service will associate EnvA to TransformA and EnvB to TransformB and EnvA
> to TransformC. This will ofcource require changes to Expansion service
> proto but giving all the information to expansion service will make it
> support more case and make it a bit more future proof.
>
>
> On Mon, Apr 22, 2019 at 10:16 AM Maximilian Michels 
> wrote:
>
>> Thanks for the summary Cham. All makes sense. I agree that we want to
>> keep the option to manually specify artifacts.
>>
>> > There are few unanswered questions though.
>> > (1) In what form will a transform author specify dependencies ? For
>> example, URL to a Maven repo, URL to a local file, blob ?
>>
>> Going forward, we probably want to support multiple ways. For now, we
>> could stick with a URL-based approach with support for different file
>> systems. In the future a list of packages to retrieve from Maven/PyPi
>> would be useful.
>>
>> We can ask user for (type, metadata). For maven it can be something like
> (MAVEN, {groupId:com.google.guava, artifactId: guava, version: 19}) or
> (FILE, file://myfile)
> To begin with, we can only support a few types like File and can add more
> types in future.
>
>> > (2) How will dependencies be included in the expansion response proto ?
>> String (URL), bytes (blob) ?
>>
>> I'd go for a list of Protobuf strings first but the format would have to
>> evolve for other dependency types.
>>
>> Here also (type, payload) should suffice. We can have interpreter for
> each type to translate the payload.
>
>> > (3) How will we manage/share transitive dependencies required at
>> runtime ?
>>
>> I'd say transitive dependencies have to be included in the list. In case
>> of fat jars, they are reduced to a single jar.
>>
> Makes sense.
>
>>
>> > (4) How will dependencies be staged for various runner/SDK combinations
>> ? (for example, portable runner/Flink, Dataflow runner)
>>
>> Staging should be no different than it is now, i.e. go through Beam's
>> artifact staging service. As long as the protocol is stable, there could
>> also be different implementations.
>>
> Makes sense.
>
>>
>> -Max
>>
>> On 20.04.19 03:08, Chamikara Jayalath wrote:
>> > OK, sounds like this is a good path forward then.
>> >
>> > * When starting up the expansion service, user (that starts up the
>> > service) provide dependencies necessary to expand transforms. We will
>> > later add support for adding new transforms to an already running
>> > expansion service.
>> > * As a part of transform configuration, transform author have the
>> option
>> > of providing a list of dependencies that will be needed to run the
>> > transform.
>> > * These dependencies will be send back to the pipeline SDK as a part of
>> > expansion response and pipeline SDK will stage these resources.
>> > * Pipeline author have the option of specifying the dependencies using
>> a
>> > pipeline option. (for example, https://github.com/apache/beam/pull/8340
>> )
>> >
>> > I think last option is important to (1) make existing transform easily
>> > available for cross-language usage without additional configurations
>> (2)
>> > allow pipeline authors to override dependency versions specified by in
>> > the transform configuration (for example, to apply security patches)
>> > without updating the expansion service.
>> >
>> > There are few unanswered questions though.
>> > (1) In what form will a transform author specify dependencies ? For
>> > example, URL to a Maven 

Integration of python/portable runner tests for Samza runner

2019-04-22 Thread Daniel Chen
Hi everyone,

I'm working on improving the validation of the Python portable Samza
runner. For java, we have the gradle task ( :validatesRunner) that runs the
runner validation tests.
I am looking for pointers on how to similarly integrate/enable the
portability and Python tests for the Samza runner.

Any help will be greatly appreciated.

Thanks,
Daniel


Re: Projects Can Apply Individually for Google Season of Docs

2019-04-22 Thread Pablo Estrada
Hello all,
thanks to everyone for your participation. I have submitted the application
on behalf of Beam, and requested one technical writer. Let's see how it
goes : )
Best
-P.

On Wed, Apr 17, 2019 at 10:09 PM Ahmet Altay  wrote:

> Thanks Aizhamal, I completed the forms.
>
> On Wed, Apr 17, 2019 at 6:46 PM Aizhamal Nurmamat kyzy <
> aizha...@google.com> wrote:
>
>> Hi everyone,
>>
>> Here are a few updates on our application for Season of Docs:
>>
>> 1. Pablo and I have created the following document [1] with some of the
>> project ideas shared in the mailing list. If you have more ideas, please
>> add them into the doc and provide description. If you also want to be a
>> mentor for the proposed ideas, please add your name in the table.
>>
>> 2. To submit our application, we need to publish our project ideas list.
>> For this we have opened a Jira tickets with “gsod2019” tag[2]. We should
>> maybe also think of adding a small blog post in the Beam website that
>> contains all the ideas in one place[3]? Please let me know what you think
>> on this.
>>
>> 3. By next week Tuesday (Application Deadline)
>>
>>-
>>
>>+pabl...@apache.org  , please complete the org
>>application form [4]
>>-
>>
>>@Ahmet Altay  , please complete alternative
>>administrator form [5]
>>-
>>
>>@pabl...@apache.org  , @Ahmet Altay
>>  , and all other contributors that want to
>>participate as mentors, please complete the mentor registration form [6]
>>
>>
>> Thank you,
>>
>> Aizhamal
>>
>>
>> [1]
>> https://docs.google.com/document/d/1FNf-BjB4Q7PDdqygPboLr7CyIeo6JAkrt0RBgs2I4dE/edit#
>>
>> [2]
>> https://issues.apache.org/jira/browse/BEAM-7104?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20Open%20AND%20labels%20%3D%20gsod2019
>>
>> [3] https://beam.apache.org/blog/
>>
>> [4]
>> https://docs.google.com/forms/d/e/1FAIpQLScrEq5yKmadgn7LEPC8nN811-6DNmYvus5uXv_JY5BX7CH-Bg/viewform
>>
>> [5]
>> https://docs.google.com/forms/d/e/1FAIpQLSc5ZsBzqfsib-epktZp8bYxL_hO4RhT_Zz8AY6zXDHB79ue9g/viewform
>> [6]
>> https://docs.google.com/forms/d/e/1FAIpQLSe-JjGvaKKGWZOXxrorONhB8qN3mjPrB9ZVkcsntR73Cv_K7g/viewform
>>
>> On Wed, Apr 10, 2019 at 2:57 PM Pablo Estrada  wrote:
>>
>>> I'd be happy to be a mentor for this to help add getting started
>>> documentation for Python on Flink. I'd want to focus on the reviews and
>>> less on the administration - so I'm willing to be a secondary administrator
>>> if that's necessary to move forward, but I'd love it if someone would help
>>> administer.
>>> FWIW, neither the administrator nor any other mentor has to be a
>>> committer.
>>>
>>> Anyone willing to be primary administrator and also a mentor?
>>>
>>> Thanks
>>> -P.
>>>
>>> On Fri, Apr 5, 2019 at 9:40 AM Kenneth Knowles  wrote:
>>>
 Yes, this is great. Thanks for noticing the call and pushing ahead on
 this, Aizhamal!

 I would also like to see the runner comparison revamp at
 https://issues.apache.org/jira/browse/BEAM-2888 which would help users
 really understand what they can and cannot do in plain terms.

 Kenn

 On Fri, Apr 5, 2019 at 9:30 AM Ahmet Altay  wrote:

> Thank you Aizhamal for volunteering. I am happy to help as an
> administrator.
>
> cc: +Rose Nguyen  +Melissa Pashniak
>  in case they will be interested in mentorship
> and/or administration.
>
>
>
>
> On Fri, Apr 5, 2019 at 9:16 AM Thomas Weise  wrote:
>
>> This is great. Beam documentation needs work in several areas, Python
>> SDK, portability and SQL come to mind right away :)
>>
>>
>> On Thu, Apr 4, 2019 at 4:21 PM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>>
>>> Hello everyone,
>>>
>>> As the ASF announced that each project can apply for Season of Docs
>>> individually, I would like to volunteer to be one of the administrators 
>>> for
>>> the program. Is this fine for everyone in the community? If so, I will
>>> start working on application on behalf of Beam this week, and I will 
>>> send
>>> updates on this thread with progress.
>>>
>>> The program requires two administrators, so any volunteers would be
>>> appreciated. I’m happy to take on the administrative load, and partner 
>>> with
>>> committers or PMC members. We will also need at least two mentors for 
>>> the
>>> program, to onboard tech writers to the project and work with them 
>>> closely
>>> during 3 months period. Please express your interest in the thread :)
>>>
>>> If you have some ideas to work on for Season of Docs, please let me
>>> know directly, or file a JIRA issue, and add the "gsod" and "gsod2019"
>>> labels to it. It will help us to gather ideas and put them together in 
>>> the
>>> application.
>>>
>>> Thanks everybody,
>>> Aizhamal
>>>
>>>
>>> On Wed, Apr 3, 2019 at 1:55 PM  wrote:
>>>
 Hi All


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Melissa Pashniak
Congrats Yifan!


On Mon, Apr 22, 2019 at 1:33 PM Ankur Goenka  wrote:

> Congratulations Yifan!
>
> On Mon, Apr 22, 2019 at 11:42 AM Thomas Weise  wrote:
>
>> Congrats Yifan!
>>
>>
>> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
>> wrote:
>>
>>> Congrats! Great work.
>>>
>>> -Max
>>>
>>> On 22.04.19 19:00, Rui Wang wrote:
>>> > Congratulations! Thanks for your contribution!!
>>> >
>>> > -Rui
>>> >
>>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang >> > > wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang >> > > wrote:
>>> >
>>> > Congratulations, Yifan~
>>> >
>>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>>> > mailto:conne...@google.com>> wrote:
>>> >
>>> > Well done Yifan!!!
>>> >
>>> > Thank you for sharing Kenn!!!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>>> > mailto:al...@google.com>> wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>>> > >> > > wrote:
>>> >
>>> > Congratulations Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>>> > mailto:cma...@google.com>>
>>> wrote:
>>> >
>>> > Congratulations Yifan!!
>>> >
>>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth
>>> Knowles
>>> > mailto:k...@apache.org>>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Please join me and the rest of the Beam PMC
>>> > in welcoming a new committer: Yifan Zou.
>>> >
>>> > Yifan has been contributing to Beam since
>>> > early 2018. He has proposed 70+ pull
>>> > requests, adding dependency checking and
>>> > improving test infrastructure. But
>>> something
>>> > the numbers cannot show adequately is the
>>> > huge effort Yifan has put into working with
>>> > infra and keeping our Jenkins executors
>>> healthy.
>>> >
>>> > In consideration of Yian's contributions,
>>> > the Beam PMC trusts Yifan with the
>>> > responsibilities of a Beamcommitter[1].
>>> >
>>> > Thank you, Yifan, for your contributions.
>>> >
>>> > Kenn
>>> >
>>> > [1]
>>> >
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> > <
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 
>>> > Ruoyun  Huang
>>> >
>>>
>>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Ankur Goenka
Congratulations Yifan!

On Mon, Apr 22, 2019 at 11:42 AM Thomas Weise  wrote:

> Congrats Yifan!
>
>
> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
> wrote:
>
>> Congrats! Great work.
>>
>> -Max
>>
>> On 22.04.19 19:00, Rui Wang wrote:
>> > Congratulations! Thanks for your contribution!!
>> >
>> > -Rui
>> >
>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang > > > wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang > > > wrote:
>> >
>> > Congratulations, Yifan~
>> >
>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>> > mailto:conne...@google.com>> wrote:
>> >
>> > Well done Yifan!!!
>> >
>> > Thank you for sharing Kenn!!!
>> >
>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>> > mailto:al...@google.com>> wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>> > > > > wrote:
>> >
>> > Congratulations Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>> > mailto:cma...@google.com>>
>> wrote:
>> >
>> > Congratulations Yifan!!
>> >
>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
>> > mailto:k...@apache.org>>
>> wrote:
>> >
>> > Hi all,
>> >
>> > Please join me and the rest of the Beam PMC
>> > in welcoming a new committer: Yifan Zou.
>> >
>> > Yifan has been contributing to Beam since
>> > early 2018. He has proposed 70+ pull
>> > requests, adding dependency checking and
>> > improving test infrastructure. But something
>> > the numbers cannot show adequately is the
>> > huge effort Yifan has put into working with
>> > infra and keeping our Jenkins executors
>> healthy.
>> >
>> > In consideration of Yian's contributions,
>> > the Beam PMC trusts Yifan with the
>> > responsibilities of a Beamcommitter[1].
>> >
>> > Thank you, Yifan, for your contributions.
>> >
>> > Kenn
>> >
>> > [1]
>> >
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> > <
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> >
>> >
>> >
>> >
>> > --
>> > 
>> > Ruoyun  Huang
>> >
>>
>


Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Ankur Goenka
Great discussion!
I have a few points around the structure of proto but that is less
important as it can evolve.
However, I think that artifact compatibility is another important aspect to
look at.
Example: TransformA uses Guava 1.6>< 1.7, TransformB uses 1.8><1.9 and
TransformC uses 1.6><1.8. As sdk provide the environment for each
transform, it can not simply say EnvironmentJava for both TransformA and
TransformB as the dependencies are not compatible.
We should have separate environment associated with TransformA and
TransformB in this case.

To support this case, we need 2 things.
1: Granular metadata about the dependency including type.
2: Complete list of the transforms to be expanded.

Elaboration:
The compatibility check can be done in a crude way if we provide all the
metadata about the dependency to expansion service.
Also, the expansion service should expand all the applicable transforms in
a single call so that it knows about incompatibility and create separate
environments for these transforms. So in the above example, expansion
service will associate EnvA to TransformA and EnvB to TransformB and EnvA
to TransformC. This will ofcource require changes to Expansion service
proto but giving all the information to expansion service will make it
support more case and make it a bit more future proof.


On Mon, Apr 22, 2019 at 10:16 AM Maximilian Michels  wrote:

> Thanks for the summary Cham. All makes sense. I agree that we want to
> keep the option to manually specify artifacts.
>
> > There are few unanswered questions though.
> > (1) In what form will a transform author specify dependencies ? For
> example, URL to a Maven repo, URL to a local file, blob ?
>
> Going forward, we probably want to support multiple ways. For now, we
> could stick with a URL-based approach with support for different file
> systems. In the future a list of packages to retrieve from Maven/PyPi
> would be useful.
>
> We can ask user for (type, metadata). For maven it can be something like
(MAVEN, {groupId:com.google.guava, artifactId: guava, version: 19}) or
(FILE, file://myfile)
To begin with, we can only support a few types like File and can add more
types in future.

> > (2) How will dependencies be included in the expansion response proto ?
> String (URL), bytes (blob) ?
>
> I'd go for a list of Protobuf strings first but the format would have to
> evolve for other dependency types.
>
> Here also (type, payload) should suffice. We can have interpreter for each
type to translate the payload.

> > (3) How will we manage/share transitive dependencies required at runtime
> ?
>
> I'd say transitive dependencies have to be included in the list. In case
> of fat jars, they are reduced to a single jar.
>
Makes sense.

>
> > (4) How will dependencies be staged for various runner/SDK combinations
> ? (for example, portable runner/Flink, Dataflow runner)
>
> Staging should be no different than it is now, i.e. go through Beam's
> artifact staging service. As long as the protocol is stable, there could
> also be different implementations.
>
Makes sense.

>
> -Max
>
> On 20.04.19 03:08, Chamikara Jayalath wrote:
> > OK, sounds like this is a good path forward then.
> >
> > * When starting up the expansion service, user (that starts up the
> > service) provide dependencies necessary to expand transforms. We will
> > later add support for adding new transforms to an already running
> > expansion service.
> > * As a part of transform configuration, transform author have the option
> > of providing a list of dependencies that will be needed to run the
> > transform.
> > * These dependencies will be send back to the pipeline SDK as a part of
> > expansion response and pipeline SDK will stage these resources.
> > * Pipeline author have the option of specifying the dependencies using a
> > pipeline option. (for example, https://github.com/apache/beam/pull/8340)
> >
> > I think last option is important to (1) make existing transform easily
> > available for cross-language usage without additional configurations (2)
> > allow pipeline authors to override dependency versions specified by in
> > the transform configuration (for example, to apply security patches)
> > without updating the expansion service.
> >
> > There are few unanswered questions though.
> > (1) In what form will a transform author specify dependencies ? For
> > example, URL to a Maven repo, URL to a local file, blob ?
> > (2) How will dependencies be included in the expansion response proto ?
> > String (URL), bytes (blob) ?
> > (3) How will we manage/share transitive dependencies required at runtime
> ?
> > (4) How will dependencies be staged for various runner/SDK combinations
> > ? (for example, portable runner/Flink, Dataflow runner)
> >
> > Thanks,
> > Cham
> >
> > On Fri, Apr 19, 2019 at 4:49 AM Maximilian Michels  > > wrote:
> >
> > Thank you for your replies.
> >
> > I did not suggest that the Expansion Service does the staging, but 

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-22 Thread Andrew Pilloud
I signed the wheels files and updated the build process to not require
giving travis apache credentials. (You should probably change your password
if you haven't already.)

Andrew

On Mon, Apr 22, 2019 at 12:18 PM Ahmet Altay  wrote:

> +1 (binding)
>
> Verified the python 2 wheel files with quick start examples.
>
> On Mon, Apr 22, 2019 at 11:26 AM Ahmet Altay  wrote:
>
>> I built the wheel files. They are in the usual place along with other
>> python artifacts. I will test them a bit and update here. Could someone
>> else please try the wheel files as well?
>>
>> Andrew, could you sign and hash the wheel files?
>>
>> On Mon, Apr 22, 2019 at 10:11 AM Ahmet Altay  wrote:
>>
>>> I verified
>>> - signatures and hashes.
>>>  - python streaming quickstart guide
>>>
>>> I would like to verify the wheel files before voting. Please let us know
>>> when they are ready. Also, if you need help with building wheel files I can
>>> help/build.
>>>
>>> Ahmet
>>>
>>> On Mon, Apr 22, 2019 at 3:33 AM Maximilian Michels 
>>> wrote:
>>>
 +1 (binding)

 Found a minor bug while testing, but not a blocker:
 https://jira.apache.org/jira/browse/BEAM-7128

 Thanks,
 Max

 On 20.04.19 23:02, Pablo Estrada wrote:
 > +1
 > Ran SQL postcommit, and Dataflow Portability Java validatesrunner
 tests.
 >
 > -P.
 >
 > On Wed, Apr 17, 2019 at 1:38 AM Jean-Baptiste Onofré >>> > > wrote:
 >
 > +1 (binding)
 >
 > Quickly checked with beam-samples.
 >
 > Regards
 > JB
 >
 > On 16/04/2019 00:50, Andrew Pilloud wrote:
 >  > Hi everyone,
 >  >
 >  > Please review and vote on the release candidate #4 for the
 version
 >  > 2.12.0, as follows:
 >  >
 >  > [ ] +1, Approve the release
 >  > [ ] -1, Do not approve the release (please provide specific
 comments)
 >  >
 >  > The complete staging area is available for your review, which
 > includes:
 >  > * JIRA release notes [1],
 >  > * the official Apache source release to be deployed to
 > dist.apache.org 
 >  >  [2], which is signed with the key
 with
 >  > fingerprint 9E7CEC0661EFD610B632C610AE8FE17F9F8AE3D4 [3],
 >  > * all artifacts to be deployed to the Maven Central Repository
 [4],
 >  > * source code tag "v2.12.0-RC4" [5],
 >  > * website pull request listing the release [6], publishing the
 API
 >  > reference manual [7], and the blog post [8].
 >  > * Java artifacts were built with Gradle/5.2.1 and
 OpenJDK/Oracle JDK
 >  > 1.8.0_181.
 >  > * Python artifacts are deployed along with the source release
 to the
 >  > dist.apache.org  <
 http://dist.apache.org>
 > [2].
 >  > * Validation sheet with a tab for 2.12.0 release to help with
 > validation
 >  > [9].
 >  >
 >  > The vote will be open for at least 72 hours. It is adopted by
 > majority
 >  > approval, with at least 3 PMC affirmative votes.
 >  >
 >  > Thanks,
 >  > Andrew
 >  >
 >  > 1]
 >
 https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12344944
 >  > [2] https://dist.apache.org/repos/dist/dev/beam/2.12.0/
 >  > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
 >  > [4]
 >
 https://repository.apache.org/content/repositories/orgapachebeam-1068/
 >  > [5] https://github.com/apache/beam/tree/v2.12.0-RC4
 >  > [6] https://github.com/apache/beam/pull/8215
 >  > [7] https://github.com/apache/beam-site/pull/588
 >  > [8] https://github.com/apache/beam/pull/8314
 >  > [9]
 >
 https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1007316984
 >
 > --
 > Jean-Baptiste Onofré
 > jbono...@apache.org 
 > http://blog.nanthrax.net
 > Talend - http://www.talend.com
 >

>>>


Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-22 Thread Ahmet Altay
+1 (binding)

Verified the python 2 wheel files with quick start examples.

On Mon, Apr 22, 2019 at 11:26 AM Ahmet Altay  wrote:

> I built the wheel files. They are in the usual place along with other
> python artifacts. I will test them a bit and update here. Could someone
> else please try the wheel files as well?
>
> Andrew, could you sign and hash the wheel files?
>
> On Mon, Apr 22, 2019 at 10:11 AM Ahmet Altay  wrote:
>
>> I verified
>> - signatures and hashes.
>>  - python streaming quickstart guide
>>
>> I would like to verify the wheel files before voting. Please let us know
>> when they are ready. Also, if you need help with building wheel files I can
>> help/build.
>>
>> Ahmet
>>
>> On Mon, Apr 22, 2019 at 3:33 AM Maximilian Michels 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> Found a minor bug while testing, but not a blocker:
>>> https://jira.apache.org/jira/browse/BEAM-7128
>>>
>>> Thanks,
>>> Max
>>>
>>> On 20.04.19 23:02, Pablo Estrada wrote:
>>> > +1
>>> > Ran SQL postcommit, and Dataflow Portability Java validatesrunner
>>> tests.
>>> >
>>> > -P.
>>> >
>>> > On Wed, Apr 17, 2019 at 1:38 AM Jean-Baptiste Onofré >> > > wrote:
>>> >
>>> > +1 (binding)
>>> >
>>> > Quickly checked with beam-samples.
>>> >
>>> > Regards
>>> > JB
>>> >
>>> > On 16/04/2019 00:50, Andrew Pilloud wrote:
>>> >  > Hi everyone,
>>> >  >
>>> >  > Please review and vote on the release candidate #4 for the
>>> version
>>> >  > 2.12.0, as follows:
>>> >  >
>>> >  > [ ] +1, Approve the release
>>> >  > [ ] -1, Do not approve the release (please provide specific
>>> comments)
>>> >  >
>>> >  > The complete staging area is available for your review, which
>>> > includes:
>>> >  > * JIRA release notes [1],
>>> >  > * the official Apache source release to be deployed to
>>> > dist.apache.org 
>>> >  >  [2], which is signed with the key with
>>> >  > fingerprint 9E7CEC0661EFD610B632C610AE8FE17F9F8AE3D4 [3],
>>> >  > * all artifacts to be deployed to the Maven Central Repository
>>> [4],
>>> >  > * source code tag "v2.12.0-RC4" [5],
>>> >  > * website pull request listing the release [6], publishing the
>>> API
>>> >  > reference manual [7], and the blog post [8].
>>> >  > * Java artifacts were built with Gradle/5.2.1 and
>>> OpenJDK/Oracle JDK
>>> >  > 1.8.0_181.
>>> >  > * Python artifacts are deployed along with the source release
>>> to the
>>> >  > dist.apache.org  <
>>> http://dist.apache.org>
>>> > [2].
>>> >  > * Validation sheet with a tab for 2.12.0 release to help with
>>> > validation
>>> >  > [9].
>>> >  >
>>> >  > The vote will be open for at least 72 hours. It is adopted by
>>> > majority
>>> >  > approval, with at least 3 PMC affirmative votes.
>>> >  >
>>> >  > Thanks,
>>> >  > Andrew
>>> >  >
>>> >  > 1]
>>> >
>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12344944
>>> >  > [2] https://dist.apache.org/repos/dist/dev/beam/2.12.0/
>>> >  > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>> >  > [4]
>>> >
>>> https://repository.apache.org/content/repositories/orgapachebeam-1068/
>>> >  > [5] https://github.com/apache/beam/tree/v2.12.0-RC4
>>> >  > [6] https://github.com/apache/beam/pull/8215
>>> >  > [7] https://github.com/apache/beam-site/pull/588
>>> >  > [8] https://github.com/apache/beam/pull/8314
>>> >  > [9]
>>> >
>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1007316984
>>> >
>>> > --
>>> > Jean-Baptiste Onofré
>>> > jbono...@apache.org 
>>> > http://blog.nanthrax.net
>>> > Talend - http://www.talend.com
>>> >
>>>
>>


Meetups in London, Stockholm + recordings

2019-04-22 Thread Matthias Baetens
Hi everyone,

Happy to announce 2 meetups for the next week(s):
- *London *is having its second meetup of the year on *01/05/2019 in
Level39, Canary Wharf
*
: https://www.meetup.com/London-Apache-Beam-Meetup/events/260526077 with
talks from Lyft, Datatonic and Google.
- *Stockholm* is having its second meetup in its lifetime on *06/05/2019 at
Spotify
:
*
https://www.meetup.com/Apache-Beam-Stockholm/events/260634514 with talks
from Spotify, EQT and Google.

The recording of the 6th meetup in London at Revolut is up on YouTube
 now.

Last but not least: start following the Beam Summit handle on Twitter
 for the latest information on the Beam
Summit  in Berlin (19-20 June 2019) and get your
ticket on Eventbrite
!

Best regards,
Matthias


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Thomas Weise
Congrats Yifan!


On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels  wrote:

> Congrats! Great work.
>
> -Max
>
> On 22.04.19 19:00, Rui Wang wrote:
> > Congratulations! Thanks for your contribution!!
> >
> > -Rui
> >
> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang  > > wrote:
> >
> > Congratulations, Yifan!
> >
> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang  > > wrote:
> >
> > Congratulations, Yifan~
> >
> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
> > mailto:conne...@google.com>> wrote:
> >
> > Well done Yifan!!!
> >
> > Thank you for sharing Kenn!!!
> >
> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> >
> > Congratulations, Yifan!
> >
> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
> >  > > wrote:
> >
> > Congratulations Yifan!
> >
> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
> > mailto:cma...@google.com>>
> wrote:
> >
> > Congratulations Yifan!!
> >
> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
> > mailto:k...@apache.org>>
> wrote:
> >
> > Hi all,
> >
> > Please join me and the rest of the Beam PMC
> > in welcoming a new committer: Yifan Zou.
> >
> > Yifan has been contributing to Beam since
> > early 2018. He has proposed 70+ pull
> > requests, adding dependency checking and
> > improving test infrastructure. But something
> > the numbers cannot show adequately is the
> > huge effort Yifan has put into working with
> > infra and keeping our Jenkins executors
> healthy.
> >
> > In consideration of Yian's contributions,
> > the Beam PMC trusts Yifan with the
> > responsibilities of a Beamcommitter[1].
> >
> > Thank you, Yifan, for your contributions.
> >
> > Kenn
> >
> > [1]
> >
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> > <
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> >
> >
> >
> >
> > --
> > 
> > Ruoyun  Huang
> >
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Udi Meiri
Congrats Yifan!

On Mon, Apr 22, 2019 at 11:04 AM Valentyn Tymofieiev 
wrote:

> Congratulations, Yifan! Thanks a lot for your continued contributions to
> Beam.
>
> On Mon, Apr 22, 2019 at 10:24 AM Robin Qiu  wrote:
>
>> Congratulations Yifan!
>>
>> On Mon, Apr 22, 2019 at 10:17 AM Chamikara Jayalath 
>> wrote:
>>
>>> Congrats Yifan!
>>>
>>> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
>>> wrote:
>>>
 Congrats! Great work.

 -Max

 On 22.04.19 19:00, Rui Wang wrote:
 > Congratulations! Thanks for your contribution!!
 >
 > -Rui
 >
 > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang >>> > > wrote:
 >
 > Congratulations, Yifan!
 >
 > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang >>> > > wrote:
 >
 > Congratulations, Yifan~
 >
 > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
 > mailto:conne...@google.com>> wrote:
 >
 > Well done Yifan!!!
 >
 > Thank you for sharing Kenn!!!
 >
 > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
 > mailto:al...@google.com>> wrote:
 >
 > Congratulations, Yifan!
 >
 > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
 > >>> > > wrote:
 >
 > Congratulations Yifan!
 >
 > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
 > mailto:cma...@google.com>>
 wrote:
 >
 > Congratulations Yifan!!
 >
 > On Mon, Apr 22, 2019 at 11:26 AM Kenneth
 Knowles
 > mailto:k...@apache.org>>
 wrote:
 >
 > Hi all,
 >
 > Please join me and the rest of the
 Beam PMC
 > in welcoming a new committer: Yifan Zou.
 >
 > Yifan has been contributing to Beam since
 > early 2018. He has proposed 70+ pull
 > requests, adding dependency checking and
 > improving test infrastructure. But
 something
 > the numbers cannot show adequately is the
 > huge effort Yifan has put into working
 with
 > infra and keeping our Jenkins executors
 healthy.
 >
 > In consideration of Yian's contributions,
 > the Beam PMC trusts Yifan with the
 > responsibilities of a Beamcommitter[1].
 >
 > Thank you, Yifan, for your contributions.
 >
 > Kenn
 >
 > [1]
 >
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
 > <
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
 >
 >
 >
 >
 > --
 > 
 > Ruoyun  Huang
 >

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-22 Thread Ahmet Altay
I built the wheel files. They are in the usual place along with other
python artifacts. I will test them a bit and update here. Could someone
else please try the wheel files as well?

Andrew, could you sign and hash the wheel files?

On Mon, Apr 22, 2019 at 10:11 AM Ahmet Altay  wrote:

> I verified
> - signatures and hashes.
>  - python streaming quickstart guide
>
> I would like to verify the wheel files before voting. Please let us know
> when they are ready. Also, if you need help with building wheel files I can
> help/build.
>
> Ahmet
>
> On Mon, Apr 22, 2019 at 3:33 AM Maximilian Michels  wrote:
>
>> +1 (binding)
>>
>> Found a minor bug while testing, but not a blocker:
>> https://jira.apache.org/jira/browse/BEAM-7128
>>
>> Thanks,
>> Max
>>
>> On 20.04.19 23:02, Pablo Estrada wrote:
>> > +1
>> > Ran SQL postcommit, and Dataflow Portability Java validatesrunner tests.
>> >
>> > -P.
>> >
>> > On Wed, Apr 17, 2019 at 1:38 AM Jean-Baptiste Onofré > > > wrote:
>> >
>> > +1 (binding)
>> >
>> > Quickly checked with beam-samples.
>> >
>> > Regards
>> > JB
>> >
>> > On 16/04/2019 00:50, Andrew Pilloud wrote:
>> >  > Hi everyone,
>> >  >
>> >  > Please review and vote on the release candidate #4 for the
>> version
>> >  > 2.12.0, as follows:
>> >  >
>> >  > [ ] +1, Approve the release
>> >  > [ ] -1, Do not approve the release (please provide specific
>> comments)
>> >  >
>> >  > The complete staging area is available for your review, which
>> > includes:
>> >  > * JIRA release notes [1],
>> >  > * the official Apache source release to be deployed to
>> > dist.apache.org 
>> >  >  [2], which is signed with the key with
>> >  > fingerprint 9E7CEC0661EFD610B632C610AE8FE17F9F8AE3D4 [3],
>> >  > * all artifacts to be deployed to the Maven Central Repository
>> [4],
>> >  > * source code tag "v2.12.0-RC4" [5],
>> >  > * website pull request listing the release [6], publishing the
>> API
>> >  > reference manual [7], and the blog post [8].
>> >  > * Java artifacts were built with Gradle/5.2.1 and OpenJDK/Oracle
>> JDK
>> >  > 1.8.0_181.
>> >  > * Python artifacts are deployed along with the source release to
>> the
>> >  > dist.apache.org  > >
>> > [2].
>> >  > * Validation sheet with a tab for 2.12.0 release to help with
>> > validation
>> >  > [9].
>> >  >
>> >  > The vote will be open for at least 72 hours. It is adopted by
>> > majority
>> >  > approval, with at least 3 PMC affirmative votes.
>> >  >
>> >  > Thanks,
>> >  > Andrew
>> >  >
>> >  > 1]
>> >
>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12344944
>> >  > [2] https://dist.apache.org/repos/dist/dev/beam/2.12.0/
>> >  > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> >  > [4]
>> >
>> https://repository.apache.org/content/repositories/orgapachebeam-1068/
>> >  > [5] https://github.com/apache/beam/tree/v2.12.0-RC4
>> >  > [6] https://github.com/apache/beam/pull/8215
>> >  > [7] https://github.com/apache/beam-site/pull/588
>> >  > [8] https://github.com/apache/beam/pull/8314
>> >  > [9]
>> >
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1007316984
>> >
>> > --
>> > Jean-Baptiste Onofré
>> > jbono...@apache.org 
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>> >
>>
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Valentyn Tymofieiev
Congratulations, Yifan! Thanks a lot for your continued contributions to
Beam.

On Mon, Apr 22, 2019 at 10:24 AM Robin Qiu  wrote:

> Congratulations Yifan!
>
> On Mon, Apr 22, 2019 at 10:17 AM Chamikara Jayalath 
> wrote:
>
>> Congrats Yifan!
>>
>> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
>> wrote:
>>
>>> Congrats! Great work.
>>>
>>> -Max
>>>
>>> On 22.04.19 19:00, Rui Wang wrote:
>>> > Congratulations! Thanks for your contribution!!
>>> >
>>> > -Rui
>>> >
>>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang >> > > wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang >> > > wrote:
>>> >
>>> > Congratulations, Yifan~
>>> >
>>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>>> > mailto:conne...@google.com>> wrote:
>>> >
>>> > Well done Yifan!!!
>>> >
>>> > Thank you for sharing Kenn!!!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>>> > mailto:al...@google.com>> wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>>> > >> > > wrote:
>>> >
>>> > Congratulations Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>>> > mailto:cma...@google.com>>
>>> wrote:
>>> >
>>> > Congratulations Yifan!!
>>> >
>>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth
>>> Knowles
>>> > mailto:k...@apache.org>>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Please join me and the rest of the Beam PMC
>>> > in welcoming a new committer: Yifan Zou.
>>> >
>>> > Yifan has been contributing to Beam since
>>> > early 2018. He has proposed 70+ pull
>>> > requests, adding dependency checking and
>>> > improving test infrastructure. But
>>> something
>>> > the numbers cannot show adequately is the
>>> > huge effort Yifan has put into working with
>>> > infra and keeping our Jenkins executors
>>> healthy.
>>> >
>>> > In consideration of Yian's contributions,
>>> > the Beam PMC trusts Yifan with the
>>> > responsibilities of a Beamcommitter[1].
>>> >
>>> > Thank you, Yifan, for your contributions.
>>> >
>>> > Kenn
>>> >
>>> > [1]
>>> >
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> > <
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 
>>> > Ruoyun  Huang
>>> >
>>>
>>


Re: Go SDK status

2019-04-22 Thread Maximilian Michels
Nice summary, Robert! I really like the transparency on the state of the 
Go SDK and how it's being used.


It would be great to see the streaming mode improve because only then we 
have a full-blown SDK. It looks like we will need a few more resources 
on the SDK to bring it up to par with Python.


I agree that cross-language transforms would be the most sensible path 
to solving the IO problem. The state of the Python SDK does not differ 
much in this regard because it also suffers from a lack of IO. I think 
you have seen the recent discussions about how to configure 
cross-language transforms. For the Python side we have Java's 
GenerateSequence and KafkaIO working with the portable Flink Runner.


Unfortunately, I'm not a Gopher yet but I'd be happy to exchange ideas 
or go into more detail about the cross-language capabilities.


Cheers,
Max

On 18.04.19 15:13, Thomas Weise wrote:

Hi Robert,

Thanks a bunch for providing this comprehensive update. This is exactly 
the kind of perspective I was looking for, even when overall it means 
that for potential users of the Go SDK it is even sooner than what I 
might have hoped for.


For more context, my interest was primarily on the streaming side. From 
the list of missing features you listed, State + Timers + Triggers would 
probably be highest priority. Unfortunately I won't be able to 
contribute to the Go SDK anytime soon, so this is mostly fyi in case 
anyone else does.


On improving the IOs, I think it would make a lot of sense to focus on 
the cross-language route. There has been some work lately to make 
existing Beam Java IOs available on the Flink runner (Max would be able 
to share more details on that).


Thanks!
Thomas


On Wed, Apr 17, 2019 at 9:56 PM Robert Burke > wrote:


Oh dang. Thanks for mentioning that! Here's an open copy of the
versioning thoughts doc, though there shouldn't be any surprises
from the points I mentioned above.


https://docs.google.com/document/d/1ZjP30zNLWTu_WzkWbgY8F_ZXlA_OWAobAD9PuohJxPg/edit#heading=h.drpipq762xi7

On Wed, 17 Apr 2019 at 21:20, Nathan Fisher mailto:nfis...@junctionbox.ca>> wrote:

Hi Robert,

Great summary on the current state of play. FYI the referenced G
doc doesn't appear to people outside the org as a default.

Great to hear the Go SDK is still getting love. I last looked at
in September-October of last year.

Cheers,
Nathan

On Wed, 17 Apr 2019 at 20:27, Lukasz Cwik mailto:lc...@google.com>> wrote:

Thanks for the indepth summary.

On Mon, Apr 15, 2019 at 4:19 PM Robert Burke
mailto:rob...@frantil.com>> wrote:

Hi Thomas! I'm so glad you asked!

The status of the Go SDK is complicated, so this email
can't be brief. There's are several dimensions to
consider: as a Go Open Source Project, User Libraries
and Experience, and on Beam Features.

I'm going to be updating the roadmap later this month
when I have a spare moment.

*tl;dr;*
I would *love* help in improving the Go SDK, especially
around interactions with Java/Python/Flink. Java and I
do not have a good working relationship for operational
purposes, and the last time I used Python, I had to
re-image my machine. There's lots to do, but shouting
out tasks to the void is rarely as productive as it is
cathartic. If there's an offer to help, and a preference
for/experience with  something to work on, I'm willing
to find something useful to get started on for you.

(Note: The following are simply my opinion as someone
who works with the project weekly as a Go programmer,
and should not be treated as demands or gospel. I just
don't have anyone to talk about Go SDK issues with, and
my previous discussions, have largely seemed to fall on
uninterested ears.)

*The SDK can be considered Alpha when all of the
following are true:*
* The SDK is tested by the Beam project on a ULR and on
Flink as well as Dataflow.
* The IOs have received some love to ensure they can
scale (either through SDF or reshuffles), and be
portable to different environments (eg. using the Go
Cloud Development Kit (CDK) libraries).
    * Cross-Language IO support would also be acceptable.
* The SDK is using Go Modules for dependency management,
marking it as version 0.Minor (where Minor should
probably track the mainline Beam minor version for now).

*We can move to 

Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Alan Myrvold
Congrats, Yifan!!!

On Mon, Apr 22, 2019 at 10:24 AM Robin Qiu  wrote:

> Congratulations Yifan!
>
> On Mon, Apr 22, 2019 at 10:17 AM Chamikara Jayalath 
> wrote:
>
>> Congrats Yifan!
>>
>> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
>> wrote:
>>
>>> Congrats! Great work.
>>>
>>> -Max
>>>
>>> On 22.04.19 19:00, Rui Wang wrote:
>>> > Congratulations! Thanks for your contribution!!
>>> >
>>> > -Rui
>>> >
>>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang >> > > wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang >> > > wrote:
>>> >
>>> > Congratulations, Yifan~
>>> >
>>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>>> > mailto:conne...@google.com>> wrote:
>>> >
>>> > Well done Yifan!!!
>>> >
>>> > Thank you for sharing Kenn!!!
>>> >
>>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>>> > mailto:al...@google.com>> wrote:
>>> >
>>> > Congratulations, Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>>> > >> > > wrote:
>>> >
>>> > Congratulations Yifan!
>>> >
>>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>>> > mailto:cma...@google.com>>
>>> wrote:
>>> >
>>> > Congratulations Yifan!!
>>> >
>>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth
>>> Knowles
>>> > mailto:k...@apache.org>>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Please join me and the rest of the Beam PMC
>>> > in welcoming a new committer: Yifan Zou.
>>> >
>>> > Yifan has been contributing to Beam since
>>> > early 2018. He has proposed 70+ pull
>>> > requests, adding dependency checking and
>>> > improving test infrastructure. But
>>> something
>>> > the numbers cannot show adequately is the
>>> > huge effort Yifan has put into working with
>>> > infra and keeping our Jenkins executors
>>> healthy.
>>> >
>>> > In consideration of Yian's contributions,
>>> > the Beam PMC trusts Yifan with the
>>> > responsibilities of a Beamcommitter[1].
>>> >
>>> > Thank you, Yifan, for your contributions.
>>> >
>>> > Kenn
>>> >
>>> > [1]
>>> >
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> > <
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 
>>> > Ruoyun  Huang
>>> >
>>>
>>


Re: AvroUtils converting generic record to Beam Row causes class cast exception

2019-04-22 Thread Rui Wang
I see. I created this PR [1] to ask feedback from the reviewer who knows
better on Avro in Beam.

-Rui


[1]: https://github.com/apache/beam/pull/8376


On Sun, Apr 21, 2019 at 11:19 PM Vishwas Bm  wrote:

> Hi Rui,
>
> I checked the AvroUtils code. There is a static intializer block basically
> it registers Avro Timestamp Conversion functions for logical type
> timestamp-millis.
>
> *// Code Snippet below*
> static {
> // This works around a bug in the Avro library (AVRO-1891) around
> SpecificRecord's handling
> // of DateTime types.
>SpecificData.get().addLogicalTypeConversion(new TimeConversions.
> TimestampConversion());
>GenericData.get().addLogicalTypeConversion(new TimeConversions.
> TimestampConversion());
> }
>
> Because of this when deserializing generic record from kafka using
> KafkaAvroDeserializer, the long value produced at the producer end gets
> converted to joda-time during deserialization.
>
> Next when we try to convert this genericRecord to Row as part of
> AvroUtils.toBeamRowStrict function, we again try to convert the value
> recieved to joda-time.
> But the exception is thrown as there is type cast to Long.
>
> *// Code Snippet Below:*
> else if (logicalType instanceof LogicalTypes.TimestampMillis) {
>  return convertDateTimeStrict((Long) value, fieldType); *<--
> Class cast exception is thrown here, as we are typecasting from JodaTime to
> Long*
> }
>
> private static Object convertDateTimeStrict (Long value, Schema.FieldType
> fieldType) {
>  checkTypeName(fieldType.getTypeName(), TypeName.DATETIME, "
> dateTime");
>  return new Instant(value);  <--  *Creates a JodaTime
> Instance here*
> }
>
>
> *Thanks & Regards,*
>
> *Vishwas *
>
>
>
> On Tue, Apr 16, 2019 at 9:18 AM Rui Wang  wrote:
>
>> I didn't find code in `AvroUtils.toBeamRowStrict` that converts long to
>> Joda time. `AvroUtils.toBeamRowStrict` retrieves objects from
>> GenericRecord, and tries to cast objects based on their types (and
>> cast(object) to long for "timestamp-millis"). see [1].
>>
>> So in order to use `AvroUtils.toBeamRowStrict`, the generated
>> GenericRecord should have long for "timestamp-millis".
>>
>> The schema you pasted looks right. Not sure why generated class is Joda
>> time (is it controlled by some flags?). But at least you could write a
>> small function to do schema conversion for your need.
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java#L672
>>
>>
>> Rui
>>
>>
>> On Mon, Apr 15, 2019 at 7:11 PM Vishwas Bm  wrote:
>>
>>> Hi Rui,
>>>
>>> I agree that by converting it to long, there will be no error.
>>> But the KafkaIO is giving a GenericRecord with attribute of type
>>> JodaTime. Now I convert it to long. Then in the  AvroUtils.toBeamRowStrict
>>> again converts it to JodaTime.
>>>
>>> I used the avro tools 1.8.2 jar, for the below schema and I see that the
>>> generated class has a JodaTime attribute.
>>>
>>> {
>>> "name": "timeOfRelease",
>>> "type":
>>> {
>>> "type": "long",
>>> "logicalType": "timestamp-millis",
>>> "connect.version": 1,
>>> "connect.name":
>>> "org.apache.kafka.connect.data.Timestamp"
>>> }
>>>  }
>>>
>>> *Attribute type in generated class:*
>>> private org.joda.time.DateTime timeOfRelease;
>>>
>>>
>>> So not sure why this type casting is required.
>>>
>>>
>>> *Thanks & Regards,*
>>>
>>> *Vishwas *
>>>
>>>
>>> On Tue, Apr 16, 2019 at 12:56 AM Rui Wang  wrote:
>>>
 Read from the code and seems like as the logical type
 "timestamp-millis" means, it's expecting millis in Long as values under
 this logical type.

 So if you can convert joda-time to millis before calling
 "AvroUtils.toBeamRowStrict(genericRecord, this.beamSchema)", your exception
 will gone.

 -Rui


 On Mon, Apr 15, 2019 at 10:28 AM Lukasz Cwik  wrote:

> +dev 
>
> On Sun, Apr 14, 2019 at 10:29 PM Vishwas Bm 
> wrote:
>
>> Hi,
>>
>> Below is my pipeline:
>>
>> KafkaSource (KafkaIO.read) --> Pardo ---> BeamSql
>> ---> KafkaSink(KafkaIO.write)
>>
>>
>> The avro schema of the topic has a field of logical type
>> timestamp-millis.  KafkaIO.read transform is creating a
>> KafkaRecord, where this field is being converted to
>> joda-time.
>>
>> In my Pardo transform, I am trying to use the AvroUtils class methods
>> to convert the generic record to Beam Row and getting below class cast
>> exception for the joda-time attribute.
>>
>>  AvroUtils.toBeamRowStrict(genericRecord, this.beamSchema)
>>
>> Caused by: java.lang.ClassCastException: org.joda.time.DateTime
>> cannot be cast to java.lang.Long
>> at
>> 

Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Robin Qiu
Congratulations Yifan!

On Mon, Apr 22, 2019 at 10:17 AM Chamikara Jayalath 
wrote:

> Congrats Yifan!
>
> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
> wrote:
>
>> Congrats! Great work.
>>
>> -Max
>>
>> On 22.04.19 19:00, Rui Wang wrote:
>> > Congratulations! Thanks for your contribution!!
>> >
>> > -Rui
>> >
>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang > > > wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang > > > wrote:
>> >
>> > Congratulations, Yifan~
>> >
>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>> > mailto:conne...@google.com>> wrote:
>> >
>> > Well done Yifan!!!
>> >
>> > Thank you for sharing Kenn!!!
>> >
>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>> > mailto:al...@google.com>> wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>> > > > > wrote:
>> >
>> > Congratulations Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>> > mailto:cma...@google.com>>
>> wrote:
>> >
>> > Congratulations Yifan!!
>> >
>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
>> > mailto:k...@apache.org>>
>> wrote:
>> >
>> > Hi all,
>> >
>> > Please join me and the rest of the Beam PMC
>> > in welcoming a new committer: Yifan Zou.
>> >
>> > Yifan has been contributing to Beam since
>> > early 2018. He has proposed 70+ pull
>> > requests, adding dependency checking and
>> > improving test infrastructure. But something
>> > the numbers cannot show adequately is the
>> > huge effort Yifan has put into working with
>> > infra and keeping our Jenkins executors
>> healthy.
>> >
>> > In consideration of Yian's contributions,
>> > the Beam PMC trusts Yifan with the
>> > responsibilities of a Beamcommitter[1].
>> >
>> > Thank you, Yifan, for your contributions.
>> >
>> > Kenn
>> >
>> > [1]
>> >
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> > <
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> >
>> >
>> >
>> >
>> > --
>> > 
>> > Ruoyun  Huang
>> >
>>
>


Re: Hazelcast Jet Runner

2019-04-22 Thread Maximilian Michels

Hi Jozsef,

If the Runner support the complete set of ValidatesRunner tests and the 
Nexmark suite, it is already in a very good state. Like Kenn already 
suggested, we can definitely add it to the capability matrix then.


Thanks,
Max

On 19.04.19 22:52, Kenneth Knowles wrote:
The ValidatesRunner tests are the best source we have for knowing the 
capabilities of a runner. Are there instructions for running the tests?


Assuming we can check it out, then just open a PR to the website with 
the current capabilities and caveats. Since it is a big deal and could 
use lots of eyes, I would share the PR link on this thread.


Kenn

On Thu, Apr 18, 2019 at 11:53 AM Jozsef Bartok > wrote:


Hi. We at Hazelcast Jet have been working for a while now to
implement a Java Beam Runner (non-portable) based on Hazelcast Jet
(https://jet.hazelcast.org/). The process is still ongoing
(https://github.com/hazelcast/hazelcast-jet-beam-runner), but we are
aiming for a fully functional, reliable Runner which can proudly
join the Capability Matrix. For that purpose I would like to ask
what’s your process of validating runners? We are already running
the @ValidatesRunner tests and the Nexmark test suite, but beyond
that what other steps do we need to take to get our Runner to the
level it needs to be at?



Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-22 Thread Ahmet Altay
I verified
- signatures and hashes.
 - python streaming quickstart guide

I would like to verify the wheel files before voting. Please let us know
when they are ready. Also, if you need help with building wheel files I can
help/build.

Ahmet

On Mon, Apr 22, 2019 at 3:33 AM Maximilian Michels  wrote:

> +1 (binding)
>
> Found a minor bug while testing, but not a blocker:
> https://jira.apache.org/jira/browse/BEAM-7128
>
> Thanks,
> Max
>
> On 20.04.19 23:02, Pablo Estrada wrote:
> > +1
> > Ran SQL postcommit, and Dataflow Portability Java validatesrunner tests.
> >
> > -P.
> >
> > On Wed, Apr 17, 2019 at 1:38 AM Jean-Baptiste Onofré  > > wrote:
> >
> > +1 (binding)
> >
> > Quickly checked with beam-samples.
> >
> > Regards
> > JB
> >
> > On 16/04/2019 00:50, Andrew Pilloud wrote:
> >  > Hi everyone,
> >  >
> >  > Please review and vote on the release candidate #4 for the version
> >  > 2.12.0, as follows:
> >  >
> >  > [ ] +1, Approve the release
> >  > [ ] -1, Do not approve the release (please provide specific
> comments)
> >  >
> >  > The complete staging area is available for your review, which
> > includes:
> >  > * JIRA release notes [1],
> >  > * the official Apache source release to be deployed to
> > dist.apache.org 
> >  >  [2], which is signed with the key with
> >  > fingerprint 9E7CEC0661EFD610B632C610AE8FE17F9F8AE3D4 [3],
> >  > * all artifacts to be deployed to the Maven Central Repository
> [4],
> >  > * source code tag "v2.12.0-RC4" [5],
> >  > * website pull request listing the release [6], publishing the API
> >  > reference manual [7], and the blog post [8].
> >  > * Java artifacts were built with Gradle/5.2.1 and OpenJDK/Oracle
> JDK
> >  > 1.8.0_181.
> >  > * Python artifacts are deployed along with the source release to
> the
> >  > dist.apache.org  
> > [2].
> >  > * Validation sheet with a tab for 2.12.0 release to help with
> > validation
> >  > [9].
> >  >
> >  > The vote will be open for at least 72 hours. It is adopted by
> > majority
> >  > approval, with at least 3 PMC affirmative votes.
> >  >
> >  > Thanks,
> >  > Andrew
> >  >
> >  > 1]
> >
> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12344944
> >  > [2] https://dist.apache.org/repos/dist/dev/beam/2.12.0/
> >  > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> >  > [4]
> >
> https://repository.apache.org/content/repositories/orgapachebeam-1068/
> >  > [5] https://github.com/apache/beam/tree/v2.12.0-RC4
> >  > [6] https://github.com/apache/beam/pull/8215
> >  > [7] https://github.com/apache/beam-site/pull/588
> >  > [8] https://github.com/apache/beam/pull/8314
> >  > [9]
> >
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1007316984
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org 
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Maximilian Michels
Thanks for the summary Cham. All makes sense. I agree that we want to 
keep the option to manually specify artifacts.



There are few unanswered questions though.
(1) In what form will a transform author specify dependencies ? For example, 
URL to a Maven repo, URL to a local file, blob ?


Going forward, we probably want to support multiple ways. For now, we 
could stick with a URL-based approach with support for different file 
systems. In the future a list of packages to retrieve from Maven/PyPi 
would be useful.



(2) How will dependencies be included in the expansion response proto ? String 
(URL), bytes (blob) ?


I'd go for a list of Protobuf strings first but the format would have to 
evolve for other dependency types.


(3) How will we manage/share transitive dependencies required at runtime ? 


I'd say transitive dependencies have to be included in the list. In case 
of fat jars, they are reduced to a single jar.



(4) How will dependencies be staged for various runner/SDK combinations ? (for 
example, portable runner/Flink, Dataflow runner)


Staging should be no different than it is now, i.e. go through Beam's 
artifact staging service. As long as the protocol is stable, there could 
also be different implementations.


-Max

On 20.04.19 03:08, Chamikara Jayalath wrote:

OK, sounds like this is a good path forward then.

* When starting up the expansion service, user (that starts up the 
service) provide dependencies necessary to expand transforms. We will 
later add support for adding new transforms to an already running 
expansion service.
* As a part of transform configuration, transform author have the option 
of providing a list of dependencies that will be needed to run the 
transform.
* These dependencies will be send back to the pipeline SDK as a part of 
expansion response and pipeline SDK will stage these resources.
* Pipeline author have the option of specifying the dependencies using a 
pipeline option. (for example, https://github.com/apache/beam/pull/8340)


I think last option is important to (1) make existing transform easily 
available for cross-language usage without additional configurations (2) 
allow pipeline authors to override dependency versions specified by in 
the transform configuration (for example, to apply security patches) 
without updating the expansion service.


There are few unanswered questions though.
(1) In what form will a transform author specify dependencies ? For 
example, URL to a Maven repo, URL to a local file, blob ?
(2) How will dependencies be included in the expansion response proto ? 
String (URL), bytes (blob) ?

(3) How will we manage/share transitive dependencies required at runtime ?
(4) How will dependencies be staged for various runner/SDK combinations 
? (for example, portable runner/Flink, Dataflow runner)


Thanks,
Cham

On Fri, Apr 19, 2019 at 4:49 AM Maximilian Michels > wrote:


Thank you for your replies.

I did not suggest that the Expansion Service does the staging, but it
would return the required resources (e.g. jars) for the external
transform's runtime environment. The client then has to take care of
staging the resources.

The Expansion Service itself also needs resources to do the
expansion. I
assumed those to be provided when starting the expansion service. I
consider it less important but we could also provide a way to add new
transforms to the Expansion Service after startup.

Good point on Docker vs externally provided environments. For the PR
[1]
it will suffice then to add Kafka to the container dependencies. The
"--jar_package" pipeline option is ok for now but I'd like to see work
towards staging resources for external transforms via information
returned by the Expansion Service. That avoids users having to take
care
of including the correct jars in their pipeline options.

These issues are related and we could discuss them in separate threads:

* Auto-discovery of Expansion Service and its external transforms
* Credentials required during expansion / runtime

Thanks,
Max

[1] ttps://github.com/apache/beam/pull/8322


On 19.04.19 07:35, Thomas Weise wrote:
 > Good discussion :)
 >
 > Initially the expansion service was considered a user
responsibility,
 > but I think that isn't necessarily the case. I can also see the
 > expansion service provided as part of the infrastructure and the
user
 > not wanting to deal with it at all. For example, users may want
to write
 > Python transforms and use external IOs, without being concerned how
 > these IOs are provided. Under such scenario it would be good if:
 >
 > * Expansion service(s) can be auto-discovered via the job service
endpoint
 > * Available external transforms can be discovered via the expansion
 > service(s)
 > * 

Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Chamikara Jayalath
Congrats Yifan!

On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels  wrote:

> Congrats! Great work.
>
> -Max
>
> On 22.04.19 19:00, Rui Wang wrote:
> > Congratulations! Thanks for your contribution!!
> >
> > -Rui
> >
> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang  > > wrote:
> >
> > Congratulations, Yifan!
> >
> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang  > > wrote:
> >
> > Congratulations, Yifan~
> >
> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
> > mailto:conne...@google.com>> wrote:
> >
> > Well done Yifan!!!
> >
> > Thank you for sharing Kenn!!!
> >
> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> >
> > Congratulations, Yifan!
> >
> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
> >  > > wrote:
> >
> > Congratulations Yifan!
> >
> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
> > mailto:cma...@google.com>>
> wrote:
> >
> > Congratulations Yifan!!
> >
> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
> > mailto:k...@apache.org>>
> wrote:
> >
> > Hi all,
> >
> > Please join me and the rest of the Beam PMC
> > in welcoming a new committer: Yifan Zou.
> >
> > Yifan has been contributing to Beam since
> > early 2018. He has proposed 70+ pull
> > requests, adding dependency checking and
> > improving test infrastructure. But something
> > the numbers cannot show adequately is the
> > huge effort Yifan has put into working with
> > infra and keeping our Jenkins executors
> healthy.
> >
> > In consideration of Yian's contributions,
> > the Beam PMC trusts Yifan with the
> > responsibilities of a Beamcommitter[1].
> >
> > Thank you, Yifan, for your contributions.
> >
> > Kenn
> >
> > [1]
> >
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> > <
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> >
> >
> >
> >
> > --
> > 
> > Ruoyun  Huang
> >
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Maximilian Michels

Congrats! Great work.

-Max

On 22.04.19 19:00, Rui Wang wrote:

Congratulations! Thanks for your contribution!!

-Rui

On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang > wrote:


Congratulations, Yifan!

On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang mailto:boyu...@google.com>> wrote:

Congratulations, Yifan~

On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
mailto:conne...@google.com>> wrote:

Well done Yifan!!!

Thank you for sharing Kenn!!!

On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
mailto:al...@google.com>> wrote:

Congratulations, Yifan!

On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
mailto:timrobertson...@gmail.com>> wrote:

Congratulations Yifan!

On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
mailto:cma...@google.com>> wrote:

Congratulations Yifan!!

On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
mailto:k...@apache.org>> wrote:

Hi all,

Please join me and the rest of the Beam PMC
in welcoming a new committer: Yifan Zou.

Yifan has been contributing to Beam since
early 2018. He has proposed 70+ pull
requests, adding dependency checking and
improving test infrastructure. But something
the numbers cannot show adequately is the
huge effort Yifan has put into working with
infra and keeping our Jenkins executors healthy.

In consideration of Yian's contributions,
the Beam PMC trusts Yifan with the
responsibilities of a Beamcommitter[1].

Thank you, Yifan, for your contributions.

Kenn

[1]

https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer





-- 


Ruoyun  Huang



Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Ruoyun Huang
Congratulations, Yifan!

On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang  wrote:

> Congratulations, Yifan~
>
> On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan 
> wrote:
>
>> Well done Yifan!!!
>>
>> Thank you for sharing Kenn!!!
>>
>> On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay  wrote:
>>
>>> Congratulations, Yifan!
>>>
>>> On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson 
>>> wrote:
>>>
 Congratulations Yifan!

 On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden  wrote:

> Congratulations Yifan!!
>
> On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles 
> wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Yifan Zou.
>>
>> Yifan has been contributing to Beam since early 2018. He has
>> proposed 70+ pull requests, adding dependency checking and improving test
>> infrastructure. But something the numbers cannot show adequately is the
>> huge effort Yifan has put into working with infra and keeping our Jenkins
>> executors healthy.
>>
>> In consideration of Yian's contributions, the Beam PMC trusts Yifan
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Yifan, for your contributions.
>>
>> Kenn
>>
>> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-
>> beam-committer
>>
>

-- 

Ruoyun  Huang


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Boyuan Zhang
Congratulations, Yifan~

On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan 
wrote:

> Well done Yifan!!!
>
> Thank you for sharing Kenn!!!
>
> On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay  wrote:
>
>> Congratulations, Yifan!
>>
>> On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson 
>> wrote:
>>
>>> Congratulations Yifan!
>>>
>>> On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden  wrote:
>>>
 Congratulations Yifan!!

 On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Yifan Zou.
>
> Yifan has been contributing to Beam since early 2018. He has proposed
> 70+ pull requests, adding dependency checking and improving test
> infrastructure. But something the numbers cannot show adequately is the
> huge effort Yifan has put into working with infra and keeping our Jenkins
> executors healthy.
>
> In consideration of Yian's contributions, the Beam PMC trusts Yifan
> with the responsibilities of a Beam committer [1].
>
> Thank you, Yifan, for your contributions.
>
> Kenn
>
> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-
> beam-committer
>



Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Connell O'Callaghan
Well done Yifan!!!

Thank you for sharing Kenn!!!

On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay  wrote:

> Congratulations, Yifan!
>
> On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson 
> wrote:
>
>> Congratulations Yifan!
>>
>> On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden  wrote:
>>
>>> Congratulations Yifan!!
>>>
>>> On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles 
>>> wrote:
>>>
 Hi all,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Yifan Zou.

 Yifan has been contributing to Beam since early 2018. He has proposed
 70+ pull requests, adding dependency checking and improving test
 infrastructure. But something the numbers cannot show adequately is the
 huge effort Yifan has put into working with infra and keeping our Jenkins
 executors healthy.

 In consideration of Yian's contributions, the Beam PMC trusts Yifan
 with the responsibilities of a Beam committer [1].

 Thank you, Yifan, for your contributions.

 Kenn

 [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-
 beam-committer

>>>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Ahmet Altay
Congratulations, Yifan!

On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson 
wrote:

> Congratulations Yifan!
>
> On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden  wrote:
>
>> Congratulations Yifan!!
>>
>> On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles  wrote:
>>
>>> Hi all,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>>> Yifan Zou.
>>>
>>> Yifan has been contributing to Beam since early 2018. He has proposed
>>> 70+ pull requests, adding dependency checking and improving test
>>> infrastructure. But something the numbers cannot show adequately is the
>>> huge effort Yifan has put into working with infra and keeping our Jenkins
>>> executors healthy.
>>>
>>> In consideration of Yian's contributions, the Beam PMC trusts Yifan with
>>> the responsibilities of a Beam committer [1].
>>>
>>> Thank you, Yifan, for your contributions.
>>>
>>> Kenn
>>>
>>> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-
>>> beam-committer
>>>
>>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Tim Robertson
Congratulations Yifan!

On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden  wrote:

> Congratulations Yifan!!
>
> On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Yifan Zou.
>>
>> Yifan has been contributing to Beam since early 2018. He has proposed
>> 70+ pull requests, adding dependency checking and improving test
>> infrastructure. But something the numbers cannot show adequately is the
>> huge effort Yifan has put into working with infra and keeping our Jenkins
>> executors healthy.
>>
>> In consideration of Yian's contributions, the Beam PMC trusts Yifan with
>> the responsibilities of a Beam committer [1].
>>
>> Thank you, Yifan, for your contributions.
>>
>> Kenn
>>
>> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam
>> -committer
>>
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Cyrus Maden
Congratulations Yifan!!

On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles  wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer:
> Yifan Zou.
>
> Yifan has been contributing to Beam since early 2018. He has proposed 70+
> pull requests, adding dependency checking and improving test
> infrastructure. But something the numbers cannot show adequately is the
> huge effort Yifan has put into working with infra and keeping our Jenkins
> executors healthy.
>
> In consideration of Yian's contributions, the Beam PMC trusts Yifan with
> the responsibilities of a Beam committer [1].
>
> Thank you, Yifan, for your contributions.
>
> Kenn
>
> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-
> committer
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Aizhamal Nurmamat kyzy
Congratulations, Yifan! Thank you for all your work on Beam 

On Mon, Apr 22, 2019 at 08:26 Kenneth Knowles  wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer:
> Yifan Zou.
>
> Yifan has been contributing to Beam since early 2018. He has proposed 70+
> pull requests, adding dependency checking and improving test
> infrastructure. But something the numbers cannot show adequately is the
> huge effort Yifan has put into working with infra and keeping our Jenkins
> executors healthy.
>
> In consideration of Yian's contributions, the Beam PMC trusts Yifan with
> the responsibilities of a Beam committer [1].
>
> Thank you, Yifan, for your contributions.
>
> Kenn
>
> [1] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-
> committer
>


[ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Kenneth Knowles
Hi all,

Please join me and the rest of the Beam PMC in welcoming a new committer:
Yifan Zou.

Yifan has been contributing to Beam since early 2018. He has proposed 70+
pull requests, adding dependency checking and improving test
infrastructure. But something the numbers cannot show adequately is the
huge effort Yifan has put into working with infra and keeping our Jenkins
executors healthy.

In consideration of Yian's contributions, the Beam PMC trusts Yifan with
the responsibilities of a Beam committer [1].

Thank you, Yifan, for your contributions.

Kenn

[1] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-
committer


Re: [DISCUSS] Turn `WindowedValue` into `T` in the FnDataService and BeamFnDataClient interface definition

2019-04-22 Thread Kenneth Knowles
Makes sense to me. I don't see any problem.

Kenn

On Mon, Apr 22, 2019 at 12:08 AM jincheng sun 
wrote:

> Hi Kenn,
>
> Thanks for your reply, and explained the design of WindowValue clearly!
>
> At present, the definitions of `FnDataService` and `BeamFnDataClient` in
> Data Plane are very clear and universal, such as: send(...)/receive(...).
> If it is only applied in the project of Beam, it is already very good.
> Because `WindowValue` is a very basic data structure in the Beam project,
> both the Runner and the SDK harness have define the WindowedValue data
> structure.
>
> The reason I want to change the interface parameter from
> `WindowedValue` to T is because I want to make the `Data Plane`
> interface into a class library that can be used by other projects (such as
> Apache Flink), so that other projects Can have its own `FnDataService`
> implementation. However, the definition of `WindowedValue` does not apply
> to all projects. For example, Apache Flink also has a definition similar to
> WindowedValue. For example, Apache Flink Stream has StreamRecord. If we
> change `WindowedValue` to T, then other project's implementation does
> not need to wrap WindowedValue, the interface will become more concise.
> Furthermore,  we only need one T, such as the Apache Flink DataSet operator.
>
> So, I agree with your understanding, I don't expect `WindowedValueXXX`
> in the FnDataService interface, I hope to just use a `T`.
>
> Have you seen some problem if we change the interface parameter from
> `WindowedValue` to T?
>
> Thanks,
> Jincheng
>
> Kenneth Knowles  于2019年4月20日周六 上午2:38写道:
>
>> WindowedValue has always been an interface, not a concrete
>> representation:
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/WindowedValue.java
>> .
>> It is an abstract class because we started in Java 7 where you could not
>> have default methods, and just due to legacy style concerns. it is not just
>> discussed, but implemented, that there are WindowedValue implementations
>> with fewer allocations.
>> At the coder level, it was also always intended to have multiple
>> encodings. We already do have separate encodings based on whether there is
>> 1 window or multiple windows. The coder for a particular kind of
>> WindowedValue should decide this. Before the Fn API none of this had to be
>> standardized, because the runner could just choose whatever it wants. Now
>> we have to standardize any encodings that runners and harnesses both need
>> to know. There should be many, and adding more should be just a matter of
>> standardization, no new design.
>>
>> None of this should be user-facing or in the runner API / pipeline graph
>> - that is critical to making it flexible on the backend between the runner
>> & SDK harness.
>>
>> If I understand it, from our offline discussion, you are interested in
>> the case where you issue a ProcessBundleRequest to the SDK harness and none
>> of the primitives in the subgraph will ever observe the metadata. So you
>> want to not even have a tiny
>> "WindowedValueWithNoMetadata". Is that accurate?
>>
>> Kenn
>>
>> On Fri, Apr 19, 2019 at 10:17 AM jincheng sun 
>> wrote:
>>
>>> Thank you! And have a nice weekend!
>>>
>>>
>>> Lukasz Cwik  于2019年4月20日周六 上午1:14写道:
>>>
 I have added you as a contributor.

 On Fri, Apr 19, 2019 at 9:56 AM jincheng sun 
 wrote:

> Hi Lukasz,
>
> Thanks for your affirmation and provide more contextual information. :)
>
> Would you please give me the contributor permission?  My JIRA ID is
> sunjincheng121.
>
> I would like to create/assign tickets for this work.
>
> Thanks,
> Jincheng
>
> Lukasz Cwik  于2019年4月20日周六 上午12:26写道:
>
>> Since I don't think this is a contentious change.
>>
>> On Fri, Apr 19, 2019 at 9:25 AM Lukasz Cwik  wrote:
>>
>>> Yes, using T makes sense.
>>>
>>> The WindowedValue was meant to be a context object in the SDK
>>> harness that propagates various information about the current element. 
>>> We
>>> have discussed in the past about:
>>> * making optimizations which would pass around less of the context
>>> information if we know that the DoFns don't need it (for example, all 
>>> the
>>> values share the same window).
>>> * versioning the encoding separately from the WindowedValue context
>>> object (see recent discussion about element timestamp precision [1])
>>> * the runner may want its own representation of a context object
>>> that makes sense for it which isn't a WindowedValue necessarily.
>>>
>>> Feel free to cut a JIRA about this and start working on a change
>>> towards this.
>>>
>>> 1:
>>> 

Beam Dependency Check Report (2019-04-22)

2019-04-22 Thread Apache Jenkins Server

High Priority Dependency Updates Of Beam Python SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
future
0.16.0
0.17.1
2016-10-27
2018-12-10BEAM-5968
google-cloud-bigquery
1.6.1
1.11.2
2019-01-21
2019-04-08BEAM-5537
oauth2client
3.0.0
4.1.3
2018-12-10
2018-12-10BEAM-6089
High Priority Dependency Updates Of Beam Java SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
com.rabbitmq:amqp-client
4.9.3
5.7.0
2019-01-18
2019-04-05BEAM-5895
com.google.auto.service:auto-service
1.0-rc2
1.0-rc5
2014-10-25
2019-03-25BEAM-5541
com.github.ben-manes.versions:com.github.ben-manes.versions.gradle.plugin
0.17.0
0.21.0
2019-02-11
2019-03-04BEAM-6645
org.conscrypt:conscrypt-openjdk
1.1.3
2.1.0
2018-06-04
2019-04-03BEAM-5748
org.elasticsearch:elasticsearch
6.4.0
7.0.0
2018-08-18
2019-04-06BEAM-6090
org.elasticsearch:elasticsearch-hadoop
5.0.0
7.0.0
2016-10-26
2019-04-06BEAM-5551
org.elasticsearch.client:elasticsearch-rest-client
6.4.0
7.0.0
2018-08-18
2019-04-06BEAM-6091
com.google.errorprone:error_prone_annotations
2.1.2
2.3.3
2017-10-19
2019-02-22BEAM-6741
org.elasticsearch.test:framework
6.4.0
7.0.0
2018-08-18
2019-04-06BEAM-6092
com.google.auth:google-auth-library-credentials
0.12.0
0.15.0
2018-11-14
2019-03-27BEAM-6478
io.grpc:grpc-auth
1.17.1
1.20.0
2018-12-07
2019-04-10BEAM-5896
io.grpc:grpc-context
1.13.1
1.20.0
2018-06-21
2019-04-10BEAM-5897
io.grpc:grpc-core
1.17.1
1.20.0
2018-12-07
2019-04-10BEAM-5898
io.grpc:grpc-netty
1.17.1
1.20.0
2018-12-07
2019-04-10BEAM-5899
io.grpc:grpc-protobuf
1.13.1
1.20.0
2018-06-21
2019-04-10BEAM-5900
io.grpc:grpc-stub
1.17.1
1.20.0
2018-12-07
2019-04-10BEAM-5901
io.grpc:grpc-testing
1.13.1
1.20.0
2018-06-21
2019-04-10BEAM-5902
com.google.code.gson:gson
2.7
2.8.5
2016-06-14
2018-05-22BEAM-5558
com.google.guava:guava
20.0
27.1-jre
2016-10-28
2019-03-08BEAM-5559
org.apache.hbase:hbase-common
1.2.6
2.1.4
2017-05-29
2019-03-20BEAM-5560
org.apache.hbase:hbase-hadoop-compat
1.2.6
2.1.4
2017-05-29
2019-03-20BEAM-5561
org.apache.hbase:hbase-hadoop2-compat
1.2.6
2.1.4
2017-05-29
2019-03-20BEAM-5562
org.apache.hbase:hbase-server
1.2.6
2.1.4
2017-05-29
2019-03-20BEAM-5563
org.apache.hbase:hbase-shaded-client
1.2.6
2.1.4
2017-05-29
2019-03-20BEAM-5564
org.apache.hive:hive-cli
2.1.0
3.1.1
2016-06-17
2018-10-24BEAM-5566
org.apache.hive:hive-common
2.1.0
3.1.1
2016-06-17
2018-10-24BEAM-5567
org.apache.hive:hive-exec
2.1.0
3.1.1
2016-06-17
2018-10-24BEAM-5568
org.apache.hive.hcatalog:hive-hcatalog-core
2.1.0
3.1.1
2016-06-17
2018-10-24BEAM-5569
net.java.dev.javacc:javacc
4.0
7.0.4
2006-03-17
2018-09-17BEAM-5570
javax.servlet:javax.servlet-api
3.1.0
4.0.1
2013-04-25
2018-04-20BEAM-5750
org.eclipse.jetty:jetty-server
9.2.10.v20150310
9.4.17.v20190418
2015-03-10
2019-04-18BEAM-5752
org.eclipse.jetty:jetty-servlet
9.2.10.v20150310
9.4.17.v20190418
2015-03-10
2019-04-18BEAM-5753
net.java.dev.jna:jna
4.1.0
5.3.0
2014-03-06
2019-04-20BEAM-5573
junit:junit
4.13-beta-1
4.13-beta-2
2018-11-25
2019-02-02BEAM-6127
com.esotericsoftware:kryo
4.0.2
5.0.0-RC4
2018-03-20
2019-04-14BEAM-5809
com.esotericsoftware.kryo:kryo
2.21
2.24.0
2013-02-27
2014-05-04BEAM-5574
io.netty:netty-tcnative-boringssl-static

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-22 Thread Maximilian Michels

+1 (binding)

Found a minor bug while testing, but not a blocker: 
https://jira.apache.org/jira/browse/BEAM-7128


Thanks,
Max

On 20.04.19 23:02, Pablo Estrada wrote:

+1
Ran SQL postcommit, and Dataflow Portability Java validatesrunner tests.

-P.

On Wed, Apr 17, 2019 at 1:38 AM Jean-Baptiste Onofré > wrote:


+1 (binding)

Quickly checked with beam-samples.

Regards
JB

On 16/04/2019 00:50, Andrew Pilloud wrote:
 > Hi everyone,
 >
 > Please review and vote on the release candidate #4 for the version
 > 2.12.0, as follows:
 >
 > [ ] +1, Approve the release
 > [ ] -1, Do not approve the release (please provide specific comments)
 >
 > The complete staging area is available for your review, which
includes:
 > * JIRA release notes [1],
 > * the official Apache source release to be deployed to
dist.apache.org 
 >  [2], which is signed with the key with
 > fingerprint 9E7CEC0661EFD610B632C610AE8FE17F9F8AE3D4 [3],
 > * all artifacts to be deployed to the Maven Central Repository [4],
 > * source code tag "v2.12.0-RC4" [5],
 > * website pull request listing the release [6], publishing the API
 > reference manual [7], and the blog post [8].
 > * Java artifacts were built with Gradle/5.2.1 and OpenJDK/Oracle JDK
 > 1.8.0_181.
 > * Python artifacts are deployed along with the source release to the
 > dist.apache.org  
[2].
 > * Validation sheet with a tab for 2.12.0 release to help with
validation
 > [9].
 >
 > The vote will be open for at least 72 hours. It is adopted by
majority
 > approval, with at least 3 PMC affirmative votes.
 >
 > Thanks,
 > Andrew
 >
 > 1]

https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12344944
 > [2] https://dist.apache.org/repos/dist/dev/beam/2.12.0/
 > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
 > [4]
https://repository.apache.org/content/repositories/orgapachebeam-1068/
 > [5] https://github.com/apache/beam/tree/v2.12.0-RC4
 > [6] https://github.com/apache/beam/pull/8215
 > [7] https://github.com/apache/beam-site/pull/588
 > [8] https://github.com/apache/beam/pull/8314
 > [9]

https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1007316984

-- 
Jean-Baptiste Onofré

jbono...@apache.org 
http://blog.nanthrax.net
Talend - http://www.talend.com



Re: [DISCUSS] Turn `WindowedValue` into `T` in the FnDataService and BeamFnDataClient interface definition

2019-04-22 Thread jincheng sun
Hi Kenn,

Thanks for your reply, and explained the design of WindowValue clearly!

At present, the definitions of `FnDataService` and `BeamFnDataClient` in
Data Plane are very clear and universal, such as: send(...)/receive(...).
If it is only applied in the project of Beam, it is already very good.
Because `WindowValue` is a very basic data structure in the Beam project,
both the Runner and the SDK harness have define the WindowedValue data
structure.

The reason I want to change the interface parameter from `WindowedValue`
to T is because I want to make the `Data Plane` interface into a class
library that can be used by other projects (such as Apache Flink), so that
other projects Can have its own `FnDataService` implementation. However,
the definition of `WindowedValue` does not apply to all projects. For
example, Apache Flink also has a definition similar to WindowedValue. For
example, Apache Flink Stream has StreamRecord. If we change
`WindowedValue` to T, then other project's implementation does not need
to wrap WindowedValue, the interface will become more concise.
Furthermore,  we only need one T, such as the Apache Flink DataSet operator.

So, I agree with your understanding, I don't expect `WindowedValueXXX`
in the FnDataService interface, I hope to just use a `T`.

Have you seen some problem if we change the interface parameter from
`WindowedValue` to T?

Thanks,
Jincheng

Kenneth Knowles  于2019年4月20日周六 上午2:38写道:

> WindowedValue has always been an interface, not a concrete representation:
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/WindowedValue.java
> .
> It is an abstract class because we started in Java 7 where you could not
> have default methods, and just due to legacy style concerns. it is not just
> discussed, but implemented, that there are WindowedValue implementations
> with fewer allocations.
> At the coder level, it was also always intended to have multiple
> encodings. We already do have separate encodings based on whether there is
> 1 window or multiple windows. The coder for a particular kind of
> WindowedValue should decide this. Before the Fn API none of this had to be
> standardized, because the runner could just choose whatever it wants. Now
> we have to standardize any encodings that runners and harnesses both need
> to know. There should be many, and adding more should be just a matter of
> standardization, no new design.
>
> None of this should be user-facing or in the runner API / pipeline graph -
> that is critical to making it flexible on the backend between the runner &
> SDK harness.
>
> If I understand it, from our offline discussion, you are interested in the
> case where you issue a ProcessBundleRequest to the SDK harness and none of
> the primitives in the subgraph will ever observe the metadata. So you want
> to not even have a tiny
> "WindowedValueWithNoMetadata". Is that accurate?
>
> Kenn
>
> On Fri, Apr 19, 2019 at 10:17 AM jincheng sun 
> wrote:
>
>> Thank you! And have a nice weekend!
>>
>>
>> Lukasz Cwik  于2019年4月20日周六 上午1:14写道:
>>
>>> I have added you as a contributor.
>>>
>>> On Fri, Apr 19, 2019 at 9:56 AM jincheng sun 
>>> wrote:
>>>
 Hi Lukasz,

 Thanks for your affirmation and provide more contextual information. :)

 Would you please give me the contributor permission?  My JIRA ID is
 sunjincheng121.

 I would like to create/assign tickets for this work.

 Thanks,
 Jincheng

 Lukasz Cwik  于2019年4月20日周六 上午12:26写道:

> Since I don't think this is a contentious change.
>
> On Fri, Apr 19, 2019 at 9:25 AM Lukasz Cwik  wrote:
>
>> Yes, using T makes sense.
>>
>> The WindowedValue was meant to be a context object in the SDK harness
>> that propagates various information about the current element. We have
>> discussed in the past about:
>> * making optimizations which would pass around less of the context
>> information if we know that the DoFns don't need it (for example, all the
>> values share the same window).
>> * versioning the encoding separately from the WindowedValue context
>> object (see recent discussion about element timestamp precision [1])
>> * the runner may want its own representation of a context object that
>> makes sense for it which isn't a WindowedValue necessarily.
>>
>> Feel free to cut a JIRA about this and start working on a change
>> towards this.
>>
>> 1:
>> https://lists.apache.org/thread.html/221b06e81bba335d0ea8d770212cc7ee047dba65bec7978368a51473@%3Cdev.beam.apache.org%3E
>>
>> On Fri, Apr 19, 2019 at 3:18 AM jincheng sun <
>> sunjincheng...@gmail.com> wrote:
>>
>>> Hi Beam devs,
>>>
>>> I read some of the docs about `Communicating over the Fn API` in
>>> Beam. I feel that Beam 

Re: AvroUtils converting generic record to Beam Row causes class cast exception

2019-04-22 Thread Vishwas Bm
Hi Rui,

I checked the AvroUtils code. There is a static intializer block basically
it registers Avro Timestamp Conversion functions for logical type
timestamp-millis.

*// Code Snippet below*
static {
// This works around a bug in the Avro library (AVRO-1891) around
SpecificRecord's handling
// of DateTime types.
   SpecificData.get().addLogicalTypeConversion(new TimeConversions.
TimestampConversion());
   GenericData.get().addLogicalTypeConversion(new TimeConversions.
TimestampConversion());
}

Because of this when deserializing generic record from kafka using
KafkaAvroDeserializer, the long value produced at the producer end gets
converted to joda-time during deserialization.

Next when we try to convert this genericRecord to Row as part of
AvroUtils.toBeamRowStrict function, we again try to convert the value
recieved to joda-time.
But the exception is thrown as there is type cast to Long.

*// Code Snippet Below:*
else if (logicalType instanceof LogicalTypes.TimestampMillis) {
 return convertDateTimeStrict((Long) value, fieldType); *<--  Class
cast exception is thrown here, as we are typecasting from JodaTime to Long*
}

private static Object convertDateTimeStrict (Long value, Schema.FieldType
fieldType) {
 checkTypeName(fieldType.getTypeName(), TypeName.DATETIME, "dateTime
");
 return new Instant(value);  <--  *Creates a JodaTime Instance
here*
}


*Thanks & Regards,*

*Vishwas *



On Tue, Apr 16, 2019 at 9:18 AM Rui Wang  wrote:

> I didn't find code in `AvroUtils.toBeamRowStrict` that converts long to
> Joda time. `AvroUtils.toBeamRowStrict` retrieves objects from
> GenericRecord, and tries to cast objects based on their types (and
> cast(object) to long for "timestamp-millis"). see [1].
>
> So in order to use `AvroUtils.toBeamRowStrict`, the generated
> GenericRecord should have long for "timestamp-millis".
>
> The schema you pasted looks right. Not sure why generated class is Joda
> time (is it controlled by some flags?). But at least you could write a
> small function to do schema conversion for your need.
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java#L672
>
>
> Rui
>
>
> On Mon, Apr 15, 2019 at 7:11 PM Vishwas Bm  wrote:
>
>> Hi Rui,
>>
>> I agree that by converting it to long, there will be no error.
>> But the KafkaIO is giving a GenericRecord with attribute of type
>> JodaTime. Now I convert it to long. Then in the  AvroUtils.toBeamRowStrict
>> again converts it to JodaTime.
>>
>> I used the avro tools 1.8.2 jar, for the below schema and I see that the
>> generated class has a JodaTime attribute.
>>
>> {
>> "name": "timeOfRelease",
>> "type":
>> {
>> "type": "long",
>> "logicalType": "timestamp-millis",
>> "connect.version": 1,
>> "connect.name":
>> "org.apache.kafka.connect.data.Timestamp"
>> }
>>  }
>>
>> *Attribute type in generated class:*
>> private org.joda.time.DateTime timeOfRelease;
>>
>>
>> So not sure why this type casting is required.
>>
>>
>> *Thanks & Regards,*
>>
>> *Vishwas *
>>
>>
>> On Tue, Apr 16, 2019 at 12:56 AM Rui Wang  wrote:
>>
>>> Read from the code and seems like as the logical type "timestamp-millis"
>>> means, it's expecting millis in Long as values under this logical type.
>>>
>>> So if you can convert joda-time to millis before calling
>>> "AvroUtils.toBeamRowStrict(genericRecord, this.beamSchema)", your exception
>>> will gone.
>>>
>>> -Rui
>>>
>>>
>>> On Mon, Apr 15, 2019 at 10:28 AM Lukasz Cwik  wrote:
>>>
 +dev 

 On Sun, Apr 14, 2019 at 10:29 PM Vishwas Bm 
 wrote:

> Hi,
>
> Below is my pipeline:
>
> KafkaSource (KafkaIO.read) --> Pardo ---> BeamSql
> ---> KafkaSink(KafkaIO.write)
>
>
> The avro schema of the topic has a field of logical type
> timestamp-millis.  KafkaIO.read transform is creating a
> KafkaRecord, where this field is being converted to
> joda-time.
>
> In my Pardo transform, I am trying to use the AvroUtils class methods
> to convert the generic record to Beam Row and getting below class cast
> exception for the joda-time attribute.
>
>  AvroUtils.toBeamRowStrict(genericRecord, this.beamSchema)
>
> Caused by: java.lang.ClassCastException: org.joda.time.DateTime cannot
> be cast to java.lang.Long
> at
> org.apache.beam.sdk.schemas.utils.AvroUtils.convertAvroFieldStrict(AvroUtils.java:664)
> at
> org.apache.beam.sdk.schemas.utils.AvroUtils.toBeamRowStrict(AvroUtils.java:217)
>
> I have opened a jira https://issues.apache.org/jira/browse/BEAM-7073
> for this
>
>
>
> *Thanks & Regards,*
>
> *Vishwas *
>
>