Re: Precommits broken?

2018-06-13 Thread Scott Wegner
Indeed, I was going to send out an email about pre-commit filtering, but
we've already found some kinks and may need to revert it.

The change was submitted in PR#5611 [1] and enables Jenkins triggering to
only run pre-commits based on modified files. However, Udi noticed that
this also prevents manually running pre-commits on a PR with trigger
phrases when your PR changes don't match the pre-commit include path [2].
This was blocking 2.5.0 release validation, so I have a PR out to revert
the change [3].

I did some investigation and this is a deficiency in the Jenkins plugin
used to trigger jobs on pull requests. I've filed a bug [4] and submitted a
PR [5], but there's no guarantee that it'll get accepted or when it will be
available.

Question for others: we were hoping to enable pre-commit triggering as an
optimization to decrease testing wait time and limit the impact of test
flakiness [6]. But this bug in the plugin means we'd lose the ability to
manually trigger pre-commits which aren't automatically run. One workaround
would be to run the tests locally instead of on Jenkins, though that's
clearly less desirable. Is this a blocker?

Should we:
(a) Keep pre-commit triggering enabled for now and hope the upstream patch
gets accepted, or
(b) Revert the pre-commit change and wait for the patch

Thoughts?

[1] https://github.com/apache/beam/pull/5611
[2] https://github.com/apache/beam/pull/5607#issuecomment-397080770
[3] https://github.com/apache/beam/pull/5638
[4] https://github.com/jenkinsci/ghprb-plugin/issues/678
[5] https://github.com/jenkinsci/ghprb-plugin/pull/680
[6]
https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#bookmark=id.6j8bwxnbp7fr


On Wed, Jun 13, 2018 at 10:03 PM Rui Wang  wrote:

> Precommit filter is a really coool optimization!
>
> -Rui
>
> On Wed, Jun 13, 2018 at 5:21 PM Andrew Pilloud 
> wrote:
>
>> Ah, so this is intended and I didn't break anything? Cool! Sorry for the
>> false alarm, looks like a great build optimization!
>>
>> Andrew
>>
>> On Wed, Jun 13, 2018 at 5:06 PM Yifan Zou  wrote:
>>
>>> Probably due to the precommit filter applied in #5611
>>> ?
>>>
>>> On Wed, Jun 13, 2018 at 5:02 PM Andrew Pilloud 
>>> wrote:
>>>
 Looks like statuses got posted between me writing this email and
 sending it. Still wondering why the python and go jobs appear to be 
 missing?

 Andrew

 On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud 
 wrote:

> Recent PRs don't appear to be running all the precommits, and success
> status isn't being pushed to PRs. Anyone know what is going on?
>
> See:
> https://github.com/apache/beam/pull/5592
> https://github.com/apache/beam/pull/5622
>
> Andrew
>
>


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Jean-Baptiste Onofré
It looks good to me, I'm merging and moving forward.

Regards
JB

On 14/06/2018 00:45, Pablo Estrada wrote:
> Sent out https://github.com/apache/beam/pull/5640 to ignore the flaky
> test. As JB is the release manager, I'l let him make the call on what to
> do about it.
> Best
> -P.
> 
> On Wed, Jun 13, 2018 at 3:34 PM Ahmet Altay  > wrote:
> 
> I would vote for second option, not a release blocker and disable
> the test in the release branch. My reasoning is:
> - ReferenceRunner is not yet the official alternative to existing
> direct runners.
> - It is bad to have flaky tests on the release branch, and we would
> not get good signal during validation.
> 
> On Wed, Jun 13, 2018 at 3:14 PM, Pablo Estrada  > wrote:
> 
> Hello all,
> cherrypicks for the release branch seem to be going well, but
> thanks to them we were able to surface a flaky test in the
> release branch. JIRA is
> filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558
> 
> Given that test issue, I see the following options:
> - Consider that this test is not a release blocker. Go ahead
> with RC2 after cherrypicks are brought in, or
> - Consider that this test is not a release blocker, so we
> disable it before cutting RC2.
> - Consider this test a release blocker, and triage the bug for
> fixing.
> 
> What do you think?
> 
> Best
> -P.
> 
> On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada
> mailto:pabl...@google.com>> wrote:
> 
> Precommits for
> PR https://github.com/apache/beam/pull/5609 are now passing.
> For now I've simply set failOnWarning to false to cherrypick
> into the release, and fix in master later on.
> Best
> -P.
> 
> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner
> mailto:sweg...@google.com>> wrote:
> 
> From my understanding, the @SuppressFBWarnings usage is
> in a dependency (ByteBuddy) rather than directly in our
> code; so we're not able to modify the usage.
> 
> Pablo, feel free to disable failOnWarning for the
> sdks-java-core project temporarily. This isn't a major
> regression since we've only recently made the change to
> enable it [1]. We can work separately on figuring out
> how to resolve the warnings.
> 
> [1] https://github.com/apache/beam/pull/5319 
> 
> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson
>  > wrote:
> 
> Hi Pablo,
> 
> I'm afraid I couldn't find one either... there is an
> issue about it [1] which is old so it doesn't look
> likely to be resolved either.
> 
> If you have time (sorry I am a bit busy) could you
> please verify the version does work if you install
> that version locally? I know the maven version of
> that [2] but not sure on the gradle equivalent. If
> we know it works, we can then find a repository that
> fits ok with Apache/Beam policy.
> 
> Alternatively, we could consider using a fully
> qualified reference (i.e.
> @edu.umd.cs.findbugs.annotations.SuppressWarnings)
> to the deprecated version and leave the dependency
> at the 1.3.9-1. I believe our general direction is
> to remove findbugs when errorprone covers all
> aspects so I *expect* this should be considered
> reasonable.
> 
> I hope this helps,
> Tim
> 
> [1] 
> https://github.com/stephenc/findbugs-annotations/issues/4
> [2] 
> https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
> 
> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada
> mailto:pabl...@google.com>> wrote:
> 
> Hi Tim,
> you're right. Thanks for pointing that out.
> There's just one problem that I'm running into
> now: The 3.0.1-1 version does not seem to be
> available in Maven Central[1]. Looking at the
> website, I am not quite sure if there's another
> repository where they do stage the newer
> versions?[2]
> 
> -P
> 
> [1] 
> 

Re: Precommits broken?

2018-06-13 Thread Rui Wang
Precommit filter is a really coool optimization!

-Rui

On Wed, Jun 13, 2018 at 5:21 PM Andrew Pilloud  wrote:

> Ah, so this is intended and I didn't break anything? Cool! Sorry for the
> false alarm, looks like a great build optimization!
>
> Andrew
>
> On Wed, Jun 13, 2018 at 5:06 PM Yifan Zou  wrote:
>
>> Probably due to the precommit filter applied in #5611
>> ?
>>
>> On Wed, Jun 13, 2018 at 5:02 PM Andrew Pilloud 
>> wrote:
>>
>>> Looks like statuses got posted between me writing this email and sending
>>> it. Still wondering why the python and go jobs appear to be missing?
>>>
>>> Andrew
>>>
>>> On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud 
>>> wrote:
>>>
 Recent PRs don't appear to be running all the precommits, and success
 status isn't being pushed to PRs. Anyone know what is going on?

 See:
 https://github.com/apache/beam/pull/5592
 https://github.com/apache/beam/pull/5622

 Andrew




Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Eugene Kirpichov
FWIW I have a fix to the flaky test in
https://github.com/apache/beam/pull/5585 (open)

On Wed, Jun 13, 2018 at 5:26 PM Udi Meiri  wrote:

> +1 to ignoring flaky test.
>
> FYI there's a fourth cherrypick: https://github.com/apache/beam/pull/5624
>
> On Wed, Jun 13, 2018 at 3:45 PM Pablo Estrada  wrote:
>
>> Sent out https://github.com/apache/beam/pull/5640 to ignore the flaky
>> test. As JB is the release manager, I'l let him make the call on what to do
>> about it.
>> Best
>> -P.
>>
>> On Wed, Jun 13, 2018 at 3:34 PM Ahmet Altay  wrote:
>>
>>> I would vote for second option, not a release blocker and disable the
>>> test in the release branch. My reasoning is:
>>> - ReferenceRunner is not yet the official alternative to existing direct
>>> runners.
>>> - It is bad to have flaky tests on the release branch, and we would not
>>> get good signal during validation.
>>>
>>> On Wed, Jun 13, 2018 at 3:14 PM, Pablo Estrada 
>>> wrote:
>>>
 Hello all,
 cherrypicks for the release branch seem to be going well, but thanks to
 them we were able to surface a flaky test in the release branch. JIRA is
 filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558

 Given that test issue, I see the following options:
 - Consider that this test is not a release blocker. Go ahead with RC2
 after cherrypicks are brought in, or
 - Consider that this test is not a release blocker, so we disable it
 before cutting RC2.
 - Consider this test a release blocker, and triage the bug for fixing.

 What do you think?

 Best
 -P.

 On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada 
 wrote:

> Precommits for PR https://github.com/apache/beam/pull/5609 are now
> passing. For now I've simply set failOnWarning to false to cherrypick into
> the release, and fix in master later on.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner 
> wrote:
>
>> From my understanding, the @SuppressFBWarnings usage is in a
>> dependency (ByteBuddy) rather than directly in our code; so we're not 
>> able
>> to modify the usage.
>>
>> Pablo, feel free to disable failOnWarning for the sdks-java-core
>> project temporarily. This isn't a major regression since we've only
>> recently made the change to enable it [1]. We can work separately on
>> figuring out how to resolve the warnings.
>>
>> [1] https://github.com/apache/beam/pull/5319
>>
>> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson <
>> timrobertson...@gmail.com> wrote:
>>
>>> Hi Pablo,
>>>
>>> I'm afraid I couldn't find one either... there is an issue about it
>>> [1] which is old so it doesn't look likely to be resolved either.
>>>
>>> If you have time (sorry I am a bit busy) could you please verify the
>>> version does work if you install that version locally? I know the maven
>>> version of that [2] but not sure on the gradle equivalent. If we know it
>>> works, we can then find a repository that fits ok with Apache/Beam 
>>> policy.
>>>
>>> Alternatively, we could consider using a fully qualified reference
>>> (i.e. @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the 
>>> deprecated
>>> version and leave the dependency at the 1.3.9-1. I believe our general
>>> direction is to remove findbugs when errorprone covers all aspects so I
>>> *expect* this should be considered reasonable.
>>>
>>> I hope this helps,
>>> Tim
>>>
>>> [1] https://github.com/stephenc/findbugs-annotations/issues/4
>>> [2]
>>> https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>>>
>>> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
>>> wrote:
>>>
 Hi Tim,
 you're right. Thanks for pointing that out. There's just one
 problem that I'm running into now: The 3.0.1-1 version does not seem 
 to be
 available in Maven Central[1]. Looking at the website, I am not quite 
 sure
 if there's another repository where they do stage the newer 
 versions?[2]

 -P

 [1]
 https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
 /
 [2] http://stephenc.github.io/findbugs-annotations/

 On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
 timrobertson...@gmail.com> wrote:

> Hi Pablo,
>
> I took only a quick look.
>
> "- The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation"
>
> Unless I misunderstand you it looks like SuppressFBWarnings was
> added in Stephen's version in this commit [1] which was
> introduced in version 2.0.3-1 -  I've checked is in the 3.0.1-1 build 
> [2]
> I notice in your commits [1] you've been exploring 

Re: Proposing interactive beam runner

2018-06-13 Thread Ahmet Altay
Thank you Sindy.

I like the demo; it looks great. This would be interesting to a lot of
users. What are your plans for moving this forward? What kind of an input
you are looking for?

Ahmet

On Wed, Jun 13, 2018 at 2:32 PM, Eugene Kirpichov 
wrote:

> This is awesome, thanks Sindy! I hope that the questions related to
> portability will get resolved in a way that will allow to reuse some of the
> work for other interactive Beam experiences, including SQL as Andrew says,
> and providing a REPL e.g. for users of Scala or other JVM-based languages.
>
> +Neville Li  Do I remember correctly that you guys
> had some sort of interactivity going in Scio but were looking forward to
> Beam developing a native solution?
>
> On Wed, Jun 13, 2018 at 2:22 PM Sindy Li  wrote:
>
>> *Thanks, Andrew!*
>>
>> *Here is a link to the demo on Youtube for people interested:*
>> *https://www.youtube.com/watch?v=c5CjA1e3Cqw=youtu.be
>> *
>>
>> On Wed, Jun 13, 2018 at 1:23 PM, Andrew Pilloud 
>> wrote:
>>
>>> This sounds really interesting, thanks for sharing! We've just begun to
>>> explore making Beam SQL interactive. The Interactive Runner you've proposed
>>> sounds like it would solve a bunch of the problems SQL faces as well. SQL
>>> is written in Java right now, so we can't immediately reuse any code.
>>>
>>> Andrew
>>>
>>> On Wed, Jun 13, 2018 at 11:48 AM Sindy Li  wrote:
>>>
 Resending after subscribing to dev list.

 -- Forwarded message --
 From: Sindy Li 
 Date: Fri, Jun 8, 2018 at 5:57 PM
 Subject: Proposing interactive beam runner
 To: dev@beam.apache.org
 Cc: Harsh Vardhan , Chamikara Jayalath <
 chamik...@google.com>, Anand Iyer , Robert Bradshaw
 


 Hello,

 We were exploring ways to provide an interactive notebook experience
 for writing Beam Python pipelines. The design doc
 
  provides
 an overview/vision of what we would like to achieve. Pull request
  provides a prototype for
 the same. The document also provides demo screen shots and
 instructions for running a demo in Jupyter. Please take a look. We believe
 this would be a useful addition to Beam.

 Thanks!




>>


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Udi Meiri
+1 to ignoring flaky test.

FYI there's a fourth cherrypick: https://github.com/apache/beam/pull/5624

On Wed, Jun 13, 2018 at 3:45 PM Pablo Estrada  wrote:

> Sent out https://github.com/apache/beam/pull/5640 to ignore the flaky
> test. As JB is the release manager, I'l let him make the call on what to do
> about it.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 3:34 PM Ahmet Altay  wrote:
>
>> I would vote for second option, not a release blocker and disable the
>> test in the release branch. My reasoning is:
>> - ReferenceRunner is not yet the official alternative to existing direct
>> runners.
>> - It is bad to have flaky tests on the release branch, and we would not
>> get good signal during validation.
>>
>> On Wed, Jun 13, 2018 at 3:14 PM, Pablo Estrada 
>> wrote:
>>
>>> Hello all,
>>> cherrypicks for the release branch seem to be going well, but thanks to
>>> them we were able to surface a flaky test in the release branch. JIRA is
>>> filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558
>>>
>>> Given that test issue, I see the following options:
>>> - Consider that this test is not a release blocker. Go ahead with RC2
>>> after cherrypicks are brought in, or
>>> - Consider that this test is not a release blocker, so we disable it
>>> before cutting RC2.
>>> - Consider this test a release blocker, and triage the bug for fixing.
>>>
>>> What do you think?
>>>
>>> Best
>>> -P.
>>>
>>> On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada 
>>> wrote:
>>>
 Precommits for PR https://github.com/apache/beam/pull/5609 are now
 passing. For now I've simply set failOnWarning to false to cherrypick into
 the release, and fix in master later on.
 Best
 -P.

 On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner 
 wrote:

> From my understanding, the @SuppressFBWarnings usage is in a
> dependency (ByteBuddy) rather than directly in our code; so we're not able
> to modify the usage.
>
> Pablo, feel free to disable failOnWarning for the sdks-java-core
> project temporarily. This isn't a major regression since we've only
> recently made the change to enable it [1]. We can work separately on
> figuring out how to resolve the warnings.
>
> [1] https://github.com/apache/beam/pull/5319
>
> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson <
> timrobertson...@gmail.com> wrote:
>
>> Hi Pablo,
>>
>> I'm afraid I couldn't find one either... there is an issue about it
>> [1] which is old so it doesn't look likely to be resolved either.
>>
>> If you have time (sorry I am a bit busy) could you please verify the
>> version does work if you install that version locally? I know the maven
>> version of that [2] but not sure on the gradle equivalent. If we know it
>> works, we can then find a repository that fits ok with Apache/Beam 
>> policy.
>>
>> Alternatively, we could consider using a fully qualified reference
>> (i.e. @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the 
>> deprecated
>> version and leave the dependency at the 1.3.9-1. I believe our general
>> direction is to remove findbugs when errorprone covers all aspects so I
>> *expect* this should be considered reasonable.
>>
>> I hope this helps,
>> Tim
>>
>> [1] https://github.com/stephenc/findbugs-annotations/issues/4
>> [2]
>> https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>>
>> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
>> wrote:
>>
>>> Hi Tim,
>>> you're right. Thanks for pointing that out. There's just one problem
>>> that I'm running into now: The 3.0.1-1 version does not seem to be
>>> available in Maven Central[1]. Looking at the website, I am not quite 
>>> sure
>>> if there's another repository where they do stage the newer versions?[2]
>>>
>>> -P
>>>
>>> [1]
>>> https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
>>> /
>>> [2] http://stephenc.github.io/findbugs-annotations/
>>>
>>> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
>>> timrobertson...@gmail.com> wrote:
>>>
 Hi Pablo,

 I took only a quick look.

 "- The JAR from the non-LGPL findbugs does not contain the
 SuppressFBWarnings annotation"

 Unless I misunderstand you it looks like SuppressFBWarnings was
 added in Stephen's version in this commit [1] which was introduced
 in version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
 I notice in your commits [1] you've been exploring version 3.0.0
 already though... what happens when you use 3.0.1-1? It sounds like the
 wrong version is coming in rather than the annotation being missing.

 Thanks,
 Tim

 [1]
 

Re: Precommits broken?

2018-06-13 Thread Andrew Pilloud
Ah, so this is intended and I didn't break anything? Cool! Sorry for the
false alarm, looks like a great build optimization!

Andrew

On Wed, Jun 13, 2018 at 5:06 PM Yifan Zou  wrote:

> Probably due to the precommit filter applied in #5611
> ?
>
> On Wed, Jun 13, 2018 at 5:02 PM Andrew Pilloud 
> wrote:
>
>> Looks like statuses got posted between me writing this email and sending
>> it. Still wondering why the python and go jobs appear to be missing?
>>
>> Andrew
>>
>> On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud 
>> wrote:
>>
>>> Recent PRs don't appear to be running all the precommits, and success
>>> status isn't being pushed to PRs. Anyone know what is going on?
>>>
>>> See:
>>> https://github.com/apache/beam/pull/5592
>>> https://github.com/apache/beam/pull/5622
>>>
>>> Andrew
>>>
>>>


Re: Proposal: keeping post-commit tests green

2018-06-13 Thread Ahmet Altay
On Wed, Jun 13, 2018 at 3:52 PM, Mikhail Gryzykhin 
wrote:

> Hi Ahmet,
>
> I've checked on tests status and most of other tests are green 98% of the
> time. So I feel that we do not need any explicit actions for those tests.
>

Is it going to be a one time action to fix existing flaky tests? Or is it
about a process to detect flaky tests in general? If it is former, Java
only makes sense to me.


>
> However java tests seem to have most of the problems. So I moved it to
> requirements explicitly.
>
> I do not bring in fixing failing tests as those should not require any
> specific process.
>
> --Mikhail
>
> Have feedback ?
>
>
>


Re: Precommits broken?

2018-06-13 Thread Yifan Zou
Probably due to the precommit filter applied in #5611
?

On Wed, Jun 13, 2018 at 5:02 PM Andrew Pilloud  wrote:

> Looks like statuses got posted between me writing this email and sending
> it. Still wondering why the python and go jobs appear to be missing?
>
> Andrew
>
> On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud 
> wrote:
>
>> Recent PRs don't appear to be running all the precommits, and success
>> status isn't being pushed to PRs. Anyone know what is going on?
>>
>> See:
>> https://github.com/apache/beam/pull/5592
>> https://github.com/apache/beam/pull/5622
>>
>> Andrew
>>
>>


Re: Building and visualizing the Beam SQL graph

2018-06-13 Thread Anton Kedin
>From the visualization perspective I really loved the interactive runner
demo where it shows the graph:
https://www.youtube.com/watch?v=c5CjA1e3Cqw=27s

On Wed, Jun 13, 2018 at 4:36 PM Kenneth Knowles  wrote:

> Another thing to consider is that we might return something like a
> "SqlPCollection" that is the PCollection plus additional metadata that
> is useful to the shell / enumerable converter (such as if the PCollection
> has a known finite size due to LIMIT, even if it is "unbounded", and the
> shell can return control to the user once it receives enough rows). After
> your proposed change this will be much more natural to do, so that's
> another point in favor of the refactor.
>
> Kenn
>
> On Wed, Jun 13, 2018 at 10:22 AM Andrew Pilloud 
> wrote:
>
>> One of my goals is to make the graph easier to read and map back to the
>> SQL EXPLAIN output. The way the graph is currently built (`toPTransform` vs
>> `toPCollection`) does make a big difference in that graph. I think it is
>> also important to have a common function to do the apply with consistent
>> naming. I think that will greatly help with ease of understanding. It
>> sounds like what really want is this in the BeamRelNode interface:
>>
>> PInput buildPInput(Pipeline pipeline);
>> PTransform> buildPTransform();
>>
>> default PCollection toPCollection(Pipeline pipeline) {
>> return buildPInput(pipeline).apply(getStageName(), buildPTransform());
>> }
>>
>> Andrew
>>
>> On Mon, Jun 11, 2018 at 2:27 PM Mingmin Xu  wrote:
>>
>>> EXPLAIN shows the execution plan in SQL perspective only. After
>>> converting to a Beam composite PTransform, there're more steps underneath,
>>> each Runner re-org Beam PTransforms again which makes the final pipeline
>>> hard to read. In SQL module itself, I don't see any difference between
>>> `toPTransform` and `toPCollection`. We could have an easy-to-understand
>>> step name when converting RelNodes, but Runners show the graph to
>>> developers.
>>>
>>> Mingmin
>>>
>>> On Mon, Jun 11, 2018 at 2:06 PM, Andrew Pilloud 
>>> wrote:
>>>
 That sounds correct. And because each rel node might have a different
 input there isn't a standard interface (like PTransform<
 PCollection, PCollection> toPTransform());

 Andrew

 On Mon, Jun 11, 2018 at 1:31 PM Kenneth Knowles  wrote:

> Agree with that. It will be kind of tricky to generalize. I think
> there are some criteria in this case that might apply in other cases:
>
> 1. Each rel node (or construct of a DSL) should have a PTransform for
> how it computes its result from its inputs.
> 2. The inputs to that PTransform should actually be the inputs to the
> rel node!
>
> So I tried to improve #1 but I probably made #2 worse.
>
> Kenn
>
> On Mon, Jun 11, 2018 at 12:53 PM Anton Kedin  wrote:
>
>> Not answering the original question, but doesn't "explain" satisfy
>> the SQL use case?
>>
>> Going forward we probably want to solve this in a more general way.
>> We have at least 3 ways to represent the pipeline:
>>  - how runner executes it;
>>  - what it looks like when constructed;
>>  - what the user was describing in DSL;
>> And there will probably be more, if extra layers are built on top of
>> DSLs.
>>
>> If possible, we probably should be able to map any level of
>> abstraction to any other to better understand and debug the pipelines.
>>
>>
>> On Mon, Jun 11, 2018 at 12:17 PM Kenneth Knowles 
>> wrote:
>>
>>> In other words, revert
>>> https://github.com/apache/beam/pull/4705/files, at least in spirit?
>>> I agree :-)
>>>
>>> Kenn
>>>
>>> On Mon, Jun 11, 2018 at 11:39 AM Andrew Pilloud 
>>> wrote:
>>>
 We are currently converting the Calcite Rel tree to Beam by
 recursively building a tree of nested PTransforms. This results in a 
 weird
 nested graph in the dataflow UI where each node contains its inputs 
 nested
 inside of it. I'm going to change the internal data structure for
 converting the tree from a PTransform to a PCollection, which will 
 result
 in a more accurate representation of the tree structure being built and
 should simplify the code as well. This will not change the public 
 interface
 to SQL, which will remain a PTransform. Any thoughts or objections?

 I was also wondering if there are tools for visualizing the Beam
 graph aside from the dataflow runner UI. What other tools exist?

 Andrew

>>>
>>>
>>>
>>> --
>>> 
>>> Mingmin
>>>
>>


Re: Precommits broken?

2018-06-13 Thread Andrew Pilloud
Looks like statuses got posted between me writing this email and sending
it. Still wondering why the python and go jobs appear to be missing?

Andrew

On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud  wrote:

> Recent PRs don't appear to be running all the precommits, and success
> status isn't being pushed to PRs. Anyone know what is going on?
>
> See:
> https://github.com/apache/beam/pull/5592
> https://github.com/apache/beam/pull/5622
>
> Andrew
>
>


Precommits broken?

2018-06-13 Thread Andrew Pilloud
Recent PRs don't appear to be running all the precommits, and success
status isn't being pushed to PRs. Anyone know what is going on?

See:
https://github.com/apache/beam/pull/5592
https://github.com/apache/beam/pull/5622

Andrew


Re: SQL Filter Pushdowns in Apache Beam SQL

2018-06-13 Thread Kenneth Knowles
This has come up in a couple of in-person conversations. Pushing filtering
and projection into to connectors is something we intend to do. Calcite's
optimizer is designed to support this, we just don't have it set up.

Your use case sounds like one that might test the limits of that, since the
JDBC read would occur before windowing or setting it up as a side input.
I'd be curious what a Beam pipeline to do this without SQL would look like.

Kenn

On Wed, Jun 13, 2018 at 8:47 AM Lukasz Cwik  wrote:

> It is currently the later where all the data is read and then filtered
> within the pipeline. Note that this doesn't mean that all the data is
> loaded into memory as the way that the join is done is dependent on the
> Runner that is powering the pipeline.
>
> Kenn had shared this doc[1] which is starting to look at integrating
> Runners and IO into the SQL shell and attempting to start defining a way to
> map properties from SQL onto the IO connector but it seems natural that the
> filter would get pushed down to the IO connector as well. Please take a
> look and feel free to comment.
>
> 1:
> https://docs.google.com/document/d/1ZFVlnldrIYhUgOfxIT2JcmTFFSWTl4HwAnQsnwiNL1g/edit#heading=h.4zubkdp87wok
>
> On Wed, Jun 13, 2018 at 7:39 AM Harshvardhan Agrawal <
> harshvardhan.ag...@gmail.com> wrote:
>
>> Hi,
>>
>> We are currently playing with Apache Beam’s SQL extension on top of
>> Flink. One of the features that we were interested is the SQL Predicate
>> Pushdown feature that Spark provides. Does Beam support that?
>>
>> For eg:
>> I have an unbounded dataset that I want to join with some static
>> reference data stored in a database. Will beam perform the logic of
>> figuring out all the unique keys in the window and push it down to the jdbc
>> source or will it bring all the data from the jdbc source into memory and
>> then perform the join?
>>
>> Thanks,
>> Harsh
>> --
>> Regards,
>> Harshvardhan
>>
>


Re: Apache Beam June Newsletter

2018-06-13 Thread Pablo Estrada
Thanks Gris! Lots of interesting things.
Best
-P.

On Wed, Jun 13, 2018 at 4:40 PM Griselda Cuevas  wrote:

> Hi Beam Community!
>
> Here
> 
>  [1]
> is the June Edition of our Apache Beam Newsletter. This edition was curated
> by our community of contributors, committers and PMCs. Generally, it
> contains the work done in the previous month (May in this case) and what's
> planned for the future.
>
> We hope to provide visibility to what's going on in the community, so if
> you have questions, feel free to ask in this thread.
>
> Cheers,
> Gris
>
> [1]
> https://docs.google.com/document/d/1BwRhOu-uDd3SLB_Om_Beke5RoGKos4hj7Ljh7zM2YIo/edit?ts=5b17fb92#
>
> --
> You received this message because you are subscribed to the Google Groups
> "datapls-team" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datapls-team+unsubscr...@google.com.
> To post to this group, send email to datapls-t...@google.com.
> To view this discussion on the web visit
> https://groups.google.com/a/google.com/d/msgid/datapls-team/CAMtXPk6KnivR%3Dea8ObNhTVoacDDAn35_Nrsa52hLzY21SjJPEw%40mail.gmail.com
> 
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "dataflow-team" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dataflow-team+unsubscr...@google.com.
> To post to this group, send email to dataflow-t...@google.com.
> To view this discussion on the web visit
> https://groups.google.com/a/google.com/d/msgid/dataflow-team/CAMtXPk6KnivR%3Dea8ObNhTVoacDDAn35_Nrsa52hLzY21SjJPEw%40mail.gmail.com
> 
> .
>
-- 
Got feedback? go/pabloem-feedback


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Yifan Zou
Thanks Kenn. These are all direct dependencies.

On Wed, Jun 13, 2018 at 4:40 PM Yifan Zou  wrote:

> Thanks everyone for feedbacks!
>
> We will embed some text to explain the details of the report and guide
> people what to do with it at this point. Cham and I prepare to start
> updating those dependencies, and have them grouped and find them owners if
> possible. Also, we will try to automate JIRA filing and assign them to
> owners to review/upgrade dependencies.
>
> Regards.
>
> Yifan Zou
>
> On Wed, Jun 13, 2018 at 10:45 AM Chamikara Jayalath 
> wrote:
>
>>
>> Thanks Yifan.
>>
>> On Wed, Jun 13, 2018 at 10:21 AM Ahmet Altay  wrote:
>>
>>> Thanks Yifan, this is great!
>>>
>>> My unsolicited feedback:
>>> - Could it warn against dependencies that did not get updates for a long
>>> time? For python there were examples of a dependency being abandoned by its
>>> own developers and it took us a while to figure it out and switch to
>>> maintained one. (Currently googledatastore is a dependency like that.)
>>>
>>> I will second the Scott's question for what should I do with this
>>> report? Is it possible to add a link to quickly create a JIRA issue for a
>>> given dependency? Or is it possible to link to already open issues for the
>>> identified dependencies?
>>>
>>
>> We ultimately want to create JIRAs automatically and assign them to
>> owners if defined according to the policy document. We can also add a link
>> to the JIRA to the report. But given the number of outdated dependencies
>> listed here, looks like we'll have to bootstrap the process by manually
>> updating many of these dependencies.
>>
>> - Cham
>>
>>
>>> Ahmet
>>>
>>> On Wed, Jun 13, 2018 at 10:13 AM, Scott Wegner 
>>> wrote:
>>>
 Nifty. Here's some unsolicited feedback:

 * The report gives a nice view of the data and leaves it as an exercise
 to the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major
 version behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the
 report more digestable if these details were already included.
 * The next question after reading this report is "What should I do with
 this?" I recommend embedding details or links to answer that. For example:

   "High Priority Dependency Updates are defined as XYZ. In Beam, we
 make a best-effort attempt at keeping all dependencies up-to-date according
 to ABC. In the future, issues will be filed and tracked for these
 automatically, but in the meantime you can search for existing issues or
 open a new one . Read more about our dependency update policy 
 ."

 On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada 
 wrote:

> Ahh very nice... thanks Yifan & Cham!
> Lots of old dependencies eh... very interesting.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:
>
>> Hi,
>>
>>
>> I want to follow up and explain this email.
>>
>>
>> This is a sample email that reports the results of Beam SDK
>> dependency check, which was proposed here
>> .
>> The goal is finding updates for all Beam Python & Java SDKs' dependencies
>> and prioritize them. The job will be auto triggered in Jenkins once a 
>> week
>> and generate a report. The report lists the high priority updates base on
>> the following criteria:
>>
>>
>> The dependency update is high priority if:
>>
>> 1. It has major versions update available;
>>
>>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>>
>>  2. or, it is over 3 minor versions behind the latest version;
>>
>>   e.g. org.tukaani:xz 1.5 -> 1.8
>>
>> 3. or, the current version is behind the later version for over 180
>> days.
>>
>>   e.g. com.google.auto.service:auto-service 2014-10-24 ->
>> 2017-12-11
>>
>>
>> This job helps Beam contributors to determine the dependency which is
>> far behind the latest released version. The next step would be automating
>> filing JIRA bugs for dep updates, group dependencies and identify owners 
>> to
>> take care of the upgrades follow Chamikara's proposal
>> 
>> .
>>
>>
>> For more readings:
>>
>> [Proposal] Beam dependency check automation
>> 
>>  by Yifan Zou
>>
>> [Proposal] Beam dependency update policy
>> 
>>  by *Chamikara Jayalath*
>>
>> Thank you.
>>
>> Yifan Zou
>>
>> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:

Apache Beam June Newsletter

2018-06-13 Thread Griselda Cuevas
Hi Beam Community!

Here

[1]
is the June Edition of our Apache Beam Newsletter. This edition was curated
by our community of contributors, committers and PMCs. Generally, it
contains the work done in the previous month (May in this case) and what's
planned for the future.

We hope to provide visibility to what's going on in the community, so if
you have questions, feel free to ask in this thread.

Cheers,
Gris

[1]
https://docs.google.com/document/d/1BwRhOu-uDd3SLB_Om_Beke5RoGKos4hj7Ljh7zM2YIo/edit?ts=5b17fb92#


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Yifan Zou
Thanks everyone for feedbacks!

We will embed some text to explain the details of the report and guide
people what to do with it at this point. Cham and I prepare to start
updating those dependencies, and have them grouped and find them owners if
possible. Also, we will try to automate JIRA filing and assign them to
owners to review/upgrade dependencies.

Regards.

Yifan Zou

On Wed, Jun 13, 2018 at 10:45 AM Chamikara Jayalath 
wrote:

>
> Thanks Yifan.
>
> On Wed, Jun 13, 2018 at 10:21 AM Ahmet Altay  wrote:
>
>> Thanks Yifan, this is great!
>>
>> My unsolicited feedback:
>> - Could it warn against dependencies that did not get updates for a long
>> time? For python there were examples of a dependency being abandoned by its
>> own developers and it took us a while to figure it out and switch to
>> maintained one. (Currently googledatastore is a dependency like that.)
>>
>> I will second the Scott's question for what should I do with this report?
>> Is it possible to add a link to quickly create a JIRA issue for a given
>> dependency? Or is it possible to link to already open issues for the
>> identified dependencies?
>>
>
> We ultimately want to create JIRAs automatically and assign them to owners
> if defined according to the policy document. We can also add a link to the
> JIRA to the report. But given the number of outdated dependencies listed
> here, looks like we'll have to bootstrap the process by manually updating
> many of these dependencies.
>
> - Cham
>
>
>> Ahmet
>>
>> On Wed, Jun 13, 2018 at 10:13 AM, Scott Wegner 
>> wrote:
>>
>>> Nifty. Here's some unsolicited feedback:
>>>
>>> * The report gives a nice view of the data and leaves it as an exercise
>>> to the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major
>>> version behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the
>>> report more digestable if these details were already included.
>>> * The next question after reading this report is "What should I do with
>>> this?" I recommend embedding details or links to answer that. For example:
>>>
>>>   "High Priority Dependency Updates are defined as XYZ. In Beam, we make
>>> a best-effort attempt at keeping all dependencies up-to-date according to
>>> ABC. In the future, issues will be filed and tracked for these
>>> automatically, but in the meantime you can search for existing issues or
>>> open a new one . Read more about our dependency update policy ."
>>>
>>> On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada 
>>> wrote:
>>>
 Ahh very nice... thanks Yifan & Cham!
 Lots of old dependencies eh... very interesting.
 Best
 -P.

 On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:

> Hi,
>
>
> I want to follow up and explain this email.
>
>
> This is a sample email that reports the results of Beam SDK dependency
> check, which was proposed here
> .
> The goal is finding updates for all Beam Python & Java SDKs' dependencies
> and prioritize them. The job will be auto triggered in Jenkins once a week
> and generate a report. The report lists the high priority updates base on
> the following criteria:
>
>
> The dependency update is high priority if:
>
> 1. It has major versions update available;
>
>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>
>  2. or, it is over 3 minor versions behind the latest version;
>
>   e.g. org.tukaani:xz 1.5 -> 1.8
>
> 3. or, the current version is behind the later version for over 180
> days.
>
>   e.g. com.google.auto.service:auto-service 2014-10-24 ->
> 2017-12-11
>
>
> This job helps Beam contributors to determine the dependency which is
> far behind the latest released version. The next step would be automating
> filing JIRA bugs for dep updates, group dependencies and identify owners 
> to
> take care of the upgrades follow Chamikara's proposal
> 
> .
>
>
> For more readings:
>
> [Proposal] Beam dependency check automation
> 
>  by Yifan Zou
>
> [Proposal] Beam dependency update policy
> 
>  by *Chamikara Jayalath*
>
> Thank you.
>
> Yifan Zou
>
> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> High Priority Dependency Updates Of Beam Python SDK:
>> *Dependency Name* *Current Version* *Later Version* *Current Version
>> Release Date* *Later Version Release Date*
>> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08

Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Kenneth Knowles
Wow, this is extremely readable and actionable. Are these all direct
dependencies, or also transitive?

Kenn

On Wed, Jun 13, 2018 at 10:45 AM Chamikara Jayalath 
wrote:

>
> Thanks Yifan.
>
> On Wed, Jun 13, 2018 at 10:21 AM Ahmet Altay  wrote:
>
>> Thanks Yifan, this is great!
>>
>> My unsolicited feedback:
>> - Could it warn against dependencies that did not get updates for a long
>> time? For python there were examples of a dependency being abandoned by its
>> own developers and it took us a while to figure it out and switch to
>> maintained one. (Currently googledatastore is a dependency like that.)
>>
>> I will second the Scott's question for what should I do with this report?
>> Is it possible to add a link to quickly create a JIRA issue for a given
>> dependency? Or is it possible to link to already open issues for the
>> identified dependencies?
>>
>
> We ultimately want to create JIRAs automatically and assign them to owners
> if defined according to the policy document. We can also add a link to the
> JIRA to the report. But given the number of outdated dependencies listed
> here, looks like we'll have to bootstrap the process by manually updating
> many of these dependencies.
>
> - Cham
>
>
>> Ahmet
>>
>> On Wed, Jun 13, 2018 at 10:13 AM, Scott Wegner 
>> wrote:
>>
>>> Nifty. Here's some unsolicited feedback:
>>>
>>> * The report gives a nice view of the data and leaves it as an exercise
>>> to the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major
>>> version behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the
>>> report more digestable if these details were already included.
>>> * The next question after reading this report is "What should I do with
>>> this?" I recommend embedding details or links to answer that. For example:
>>>
>>>   "High Priority Dependency Updates are defined as XYZ. In Beam, we make
>>> a best-effort attempt at keeping all dependencies up-to-date according to
>>> ABC. In the future, issues will be filed and tracked for these
>>> automatically, but in the meantime you can search for existing issues or
>>> open a new one . Read more about our dependency update policy ."
>>>
>>> On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada 
>>> wrote:
>>>
 Ahh very nice... thanks Yifan & Cham!
 Lots of old dependencies eh... very interesting.
 Best
 -P.

 On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:

> Hi,
>
>
> I want to follow up and explain this email.
>
>
> This is a sample email that reports the results of Beam SDK dependency
> check, which was proposed here
> .
> The goal is finding updates for all Beam Python & Java SDKs' dependencies
> and prioritize them. The job will be auto triggered in Jenkins once a week
> and generate a report. The report lists the high priority updates base on
> the following criteria:
>
>
> The dependency update is high priority if:
>
> 1. It has major versions update available;
>
>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>
>  2. or, it is over 3 minor versions behind the latest version;
>
>   e.g. org.tukaani:xz 1.5 -> 1.8
>
> 3. or, the current version is behind the later version for over 180
> days.
>
>   e.g. com.google.auto.service:auto-service 2014-10-24 ->
> 2017-12-11
>
>
> This job helps Beam contributors to determine the dependency which is
> far behind the latest released version. The next step would be automating
> filing JIRA bugs for dep updates, group dependencies and identify owners 
> to
> take care of the upgrades follow Chamikara's proposal
> 
> .
>
>
> For more readings:
>
> [Proposal] Beam dependency check automation
> 
>  by Yifan Zou
>
> [Proposal] Beam dependency update policy
> 
>  by *Chamikara Jayalath*
>
> Thank you.
>
> Yifan Zou
>
> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> High Priority Dependency Updates Of Beam Python SDK:
>> *Dependency Name* *Current Version* *Later Version* *Current Version
>> Release Date* *Later Version Release Date*
>> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
>> httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
>> Updates Of Beam Java SDK:
>> *Dependency Name* *Current Version* *Later Version* *Current Version
>> Release Date* *Later Version Release Date*
>> org.assertj:assertj-core 

Re: Building and visualizing the Beam SQL graph

2018-06-13 Thread Kenneth Knowles
Another thing to consider is that we might return something like a
"SqlPCollection" that is the PCollection plus additional metadata that
is useful to the shell / enumerable converter (such as if the PCollection
has a known finite size due to LIMIT, even if it is "unbounded", and the
shell can return control to the user once it receives enough rows). After
your proposed change this will be much more natural to do, so that's
another point in favor of the refactor.

Kenn

On Wed, Jun 13, 2018 at 10:22 AM Andrew Pilloud  wrote:

> One of my goals is to make the graph easier to read and map back to the
> SQL EXPLAIN output. The way the graph is currently built (`toPTransform` vs
> `toPCollection`) does make a big difference in that graph. I think it is
> also important to have a common function to do the apply with consistent
> naming. I think that will greatly help with ease of understanding. It
> sounds like what really want is this in the BeamRelNode interface:
>
> PInput buildPInput(Pipeline pipeline);
> PTransform> buildPTransform();
>
> default PCollection toPCollection(Pipeline pipeline) {
> return buildPInput(pipeline).apply(getStageName(), buildPTransform());
> }
>
> Andrew
>
> On Mon, Jun 11, 2018 at 2:27 PM Mingmin Xu  wrote:
>
>> EXPLAIN shows the execution plan in SQL perspective only. After
>> converting to a Beam composite PTransform, there're more steps underneath,
>> each Runner re-org Beam PTransforms again which makes the final pipeline
>> hard to read. In SQL module itself, I don't see any difference between
>> `toPTransform` and `toPCollection`. We could have an easy-to-understand
>> step name when converting RelNodes, but Runners show the graph to
>> developers.
>>
>> Mingmin
>>
>> On Mon, Jun 11, 2018 at 2:06 PM, Andrew Pilloud 
>> wrote:
>>
>>> That sounds correct. And because each rel node might have a different
>>> input there isn't a standard interface (like PTransform,
>>> PCollection> toPTransform());
>>>
>>> Andrew
>>>
>>> On Mon, Jun 11, 2018 at 1:31 PM Kenneth Knowles  wrote:
>>>
 Agree with that. It will be kind of tricky to generalize. I think there
 are some criteria in this case that might apply in other cases:

 1. Each rel node (or construct of a DSL) should have a PTransform for
 how it computes its result from its inputs.
 2. The inputs to that PTransform should actually be the inputs to the
 rel node!

 So I tried to improve #1 but I probably made #2 worse.

 Kenn

 On Mon, Jun 11, 2018 at 12:53 PM Anton Kedin  wrote:

> Not answering the original question, but doesn't "explain" satisfy the
> SQL use case?
>
> Going forward we probably want to solve this in a more general way. We
> have at least 3 ways to represent the pipeline:
>  - how runner executes it;
>  - what it looks like when constructed;
>  - what the user was describing in DSL;
> And there will probably be more, if extra layers are built on top of
> DSLs.
>
> If possible, we probably should be able to map any level of
> abstraction to any other to better understand and debug the pipelines.
>
>
> On Mon, Jun 11, 2018 at 12:17 PM Kenneth Knowles 
> wrote:
>
>> In other words, revert https://github.com/apache/beam/pull/4705/files,
>> at least in spirit? I agree :-)
>>
>> Kenn
>>
>> On Mon, Jun 11, 2018 at 11:39 AM Andrew Pilloud 
>> wrote:
>>
>>> We are currently converting the Calcite Rel tree to Beam by
>>> recursively building a tree of nested PTransforms. This results in a 
>>> weird
>>> nested graph in the dataflow UI where each node contains its inputs 
>>> nested
>>> inside of it. I'm going to change the internal data structure for
>>> converting the tree from a PTransform to a PCollection, which will 
>>> result
>>> in a more accurate representation of the tree structure being built and
>>> should simplify the code as well. This will not change the public 
>>> interface
>>> to SQL, which will remain a PTransform. Any thoughts or objections?
>>>
>>> I was also wondering if there are tools for visualizing the Beam
>>> graph aside from the dataflow runner UI. What other tools exist?
>>>
>>> Andrew
>>>
>>
>>
>>
>> --
>> 
>> Mingmin
>>
>


Re: Proposal: keeping post-commit tests green

2018-06-13 Thread Mikhail Gryzykhin
Hi Ahmet,

I've checked on tests status and most of other tests are green 98% of the
time. So I feel that we do not need any explicit actions for those tests.

However java tests seem to have most of the problems. So I moved it to
requirements explicitly.

I do not bring in fixing failing tests as those should not require any
specific process.

--Mikhail

Have feedback ?


On Wed, Jun 13, 2018 at 3:49 PM Ahmet Altay  wrote:

>
>
> On Wed, Jun 13, 2018 at 3:45 PM, Mikhail Gryzykhin 
> wrote:
>
>> Hello everybody,
>>
>> Thanks everyone. I didn't receive any more feedback on the design
>> proposal document [1] and I believe we've reached consensus. I've added
>> implementation tasks in JIRA (BEAM-4559 [2])  and will start coding soon.
>> As a recap, the high-level plan is:
>>
>>
>>- Split existing post-commit tests jobs to automatically and manually
>>triggered
>>- Add tracking by JIRA bugs for failing test job
>>- Create document describing post-commit failures handling policies
>>- Add tests status badge to PR template
>>- Create dashboard for post-commit tests
>>- Detect and fix flaky java tests (if any)
>>
>>
> Why is this limited to flaky java tests?
>
>
>>
>> [1]
>> https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME
>> [2] https://issues.apache.org/jira/browse/BEAM-4559
>>
>> --Mikhail
>>
>>
>>
>


Re: Proposal: keeping post-commit tests green

2018-06-13 Thread Ahmet Altay
On Wed, Jun 13, 2018 at 3:45 PM, Mikhail Gryzykhin 
wrote:

> Hello everybody,
>
> Thanks everyone. I didn't receive any more feedback on the design proposal
> document [1] and I believe we've reached consensus. I've added
> implementation tasks in JIRA (BEAM-4559 [2])  and will start coding soon.
> As a recap, the high-level plan is:
>
>
>- Split existing post-commit tests jobs to automatically and manually
>triggered
>- Add tracking by JIRA bugs for failing test job
>- Create document describing post-commit failures handling policies
>- Add tests status badge to PR template
>- Create dashboard for post-commit tests
>- Detect and fix flaky java tests (if any)
>
>
Why is this limited to flaky java tests?


>
> [1] https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7V
> iXXAebBAf_uQME
> [2] https://issues.apache.org/jira/browse/BEAM-4559
>
> --Mikhail
>
>
>


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Pablo Estrada
Sent out https://github.com/apache/beam/pull/5640 to ignore the flaky test.
As JB is the release manager, I'l let him make the call on what to do about
it.
Best
-P.

On Wed, Jun 13, 2018 at 3:34 PM Ahmet Altay  wrote:

> I would vote for second option, not a release blocker and disable the test
> in the release branch. My reasoning is:
> - ReferenceRunner is not yet the official alternative to existing direct
> runners.
> - It is bad to have flaky tests on the release branch, and we would not
> get good signal during validation.
>
> On Wed, Jun 13, 2018 at 3:14 PM, Pablo Estrada  wrote:
>
>> Hello all,
>> cherrypicks for the release branch seem to be going well, but thanks to
>> them we were able to surface a flaky test in the release branch. JIRA is
>> filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558
>>
>> Given that test issue, I see the following options:
>> - Consider that this test is not a release blocker. Go ahead with RC2
>> after cherrypicks are brought in, or
>> - Consider that this test is not a release blocker, so we disable it
>> before cutting RC2.
>> - Consider this test a release blocker, and triage the bug for fixing.
>>
>> What do you think?
>>
>> Best
>> -P.
>>
>> On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada  wrote:
>>
>>> Precommits for PR https://github.com/apache/beam/pull/5609 are now
>>> passing. For now I've simply set failOnWarning to false to cherrypick into
>>> the release, and fix in master later on.
>>> Best
>>> -P.
>>>
>>> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner  wrote:
>>>
 From my understanding, the @SuppressFBWarnings usage is in a dependency
 (ByteBuddy) rather than directly in our code; so we're not able to modify
 the usage.

 Pablo, feel free to disable failOnWarning for the sdks-java-core
 project temporarily. This isn't a major regression since we've only
 recently made the change to enable it [1]. We can work separately on
 figuring out how to resolve the warnings.

 [1] https://github.com/apache/beam/pull/5319

 On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson <
 timrobertson...@gmail.com> wrote:

> Hi Pablo,
>
> I'm afraid I couldn't find one either... there is an issue about it
> [1] which is old so it doesn't look likely to be resolved either.
>
> If you have time (sorry I am a bit busy) could you please verify the
> version does work if you install that version locally? I know the maven
> version of that [2] but not sure on the gradle equivalent. If we know it
> works, we can then find a repository that fits ok with Apache/Beam policy.
>
> Alternatively, we could consider using a fully qualified reference
> (i.e. @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
> version and leave the dependency at the 1.3.9-1. I believe our general
> direction is to remove findbugs when errorprone covers all aspects so I
> *expect* this should be considered reasonable.
>
> I hope this helps,
> Tim
>
> [1] https://github.com/stephenc/findbugs-annotations/issues/4
> [2]
> https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>
> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
> wrote:
>
>> Hi Tim,
>> you're right. Thanks for pointing that out. There's just one problem
>> that I'm running into now: The 3.0.1-1 version does not seem to be
>> available in Maven Central[1]. Looking at the website, I am not quite 
>> sure
>> if there's another repository where they do stage the newer versions?[2]
>>
>> -P
>>
>> [1]
>> https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
>> /
>> [2] http://stephenc.github.io/findbugs-annotations/
>>
>> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
>> timrobertson...@gmail.com> wrote:
>>
>>> Hi Pablo,
>>>
>>> I took only a quick look.
>>>
>>> "- The JAR from the non-LGPL findbugs does not contain the
>>> SuppressFBWarnings annotation"
>>>
>>> Unless I misunderstand you it looks like SuppressFBWarnings was
>>> added in Stephen's version in this commit [1] which was introduced
>>> in version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
>>> I notice in your commits [1] you've been exploring version 3.0.0
>>> already though... what happens when you use 3.0.1-1? It sounds like the
>>> wrong version is coming in rather than the annotation being missing.
>>>
>>> Thanks,
>>> Tim
>>>
>>> [1]
>>> https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
>>> [2] https://github.com/stephenc/findbugs-annotations/releases
>>> [3]
>>> https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Re: Proposal: keeping post-commit tests green

2018-06-13 Thread Mikhail Gryzykhin
Hello everybody,

Thanks everyone. I didn't receive any more feedback on the design proposal
document [1] and I believe we've reached consensus. I've added
implementation tasks in JIRA (BEAM-4559 [2])  and will start coding soon.
As a recap, the high-level plan is:


   - Split existing post-commit tests jobs to automatically and manually
   triggered
   - Add tracking by JIRA bugs for failing test job
   - Create document describing post-commit failures handling policies
   - Add tests status badge to PR template
   - Create dashboard for post-commit tests
   - Detect and fix flaky java tests (if any)


[1]
https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME
[2] https://issues.apache.org/jira/browse/BEAM-4559

--Mikhail


On Wed, Jun 6, 2018 at 1:12 PM Mikhail Gryzykhin  wrote:

> Hello everyone,
>
> Most of the comments on my last draft addressed technical details of
> automation implementation of specific processes proposed. No major process
> changes were suggested.
>
> If you have not yet, please review this document.
>
> Highlights from last change:
> * Bumped splitting tests jobs after Kenneths comment.
> * No-commit in case of too many open JIRA tickets (metric was there,
> action was missing)
> * No-commit in case of too old JIRA ticket (metric was there, action was
> missing)
> * Closed comments that are addressed in document.
>
> This document already has two LGTMs from Scott Wegner and Thomas Weise.
> If no major comments will come, I'll treat this document as complete and
> start working on implementing work items defined in this document.
>
> Thank you,
> --Mikhail
>
>
> On Tue, Jun 5, 2018 at 7:38 PM Thomas Weise  wrote:
>
>> Thanks for taking this initiative. As the number of contributors grows,
>> so does the cost of broken builds. I'm also in favor of locking master
>> merges until related issues are fixed (short term pain for long term
>> gain). It would penalize a few for the benefit of many.
>>
>> On that note, recently we also had a fair share of pre-commit build
>> issues, with a few making their way to master. These include instances
>> unrelated to build tooling, such as compile error or packaging. I don't
>> think we should run PR merges over the red light and suggest it is
>> necessary to step up the gatekeeper responsibility committers have.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Tue, Jun 5, 2018 at 10:56 AM, Scott Wegner  wrote:
>>
>>> I've taken another pass over the doc, and it looks good to me. Thanks
>>> for driving this effort!
>>>
>>> On Mon, Jun 4, 2018 at 9:08 AM Mikhail Gryzykhin 
>>> wrote:
>>>
 Hello everyone,

 I have addressed comments on the proposal doc and updated it
 accordingly. I have also added section on metrics that we want to track for
 pre-commit tests and contents for dashboard.

 Please, take a second look at the document.

 Highlights:
 * Sections that I feel require more discussion are marked with *[More
 opinions wanted]*
 ** I've kept original comments open for this iteration. Please, close
 them if you feel those resolved, or elaborate more on the topic.*
 * Added information on metrics to track
 * Moved “Split test jobs into automatically and manually triggered” to
 “Other ideas to consider”
 * Prioritized automated JIRA ticket creation over manual
 * Prioritized roll-back first policy
 * Added process for enforcing proposed policies.

 --Mikhail

 Have feedback ?


 On Tue, May 22, 2018 at 10:11 AM Scott Wegner 
 wrote:

> Thanks for the thoughtful proposal Mikhail. I've left some comments in
> the doc.
>
> I encourage others to take a look: the proposal adds some strong
> policies about dealing with post-commit failures (rollback policy, locking
> master). Currently our post-commits are frequently red, and we're missing
> out on a valuable quality signal. I'm in favor of such policies to help 
> get
> the test signals back to a healthy state.
>
> On Mon, May 21, 2018 at 2:48 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi Everyone,
>>
>> I've updated design doc according to comments.
>>
>> https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME
>>
>> In general, ideas proposed seem to be appreciated. Still, some of
>> sections require more discussion.
>>
>> Changes highlight:
>> * Added roll-back first policy to best practices. This includes
>> process on how to handle roll-back.
>> * Marked topics that I'd like to have more input on. [cyan color]
>>
>> --Mikhail
>>
>> Have feedback ?
>>
>>
>> On Fri, May 18, 2018 at 10:56 AM Andrew Pilloud 
>> wrote:
>>
>>> Blocking commits to master on test flaps seems critical here. The
>>> test flaps won't get the attention they deserve as long as people are 

Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Ahmet Altay
I would vote for second option, not a release blocker and disable the test
in the release branch. My reasoning is:
- ReferenceRunner is not yet the official alternative to existing direct
runners.
- It is bad to have flaky tests on the release branch, and we would not get
good signal during validation.

On Wed, Jun 13, 2018 at 3:14 PM, Pablo Estrada  wrote:

> Hello all,
> cherrypicks for the release branch seem to be going well, but thanks to
> them we were able to surface a flaky test in the release branch. JIRA is
> filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558
>
> Given that test issue, I see the following options:
> - Consider that this test is not a release blocker. Go ahead with RC2
> after cherrypicks are brought in, or
> - Consider that this test is not a release blocker, so we disable it
> before cutting RC2.
> - Consider this test a release blocker, and triage the bug for fixing.
>
> What do you think?
>
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada  wrote:
>
>> Precommits for PR https://github.com/apache/beam/pull/5609 are now
>> passing. For now I've simply set failOnWarning to false to cherrypick into
>> the release, and fix in master later on.
>> Best
>> -P.
>>
>> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner  wrote:
>>
>>> From my understanding, the @SuppressFBWarnings usage is in a dependency
>>> (ByteBuddy) rather than directly in our code; so we're not able to modify
>>> the usage.
>>>
>>> Pablo, feel free to disable failOnWarning for the sdks-java-core project
>>> temporarily. This isn't a major regression since we've only recently made
>>> the change to enable it [1]. We can work separately on figuring out how to
>>> resolve the warnings.
>>>
>>> [1] https://github.com/apache/beam/pull/5319
>>>
>>> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson <
>>> timrobertson...@gmail.com> wrote:
>>>
 Hi Pablo,

 I'm afraid I couldn't find one either... there is an issue about it [1]
 which is old so it doesn't look likely to be resolved either.

 If you have time (sorry I am a bit busy) could you please verify the
 version does work if you install that version locally? I know the maven
 version of that [2] but not sure on the gradle equivalent. If we know it
 works, we can then find a repository that fits ok with Apache/Beam policy.

 Alternatively, we could consider using a fully qualified reference
 (i.e. @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the
 deprecated version and leave the dependency at the 1.3.9-1. I believe our
 general direction is to remove findbugs when errorprone covers all aspects
 so I *expect* this should be considered reasonable.

 I hope this helps,
 Tim

 [1] https://github.com/stephenc/findbugs-annotations/issues/4
 [2] https://maven.apache.org/guides/mini/guide-3rd-party-
 jars-local.html

 On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
 wrote:

> Hi Tim,
> you're right. Thanks for pointing that out. There's just one problem
> that I'm running into now: The 3.0.1-1 version does not seem to be
> available in Maven Central[1]. Looking at the website, I am not quite sure
> if there's another repository where they do stage the newer versions?[2]
>
> -P
>
> [1] https://repo.maven.apache.org/maven2/com/github/
> stephenc/findbugs/findbugs-annotations/
> [2] http://stephenc.github.io/findbugs-annotations/
>
> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
> timrobertson...@gmail.com> wrote:
>
>> Hi Pablo,
>>
>> I took only a quick look.
>>
>> "- The JAR from the non-LGPL findbugs does not contain the
>> SuppressFBWarnings annotation"
>>
>> Unless I misunderstand you it looks like SuppressFBWarnings was added
>> in Stephen's version in this commit [1] which was introduced in
>> version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
>> I notice in your commits [1] you've been exploring version 3.0.0
>> already though... what happens when you use 3.0.1-1? It sounds like the
>> wrong version is coming in rather than the annotation being missing.
>>
>> Thanks,
>> Tim
>>
>> [1] https://github.com/stephenc/findbugs-annotations/
>> commits/master/src/main/java/edu/umd/cs/findbugs/
>> annotations/SuppressWarnings.java
>> [2] https://github.com/stephenc/findbugs-annotations/releases
>> [3] https://github.com/apache/beam/pull/5609/commits/
>> 32c7df706e970557f154ff6bc521b2e00f9d09ab
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
>> wrote:
>>
>>> Hi all,
>>> I'll humbly declare that after wrestling with he build to stop
>>> depending on the wrong findbugs_annotations, I feel somewhat lost. The
>>> issue is actually quite small:
>>>
>>> - The JAR from the non-LGPL findbugs does not contain the

Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Pablo Estrada
Hello all,
cherrypicks for the release branch seem to be going well, but thanks to
them we were able to surface a flaky test in the release branch. JIRA is
filed: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4558

Given that test issue, I see the following options:
- Consider that this test is not a release blocker. Go ahead with RC2 after
cherrypicks are brought in, or
- Consider that this test is not a release blocker, so we disable it before
cutting RC2.
- Consider this test a release blocker, and triage the bug for fixing.

What do you think?

Best
-P.

On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada  wrote:

> Precommits for PR https://github.com/apache/beam/pull/5609 are now
> passing. For now I've simply set failOnWarning to false to cherrypick into
> the release, and fix in master later on.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner  wrote:
>
>> From my understanding, the @SuppressFBWarnings usage is in a dependency
>> (ByteBuddy) rather than directly in our code; so we're not able to modify
>> the usage.
>>
>> Pablo, feel free to disable failOnWarning for the sdks-java-core project
>> temporarily. This isn't a major regression since we've only recently made
>> the change to enable it [1]. We can work separately on figuring out how to
>> resolve the warnings.
>>
>> [1] https://github.com/apache/beam/pull/5319
>>
>> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson 
>> wrote:
>>
>>> Hi Pablo,
>>>
>>> I'm afraid I couldn't find one either... there is an issue about it [1]
>>> which is old so it doesn't look likely to be resolved either.
>>>
>>> If you have time (sorry I am a bit busy) could you please verify the
>>> version does work if you install that version locally? I know the maven
>>> version of that [2] but not sure on the gradle equivalent. If we know it
>>> works, we can then find a repository that fits ok with Apache/Beam policy.
>>>
>>> Alternatively, we could consider using a fully qualified reference (i.e.
>>> @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
>>> version and leave the dependency at the 1.3.9-1. I believe our general
>>> direction is to remove findbugs when errorprone covers all aspects so I
>>> *expect* this should be considered reasonable.
>>>
>>> I hope this helps,
>>> Tim
>>>
>>> [1] https://github.com/stephenc/findbugs-annotations/issues/4
>>> [2] https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>>>
>>> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
>>> wrote:
>>>
 Hi Tim,
 you're right. Thanks for pointing that out. There's just one problem
 that I'm running into now: The 3.0.1-1 version does not seem to be
 available in Maven Central[1]. Looking at the website, I am not quite sure
 if there's another repository where they do stage the newer versions?[2]

 -P

 [1]
 https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
 /
 [2] http://stephenc.github.io/findbugs-annotations/

 On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
 timrobertson...@gmail.com> wrote:

> Hi Pablo,
>
> I took only a quick look.
>
> "- The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation"
>
> Unless I misunderstand you it looks like SuppressFBWarnings was added
> in Stephen's version in this commit [1] which was introduced in
> version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
> I notice in your commits [1] you've been exploring version 3.0.0
> already though... what happens when you use 3.0.1-1? It sounds like the
> wrong version is coming in rather than the annotation being missing.
>
> Thanks,
> Tim
>
> [1]
> https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
> [2] https://github.com/stephenc/findbugs-annotations/releases
> [3]
> https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab
>
>
>
>
>
>
>
> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
> wrote:
>
>> Hi all,
>> I'll humbly declare that after wrestling with he build to stop
>> depending on the wrong findbugs_annotations, I feel somewhat lost. The
>> issue is actually quite small:
>>
>> - The JAR from the non-LGPL findbugs does not contain the
>> SuppressFBWarnings annotation. This means that when building, ByteBuddy
>> produces a few warnings (nothing critical).
>> - The easiest way to avoid this failure is to call
>> applyJavaNature(failOnWarning: false), but this would be bad, since we 
>> want
>> to keep a high standard for tasks like ErrorProne and FindBugs itself.
>> - So I find myself lost: How do we suppress trivial warnings coming
>> from missing annotations, and honor warnings coming from other plugins?
>>
>> Any 

Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Boyuan Zhang
Hey all,

Currently we have 3 PRs supposed to be cherrypicked into RC2:
Pablo:  https://github.com/apache/beam/pull/5609 (merged)
Udi: https://github.com/apache/beam/pull/5607 (open)
Charles:  https://github.com/apache/beam/pull/5636 (open)

Boyuan

On Wed, Jun 13, 2018 at 9:54 AM Pablo Estrada  wrote:

> Precommits for PR https://github.com/apache/beam/pull/5609 are now
> passing. For now I've simply set failOnWarning to false to cherrypick into
> the release, and fix in master later on.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner  wrote:
>
>> From my understanding, the @SuppressFBWarnings usage is in a dependency
>> (ByteBuddy) rather than directly in our code; so we're not able to modify
>> the usage.
>>
>> Pablo, feel free to disable failOnWarning for the sdks-java-core project
>> temporarily. This isn't a major regression since we've only recently made
>> the change to enable it [1]. We can work separately on figuring out how to
>> resolve the warnings.
>>
>> [1] https://github.com/apache/beam/pull/5319
>>
>> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson 
>> wrote:
>>
>>> Hi Pablo,
>>>
>>> I'm afraid I couldn't find one either... there is an issue about it [1]
>>> which is old so it doesn't look likely to be resolved either.
>>>
>>> If you have time (sorry I am a bit busy) could you please verify the
>>> version does work if you install that version locally? I know the maven
>>> version of that [2] but not sure on the gradle equivalent. If we know it
>>> works, we can then find a repository that fits ok with Apache/Beam policy.
>>>
>>> Alternatively, we could consider using a fully qualified reference (i.e.
>>> @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
>>> version and leave the dependency at the 1.3.9-1. I believe our general
>>> direction is to remove findbugs when errorprone covers all aspects so I
>>> *expect* this should be considered reasonable.
>>>
>>> I hope this helps,
>>> Tim
>>>
>>> [1] https://github.com/stephenc/findbugs-annotations/issues/4
>>> [2] https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>>>
>>> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
>>> wrote:
>>>
 Hi Tim,
 you're right. Thanks for pointing that out. There's just one problem
 that I'm running into now: The 3.0.1-1 version does not seem to be
 available in Maven Central[1]. Looking at the website, I am not quite sure
 if there's another repository where they do stage the newer versions?[2]

 -P

 [1]
 https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
 /
 [2] http://stephenc.github.io/findbugs-annotations/

 On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
 timrobertson...@gmail.com> wrote:

> Hi Pablo,
>
> I took only a quick look.
>
> "- The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation"
>
> Unless I misunderstand you it looks like SuppressFBWarnings was added
> in Stephen's version in this commit [1] which was introduced in
> version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
> I notice in your commits [1] you've been exploring version 3.0.0
> already though... what happens when you use 3.0.1-1? It sounds like the
> wrong version is coming in rather than the annotation being missing.
>
> Thanks,
> Tim
>
> [1]
> https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
> [2] https://github.com/stephenc/findbugs-annotations/releases
> [3]
> https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab
>
>
>
>
>
>
>
> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
> wrote:
>
>> Hi all,
>> I'll humbly declare that after wrestling with he build to stop
>> depending on the wrong findbugs_annotations, I feel somewhat lost. The
>> issue is actually quite small:
>>
>> - The JAR from the non-LGPL findbugs does not contain the
>> SuppressFBWarnings annotation. This means that when building, ByteBuddy
>> produces a few warnings (nothing critical).
>> - The easiest way to avoid this failure is to call
>> applyJavaNature(failOnWarning: false), but this would be bad, since we 
>> want
>> to keep a high standard for tasks like ErrorProne and FindBugs itself.
>> - So I find myself lost: How do we suppress trivial warnings coming
>> from missing annotations, and honor warnings coming from other plugins?
>>
>> Any help / a PR from someone more capable would be appreciated.
>> Best
>> -P.
>>
>> On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía 
>> wrote:
>>
>>> Yes, ok I was not aware it was already being addressed, nice.
>>> On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay 
>>> wrote:
>>> >
>>> > 

Re: Proposing interactive beam runner

2018-06-13 Thread Eugene Kirpichov
This is awesome, thanks Sindy! I hope that the questions related to
portability will get resolved in a way that will allow to reuse some of the
work for other interactive Beam experiences, including SQL as Andrew says,
and providing a REPL e.g. for users of Scala or other JVM-based languages.

+Neville Li  Do I remember correctly that you guys had
some sort of interactivity going in Scio but were looking forward to Beam
developing a native solution?

On Wed, Jun 13, 2018 at 2:22 PM Sindy Li  wrote:

> *Thanks, Andrew!*
>
> *Here is a link to the demo on Youtube for people interested:*
> *https://www.youtube.com/watch?v=c5CjA1e3Cqw=youtu.be
> *
>
> On Wed, Jun 13, 2018 at 1:23 PM, Andrew Pilloud 
> wrote:
>
>> This sounds really interesting, thanks for sharing! We've just begun to
>> explore making Beam SQL interactive. The Interactive Runner you've proposed
>> sounds like it would solve a bunch of the problems SQL faces as well. SQL
>> is written in Java right now, so we can't immediately reuse any code.
>>
>> Andrew
>>
>> On Wed, Jun 13, 2018 at 11:48 AM Sindy Li  wrote:
>>
>>> Resending after subscribing to dev list.
>>>
>>> -- Forwarded message --
>>> From: Sindy Li 
>>> Date: Fri, Jun 8, 2018 at 5:57 PM
>>> Subject: Proposing interactive beam runner
>>> To: dev@beam.apache.org
>>> Cc: Harsh Vardhan , Chamikara Jayalath <
>>> chamik...@google.com>, Anand Iyer , Robert Bradshaw <
>>> rober...@google.com>
>>>
>>>
>>> Hello,
>>>
>>> We were exploring ways to provide an interactive notebook experience for
>>> writing Beam Python pipelines. The design doc
>>> 
>>>  provides
>>> an overview/vision of what we would like to achieve. Pull request
>>>  provides a prototype for the
>>> same. The document also provides demo screen shots and instructions for
>>> running a demo in Jupyter. Please take a look. We believe this would be a
>>> useful addition to Beam.
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>


Re: Proposing interactive beam runner

2018-06-13 Thread Sindy Li
*Thanks, Andrew!*

*Here is a link to the demo on Youtube for people interested:*
*https://www.youtube.com/watch?v=c5CjA1e3Cqw=youtu.be
*

On Wed, Jun 13, 2018 at 1:23 PM, Andrew Pilloud  wrote:

> This sounds really interesting, thanks for sharing! We've just begun to
> explore making Beam SQL interactive. The Interactive Runner you've proposed
> sounds like it would solve a bunch of the problems SQL faces as well. SQL
> is written in Java right now, so we can't immediately reuse any code.
>
> Andrew
>
> On Wed, Jun 13, 2018 at 11:48 AM Sindy Li  wrote:
>
>> Resending after subscribing to dev list.
>>
>> -- Forwarded message --
>> From: Sindy Li 
>> Date: Fri, Jun 8, 2018 at 5:57 PM
>> Subject: Proposing interactive beam runner
>> To: dev@beam.apache.org
>> Cc: Harsh Vardhan , Chamikara Jayalath <
>> chamik...@google.com>, Anand Iyer , Robert Bradshaw <
>> rober...@google.com>
>>
>>
>> Hello,
>>
>> We were exploring ways to provide an interactive notebook experience for
>> writing Beam Python pipelines. The design doc
>> 
>>  provides
>> an overview/vision of what we would like to achieve. Pull request
>>  provides a prototype for the
>> same. The document also provides demo screen shots and instructions for
>> running a demo in Jupyter. Please take a look. We believe this would be a
>> useful addition to Beam.
>>
>> Thanks!
>>
>>
>>
>>


Re: Proposing interactive beam runner

2018-06-13 Thread Andrew Pilloud
This sounds really interesting, thanks for sharing! We've just begun to
explore making Beam SQL interactive. The Interactive Runner you've proposed
sounds like it would solve a bunch of the problems SQL faces as well. SQL
is written in Java right now, so we can't immediately reuse any code.

Andrew

On Wed, Jun 13, 2018 at 11:48 AM Sindy Li  wrote:

> Resending after subscribing to dev list.
>
> -- Forwarded message --
> From: Sindy Li 
> Date: Fri, Jun 8, 2018 at 5:57 PM
> Subject: Proposing interactive beam runner
> To: dev@beam.apache.org
> Cc: Harsh Vardhan , Chamikara Jayalath <
> chamik...@google.com>, Anand Iyer , Robert Bradshaw <
> rober...@google.com>
>
>
> Hello,
>
> We were exploring ways to provide an interactive notebook experience for
> writing Beam Python pipelines. The design doc
> 
>  provides
> an overview/vision of what we would like to achieve. Pull request
>  provides a prototype for the
> same. The document also provides demo screen shots and instructions for
> running a demo in Jupyter. Please take a look. We believe this would be a
> useful addition to Beam.
>
> Thanks!
>
>
>
>


Fwd: Proposing interactive beam runner

2018-06-13 Thread Sindy Li
Resending after subscribing to dev list.

-- Forwarded message --
From: Sindy Li 
Date: Fri, Jun 8, 2018 at 5:57 PM
Subject: Proposing interactive beam runner
To: dev@beam.apache.org
Cc: Harsh Vardhan , Chamikara Jayalath <
chamik...@google.com>, Anand Iyer , Robert Bradshaw <
rober...@google.com>


Hello,

We were exploring ways to provide an interactive notebook experience for
writing Beam Python pipelines. The design doc

provides
an overview/vision of what we would like to achieve. Pull request
 provides a prototype for the
same. The document also provides demo screen shots and instructions for
running a demo in Jupyter. Please take a look. We believe this would be a
useful addition to Beam.

Thanks!


Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-13 Thread Romain Manni-Bucau
if you have a javaagent you can otherwise you can't but beam can proxy all
instances it sees which would be enough while the transforms* don't create
their own classloaders without reusing the TCCL. On the deserialization
side it is easy since it is beam land and it "just "needs to find back or
create the right classloader and when the last instance using this
classloader disappear it destroys it.

Romain Manni-Bucau
@rmannibucau  |  Blog
 | Old Blog
 | Github  |
LinkedIn  | Book



Le mer. 13 juin 2018 à 20:32, Lukasz Cwik  a écrit :

> I'm assuming that you have control over the application environment. Would
> it be possible to replace the ObjectInputStream that the JVM provides with
> your own version that uses the thread context class loader and manage the
> classloader per thread depending on what "application" owns that thread?
> (You would need to add a bunch of logic to correctly associate each
> created thread/thread factory with the correct application but most
> application containers already need to do this).
>
> On Wed, Jun 13, 2018 at 11:12 AM Romain Manni-Bucau 
> wrote:
>
>> (answered inline)
>>
>>
>> Le mer. 13 juin 2018 à 18:42, Lukasz Cwik  a écrit :
>>
>>> Thanks for the example Romain.
>>>
>>> I took a look through it and was wondering whether it is only the root
>>> objects in the deserialization tree that need to implement
>>> SerializableService?
>>> Do lots of things need to implement SerializableService typically?
>>>
>>
>> yes, on the one entering the (de)serialization process must handle that
>>
>>
>>> What do you do with types that you don't control (for example do you
>>> create wrapper types)?
>>>
>>
>> Like beam classes? ;)
>>
>> You can instrument their bytecode like in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/main/java/org/talend/sdk/component/runtime/beam/transformer/BeamIOTransformer.java#L208
>> (sorry i dont use bytebuddy but directly asm). This is quite easy to do as
>> soon as you have a classloader for these classes or - if you reuse the jvm
>> classloader - a javaagent or equivalent.
>>
>> If you don't want/can't change the bytecode then you manipulate a proxy
>> instead of the real instance (a wrapper):
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/asm/ProxyGenerator.java#L118
>> .
>>
>>
>>>
>>> On Wed, Jun 6, 2018 at 9:56 PM Romain Manni-Bucau 
>>> wrote:
>>>
 Note sure the example is atomic enough but in
 https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40
 the "instance()" is a singleton used by all the runtime of the framework.

 Deserialization happens in
 https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26
 and serialization is about creating this object in a write replace. Then
 the runtime is switching its classloader (runner for beam) as in
 https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60
 asap and resets it once done to not break its environment for reused jvms
 case.

 If we take the case of an IO, the io would lazily creates its defined
 classloader from its spec and use some reference counting logic to destroy
 it when needed in its teardown for instance. The io then does the
 classloader switch in its callbacks (setup/teardown/process/bundle hooks
 etc).


 Le mer. 6 juin 2018 23:33, Lukasz Cwik  a écrit :

> Romain, can you point to an example of a global singleton registry
> that does this right for class loading (it may allow people to work 
> towards
> such an effort)?
>
> On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
>
>> It is actually very localised in runner code where beam should reset
>> the classloader when the deserialization happens and then the runner owns
>> the classloader all the way in evaluators.
>>
>> If IO change the classloader they must indeed handle it too and patch
>> the deserialization too.
>>
>> Here again (we mentionned it multiple times in other threads) beam
>> misses a global singleton registry where you can register a "service" to
>> look it up based of a serialization configuration and a lifecycle 
>> allowing
>> to close the classloader in all instances without hacks.
>>
>>
>> 

Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-13 Thread Lukasz Cwik
I'm assuming that you have control over the application environment. Would
it be possible to replace the ObjectInputStream that the JVM provides with
your own version that uses the thread context class loader and manage the
classloader per thread depending on what "application" owns that thread?
(You would need to add a bunch of logic to correctly associate each created
thread/thread factory with the correct application but most application
containers already need to do this).

On Wed, Jun 13, 2018 at 11:12 AM Romain Manni-Bucau 
wrote:

> (answered inline)
>
>
> Le mer. 13 juin 2018 à 18:42, Lukasz Cwik  a écrit :
>
>> Thanks for the example Romain.
>>
>> I took a look through it and was wondering whether it is only the root
>> objects in the deserialization tree that need to implement
>> SerializableService?
>> Do lots of things need to implement SerializableService typically?
>>
>
> yes, on the one entering the (de)serialization process must handle that
>
>
>> What do you do with types that you don't control (for example do you
>> create wrapper types)?
>>
>
> Like beam classes? ;)
>
> You can instrument their bytecode like in
> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/main/java/org/talend/sdk/component/runtime/beam/transformer/BeamIOTransformer.java#L208
> (sorry i dont use bytebuddy but directly asm). This is quite easy to do as
> soon as you have a classloader for these classes or - if you reuse the jvm
> classloader - a javaagent or equivalent.
>
> If you don't want/can't change the bytecode then you manipulate a proxy
> instead of the real instance (a wrapper):
> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/asm/ProxyGenerator.java#L118
> .
>
>
>>
>> On Wed, Jun 6, 2018 at 9:56 PM Romain Manni-Bucau 
>> wrote:
>>
>>> Note sure the example is atomic enough but in
>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40
>>> the "instance()" is a singleton used by all the runtime of the framework.
>>>
>>> Deserialization happens in
>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26
>>> and serialization is about creating this object in a write replace. Then
>>> the runtime is switching its classloader (runner for beam) as in
>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60
>>> asap and resets it once done to not break its environment for reused jvms
>>> case.
>>>
>>> If we take the case of an IO, the io would lazily creates its defined
>>> classloader from its spec and use some reference counting logic to destroy
>>> it when needed in its teardown for instance. The io then does the
>>> classloader switch in its callbacks (setup/teardown/process/bundle hooks
>>> etc).
>>>
>>>
>>> Le mer. 6 juin 2018 23:33, Lukasz Cwik  a écrit :
>>>
 Romain, can you point to an example of a global singleton registry that
 does this right for class loading (it may allow people to work towards such
 an effort)?

 On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <
 rmannibu...@gmail.com> wrote:

> It is actually very localised in runner code where beam should reset
> the classloader when the deserialization happens and then the runner owns
> the classloader all the way in evaluators.
>
> If IO change the classloader they must indeed handle it too and patch
> the deserialization too.
>
> Here again (we mentionned it multiple times in other threads) beam
> misses a global singleton registry where you can register a "service" to
> look it up based of a serialization configuration and a lifecycle allowing
> to close the classloader in all instances without hacks.
>
>
> Le mar. 5 juin 2018 23:50, Kenneth Knowles  a écrit :
>
>> Perhaps we can also adopt a practice of making our own APIs
>> explicitly pass a Classloader when appropriate so we only have to set 
>> this
>> when we are entering code that does not have good hygiene. It might
>> actually be nice to have a lightweight static analysis to forbid bad
>> methods in our code.
>>
>> Kenn
>>
>> On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik  wrote:
>>
>>> I totally agree, but there are so many Java APIs (including ours)
>>> that messed this up so everyone lives with the same hack.
>>>
>>> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud 
>>> wrote:
>>>
 It seems like a terribly fragile way to pass arguments but my
 tests pass when I wrap the JDBC path into Beam pipeline execution with 
 that
 pattern.


Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-13 Thread Romain Manni-Bucau
(answered inline)


Le mer. 13 juin 2018 à 18:42, Lukasz Cwik  a écrit :

> Thanks for the example Romain.
>
> I took a look through it and was wondering whether it is only the root
> objects in the deserialization tree that need to implement
> SerializableService?
> Do lots of things need to implement SerializableService typically?
>

yes, on the one entering the (de)serialization process must handle that


> What do you do with types that you don't control (for example do you
> create wrapper types)?
>

Like beam classes? ;)

You can instrument their bytecode like in
https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/main/java/org/talend/sdk/component/runtime/beam/transformer/BeamIOTransformer.java#L208
(sorry i dont use bytebuddy but directly asm). This is quite easy to do as
soon as you have a classloader for these classes or - if you reuse the jvm
classloader - a javaagent or equivalent.

If you don't want/can't change the bytecode then you manipulate a proxy
instead of the real instance (a wrapper):
https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/asm/ProxyGenerator.java#L118
.


>
> On Wed, Jun 6, 2018 at 9:56 PM Romain Manni-Bucau 
> wrote:
>
>> Note sure the example is atomic enough but in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40
>> the "instance()" is a singleton used by all the runtime of the framework.
>>
>> Deserialization happens in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26
>> and serialization is about creating this object in a write replace. Then
>> the runtime is switching its classloader (runner for beam) as in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60
>> asap and resets it once done to not break its environment for reused jvms
>> case.
>>
>> If we take the case of an IO, the io would lazily creates its defined
>> classloader from its spec and use some reference counting logic to destroy
>> it when needed in its teardown for instance. The io then does the
>> classloader switch in its callbacks (setup/teardown/process/bundle hooks
>> etc).
>>
>>
>> Le mer. 6 juin 2018 23:33, Lukasz Cwik  a écrit :
>>
>>> Romain, can you point to an example of a global singleton registry that
>>> does this right for class loading (it may allow people to work towards such
>>> an effort)?
>>>
>>> On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 It is actually very localised in runner code where beam should reset
 the classloader when the deserialization happens and then the runner owns
 the classloader all the way in evaluators.

 If IO change the classloader they must indeed handle it too and patch
 the deserialization too.

 Here again (we mentionned it multiple times in other threads) beam
 misses a global singleton registry where you can register a "service" to
 look it up based of a serialization configuration and a lifecycle allowing
 to close the classloader in all instances without hacks.


 Le mar. 5 juin 2018 23:50, Kenneth Knowles  a écrit :

> Perhaps we can also adopt a practice of making our own APIs explicitly
> pass a Classloader when appropriate so we only have to set this when we 
> are
> entering code that does not have good hygiene. It might actually be nice 
> to
> have a lightweight static analysis to forbid bad methods in our code.
>
> Kenn
>
> On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik  wrote:
>
>> I totally agree, but there are so many Java APIs (including ours)
>> that messed this up so everyone lives with the same hack.
>>
>> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud 
>> wrote:
>>
>>> It seems like a terribly fragile way to pass arguments but my tests
>>> pass when I wrap the JDBC path into Beam pipeline execution with that
>>> pattern.
>>>
>>> Thanks!
>>>
>>> Andrew
>>>
>>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik  wrote:
>>>
 It is a common mistake for APIs to not include a way to specify
 which class loader to use when doing something like deserializing an
 instance of a class via the ObjectInputStream. This common issue also
 affects Apache Beam (SerializableCoder, PipelineOptionsFactory, ...) 
 and
 the way that typical Java APIs have gotten around this is to use to the
 thread context class loader (TCCL) as the way to plumb this additional
 attribute through. So Apache Beam is meant to in all places 

Re: [FYI] New Apache Beam Swag Store!

2018-06-13 Thread Griselda Cuevas
Thanks All!

To close the loop on the suggestions, I'll order more t-shirts in black so
we have some options.

G

On Wed, 13 Jun 2018 at 08:39, Ismaël Mejía  wrote:

> Great ! Thanks Gris and Matthias for putting this in place.
> Hope to get that hoodie soon. As a suggestion, more colors too, and
> eventually a t-shirt just with the big B logo.
> On Mon, Jun 11, 2018 at 6:50 PM Mikhail Gryzykhin 
> wrote:
> >
> > That's nice!
> >
> > More colors are appreciated :)
> >
> > --Mikhail
> >
> >
> > On Sun, Jun 10, 2018 at 8:20 PM Kenneth Knowles  wrote:
> >>
> >> Sweet! Agree with Raghu :-)
> >>
> >> Kenn
> >>
> >> On Sun, Jun 10, 2018 at 6:06 AM Matthias Baetens <
> baetensmatth...@gmail.com> wrote:
> >>>
> >>> Great news, big thanks for all the work, Gris! Looking forward to
> people wearing this around the globe ;)
> >>>
> >>> On Sat, 9 Jun 2018 at 01:28 Ankur Goenka  wrote:
> 
>  Awesome!
> 
> 
>  On Fri, Jun 8, 2018 at 4:24 PM Pablo Estrada 
> wrote:
> >
> > Nice : D
> >
> > On Fri, Jun 8, 2018, 3:43 PM Raghu Angadi 
> wrote:
> >>
> >> Woo-hoo! This is terrific.
> >>
> >> If we are increasing color choices I would like black or
> charcoal... Beam logo would really pop on a dark background.
> >>
> >> On Fri, Jun 8, 2018 at 3:32 PM Griselda Cuevas 
> wrote:
> >>>
> >>> Hi Beam Community,
> >>>
> >>> I just want to share with you the exciting news about our brand
> new Apache Beam Swag Store!
> >>>
> >>> You can find it here: https://store-beam.myshopify.com/
> >>>
> >>> How does it work?
> >>>
> >>> You can just select the items you want and check-out. Our Vendor
> ships to anywhere in the world and normally can have swag to be delivered
> within 1 week. Each company or user will need to pay for their own swag.
> >>> If you are hosting an event or representing Beam at one, reach out
> to me or the beam-events-meetups slack channel, I'll be happy to review
> your event and see if we can sponsor the swag. We'll have codes for this
> occasions thanks to Google, who has sponsored an initial inventory.
> >>>
> >>> If you have feedback, ideas on new swag, questions or suggestions,
> reach out to me and/or Matthias Baetens.
> >>>
> >>> Happy Friday!
> >>> G
> >>>
> >>>
> > --
> > Got feedback? go/pabloem-feedback
> 
> >>>
> >>> --
> >>>
>


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Chamikara Jayalath
Thanks Yifan.

On Wed, Jun 13, 2018 at 10:21 AM Ahmet Altay  wrote:

> Thanks Yifan, this is great!
>
> My unsolicited feedback:
> - Could it warn against dependencies that did not get updates for a long
> time? For python there were examples of a dependency being abandoned by its
> own developers and it took us a while to figure it out and switch to
> maintained one. (Currently googledatastore is a dependency like that.)
>
> I will second the Scott's question for what should I do with this report?
> Is it possible to add a link to quickly create a JIRA issue for a given
> dependency? Or is it possible to link to already open issues for the
> identified dependencies?
>

We ultimately want to create JIRAs automatically and assign them to owners
if defined according to the policy document. We can also add a link to the
JIRA to the report. But given the number of outdated dependencies listed
here, looks like we'll have to bootstrap the process by manually updating
many of these dependencies.

- Cham


> Ahmet
>
> On Wed, Jun 13, 2018 at 10:13 AM, Scott Wegner  wrote:
>
>> Nifty. Here's some unsolicited feedback:
>>
>> * The report gives a nice view of the data and leaves it as an exercise
>> to the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major
>> version behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the
>> report more digestable if these details were already included.
>> * The next question after reading this report is "What should I do with
>> this?" I recommend embedding details or links to answer that. For example:
>>
>>   "High Priority Dependency Updates are defined as XYZ. In Beam, we make
>> a best-effort attempt at keeping all dependencies up-to-date according to
>> ABC. In the future, issues will be filed and tracked for these
>> automatically, but in the meantime you can search for existing issues or
>> open a new one . Read more about our dependency update policy ."
>>
>> On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada  wrote:
>>
>>> Ahh very nice... thanks Yifan & Cham!
>>> Lots of old dependencies eh... very interesting.
>>> Best
>>> -P.
>>>
>>> On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:
>>>
 Hi,


 I want to follow up and explain this email.


 This is a sample email that reports the results of Beam SDK dependency
 check, which was proposed here
 .
 The goal is finding updates for all Beam Python & Java SDKs' dependencies
 and prioritize them. The job will be auto triggered in Jenkins once a week
 and generate a report. The report lists the high priority updates base on
 the following criteria:


 The dependency update is high priority if:

 1. It has major versions update available;

   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0

  2. or, it is over 3 minor versions behind the latest version;

   e.g. org.tukaani:xz 1.5 -> 1.8

 3. or, the current version is behind the later version for over 180
 days.

   e.g. com.google.auto.service:auto-service 2014-10-24 ->
 2017-12-11


 This job helps Beam contributors to determine the dependency which is
 far behind the latest released version. The next step would be automating
 filing JIRA bugs for dep updates, group dependencies and identify owners to
 take care of the upgrades follow Chamikara's proposal
 
 .


 For more readings:

 [Proposal] Beam dependency check automation
 
  by Yifan Zou

 [Proposal] Beam dependency update policy
 
  by *Chamikara Jayalath*

 Thank you.

 Yifan Zou

 On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
 jenk...@builds.apache.org> wrote:

> High Priority Dependency Updates Of Beam Python SDK:
> *Dependency Name* *Current Version* *Later Version* *Current Version
> Release Date* *Later Version Release Date*
> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
> httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
> Updates Of Beam Java SDK:
> *Dependency Name* *Current Version* *Later Version* *Current Version
> Release Date* *Later Version Release Date*
> org.assertj:assertj-core 2.5.0 3.10.0 2016-07-03 2018-05-11
> com.google.auto.service:auto-service 1.0-rc2 1.0-rc4 2014-10-24
> 2017-12-11
> biz.aQute:bndlib 1.43.0 2.0.0.20130123-133441 2011-04-01 2013-02-27
> org.apache.cassandra:cassandra-all 3.9 3.11.2 2016-09-26 2018-02-14
> commons-cli:commons-cli 1.2 1.4 2009-03-19 

Re: Building and visualizing the Beam SQL graph

2018-06-13 Thread Andrew Pilloud
One of my goals is to make the graph easier to read and map back to the SQL
EXPLAIN output. The way the graph is currently built (`toPTransform` vs
`toPCollection`) does make a big difference in that graph. I think it is
also important to have a common function to do the apply with consistent
naming. I think that will greatly help with ease of understanding. It
sounds like what really want is this in the BeamRelNode interface:

PInput buildPInput(Pipeline pipeline);
PTransform> buildPTransform();

default PCollection toPCollection(Pipeline pipeline) {
return buildPInput(pipeline).apply(getStageName(), buildPTransform());
}

Andrew

On Mon, Jun 11, 2018 at 2:27 PM Mingmin Xu  wrote:

> EXPLAIN shows the execution plan in SQL perspective only. After converting
> to a Beam composite PTransform, there're more steps underneath, each Runner
> re-org Beam PTransforms again which makes the final pipeline hard to read.
> In SQL module itself, I don't see any difference between `toPTransform` and
> `toPCollection`. We could have an easy-to-understand step name when
> converting RelNodes, but Runners show the graph to developers.
>
> Mingmin
>
> On Mon, Jun 11, 2018 at 2:06 PM, Andrew Pilloud 
> wrote:
>
>> That sounds correct. And because each rel node might have a different
>> input there isn't a standard interface (like PTransform,
>> PCollection> toPTransform());
>>
>> Andrew
>>
>> On Mon, Jun 11, 2018 at 1:31 PM Kenneth Knowles  wrote:
>>
>>> Agree with that. It will be kind of tricky to generalize. I think there
>>> are some criteria in this case that might apply in other cases:
>>>
>>> 1. Each rel node (or construct of a DSL) should have a PTransform for
>>> how it computes its result from its inputs.
>>> 2. The inputs to that PTransform should actually be the inputs to the
>>> rel node!
>>>
>>> So I tried to improve #1 but I probably made #2 worse.
>>>
>>> Kenn
>>>
>>> On Mon, Jun 11, 2018 at 12:53 PM Anton Kedin  wrote:
>>>
 Not answering the original question, but doesn't "explain" satisfy the
 SQL use case?

 Going forward we probably want to solve this in a more general way. We
 have at least 3 ways to represent the pipeline:
  - how runner executes it;
  - what it looks like when constructed;
  - what the user was describing in DSL;
 And there will probably be more, if extra layers are built on top of
 DSLs.

 If possible, we probably should be able to map any level of abstraction
 to any other to better understand and debug the pipelines.


 On Mon, Jun 11, 2018 at 12:17 PM Kenneth Knowles 
 wrote:

> In other words, revert https://github.com/apache/beam/pull/4705/files,
> at least in spirit? I agree :-)
>
> Kenn
>
> On Mon, Jun 11, 2018 at 11:39 AM Andrew Pilloud 
> wrote:
>
>> We are currently converting the Calcite Rel tree to Beam by
>> recursively building a tree of nested PTransforms. This results in a 
>> weird
>> nested graph in the dataflow UI where each node contains its inputs 
>> nested
>> inside of it. I'm going to change the internal data structure for
>> converting the tree from a PTransform to a PCollection, which will result
>> in a more accurate representation of the tree structure being built and
>> should simplify the code as well. This will not change the public 
>> interface
>> to SQL, which will remain a PTransform. Any thoughts or objections?
>>
>> I was also wondering if there are tools for visualizing the Beam
>> graph aside from the dataflow runner UI. What other tools exist?
>>
>> Andrew
>>
>
>
>
> --
> 
> Mingmin
>


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Ahmet Altay
Thanks Yifan, this is great!

My unsolicited feedback:
- Could it warn against dependencies that did not get updates for a long
time? For python there were examples of a dependency being abandoned by its
own developers and it took us a while to figure it out and switch to
maintained one. (Currently googledatastore is a dependency like that.)

I will second the Scott's question for what should I do with this report?
Is it possible to add a link to quickly create a JIRA issue for a given
dependency? Or is it possible to link to already open issues for the
identified dependencies?

Ahmet

On Wed, Jun 13, 2018 at 10:13 AM, Scott Wegner  wrote:

> Nifty. Here's some unsolicited feedback:
>
> * The report gives a nice view of the data and leaves it as an exercise to
> the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major version
> behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the report
> more digestable if these details were already included.
> * The next question after reading this report is "What should I do with
> this?" I recommend embedding details or links to answer that. For example:
>
>   "High Priority Dependency Updates are defined as XYZ. In Beam, we make a
> best-effort attempt at keeping all dependencies up-to-date according to
> ABC. In the future, issues will be filed and tracked for these
> automatically, but in the meantime you can search for existing issues or
> open a new one . Read more about our dependency update policy ."
>
> On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada  wrote:
>
>> Ahh very nice... thanks Yifan & Cham!
>> Lots of old dependencies eh... very interesting.
>> Best
>> -P.
>>
>> On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:
>>
>>> Hi,
>>>
>>>
>>> I want to follow up and explain this email.
>>>
>>>
>>> This is a sample email that reports the results of Beam SDK dependency
>>> check, which was proposed here
>>> .
>>> The goal is finding updates for all Beam Python & Java SDKs' dependencies
>>> and prioritize them. The job will be auto triggered in Jenkins once a week
>>> and generate a report. The report lists the high priority updates base on
>>> the following criteria:
>>>
>>>
>>> The dependency update is high priority if:
>>>
>>> 1. It has major versions update available;
>>>
>>>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>>>
>>>  2. or, it is over 3 minor versions behind the latest version;
>>>
>>>   e.g. org.tukaani:xz 1.5 -> 1.8
>>>
>>> 3. or, the current version is behind the later version for over 180 days.
>>>
>>>
>>>   e.g. com.google.auto.service:auto-service 2014-10-24 -> 2017-12-11
>>>
>>>
>>> This job helps Beam contributors to determine the dependency which is
>>> far behind the latest released version. The next step would be automating
>>> filing JIRA bugs for dep updates, group dependencies and identify owners to
>>> take care of the upgrades follow Chamikara's proposal
>>> 
>>> .
>>>
>>>
>>> For more readings:
>>>
>>> [Proposal] Beam dependency check automation
>>> 
>>>  by Yifan Zou
>>>
>>> [Proposal] Beam dependency update policy
>>> 
>>>  by *Chamikara Jayalath*
>>>
>>> Thank you.
>>>
>>> Yifan Zou
>>>
>>> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
>>> jenk...@builds.apache.org> wrote:
>>>
 High Priority Dependency Updates Of Beam Python SDK:
 *Dependency Name* *Current Version* *Later Version* *Current Version
 Release Date* *Later Version Release Date*
 google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
 httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
 Updates Of Beam Java SDK:
 *Dependency Name* *Current Version* *Later Version* *Current Version
 Release Date* *Later Version Release Date*
 org.assertj:assertj-core 2.5.0 3.10.0 2016-07-03 2018-05-11
 com.google.auto.service:auto-service 1.0-rc2 1.0-rc4 2014-10-24
 2017-12-11
 biz.aQute:bndlib 1.43.0 2.0.0.20130123-133441 2011-04-01 2013-02-27
 org.apache.cassandra:cassandra-all 3.9 3.11.2 2016-09-26 2018-02-14
 commons-cli:commons-cli 1.2 1.4 2009-03-19 2017-03-09
 commons-codec:commons-codec 1.9 1.11 2013-12-20 2017-10-17
 org.apache.commons:commons-dbcp2 2.1.1 2.3.0 2015-08-02 2018-05-08
 com.typesafe:config 1.3.0 1.3.3 2015-05-08 2018-02-21
 de.flapdoodle.embed:de.flapdoodle.embed.mongo 1.50.1 2.0.3 2015-12-11
 2018-02-14
 de.flapdoodle.embed:de.flapdoodle.embed.process 1.50.1 2.0.3 2015-12-11
 2018-02-14
 org.apache.derby:derby 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
 org.apache.derby:derbyclient 10.12.1.1 10.14.2.0 2015-10-10 

Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Scott Wegner
Nifty. Here's some unsolicited feedback:

* The report gives a nice view of the data and leaves it as an exercise to
the reader to do the math on each row (v0.25.0 to v1.3.0 = 1 major version
behind, 2017-06-26 to 2018-06-08 = 1 year behind). I would find the report
more digestable if these details were already included.
* The next question after reading this report is "What should I do with
this?" I recommend embedding details or links to answer that. For example:

  "High Priority Dependency Updates are defined as XYZ. In Beam, we make a
best-effort attempt at keeping all dependencies up-to-date according to
ABC. In the future, issues will be filed and tracked for these
automatically, but in the meantime you can search for existing issues or
open a new one . Read more about our dependency update policy ."

On Wed, Jun 13, 2018 at 9:02 AM Pablo Estrada  wrote:

> Ahh very nice... thanks Yifan & Cham!
> Lots of old dependencies eh... very interesting.
> Best
> -P.
>
> On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:
>
>> Hi,
>>
>>
>> I want to follow up and explain this email.
>>
>>
>> This is a sample email that reports the results of Beam SDK dependency
>> check, which was proposed here
>> .
>> The goal is finding updates for all Beam Python & Java SDKs' dependencies
>> and prioritize them. The job will be auto triggered in Jenkins once a week
>> and generate a report. The report lists the high priority updates base on
>> the following criteria:
>>
>>
>> The dependency update is high priority if:
>>
>> 1. It has major versions update available;
>>
>>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>>
>>  2. or, it is over 3 minor versions behind the latest version;
>>
>>   e.g. org.tukaani:xz 1.5 -> 1.8
>>
>> 3. or, the current version is behind the later version for over 180 days.
>>
>>
>>   e.g. com.google.auto.service:auto-service 2014-10-24 -> 2017-12-11
>>
>>
>> This job helps Beam contributors to determine the dependency which is far
>> behind the latest released version. The next step would be automating
>> filing JIRA bugs for dep updates, group dependencies and identify owners to
>> take care of the upgrades follow Chamikara's proposal
>> 
>> .
>>
>>
>> For more readings:
>>
>> [Proposal] Beam dependency check automation
>> 
>>  by Yifan Zou
>>
>> [Proposal] Beam dependency update policy
>> 
>>  by *Chamikara Jayalath*
>>
>> Thank you.
>>
>> Yifan Zou
>>
>> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:
>>
>>> High Priority Dependency Updates Of Beam Python SDK:
>>> *Dependency Name* *Current Version* *Later Version* *Current Version
>>> Release Date* *Later Version Release Date*
>>> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
>>> httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
>>> Updates Of Beam Java SDK:
>>> *Dependency Name* *Current Version* *Later Version* *Current Version
>>> Release Date* *Later Version Release Date*
>>> org.assertj:assertj-core 2.5.0 3.10.0 2016-07-03 2018-05-11
>>> com.google.auto.service:auto-service 1.0-rc2 1.0-rc4 2014-10-24
>>> 2017-12-11
>>> biz.aQute:bndlib 1.43.0 2.0.0.20130123-133441 2011-04-01 2013-02-27
>>> org.apache.cassandra:cassandra-all 3.9 3.11.2 2016-09-26 2018-02-14
>>> commons-cli:commons-cli 1.2 1.4 2009-03-19 2017-03-09
>>> commons-codec:commons-codec 1.9 1.11 2013-12-20 2017-10-17
>>> org.apache.commons:commons-dbcp2 2.1.1 2.3.0 2015-08-02 2018-05-08
>>> com.typesafe:config 1.3.0 1.3.3 2015-05-08 2018-02-21
>>> de.flapdoodle.embed:de.flapdoodle.embed.mongo 1.50.1 2.0.3 2015-12-11
>>> 2018-02-14
>>> de.flapdoodle.embed:de.flapdoodle.embed.process 1.50.1 2.0.3 2015-12-11
>>> 2018-02-14
>>> org.apache.derby:derby 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>>> org.apache.derby:derbyclient 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>>> org.apache.derby:derbynet 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>>> org.elasticsearch:elasticsearch 5.6.3 6.2.4 2017-10-06 2018-04-12
>>> org.elasticsearch:elasticsearch-hadoop 5.0.0 6.2.4 2016-10-26 2018-04-12
>>> org.elasticsearch.client:elasticsearch-rest-client 5.6.3 6.2.4
>>> 2017-10-06 2018-04-12
>>> com.alibaba:fastjson 1.2.12 1.2.47 2016-05-21 2018-03-15
>>> org.elasticsearch.test:framework 5.6.3 6.2.4 2017-10-06 2018-04-12
>>> org.freemarker:freemarker 2.3.25-incubating 2.3.28 2016-06-14 2018-03-30
>>> org.codehaus.groovy:groovy-all 2.4.13 3.0.0-alpha-2 2017-11-22
>>> 2018-04-16
>>> org.apache.hbase:hbase-common 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
>>> 2018-05-31
>>> org.apache.hbase:hbase-hadoop-compat 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
>>> 

[Call for Speakers] Deep Learning in Production Meetup, Boston Area on June 26th

2018-06-13 Thread Griselda Cuevas
Hi Beam Community,


Eila Arich-Landkof (from OrielResearch) and I are co-hosting the next
edition of the Deep Learning in Production Meetup
 on June 26th at the
Google Office in Cambridge, Massachusetts.


*We are looking for speakers who would like to share their Beam use case.*


We are expecting an audience of 150 people, so this is a great visibility
opportunity and also a fantastic way to network and get to meet people from
the Beam community.


If you are in the Area, come join us!


Cheers,

Eila & Gris


Re: Beam SQL Pipeline Options

2018-06-13 Thread Andrew Pilloud
I've turned this into a PR, more discussion going on over there:
https://github.com/apache/beam/pull/5592

Andrew

On Wed, Jun 6, 2018 at 9:46 PM Kenneth Knowles  wrote:

> This is a nice short design discussion doc, and perhaps a cooler piece of
> news hidden in the paragraph :-)
>
> Kenn
>
> On Wed, Jun 6, 2018 at 9:24 AM Andrew Pilloud  wrote:
>
>> We are just about to the point of having a working pure SQL workflow for
>> Beam! One of the last things that remains is how to configure Pipeline
>> Options via a SQL shell. I have written up a proposal to use the set
>> statement, for example "SET runner=DataflowRunner". I'm looking for
>> feedback, particularly on what will make for the best user experience.
>> Please take a look and comment:
>>
>>
>> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>>
>> Andrew
>>
>


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Pablo Estrada
Precommits for PR https://github.com/apache/beam/pull/5609 are now passing.
For now I've simply set failOnWarning to false to cherrypick into the
release, and fix in master later on.
Best
-P.

On Wed, Jun 13, 2018 at 9:08 AM Scott Wegner  wrote:

> From my understanding, the @SuppressFBWarnings usage is in a dependency
> (ByteBuddy) rather than directly in our code; so we're not able to modify
> the usage.
>
> Pablo, feel free to disable failOnWarning for the sdks-java-core project
> temporarily. This isn't a major regression since we've only recently made
> the change to enable it [1]. We can work separately on figuring out how to
> resolve the warnings.
>
> [1] https://github.com/apache/beam/pull/5319
>
> On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson 
> wrote:
>
>> Hi Pablo,
>>
>> I'm afraid I couldn't find one either... there is an issue about it [1]
>> which is old so it doesn't look likely to be resolved either.
>>
>> If you have time (sorry I am a bit busy) could you please verify the
>> version does work if you install that version locally? I know the maven
>> version of that [2] but not sure on the gradle equivalent. If we know it
>> works, we can then find a repository that fits ok with Apache/Beam policy.
>>
>> Alternatively, we could consider using a fully qualified reference (i.e.
>> @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
>> version and leave the dependency at the 1.3.9-1. I believe our general
>> direction is to remove findbugs when errorprone covers all aspects so I
>> *expect* this should be considered reasonable.
>>
>> I hope this helps,
>> Tim
>>
>> [1] https://github.com/stephenc/findbugs-annotations/issues/4
>> [2] https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>>
>> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada 
>> wrote:
>>
>>> Hi Tim,
>>> you're right. Thanks for pointing that out. There's just one problem
>>> that I'm running into now: The 3.0.1-1 version does not seem to be
>>> available in Maven Central[1]. Looking at the website, I am not quite sure
>>> if there's another repository where they do stage the newer versions?[2]
>>>
>>> -P
>>>
>>> [1]
>>> https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
>>> /
>>> [2] http://stephenc.github.io/findbugs-annotations/
>>>
>>> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson <
>>> timrobertson...@gmail.com> wrote:
>>>
 Hi Pablo,

 I took only a quick look.

 "- The JAR from the non-LGPL findbugs does not contain the
 SuppressFBWarnings annotation"

 Unless I misunderstand you it looks like SuppressFBWarnings was added
 in Stephen's version in this commit [1] which was introduced in
 version 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
 I notice in your commits [1] you've been exploring version 3.0.0
 already though... what happens when you use 3.0.1-1? It sounds like the
 wrong version is coming in rather than the annotation being missing.

 Thanks,
 Tim

 [1]
 https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
 [2] https://github.com/stephenc/findbugs-annotations/releases
 [3]
 https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab







 On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
 wrote:

> Hi all,
> I'll humbly declare that after wrestling with he build to stop
> depending on the wrong findbugs_annotations, I feel somewhat lost. The
> issue is actually quite small:
>
> - The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation. This means that when building, ByteBuddy
> produces a few warnings (nothing critical).
> - The easiest way to avoid this failure is to call
> applyJavaNature(failOnWarning: false), but this would be bad, since we 
> want
> to keep a high standard for tasks like ErrorProne and FindBugs itself.
> - So I find myself lost: How do we suppress trivial warnings coming
> from missing annotations, and honor warnings coming from other plugins?
>
> Any help / a PR from someone more capable would be appreciated.
> Best
> -P.
>
> On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía 
> wrote:
>
>> Yes, ok I was not aware it was already being addressed, nice.
>> On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay 
>> wrote:
>> >
>> > Ismaël,
>> >
>> > I believe Pablo's https://github.com/apache/beam/pull/5609 is
>> fixing the issue by changing the findbugs back to
>> "com.github.stephenc.findbugs". Is this what you are referring to?
>> >
>> > Ahmet
>> >
>> > On Tue, Jun 12, 2018 at 2:51 PM, Boyuan Zhang 
>> wrote:
>> >>
>> >> Hey JB,
>> >>
>> >> I added some instructions about how to create python wheels in
>> 

Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-13 Thread Lukasz Cwik
Thanks for the example Romain.

I took a look through it and was wondering whether it is only the root
objects in the deserialization tree that need to implement
SerializableService?
Do lots of things need to implement SerializableService typically?
What do you do with types that you don't control (for example do you create
wrapper types)?

On Wed, Jun 6, 2018 at 9:56 PM Romain Manni-Bucau 
wrote:

> Note sure the example is atomic enough but in
> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40
> the "instance()" is a singleton used by all the runtime of the framework.
>
> Deserialization happens in
> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26
> and serialization is about creating this object in a write replace. Then
> the runtime is switching its classloader (runner for beam) as in
> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60
> asap and resets it once done to not break its environment for reused jvms
> case.
>
> If we take the case of an IO, the io would lazily creates its defined
> classloader from its spec and use some reference counting logic to destroy
> it when needed in its teardown for instance. The io then does the
> classloader switch in its callbacks (setup/teardown/process/bundle hooks
> etc).
>
>
> Le mer. 6 juin 2018 23:33, Lukasz Cwik  a écrit :
>
>> Romain, can you point to an example of a global singleton registry that
>> does this right for class loading (it may allow people to work towards such
>> an effort)?
>>
>> On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau 
>> wrote:
>>
>>> It is actually very localised in runner code where beam should reset the
>>> classloader when the deserialization happens and then the runner owns the
>>> classloader all the way in evaluators.
>>>
>>> If IO change the classloader they must indeed handle it too and patch
>>> the deserialization too.
>>>
>>> Here again (we mentionned it multiple times in other threads) beam
>>> misses a global singleton registry where you can register a "service" to
>>> look it up based of a serialization configuration and a lifecycle allowing
>>> to close the classloader in all instances without hacks.
>>>
>>>
>>> Le mar. 5 juin 2018 23:50, Kenneth Knowles  a écrit :
>>>
 Perhaps we can also adopt a practice of making our own APIs explicitly
 pass a Classloader when appropriate so we only have to set this when we are
 entering code that does not have good hygiene. It might actually be nice to
 have a lightweight static analysis to forbid bad methods in our code.

 Kenn

 On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik  wrote:

> I totally agree, but there are so many Java APIs (including ours) that
> messed this up so everyone lives with the same hack.
>
> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud 
> wrote:
>
>> It seems like a terribly fragile way to pass arguments but my tests
>> pass when I wrap the JDBC path into Beam pipeline execution with that
>> pattern.
>>
>> Thanks!
>>
>> Andrew
>>
>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik  wrote:
>>
>>> It is a common mistake for APIs to not include a way to specify
>>> which class loader to use when doing something like deserializing an
>>> instance of a class via the ObjectInputStream. This common issue also
>>> affects Apache Beam (SerializableCoder, PipelineOptionsFactory, ...) and
>>> the way that typical Java APIs have gotten around this is to use to the
>>> thread context class loader (TCCL) as the way to plumb this additional
>>> attribute through. So Apache Beam is meant to in all places honor the 
>>> TCCL
>>> if it has been set as most Java libraries (not all) do the same hack.
>>>
>>> In most environments the TCCL is not set and we are working with a
>>> single class loader. It turns out that in more complicated environments
>>> (like when loading a JDBC driver, or JNDI, or an application server, 
>>> ...)
>>> this usually doesn't work without each caller knowing what class loading
>>> context they should be in. A common work around for most scenarios is to
>>> always set the TCCL to the current classes class loader like so before
>>> invoking any APIs that do class loading so you don't propagate the TCCL 
>>> of
>>> the caller along since they may have set it for some other reason:
>>>
>>> ClassLoader originalClassLoader = 
>>> Thread.currentThread().getContextClassLoader();try {
>>> 
>>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
>>> // call some API that uses 

Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Scott Wegner
>From my understanding, the @SuppressFBWarnings usage is in a dependency
(ByteBuddy) rather than directly in our code; so we're not able to modify
the usage.

Pablo, feel free to disable failOnWarning for the sdks-java-core project
temporarily. This isn't a major regression since we've only recently made
the change to enable it [1]. We can work separately on figuring out how to
resolve the warnings.

[1] https://github.com/apache/beam/pull/5319

On Tue, Jun 12, 2018 at 11:57 PM Tim Robertson 
wrote:

> Hi Pablo,
>
> I'm afraid I couldn't find one either... there is an issue about it [1]
> which is old so it doesn't look likely to be resolved either.
>
> If you have time (sorry I am a bit busy) could you please verify the
> version does work if you install that version locally? I know the maven
> version of that [2] but not sure on the gradle equivalent. If we know it
> works, we can then find a repository that fits ok with Apache/Beam policy.
>
> Alternatively, we could consider using a fully qualified reference (i.e.
> @edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
> version and leave the dependency at the 1.3.9-1. I believe our general
> direction is to remove findbugs when errorprone covers all aspects so I
> *expect* this should be considered reasonable.
>
> I hope this helps,
> Tim
>
> [1] https://github.com/stephenc/findbugs-annotations/issues/4
> [2] https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
>
> On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada  wrote:
>
>> Hi Tim,
>> you're right. Thanks for pointing that out. There's just one problem that
>> I'm running into now: The 3.0.1-1 version does not seem to be available in
>> Maven Central[1]. Looking at the website, I am not quite sure if there's
>> another repository where they do stage the newer versions?[2]
>>
>> -P
>>
>> [1]
>> https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
>> /
>> [2] http://stephenc.github.io/findbugs-annotations/
>>
>> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson 
>> wrote:
>>
>>> Hi Pablo,
>>>
>>> I took only a quick look.
>>>
>>> "- The JAR from the non-LGPL findbugs does not contain the
>>> SuppressFBWarnings annotation"
>>>
>>> Unless I misunderstand you it looks like SuppressFBWarnings was added in
>>> Stephen's version in this commit [1] which was introduced in version
>>> 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
>>> I notice in your commits [1] you've been exploring version 3.0.0 already
>>> though... what happens when you use 3.0.1-1? It sounds like the wrong
>>> version is coming in rather than the annotation being missing.
>>>
>>> Thanks,
>>> Tim
>>>
>>> [1]
>>> https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
>>> [2] https://github.com/stephenc/findbugs-annotations/releases
>>> [3]
>>> https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
>>> wrote:
>>>
 Hi all,
 I'll humbly declare that after wrestling with he build to stop
 depending on the wrong findbugs_annotations, I feel somewhat lost. The
 issue is actually quite small:

 - The JAR from the non-LGPL findbugs does not contain the
 SuppressFBWarnings annotation. This means that when building, ByteBuddy
 produces a few warnings (nothing critical).
 - The easiest way to avoid this failure is to call
 applyJavaNature(failOnWarning: false), but this would be bad, since we want
 to keep a high standard for tasks like ErrorProne and FindBugs itself.
 - So I find myself lost: How do we suppress trivial warnings coming
 from missing annotations, and honor warnings coming from other plugins?

 Any help / a PR from someone more capable would be appreciated.
 Best
 -P.

 On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía  wrote:

> Yes, ok I was not aware it was already being addressed, nice.
> On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay  wrote:
> >
> > Ismaël,
> >
> > I believe Pablo's https://github.com/apache/beam/pull/5609 is
> fixing the issue by changing the findbugs back to
> "com.github.stephenc.findbugs". Is this what you are referring to?
> >
> > Ahmet
> >
> > On Tue, Jun 12, 2018 at 2:51 PM, Boyuan Zhang 
> wrote:
> >>
> >> Hey JB,
> >>
> >> I added some instructions about how to create python wheels in this
> PR: https://github.com/apache/beam-site/pull/467 . Hope it would be
> helpful.
> >>
> >> Boyuan
> >>
> >
>
 --
 Got feedback? go/pabloem-feedback
 

>>>
>>> --
>> Got feedback? go/pabloem-feedback
>> 
>>
>
>


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Pablo Estrada
Ahh very nice... thanks Yifan & Cham!
Lots of old dependencies eh... very interesting.
Best
-P.

On Wed, Jun 13, 2018 at 7:45 AM Yifan Zou  wrote:

> Hi,
>
>
> I want to follow up and explain this email.
>
>
> This is a sample email that reports the results of Beam SDK dependency
> check, which was proposed here
> .
> The goal is finding updates for all Beam Python & Java SDKs' dependencies
> and prioritize them. The job will be auto triggered in Jenkins once a week
> and generate a report. The report lists the high priority updates base on
> the following criteria:
>
>
> The dependency update is high priority if:
>
> 1. It has major versions update available;
>
>   e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0
>
>  2. or, it is over 3 minor versions behind the latest version;
>
>   e.g. org.tukaani:xz 1.5 -> 1.8
>
> 3. or, the current version is behind the later version for over 180 days.
>
>
>   e.g. com.google.auto.service:auto-service 2014-10-24 -> 2017-12-11
>
>
> This job helps Beam contributors to determine the dependency which is far
> behind the latest released version. The next step would be automating
> filing JIRA bugs for dep updates, group dependencies and identify owners to
> take care of the upgrades follow Chamikara's proposal
> 
> .
>
>
> For more readings:
>
> [Proposal] Beam dependency check automation
> 
>  by Yifan Zou
>
> [Proposal] Beam dependency update policy
> 
>  by *Chamikara Jayalath*
>
> Thank you.
>
> Yifan Zou
>
> On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> High Priority Dependency Updates Of Beam Python SDK:
>> *Dependency Name* *Current Version* *Later Version* *Current Version
>> Release Date* *Later Version Release Date*
>> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
>> httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
>> Updates Of Beam Java SDK:
>> *Dependency Name* *Current Version* *Later Version* *Current Version
>> Release Date* *Later Version Release Date*
>> org.assertj:assertj-core 2.5.0 3.10.0 2016-07-03 2018-05-11
>> com.google.auto.service:auto-service 1.0-rc2 1.0-rc4 2014-10-24
>> 2017-12-11
>> biz.aQute:bndlib 1.43.0 2.0.0.20130123-133441 2011-04-01 2013-02-27
>> org.apache.cassandra:cassandra-all 3.9 3.11.2 2016-09-26 2018-02-14
>> commons-cli:commons-cli 1.2 1.4 2009-03-19 2017-03-09
>> commons-codec:commons-codec 1.9 1.11 2013-12-20 2017-10-17
>> org.apache.commons:commons-dbcp2 2.1.1 2.3.0 2015-08-02 2018-05-08
>> com.typesafe:config 1.3.0 1.3.3 2015-05-08 2018-02-21
>> de.flapdoodle.embed:de.flapdoodle.embed.mongo 1.50.1 2.0.3 2015-12-11
>> 2018-02-14
>> de.flapdoodle.embed:de.flapdoodle.embed.process 1.50.1 2.0.3 2015-12-11
>> 2018-02-14
>> org.apache.derby:derby 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>> org.apache.derby:derbyclient 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>> org.apache.derby:derbynet 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
>> org.elasticsearch:elasticsearch 5.6.3 6.2.4 2017-10-06 2018-04-12
>> org.elasticsearch:elasticsearch-hadoop 5.0.0 6.2.4 2016-10-26 2018-04-12
>> org.elasticsearch.client:elasticsearch-rest-client 5.6.3 6.2.4 2017-10-06
>> 2018-04-12
>> com.alibaba:fastjson 1.2.12 1.2.47 2016-05-21 2018-03-15
>> org.elasticsearch.test:framework 5.6.3 6.2.4 2017-10-06 2018-04-12
>> org.freemarker:freemarker 2.3.25-incubating 2.3.28 2016-06-14 2018-03-30
>> org.codehaus.groovy:groovy-all 2.4.13 3.0.0-alpha-2 2017-11-22 2018-04-16
>> org.apache.hbase:hbase-common 1.2.6 2.0.0.3.0.0.3-2 2017-05-29 2018-05-31
>> org.apache.hbase:hbase-hadoop-compat 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
>> 2018-05-31
>> org.apache.hbase:hbase-hadoop2-compat 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
>> 2018-05-31
>> org.apache.hbase:hbase-server 1.2.6 2.0.0.3.0.0.3-2 2017-05-29 2018-05-31
>> org.apache.hbase:hbase-shaded-client 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
>> 2018-05-31
>> org.apache.hbase:hbase-shaded-server 1.2.6 2.0.0-alpha2 2017-05-29
>> 2018-05-31
>> org.apache.hive:hive-cli 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
>> org.apache.hive:hive-common 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
>> org.apache.hive:hive-exec 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
>> org.apache.hive.hcatalog:hive-hcatalog-core 2.1.0 3.0.0.3.0.0.3-2
>> 2016-06-16 2018-05-21
>> org.apache.httpcomponents:httpasyncclient 4.1.2 4.1.3 2016-06-18
>> 2017-02-05
>> org.apache.httpcomponents:httpclient 4.5.2 4.5.5 2016-02-21 2018-01-18
>> org.apache.httpcomponents:httpcore 4.4.5 4.4.9 2016-06-08 2018-01-11
>> net.java.dev.javacc:javacc 4.0 7.0.3 2018-06-08 2017-11-06
>> jline:jline 2.14.6 3.0.0.M1 

Re: [FYI] New Apache Beam Swag Store!

2018-06-13 Thread Ismaël Mejía
Great ! Thanks Gris and Matthias for putting this in place.
Hope to get that hoodie soon. As a suggestion, more colors too, and
eventually a t-shirt just with the big B logo.
On Mon, Jun 11, 2018 at 6:50 PM Mikhail Gryzykhin  wrote:
>
> That's nice!
>
> More colors are appreciated :)
>
> --Mikhail
>
>
> On Sun, Jun 10, 2018 at 8:20 PM Kenneth Knowles  wrote:
>>
>> Sweet! Agree with Raghu :-)
>>
>> Kenn
>>
>> On Sun, Jun 10, 2018 at 6:06 AM Matthias Baetens  
>> wrote:
>>>
>>> Great news, big thanks for all the work, Gris! Looking forward to people 
>>> wearing this around the globe ;)
>>>
>>> On Sat, 9 Jun 2018 at 01:28 Ankur Goenka  wrote:

 Awesome!


 On Fri, Jun 8, 2018 at 4:24 PM Pablo Estrada  wrote:
>
> Nice : D
>
> On Fri, Jun 8, 2018, 3:43 PM Raghu Angadi  wrote:
>>
>> Woo-hoo! This is terrific.
>>
>> If we are increasing color choices I would like black or charcoal... 
>> Beam logo would really pop on a dark background.
>>
>> On Fri, Jun 8, 2018 at 3:32 PM Griselda Cuevas  wrote:
>>>
>>> Hi Beam Community,
>>>
>>> I just want to share with you the exciting news about our brand new 
>>> Apache Beam Swag Store!
>>>
>>> You can find it here: https://store-beam.myshopify.com/
>>>
>>> How does it work?
>>>
>>> You can just select the items you want and check-out. Our Vendor ships 
>>> to anywhere in the world and normally can have swag to be delivered 
>>> within 1 week. Each company or user will need to pay for their own swag.
>>> If you are hosting an event or representing Beam at one, reach out to 
>>> me or the beam-events-meetups slack channel, I'll be happy to review 
>>> your event and see if we can sponsor the swag. We'll have codes for 
>>> this occasions thanks to Google, who has sponsored an initial inventory.
>>>
>>> If you have feedback, ideas on new swag, questions or suggestions, 
>>> reach out to me and/or Matthias Baetens.
>>>
>>> Happy Friday!
>>> G
>>>
>>>
> --
> Got feedback? go/pabloem-feedback
>>>
>>> --
>>>


Re: Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Yifan Zou
Hi,


I want to follow up and explain this email.


This is a sample email that reports the results of Beam SDK dependency
check, which was proposed here
.
The goal is finding updates for all Beam Python & Java SDKs' dependencies
and prioritize them. The job will be auto triggered in Jenkins once a week
and generate a report. The report lists the high priority updates base on
the following criteria:


The dependency update is high priority if:

1. It has major versions update available;

  e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0

 2. or, it is over 3 minor versions behind the latest version;

  e.g. org.tukaani:xz 1.5 -> 1.8

3. or, the current version is behind the later version for over 180 days.

  e.g. com.google.auto.service:auto-service 2014-10-24 -> 2017-12-11


This job helps Beam contributors to determine the dependency which is far
behind the latest released version. The next step would be automating
filing JIRA bugs for dep updates, group dependencies and identify owners to
take care of the upgrades follow Chamikara's proposal

.


For more readings:

[Proposal] Beam dependency check automation

 by Yifan Zou

[Proposal] Beam dependency update policy

 by *Chamikara Jayalath*

Thank you.

Yifan Zou

On Wed, Jun 13, 2018 at 7:41 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> High Priority Dependency Updates Of Beam Python SDK:
> *Dependency Name* *Current Version* *Later Version* *Current Version
> Release Date* *Later Version Release Date*
> google-cloud-bigquery 0.25.0 1.3.0 2017-06-26 2018-06-08
> httplib2 0.9.2 0.11.3 2015-09-28 2018-03-30 High Priority Dependency
> Updates Of Beam Java SDK:
> *Dependency Name* *Current Version* *Later Version* *Current Version
> Release Date* *Later Version Release Date*
> org.assertj:assertj-core 2.5.0 3.10.0 2016-07-03 2018-05-11
> com.google.auto.service:auto-service 1.0-rc2 1.0-rc4 2014-10-24 2017-12-11
> biz.aQute:bndlib 1.43.0 2.0.0.20130123-133441 2011-04-01 2013-02-27
> org.apache.cassandra:cassandra-all 3.9 3.11.2 2016-09-26 2018-02-14
> commons-cli:commons-cli 1.2 1.4 2009-03-19 2017-03-09
> commons-codec:commons-codec 1.9 1.11 2013-12-20 2017-10-17
> org.apache.commons:commons-dbcp2 2.1.1 2.3.0 2015-08-02 2018-05-08
> com.typesafe:config 1.3.0 1.3.3 2015-05-08 2018-02-21
> de.flapdoodle.embed:de.flapdoodle.embed.mongo 1.50.1 2.0.3 2015-12-11
> 2018-02-14
> de.flapdoodle.embed:de.flapdoodle.embed.process 1.50.1 2.0.3 2015-12-11
> 2018-02-14
> org.apache.derby:derby 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
> org.apache.derby:derbyclient 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
> org.apache.derby:derbynet 10.12.1.1 10.14.2.0 2015-10-10 2018-05-03
> org.elasticsearch:elasticsearch 5.6.3 6.2.4 2017-10-06 2018-04-12
> org.elasticsearch:elasticsearch-hadoop 5.0.0 6.2.4 2016-10-26 2018-04-12
> org.elasticsearch.client:elasticsearch-rest-client 5.6.3 6.2.4 2017-10-06
> 2018-04-12
> com.alibaba:fastjson 1.2.12 1.2.47 2016-05-21 2018-03-15
> org.elasticsearch.test:framework 5.6.3 6.2.4 2017-10-06 2018-04-12
> org.freemarker:freemarker 2.3.25-incubating 2.3.28 2016-06-14 2018-03-30
> org.codehaus.groovy:groovy-all 2.4.13 3.0.0-alpha-2 2017-11-22 2018-04-16
> org.apache.hbase:hbase-common 1.2.6 2.0.0.3.0.0.3-2 2017-05-29 2018-05-31
> org.apache.hbase:hbase-hadoop-compat 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
> 2018-05-31
> org.apache.hbase:hbase-hadoop2-compat 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
> 2018-05-31
> org.apache.hbase:hbase-server 1.2.6 2.0.0.3.0.0.3-2 2017-05-29 2018-05-31
> org.apache.hbase:hbase-shaded-client 1.2.6 2.0.0.3.0.0.3-2 2017-05-29
> 2018-05-31
> org.apache.hbase:hbase-shaded-server 1.2.6 2.0.0-alpha2 2017-05-29
> 2018-05-31
> org.apache.hive:hive-cli 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
> org.apache.hive:hive-common 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
> org.apache.hive:hive-exec 2.1.0 3.0.0.3.0.0.3-2 2016-06-16 2018-05-21
> org.apache.hive.hcatalog:hive-hcatalog-core 2.1.0 3.0.0.3.0.0.3-2
> 2016-06-16 2018-05-21
> org.apache.httpcomponents:httpasyncclient 4.1.2 4.1.3 2016-06-18
> 2017-02-05
> org.apache.httpcomponents:httpclient 4.5.2 4.5.5 2016-02-21 2018-01-18
> org.apache.httpcomponents:httpcore 4.4.5 4.4.9 2016-06-08 2018-01-11
> net.java.dev.javacc:javacc 4.0 7.0.3 2018-06-08 2017-11-06
> jline:jline 2.14.6 3.0.0.M1 2018-03-26 2018-06-08
> net.java.dev.jna:jna 4.1.0 4.5.1 2014-03-06 2017-12-27
> com.esotericsoftware.kryo:kryo 2.21 2.24.0 2013-02-27 2014-05-04
> io.dropwizard.metrics:metrics-core 3.1.2 4.1.0-rc2 2015-04-25 2018-05-03
> org.mongodb:mongo-java-driver 3.2.2 3.8.0-beta3 2016-02-15 2018-05-29
> 

Beam Dependency Check Report (2018-06-13)

2018-06-13 Thread Apache Jenkins Server

High Priority Dependency Updates Of Beam Python SDK:


Dependency Name
Current Version
Later Version
Current Version Release Date
Later Version Release Date
google-cloud-bigquery0.25.01.3.02017-06-262018-06-08
httplib20.9.20.11.32015-09-282018-03-30

High Priority Dependency Updates Of Beam Java SDK:


Dependency Name
Current Version
Later Version
Current Version Release Date
Later Version Release Date
org.assertj:assertj-core2.5.03.10.02016-07-032018-05-11
com.google.auto.service:auto-service1.0-rc21.0-rc42014-10-242017-12-11
biz.aQute:bndlib1.43.02.0.0.20130123-1334412011-04-012013-02-27
org.apache.cassandra:cassandra-all3.93.11.22016-09-262018-02-14
commons-cli:commons-cli1.21.42009-03-192017-03-09
commons-codec:commons-codec1.91.112013-12-202017-10-17
org.apache.commons:commons-dbcp22.1.12.3.02015-08-022018-05-08
com.typesafe:config1.3.01.3.32015-05-082018-02-21
de.flapdoodle.embed:de.flapdoodle.embed.mongo1.50.12.0.32015-12-112018-02-14
de.flapdoodle.embed:de.flapdoodle.embed.process1.50.12.0.32015-12-112018-02-14
org.apache.derby:derby10.12.1.110.14.2.02015-10-102018-05-03
org.apache.derby:derbyclient10.12.1.110.14.2.02015-10-102018-05-03
org.apache.derby:derbynet10.12.1.110.14.2.02015-10-102018-05-03
org.elasticsearch:elasticsearch5.6.36.2.42017-10-062018-04-12
org.elasticsearch:elasticsearch-hadoop5.0.06.2.42016-10-262018-04-12
org.elasticsearch.client:elasticsearch-rest-client5.6.36.2.42017-10-062018-04-12
com.alibaba:fastjson1.2.121.2.472016-05-212018-03-15
org.elasticsearch.test:framework5.6.36.2.42017-10-062018-04-12
org.freemarker:freemarker2.3.25-incubating2.3.282016-06-142018-03-30
org.codehaus.groovy:groovy-all2.4.133.0.0-alpha-22017-11-222018-04-16
org.apache.hbase:hbase-common1.2.62.0.0.3.0.0.3-22017-05-292018-05-31
org.apache.hbase:hbase-hadoop-compat1.2.62.0.0.3.0.0.3-22017-05-292018-05-31
org.apache.hbase:hbase-hadoop2-compat1.2.62.0.0.3.0.0.3-22017-05-292018-05-31
org.apache.hbase:hbase-server1.2.62.0.0.3.0.0.3-22017-05-292018-05-31
org.apache.hbase:hbase-shaded-client1.2.62.0.0.3.0.0.3-22017-05-292018-05-31
org.apache.hbase:hbase-shaded-server1.2.62.0.0-alpha22017-05-292018-05-31
org.apache.hive:hive-cli2.1.03.0.0.3.0.0.3-22016-06-162018-05-21
org.apache.hive:hive-common2.1.03.0.0.3.0.0.3-22016-06-162018-05-21
org.apache.hive:hive-exec2.1.03.0.0.3.0.0.3-22016-06-162018-05-21
org.apache.hive.hcatalog:hive-hcatalog-core2.1.03.0.0.3.0.0.3-22016-06-162018-05-21
org.apache.httpcomponents:httpasyncclient4.1.24.1.32016-06-182017-02-05
org.apache.httpcomponents:httpclient4.5.24.5.52016-02-212018-01-18
org.apache.httpcomponents:httpcore4.4.54.4.92016-06-082018-01-11
net.java.dev.javacc:javacc4.07.0.32018-06-082017-11-06
jline:jline2.14.63.0.0.M12018-03-262018-06-08
net.java.dev.jna:jna4.1.04.5.12014-03-062017-12-27
com.esotericsoftware.kryo:kryo2.212.24.02013-02-272014-05-04
io.dropwizard.metrics:metrics-core3.1.24.1.0-rc22015-04-252018-05-03
org.mongodb:mongo-java-driver3.2.23.8.0-beta32016-02-152018-05-29
io.netty:netty-all4.1.17.Final5.0.0.Alpha22017-11-082018-06-06
io.grpc:protoc-gen-grpc-java1.2.01.12.02017-03-152018-05-07
org.apache.qpid:proton-j0.13.10.27.12016-07-012018-04-25
com.carrotsearch.randomizedtesting:randomizedtesting-runner2.5.02.6.32017-01-232018-06-11
org.scala-lang:scala-library2.11.82.13.0-M42017-03-082018-05-14
org.slf4j:slf4j-api1.7.251.8.0-beta22017-03-162018-03-21
org.slf4j:slf4j-jdk141.7.251.8.0-beta22017-03-162018-03-21
org.apache.solr:solr-core5.5.47.3.12017-10-202018-05-17
org.apache.solr:solr-solrj5.5.47.3.12017-10-202018-05-17
org.apache.solr:solr-test-framework5.5.47.3.12017-10-202018-05-17
org.springframework:spring-_expression_4.3.5.RELEASE5.0.7.RELEASE2017-01-252018-06-12
sqlline:sqlline1.3.01.4.02017-05-302018-05-30
com.clearspring.analytics:stream2.9.52.9.62016-08-102018-01-10
org.elasticsearch.client:transport5.0.06.2.42016-10-252018-04-12
org.elasticsearch.plugin:transport-netty4-client5.6.36.2.42017-11-062018-04-12
org.tukaani:xz1.51.82014-03-082018-01-04



[BigQuery] TableRowJsonCoder question

2018-06-13 Thread Etienne Chauchot
Hi all,

While playing with BigQueryIO I noticed something. 

When we create a TableRow (e.g. in a row function in bigQueryIO) using new 
TableRow().set(), for ex a long gets boxed
into a Long. But when it is encoded using TableRowJsonCoder and then re-read it 
might be decoded as an Integer if the
value fits into Integer. It causes failure in asserts in tests like write then 
read. 
What I did for now is to downcast long to int to force it to be boxed into an 
Integer (because test value fits into
Integer) at TableRow creation.

Is there a way to fix it in TableRowJsonCoder or a better workaround?

Etienne

Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Tim Robertson
Hi Pablo,

I'm afraid I couldn't find one either... there is an issue about it [1]
which is old so it doesn't look likely to be resolved either.

If you have time (sorry I am a bit busy) could you please verify the
version does work if you install that version locally? I know the maven
version of that [2] but not sure on the gradle equivalent. If we know it
works, we can then find a repository that fits ok with Apache/Beam policy.

Alternatively, we could consider using a fully qualified reference (i.e.
@edu.umd.cs.findbugs.annotations.SuppressWarnings) to the deprecated
version and leave the dependency at the 1.3.9-1. I believe our general
direction is to remove findbugs when errorprone covers all aspects so I
*expect* this should be considered reasonable.

I hope this helps,
Tim

[1] https://github.com/stephenc/findbugs-annotations/issues/4
[2] https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html

On Wed, Jun 13, 2018 at 8:39 AM, Pablo Estrada  wrote:

> Hi Tim,
> you're right. Thanks for pointing that out. There's just one problem that
> I'm running into now: The 3.0.1-1 version does not seem to be available in
> Maven Central[1]. Looking at the website, I am not quite sure if there's
> another repository where they do stage the newer versions?[2]
>
> -P
>
> [1] https://repo.maven.apache.org/maven2/com/github/
> stephenc/findbugs/findbugs-annotations/
> [2] http://stephenc.github.io/findbugs-annotations/
>
> On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson 
> wrote:
>
>> Hi Pablo,
>>
>> I took only a quick look.
>>
>> "- The JAR from the non-LGPL findbugs does not contain the
>> SuppressFBWarnings annotation"
>>
>> Unless I misunderstand you it looks like SuppressFBWarnings was added in
>> Stephen's version in this commit [1] which was introduced in version
>> 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
>> I notice in your commits [1] you've been exploring version 3.0.0 already
>> though... what happens when you use 3.0.1-1? It sounds like the wrong
>> version is coming in rather than the annotation being missing.
>>
>> Thanks,
>> Tim
>>
>> [1] https://github.com/stephenc/findbugs-annotations/
>> commits/master/src/main/java/edu/umd/cs/findbugs/
>> annotations/SuppressWarnings.java
>> [2] https://github.com/stephenc/findbugs-annotations/releases
>> [3] https://github.com/apache/beam/pull/5609/commits/
>> 32c7df706e970557f154ff6bc521b2e00f9d09ab
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada 
>> wrote:
>>
>>> Hi all,
>>> I'll humbly declare that after wrestling with he build to stop depending
>>> on the wrong findbugs_annotations, I feel somewhat lost. The issue is
>>> actually quite small:
>>>
>>> - The JAR from the non-LGPL findbugs does not contain the
>>> SuppressFBWarnings annotation. This means that when building, ByteBuddy
>>> produces a few warnings (nothing critical).
>>> - The easiest way to avoid this failure is to call
>>> applyJavaNature(failOnWarning: false), but this would be bad, since we want
>>> to keep a high standard for tasks like ErrorProne and FindBugs itself.
>>> - So I find myself lost: How do we suppress trivial warnings coming from
>>> missing annotations, and honor warnings coming from other plugins?
>>>
>>> Any help / a PR from someone more capable would be appreciated.
>>> Best
>>> -P.
>>>
>>> On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía  wrote:
>>>
 Yes, ok I was not aware it was already being addressed, nice.
 On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay  wrote:
 >
 > Ismaël,
 >
 > I believe Pablo's https://github.com/apache/beam/pull/5609 is fixing
 the issue by changing the findbugs back to "com.github.stephenc.findbugs".
 Is this what you are referring to?
 >
 > Ahmet
 >
 > On Tue, Jun 12, 2018 at 2:51 PM, Boyuan Zhang 
 wrote:
 >>
 >> Hey JB,
 >>
 >> I added some instructions about how to create python wheels in this
 PR: https://github.com/apache/beam-site/pull/467 . Hope it would be
 helpful.
 >>
 >> Boyuan
 >>
 >

>>> --
>>> Got feedback? go/pabloem-feedback
>>> 
>>>
>>
>> --
> Got feedback? go/pabloem-feedback
>


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Pablo Estrada
Hi Tim,
you're right. Thanks for pointing that out. There's just one problem that
I'm running into now: The 3.0.1-1 version does not seem to be available in
Maven Central[1]. Looking at the website, I am not quite sure if there's
another repository where they do stage the newer versions?[2]

-P

[1]
https://repo.maven.apache.org/maven2/com/github/stephenc/findbugs/findbugs-annotations
/
[2] http://stephenc.github.io/findbugs-annotations/

On Tue, Jun 12, 2018 at 11:10 PM Tim Robertson 
wrote:

> Hi Pablo,
>
> I took only a quick look.
>
> "- The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation"
>
> Unless I misunderstand you it looks like SuppressFBWarnings was added in
> Stephen's version in this commit [1] which was introduced in version
> 2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
> I notice in your commits [1] you've been exploring version 3.0.0 already
> though... what happens when you use 3.0.1-1? It sounds like the wrong
> version is coming in rather than the annotation being missing.
>
> Thanks,
> Tim
>
> [1]
> https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
> [2] https://github.com/stephenc/findbugs-annotations/releases
> [3]
> https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab
>
>
>
>
>
>
>
> On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada  wrote:
>
>> Hi all,
>> I'll humbly declare that after wrestling with he build to stop depending
>> on the wrong findbugs_annotations, I feel somewhat lost. The issue is
>> actually quite small:
>>
>> - The JAR from the non-LGPL findbugs does not contain the
>> SuppressFBWarnings annotation. This means that when building, ByteBuddy
>> produces a few warnings (nothing critical).
>> - The easiest way to avoid this failure is to call
>> applyJavaNature(failOnWarning: false), but this would be bad, since we want
>> to keep a high standard for tasks like ErrorProne and FindBugs itself.
>> - So I find myself lost: How do we suppress trivial warnings coming from
>> missing annotations, and honor warnings coming from other plugins?
>>
>> Any help / a PR from someone more capable would be appreciated.
>> Best
>> -P.
>>
>> On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía  wrote:
>>
>>> Yes, ok I was not aware it was already being addressed, nice.
>>> On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay  wrote:
>>> >
>>> > Ismaël,
>>> >
>>> > I believe Pablo's https://github.com/apache/beam/pull/5609 is fixing
>>> the issue by changing the findbugs back to "com.github.stephenc.findbugs".
>>> Is this what you are referring to?
>>> >
>>> > Ahmet
>>> >
>>> > On Tue, Jun 12, 2018 at 2:51 PM, Boyuan Zhang 
>>> wrote:
>>> >>
>>> >> Hey JB,
>>> >>
>>> >> I added some instructions about how to create python wheels in this
>>> PR: https://github.com/apache/beam-site/pull/467 . Hope it would be
>>> helpful.
>>> >>
>>> >> Boyuan
>>> >>
>>> >
>>>
>> --
>> Got feedback? go/pabloem-feedback
>> 
>>
>
> --
Got feedback? go/pabloem-feedback


Re: [CANCEL][VOTE] Apache Beam, version 2.5.0, release candidate #1

2018-06-13 Thread Tim Robertson
Hi Pablo,

I took only a quick look.

"- The JAR from the non-LGPL findbugs does not contain the
SuppressFBWarnings annotation"

Unless I misunderstand you it looks like SuppressFBWarnings was added in
Stephen's version in this commit [1] which was introduced in version
2.0.3-1 -  I've checked is in the 3.0.1-1 build [2]
I notice in your commits [1] you've been exploring version 3.0.0 already
though... what happens when you use 3.0.1-1? It sounds like the wrong
version is coming in rather than the annotation being missing.

Thanks,
Tim

[1]
https://github.com/stephenc/findbugs-annotations/commits/master/src/main/java/edu/umd/cs/findbugs/annotations/SuppressWarnings.java
[2] https://github.com/stephenc/findbugs-annotations/releases
[3]
https://github.com/apache/beam/pull/5609/commits/32c7df706e970557f154ff6bc521b2e00f9d09ab







On Wed, Jun 13, 2018 at 2:37 AM, Pablo Estrada  wrote:

> Hi all,
> I'll humbly declare that after wrestling with he build to stop depending
> on the wrong findbugs_annotations, I feel somewhat lost. The issue is
> actually quite small:
>
> - The JAR from the non-LGPL findbugs does not contain the
> SuppressFBWarnings annotation. This means that when building, ByteBuddy
> produces a few warnings (nothing critical).
> - The easiest way to avoid this failure is to call
> applyJavaNature(failOnWarning: false), but this would be bad, since we want
> to keep a high standard for tasks like ErrorProne and FindBugs itself.
> - So I find myself lost: How do we suppress trivial warnings coming from
> missing annotations, and honor warnings coming from other plugins?
>
> Any help / a PR from someone more capable would be appreciated.
> Best
> -P.
>
> On Tue, Jun 12, 2018 at 3:02 PM Ismaël Mejía  wrote:
>
>> Yes, ok I was not aware it was already being addressed, nice.
>> On Tue, Jun 12, 2018 at 11:56 PM Ahmet Altay  wrote:
>> >
>> > Ismaël,
>> >
>> > I believe Pablo's https://github.com/apache/beam/pull/5609 is fixing
>> the issue by changing the findbugs back to "com.github.stephenc.findbugs".
>> Is this what you are referring to?
>> >
>> > Ahmet
>> >
>> > On Tue, Jun 12, 2018 at 2:51 PM, Boyuan Zhang 
>> wrote:
>> >>
>> >> Hey JB,
>> >>
>> >> I added some instructions about how to create python wheels in this
>> PR: https://github.com/apache/beam-site/pull/467 . Hope it would be
>> helpful.
>> >>
>> >> Boyuan
>> >>
>> >
>>
> --
> Got feedback? go/pabloem-feedback
>