Re: "retest this please" no longer working on the beam site repo

2018-06-27 Thread Scott Wegner
JB, did you ever hear back from INFRA? This came up again on another PR:
https://github.com/apache/beam/pull/5800#issuecomment-400866581

On Thu, Jun 21, 2018 at 10:39 AM Lukasz Cwik  wrote:

> I have found that it is significantly delayed, takes 10's of mins before
> Jenkins recognizes the PR phrase and (re-)schedules the job.
>
> On Thu, Jun 21, 2018 at 10:09 AM Jean-Baptiste Onofré 
> wrote:
>
>> I have the same issue on other Apache projects like Karaf and its
>> subproject.
>>
>> Let me ping INFRA.
>>
>> Regards
>> JB
>>
>> On 21/06/2018 18:59, Reuven Lax wrote:
>> > Does anyone know why this functionality isn't working?
>> >
>> > Reuven
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
So, there are a few modes of checkstyle failure that can be induced by this:

 - lines get too long because of wrap & indent logic
 - a lot of our HTML is actually already broken and doesn't have  tags
where it should; spotless adds a blank line which makes checkstyle notice
the errors

I think that autoformat outweighs these. I propose that we comment out
these checkstyle rules, turn on autoformat, then gradually restore the
rules. I have adjusted my pull request accordingly.

Kenn

On Wed, Jun 27, 2018 at 5:05 PM Robert Burke  wrote:

> +1!
> Given this is the recommended default for Go projects (with it's own gofmt
> tool) I'm for similar tooling for this in other languages.
>
> I suspect we can add Gradle command to run gofmt over the go code too for
> a consistent hook to run for beam code. It would save folks from needing to
> set up their IDEs for every language independently, any of which they may
> not be used to the idiomatic conventions.
>
> On Wed, Jun 27, 2018, 3:32 PM Kenneth Knowles  wrote:
>
>> OK, I opened https://github.com/apache/beam/pull/5797. This thread is
>> still less than a day, so there's no commitment yet. There's no big hurry -
>> I can always discard the autoformat and rerun it. I'm sure there will
>> always be conflicts so I will just have to rerun it and merge at some quiet
>> period for the repo. I will see about pinging existing PRs.
>>
>> On Wed, Jun 27, 2018 at 11:31 AM Daniel Oliveira 
>> wrote:
>>
>>> +1 I'll throw in my support for auto-formatting, especially if the
>>> entire project is auto-formatted in advance.
>>>
>>> On Wed, Jun 27, 2018 at 10:53 AM Huygaa Batsaikhan 
>>> wrote:
>>>
 +1. Global auto-formatting is cool!

 On Wed, Jun 27, 2018 at 10:17 AM Kenneth Knowles 
 wrote:

> I just mean that because of how the tool works. But I guess if there
> were discretion then two different people could end up with autoformatting
> that disagrees, so again you get lines in the PR diff that aren't real
> changes.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 10:16 AM Raghu Angadi 
> wrote:
>
>> On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles 
>> wrote:
>>
>>> Nope! No discretion allowed :-)
>>>
>>
>> +1. Fair enough!
>>
>>
>>>
>>> On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi 
>>> wrote:
>>>
 +1.

 Wondering if it can be configured to reformat only what we care
 most about (2 space indentation etc), allowing some discretion on the
 edges. An example of inconsistent formatting that ends up in my code:
 ---
 anObject.someLongMethodName(arg_number_1,

  arg_number_2);
 --- vs ---
 anObject.anotherMethodName(
   arg_number_1,
   arg_number_2
 );


 On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik 
 wrote:

> It wasn't clear to me that the intent was to autoformat all the
> code from the proposal initially. If thats the case, then the delta is
> quite small typically.
>
> Also, it would be easier if we recommended to users to run run
> "./gradlew spotlessApply" which will run spotless on all modules.
>
> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles 
> wrote:
>
>> Luke: the proposal here solves exactly what you are talking about.
>>
>> The problem you describe happens when the PR author uses
>> autoformat but the baseline is not already autoformatted. What I am
>> proposing is to make sure the baseline is already autoformatted, so 
>> PRs
>> never have extraneous formatting changes.
>>
>> Rafael: the default setting on GitHub is "allow edits by
>> maintainers" so actually a committer can run spotless on behalf of a
>> contributor and push the fixup. I have done this. It also lets a
>> committer fix up a good PR and merge it even if the contributor is, 
>> say,
>> asleep.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez <
>> rfern...@google.com> wrote:
>>
>>> Luke: Anything that helps contributors and reviewers work better
>>> together - +1! :D
>>>
>>>
>>>
>>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik 
>>> wrote:
>>>
 If spotless is run against a PR that is already well formatted
 its a non-issue as the formatting changes are usually related to 
 the change
 but I have reviewed a few PRs that have 100s of lines of 
 formatting change
 which really obfuscates the work.
 Instead of asking contributors to run spotless, can we have a
 cron job run it across the project like once a day/week/... and 
 cut a PR?

Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Robert Burke
+1!
Given this is the recommended default for Go projects (with it's own gofmt
tool) I'm for similar tooling for this in other languages.

I suspect we can add Gradle command to run gofmt over the go code too for a
consistent hook to run for beam code. It would save folks from needing to
set up their IDEs for every language independently, any of which they may
not be used to the idiomatic conventions.

On Wed, Jun 27, 2018, 3:32 PM Kenneth Knowles  wrote:

> OK, I opened https://github.com/apache/beam/pull/5797. This thread is
> still less than a day, so there's no commitment yet. There's no big hurry -
> I can always discard the autoformat and rerun it. I'm sure there will
> always be conflicts so I will just have to rerun it and merge at some quiet
> period for the repo. I will see about pinging existing PRs.
>
> On Wed, Jun 27, 2018 at 11:31 AM Daniel Oliveira 
> wrote:
>
>> +1 I'll throw in my support for auto-formatting, especially if the entire
>> project is auto-formatted in advance.
>>
>> On Wed, Jun 27, 2018 at 10:53 AM Huygaa Batsaikhan 
>> wrote:
>>
>>> +1. Global auto-formatting is cool!
>>>
>>> On Wed, Jun 27, 2018 at 10:17 AM Kenneth Knowles  wrote:
>>>
 I just mean that because of how the tool works. But I guess if there
 were discretion then two different people could end up with autoformatting
 that disagrees, so again you get lines in the PR diff that aren't real
 changes.

 Kenn

 On Wed, Jun 27, 2018 at 10:16 AM Raghu Angadi 
 wrote:

> On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles 
> wrote:
>
>> Nope! No discretion allowed :-)
>>
>
> +1. Fair enough!
>
>
>>
>> On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi 
>> wrote:
>>
>>> +1.
>>>
>>> Wondering if it can be configured to reformat only what we care most
>>> about (2 space indentation etc), allowing some discretion on the edges. 
>>> An
>>> example of inconsistent formatting that ends up in my code:
>>> ---
>>> anObject.someLongMethodName(arg_number_1,
>>>arg_number_2);
>>> --- vs ---
>>> anObject.anotherMethodName(
>>>   arg_number_1,
>>>   arg_number_2
>>> );
>>>
>>>
>>> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik 
>>> wrote:
>>>
 It wasn't clear to me that the intent was to autoformat all the
 code from the proposal initially. If thats the case, then the delta is
 quite small typically.

 Also, it would be easier if we recommended to users to run run
 "./gradlew spotlessApply" which will run spotless on all modules.

 On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles 
 wrote:

> Luke: the proposal here solves exactly what you are talking about.
>
> The problem you describe happens when the PR author uses
> autoformat but the baseline is not already autoformatted. What I am
> proposing is to make sure the baseline is already autoformatted, so 
> PRs
> never have extraneous formatting changes.
>
> Rafael: the default setting on GitHub is "allow edits by
> maintainers" so actually a committer can run spotless on behalf of a
> contributor and push the fixup. I have done this. It also lets a
> committer fix up a good PR and merge it even if the contributor is, 
> say,
> asleep.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez <
> rfern...@google.com> wrote:
>
>> Luke: Anything that helps contributors and reviewers work better
>> together - +1! :D
>>
>>
>>
>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik 
>> wrote:
>>
>>> If spotless is run against a PR that is already well formatted
>>> its a non-issue as the formatting changes are usually related to 
>>> the change
>>> but I have reviewed a few PRs that have 100s of lines of formatting 
>>> change
>>> which really obfuscates the work.
>>> Instead of asking contributors to run spotless, can we have a
>>> cron job run it across the project like once a day/week/... and cut 
>>> a PR?
>>>
>>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
>>> wrote:
>>>
 Good points, Dan. Checkstyle will still run, but just focused
 on the things that go beyond format.

 Kenn

 On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
 echauc...@apache.org> wrote:

> +1 !
> It's my custom to avoid reformatting to spare meaningless diff
> burden to the reviewer. Now it will be over, thanks.
>
> Etienne
>
> 

Re: Filtered Pre-commit triggering is BACK!

2018-06-27 Thread Ahmet Altay
Two of my PRs with python changes did not trigger any pre-commits. Could it
be related to this change?

https://github.com/apache/beam/pull/5768
https://github.com/apache/beam/pull/5800

Ahmet



On Tue, Jun 26, 2018 at 2:30 PM, Andrew Pilloud  wrote:

> Awesome! This will save so much time running tests.
>
> On Tue, Jun 26, 2018 at 2:29 PM Yifan Zou  wrote:
>
>> Thanks Scott! It's nice to have this feature.
>>
>> On Tue, Jun 26, 2018 at 2:24 PM Pablo Estrada  wrote:
>>
>>> This is great. Reducing load on infrastructure should help Beam scale
>>> into a larger project : ) - Thanks Scott!
>>>
>>> On Tue, Jun 26, 2018 at 2:21 PM Scott Wegner  wrote:
>>>
 By popular demand [1], filtered pre-commit triggering is now
 re-enabled. Now when submitting pull request, only pre-commit tests for
 your affected files will run: if you change just the Go SDK, you'll no
 longer need to run Java pre-commits.

 Last time we introduced this change it regressed our ability to run
 other pre-commits via trigger phrase ("Run Java PreCommit"); this is now
 fixed by splitting the Jenkins jobs by trigger condition [2].

 Enjoy!


 [1] https://lists.apache.org/thread.html/f98af933ce1ffbf8ae85161dea032a
 ab4577a600c749f46b1d85c226@%3Cdev.beam.apache.org%3E
 [2] https://github.com/apache/beam/pull/5757

>>> --
>>> Got feedback? go/pabloem-feedback
>>> 
>>>
>>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Jean-Baptiste Onofré
+1

I already used it in SQL module and it's great. It makes sense to apply on all 
modules.

Thanks !
Regards
JB

Le 27 juin 2018 à 12:15, à 12:15, Kenneth Knowles  a écrit:
>Hi all,
>
>I like readable code, but I don't like formatting it myself. And I
>_really_
>don't like discussing in code review. "Spotless" [1] can enforce - and
>automatically apply - automatic formatting for Java, Groovy, and some
>others.
>
>This is not about style or wanting a particular layout. This is about
>automation, contributor experience, and streamlining review
>
>- Contributor experience: MUCH better than checkstyle: error message
>just
>says "run ./gradlew :beam-your-module:spotlessApply" instead of telling
>them to go in and manually edit.
>
>- Automation: You want to use autoformat so you don't have to format
>code
>by hand. But if you autoformat a file that was in some other format,
>then
>you touch a bunch of unrelated lines. If the file is already
>autoformatted,
>it is much better.
>
> - Review: Never talk about code formatting ever again. A PR also needs
>baseline to already be autoformatted or formatting will make it unclear
>which lines are really changed.
>
>This is already available via applyJavaNature(enableSpotless: true) and
>it
>is turned on for SQL and our buildSrc gradle plugins. It is very nice.
>There is a JIRA [2] to turn it on for the hold code base. Personally, I
>think (a) every module could make a different choice if the main
>contributors feel strongly and (b) it is objectively better to always
>autoformat :-)
>
>WDYT? If we do it, it is trivial to add it module-at-a-time or
>globally. If
>someone conflicts with a massive autoformat commit, they can just keep
>their changes and autoformat them and it is done.
>
>Kenn
>
>[1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>[2] https://issues.apache.org/jira/browse/BEAM-4394


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
OK, I opened https://github.com/apache/beam/pull/5797. This thread is still
less than a day, so there's no commitment yet. There's no big hurry - I can
always discard the autoformat and rerun it. I'm sure there will always be
conflicts so I will just have to rerun it and merge at some quiet period
for the repo. I will see about pinging existing PRs.

On Wed, Jun 27, 2018 at 11:31 AM Daniel Oliveira 
wrote:

> +1 I'll throw in my support for auto-formatting, especially if the entire
> project is auto-formatted in advance.
>
> On Wed, Jun 27, 2018 at 10:53 AM Huygaa Batsaikhan 
> wrote:
>
>> +1. Global auto-formatting is cool!
>>
>> On Wed, Jun 27, 2018 at 10:17 AM Kenneth Knowles  wrote:
>>
>>> I just mean that because of how the tool works. But I guess if there
>>> were discretion then two different people could end up with autoformatting
>>> that disagrees, so again you get lines in the PR diff that aren't real
>>> changes.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 10:16 AM Raghu Angadi 
>>> wrote:
>>>
 On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles 
 wrote:

> Nope! No discretion allowed :-)
>

 +1. Fair enough!


>
> On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi 
> wrote:
>
>> +1.
>>
>> Wondering if it can be configured to reformat only what we care most
>> about (2 space indentation etc), allowing some discretion on the edges. 
>> An
>> example of inconsistent formatting that ends up in my code:
>> ---
>> anObject.someLongMethodName(arg_number_1,
>>arg_number_2);
>> --- vs ---
>> anObject.anotherMethodName(
>>   arg_number_1,
>>   arg_number_2
>> );
>>
>>
>> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:
>>
>>> It wasn't clear to me that the intent was to autoformat all the code
>>> from the proposal initially. If thats the case, then the delta is quite
>>> small typically.
>>>
>>> Also, it would be easier if we recommended to users to run run
>>> "./gradlew spotlessApply" which will run spotless on all modules.
>>>
>>> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles 
>>> wrote:
>>>
 Luke: the proposal here solves exactly what you are talking about.

 The problem you describe happens when the PR author uses autoformat
 but the baseline is not already autoformatted. What I am proposing is 
 to
 make sure the baseline is already autoformatted, so PRs never have
 extraneous formatting changes.

 Rafael: the default setting on GitHub is "allow edits by
 maintainers" so actually a committer can run spotless on behalf of a
 contributor and push the fixup. I have done this. It also lets a
 committer fix up a good PR and merge it even if the contributor is, 
 say,
 asleep.

 Kenn

 On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez <
 rfern...@google.com> wrote:

> Luke: Anything that helps contributors and reviewers work better
> together - +1! :D
>
>
>
> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik 
> wrote:
>
>> If spotless is run against a PR that is already well formatted
>> its a non-issue as the formatting changes are usually related to the 
>> change
>> but I have reviewed a few PRs that have 100s of lines of formatting 
>> change
>> which really obfuscates the work.
>> Instead of asking contributors to run spotless, can we have a
>> cron job run it across the project like once a day/week/... and cut 
>> a PR?
>>
>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
>> wrote:
>>
>>> Good points, Dan. Checkstyle will still run, but just focused on
>>> the things that go beyond format.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
>>> echauc...@apache.org> wrote:
>>>
 +1 !
 It's my custom to avoid reformatting to spare meaningless diff
 burden to the reviewer. Now it will be over, thanks.

 Etienne

 Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :

 Hi all,

 I like readable code, but I don't like formatting it myself.
 And I _really_ don't like discussing in code review. "Spotless" 
 [1] can
 enforce - and automatically apply - automatic formatting for Java, 
 Groovy,
 and some others.

 This is not about style or wanting a particular layout. This is
 about automation, contributor experience, and streamlining review


Re: [Design Proposal] Improving Beam code review

2018-06-27 Thread Robert Bradshaw
Thanks for writing this up! I especially like the idea of
automatically assigning code reviewers, e.g. via
https://help.github.com/articles/about-codeowners/
On Wed, Jun 27, 2018 at 11:10 AM Scott Wegner  wrote:
>
> Thanks for putting together this proposal Huygaa. Overall looks good to me; I 
> added some comments in the doc.
>
> On Tue, Jun 26, 2018 at 7:44 PM Kenneth Knowles  wrote:
>>
>> Does Kubernetes keep up with their backlog? We were hovering around 100 
>> before our recent addition of committers & stalebot, and now around 80. I 
>> can imagine their 1000 open PRs might be an OK steady state; they have some 
>> 6 month and 2 month PRs but the overall distribution might be sort of like 
>> ours. Is the data in a table somewhere? Couple other reference points: Spark 
>> has ~500, Flink ~400, Storm ~150, Rust ~150.
>>
>> Kenn
>>
>>
>> On Tue, Jun 26, 2018 at 6:35 PM Rafael Fernandez  wrote:
>>>
>>> I did a quick pass on the doc and left minor comments, thanks! I have some 
>>> feedback and thoughts:
>>>
>>> For metrics and tools, there ought to be mature OSS projects out there we 
>>> can learn from. I believe Kubernetes has a very healthy practice, it'd be 
>>> ideal to learn from them. +Griselda Cuevas can connect you (and people 
>>> working on this).
>>> I really like the idea of a style guide (which can evolve) for the various 
>>> areas - presumably Java, Python, Go, etc. have their own. The reason I like 
>>> it is because reviews become easier -- the reviewer will have an easier 
>>> time working with the contributor to make sure together they can introduce 
>>> great code that is consistent with the codebase (so they can focus on 
>>> functionality and scale discussions, not style, which is published).
>>> I think setting review expectations is hard. Many of us in the community 
>>> have various degrees of time devoted to development - some of us are paid 
>>> to work on Beam full time, others part time, others are gifting their time 
>>> and talent. I find inspiration in the Apache Code of Conduct [1] to instead 
>>> empower people to communicate clearly. A company or a developer may choose 
>>> to say "This is what you can expect from me", and say, opt-in to email 
>>> reminders and such. And when something is time sensitive, we should trust 
>>> reviewers to be Apache-y and do a micro version of "Step down consderately" 
>>> -- "I can't commit to reviewing this by Friday, I suggest another person.", 
>>> for example.
>>>
>>> I think at the end of the day we all need to eliminate guesswork and 
>>> promote the healthiest communication we can so we can all continue to grow 
>>> the project as fast as we want.
>>>
>>> r
>>>
>>> [1] https://www.apache.org/foundation/policies/conduct.html
>>>
>>> On Tue, Jun 26, 2018 at 5:48 PM Huygaa Batsaikhan  wrote:

 Reuven, that's great. In this thread, we can continue discussing the usage 
 of review tools, dashboards, and metrics.

 On Tue, Jun 26, 2018 at 5:27 PM Reuven Lax  wrote:
>
> So I suggested a while ago that we create a code-review guidelines doc, 
> and in fact I was coincidentally just now drafting up a proposal doc. 
> I'll share my proposal doc with the dev list soon.
>
> On Tue, Jun 26, 2018 at 5:18 PM Huygaa Batsaikhan  
> wrote:
>>
>> Hi, I've been looking into ways to improve Beam's code review process 
>> based on previous discussions on dev list and summits, and I would like 
>> to propose improvement ideas. Please take a look at: 
>> https://s.apache.org/beam-code-review.
>>
>> Main proposals suggested in the doc are:
>>
>> Create a code review guideline document.
>> Build/setup code review tools and dashboards for Beam.
>> Collect metrics to monitor Beam's code review health.
>>
>> Feel free to add comments in the doc. I am looking for all sorts of 
>> suggestions including existing code review guidelines, potential code 
>> review tools etc.
>>
>> Thanks so much,
>> Huygaa


Re: Beam Dependency Ownership

2018-06-27 Thread Chamikara Jayalath
It's mentioned under "Dependency declarations may identify owners that are
responsible for upgrading respective dependencies". Feel free to update if
you think more details should be added to it. I think it'll be easier if we
transfer data in spreadsheet to comments close to dependency declarations
instead of maintaining the spreadsheet (after we collect the data).
Otherwise we'll have to put an extra effort to make sure that the
spreadsheet, BeamModulePlugin, and Python setup.py are in sync. We can
decide on the exact format of the comment to make sure that automated tool
can easily parse the comment.

- Cham

On Wed, Jun 27, 2018 at 1:45 PM Yifan Zou  wrote:

> Thanks Scott, I will supplement the missing packages to the spreadsheet.
> And, we expect this being kept up to date along with the Beam project
> growth. Shall we mention this in the Dependency Guide page
> , @Chamikara Jayalath
> ?
>
> On Wed, Jun 27, 2018 at 11:17 AM Scott Wegner  wrote:
>
>> Thanks for kicking off this process Yifan-- I'll add my name to some
>> dependencies I'm familiar with.
>>
>> Do you expect this to be a one-time process, or will we maintain the
>> owners over time? If we will maintain this list, it would be easier to keep
>> it up-to-date if it was closer to the code. i.e. perhaps each dependency
>> registration in the Gradle BeamModulePlugin [1] should include a list of
>> owners.
>>
>> [1]
>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L325
>>
>> On Wed, Jun 27, 2018 at 8:52 AM Yifan Zou  wrote:
>>
>>> Hi all,
>>>
>>> We now have the automated detections for Beam dependency updates and
>>> sending a weekly report to dev mailing list. In order to address the
>>> updates in time, we want to find owners for all dependencies of Beam, and
>>> finally, Jira bugs will be automatically created and assigned to the owners
>>> if actions need to be taken. We also welcome non-owners to upgrade
>>> dependency packages, but only owners will receive the Jira tickets.
>>>
>>> Please review the spreadsheet Beam SDK Dependency Ownership
>>> 
>>>  and
>>> sign off if you are familiar with any Beam dependencies and willing to
>>> take in charge of them. It is definitely fine that a single package have
>>> multiple owners. The more owners we have, the more helps we will get to
>>> keep Beam dependencies in a healthy state.
>>>
>>> Thank you :)
>>>
>>> Regards.
>>> Yifan
>>>
>>>
>>> https://docs.google.com/spreadsheets/d/12NN3vPqFTBQtXBc0fg4sFIb9c_mgst0IDePB_0Ui8kE/edit?ts=5b32bec1#gid=0
>>>
>>


Re: Beam Dependency Ownership

2018-06-27 Thread Yifan Zou
Thanks Scott, I will supplement the missing packages to the spreadsheet.
And, we expect this being kept up to date along with the Beam project
growth. Shall we mention this in the Dependency Guide page
, @Chamikara Jayalath
?

On Wed, Jun 27, 2018 at 11:17 AM Scott Wegner  wrote:

> Thanks for kicking off this process Yifan-- I'll add my name to some
> dependencies I'm familiar with.
>
> Do you expect this to be a one-time process, or will we maintain the
> owners over time? If we will maintain this list, it would be easier to keep
> it up-to-date if it was closer to the code. i.e. perhaps each dependency
> registration in the Gradle BeamModulePlugin [1] should include a list of
> owners.
>
> [1]
> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L325
>
> On Wed, Jun 27, 2018 at 8:52 AM Yifan Zou  wrote:
>
>> Hi all,
>>
>> We now have the automated detections for Beam dependency updates and
>> sending a weekly report to dev mailing list. In order to address the
>> updates in time, we want to find owners for all dependencies of Beam, and
>> finally, Jira bugs will be automatically created and assigned to the owners
>> if actions need to be taken. We also welcome non-owners to upgrade
>> dependency packages, but only owners will receive the Jira tickets.
>>
>> Please review the spreadsheet Beam SDK Dependency Ownership
>> 
>>  and
>> sign off if you are familiar with any Beam dependencies and willing to
>> take in charge of them. It is definitely fine that a single package have
>> multiple owners. The more owners we have, the more helps we will get to
>> keep Beam dependencies in a healthy state.
>>
>> Thank you :)
>>
>> Regards.
>> Yifan
>>
>>
>> https://docs.google.com/spreadsheets/d/12NN3vPqFTBQtXBc0fg4sFIb9c_mgst0IDePB_0Ui8kE/edit?ts=5b32bec1#gid=0
>>
>


Re: Using user developped source in streamline python

2018-06-27 Thread Ismaël Mejía
It seems like a nice opportunity also to contribute them to the project in
case you are able to work on them.
Don't hesitate to contact us or ask for help if needed.


On Wed, Jun 27, 2018 at 4:45 PM Ahmet Altay  wrote:

> Hi Sébastien,
>
> Currently there is no work in progress for including the write transforms
> for the locations you listed. You could develop your own version if
> interested. Please see WriteToBigquery transform [1] for reference.
>
> Ahmet
>
> [1]
> https://github.com/apache/beam/blob/375bd3a6a53ba3ba7c965278dcb322875e1b4dca/sdks/python/apache_beam/io/gcp/bigquery.py#L1287
>
> On Wed, Jun 27, 2018 at 2:48 AM, Sebastien Morand <
> sebastien.mor...@veolia.com> wrote:
>
>> Hi,
>>
>> Thanks for your answer.
>>
>> Ok looking forward and ready to test alpha on this. Because we have
>> actually some use cases to send data to CloudSQL or Spanner or BigTable or
>> Firestore. As far as I read the documentation, there is no native support
>> for them and so we have already implemented a custom source support.
>>
>> So is there any work in progress for any of the 4 sinks above for native
>> support in the python SDK?
>>
>> Regards,
>>
>> *Sébastien MORAND*
>> Team Lead Solution Architect
>> Technology & Operations / Digital Factory
>> Veolia - Group Information Systems & Technology (IS)
>> Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
>> Bureau 0144C (Ouest)
>> 30, rue Madeleine-Vionnet
>> 
>>  -
>> 93300
>> Aubervilliers, France
>> 
>> *www.veolia.com *
>> 
>> 
>> 
>> 
>> 
>>
>>
>> On Thu, 21 Jun 2018 at 16:53, Lukasz Cwik  wrote:
>>
>>> +dev@beam.apache.org
>>>
>>> Python streaming custom source support will be available via
>>> SplittableDoFn. It is actively being worked on by a few contributors but to
>>> my knowledge there is no roadmap yet for having support for this for
>>> Dataflow.
>>>
>>>
>>> On Thu, Jun 21, 2018 at 1:19 AM Sebastien Morand <
>>> sebastien.mor...@veolia.com> wrote:
>>>
 Hi,

 We need to setup streaming dataflow in python using developed source
 (in the opposite of native source) for Cloud SQL integration and Firestore
 integration.

 This is currently not supported as far as I understood the
 documentation, can you confirm? Any roadmap on the topic?

 Thanks by advance,
 Regards,

 *Sébastien MORAND*
 Team Lead Solution Architect
 Technology & Operations / Digital Factory
 Veolia - Group Information Systems & Technology (IS)
 Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
 Bureau 0144C (Ouest)
 30, rue Madeleine-Vionnet
 
  -
 93300
 Aubervilliers, France
 
 *www.veolia.com *
 
 
 
 
 



 
 This e-mail transmission (message and any attached files) may contain
 information that is proprietary, privileged and/or confidential to Veolia
 Environnement and/or its affiliates and is intended exclusively for the
 person(s) to whom it is addressed. If you are not the intended recipient,
 please notify the sender by return e-mail and delete all copies of this
 e-mail, including all attachments. Unless expressly authorized, any use,
 disclosure, publication, retransmission or dissemination of this e-mail
 and/or of its attachments is strictly prohibited.

 Ce message electronique et ses fichiers attaches sont strictement
 confidentiels et peuvent contenir des elements dont Veolia Environnement
 et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
 destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
 message par erreur, merci de le retourner a son emetteur et de le detruire
 ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
 publication, la distribution, ou la reproduction 

Re: Beam Dependency Ownership

2018-06-27 Thread Chamikara Jayalath
Thanks Yifan. I added myself to some of the dependencies I'm involved with
upgrading.
In case anyone miss, there's a second tab for Python SDK :).


On Wed, Jun 27, 2018 at 11:17 AM Scott Wegner  wrote:

> Thanks for kicking off this process Yifan-- I'll add my name to some
> dependencies I'm familiar with.
>
> Do you expect this to be a one-time process, or will we maintain the
> owners over time? If we will maintain this list, it would be easier to keep
> it up-to-date if it was closer to the code. i.e. perhaps each dependency
> registration in the Gradle BeamModulePlugin [1] should include a list of
> owners.
>

We want to maintain these long term and owners can be added and removed as
needed. I agree that maintaining owners closer to the dependency
declarations will be easier (pls see
https://beam.apache.org/contribute/dependencies/ for bit more details on
this).

- Cham


>
> [1]
> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L325
>
> On Wed, Jun 27, 2018 at 8:52 AM Yifan Zou  wrote:
>
>> Hi all,
>>
>> We now have the automated detections for Beam dependency updates and
>> sending a weekly report to dev mailing list. In order to address the
>> updates in time, we want to find owners for all dependencies of Beam, and
>> finally, Jira bugs will be automatically created and assigned to the owners
>> if actions need to be taken. We also welcome non-owners to upgrade
>> dependency packages, but only owners will receive the Jira tickets.
>>
>> Please review the spreadsheet Beam SDK Dependency Ownership
>> 
>>  and
>> sign off if you are familiar with any Beam dependencies and willing to
>> take in charge of them. It is definitely fine that a single package have
>> multiple owners. The more owners we have, the more helps we will get to
>> keep Beam dependencies in a healthy state.
>>
>> Thank you :)
>>
>> Regards.
>> Yifan
>>
>>
>> https://docs.google.com/spreadsheets/d/12NN3vPqFTBQtXBc0fg4sFIb9c_mgst0IDePB_0Ui8kE/edit?ts=5b32bec1#gid=0
>>
>


Re: Unbounded source translation for portable pipelines

2018-06-27 Thread Lukasz Cwik
It would be great to have the ValidatesRunner suite of tests start
executing against Flink/ULR as it will make sure things don't break and are
reproducible.

On Wed, Jun 27, 2018 at 12:34 PM Eugene Kirpichov 
wrote:

> Hi!
>
> Those instructions are not current and I think should be discarded as they
> referred to a particular effort that is over - +Ankur Goenka
>  is, I believe, working on the remaining finishing
> touches for running from a clean clone of Beam master and documenting how
> to do that; could you help Thomas so we can start looking at what the
> streaming runner is missing?
>
> We'll need to document this in a more prominent place. When we get to a
> state where we can run Python WordCount from master, we'll need to document
> it somewhere on the main portability page and/or the getting started guide;
> when we can run something more serious, e.g. Tensorflow pipelines, that
> will be worth a Beam blog post and worth documenting in the TFX
> documentation.
>
> On Wed, Jun 27, 2018 at 5:35 AM Thomas Weise  wrote:
>
>> Hi Eugene,
>>
>> The basic streaming translation is already in place from the prototype,
>> though I have not verified it on the master branch yet.
>>
>> Are the user instructions for the portable Flink runner at
>> https://s.apache.org/beam-portability-team-doc current?
>>
>> (I don't have a dependency on SDF since we are going to use custom native
>> Flink sources/sinks at this time.)
>>
>> Thanks,
>> Thomas
>>
>>
>> On Tue, Jun 26, 2018 at 2:13 AM Eugene Kirpichov 
>> wrote:
>>
>>> Hi!
>>>
>>> Wanted to let you know that I've just merged the PR that adds
>>> checkpointable SDF support to the portable reference runner (ULR) and the
>>> Java SDK harness:
>>>
>>> https://github.com/apache/beam/pull/5566
>>>
>>> So now we have a reference implementation of SDF support in a portable
>>> runner, and a reference implementation of SDF support in a portable SDK
>>> harness.
>>> From here on, we need to replicate this support in other portable
>>> runners and other harnesses. The obvious targets are Flink and Python
>>> respectively.
>>>
>>> Chamikara was going to work on the Python harness. +Thomas Weise
>>>  Would you be interested in the Flink portable
>>> streaming runner side? It is of course blocked by having the rest of that
>>> runner working in streaming mode though (the batch mode is practically done
>>> - will send you a separate note about the status of that).
>>>
>>> On Fri, Mar 23, 2018 at 12:20 PM Eugene Kirpichov 
>>> wrote:
>>>
 Luke is right - unbounded sources should go through SDF. I am currently
 working on adding such support to Fn API.
 The relevant document is s.apache.org/beam-breaking-fusion (note: it
 focuses on a much more general case, but also considers in detail the
 specific case of running unbounded sources on Fn API), and the first
 related PR is https://github.com/apache/beam/pull/4743 .

 Ways you can help speed up this effort:
 - Make necessary changes to Apex runner per se to support regular SDFs
 in streaming (without portability). They will likely largely carry over to
 portable world. I recall that the Apex runner had some level of support of
 SDFs, but didn't pass the ValidatesRunner tests yet.
 - (general to Beam, not Apex-related per se) Implement the translation
 of Read.from(UnboundedSource) via impulse, which will require implementing
 an SDF that reads from a given UnboundedSource (taking the UnboundedSource
 as an element). This should be fairly straightforward and will allow all
 portable runners to take advantage of existing UnboundedSource's.


 On Fri, Mar 23, 2018 at 3:08 PM Lukasz Cwik  wrote:

> Using impulse is a precursor for both bounded and unbounded SDF.
>
> This JIRA represents the work that would be to add support for
> unbounded SDF using portability APIs:
> https://issues.apache.org/jira/browse/BEAM-2939
>
>
> On Fri, Mar 23, 2018 at 11:46 AM Thomas Weise  wrote:
>
>> So for streaming, we will need the Impulse translation for bounded
>> input, identical with batch, and then in addition to that support for 
>> SDF?
>>
>> Any pointers what's involved in adding the SDF support? Is it runner
>> specific? Does the ULR cover it?
>>
>>
>> On Fri, Mar 23, 2018 at 11:26 AM, Lukasz Cwik 
>> wrote:
>>
>>> All "sources" in portability will use splittable DoFns for execution.
>>>
>>> Specifically, runners will need to be able to checkpoint unbounded
>>> sources to get a minimum viable pipeline working.
>>> For bounded pipelines, a DoFn can read the contents of a bounded
>>> source.
>>>
>>>
>>> On Fri, Mar 23, 2018 at 10:52 AM Thomas Weise 
>>> wrote:
>>>
 Hi,

 I'm looking at the portable pipeline translation for streaming. I
 understand that for batch pipelines, it is sufficient to translate 

Re: Unbounded source translation for portable pipelines

2018-06-27 Thread Eugene Kirpichov
Hi!

Those instructions are not current and I think should be discarded as they
referred to a particular effort that is over - +Ankur Goenka
 is, I believe, working on the remaining finishing
touches for running from a clean clone of Beam master and documenting how
to do that; could you help Thomas so we can start looking at what the
streaming runner is missing?

We'll need to document this in a more prominent place. When we get to a
state where we can run Python WordCount from master, we'll need to document
it somewhere on the main portability page and/or the getting started guide;
when we can run something more serious, e.g. Tensorflow pipelines, that
will be worth a Beam blog post and worth documenting in the TFX
documentation.

On Wed, Jun 27, 2018 at 5:35 AM Thomas Weise  wrote:

> Hi Eugene,
>
> The basic streaming translation is already in place from the prototype,
> though I have not verified it on the master branch yet.
>
> Are the user instructions for the portable Flink runner at
> https://s.apache.org/beam-portability-team-doc current?
>
> (I don't have a dependency on SDF since we are going to use custom native
> Flink sources/sinks at this time.)
>
> Thanks,
> Thomas
>
>
> On Tue, Jun 26, 2018 at 2:13 AM Eugene Kirpichov 
> wrote:
>
>> Hi!
>>
>> Wanted to let you know that I've just merged the PR that adds
>> checkpointable SDF support to the portable reference runner (ULR) and the
>> Java SDK harness:
>>
>> https://github.com/apache/beam/pull/5566
>>
>> So now we have a reference implementation of SDF support in a portable
>> runner, and a reference implementation of SDF support in a portable SDK
>> harness.
>> From here on, we need to replicate this support in other portable runners
>> and other harnesses. The obvious targets are Flink and Python respectively.
>>
>> Chamikara was going to work on the Python harness. +Thomas Weise
>>  Would you be interested in the Flink portable streaming
>> runner side? It is of course blocked by having the rest of that runner
>> working in streaming mode though (the batch mode is practically done - will
>> send you a separate note about the status of that).
>>
>> On Fri, Mar 23, 2018 at 12:20 PM Eugene Kirpichov 
>> wrote:
>>
>>> Luke is right - unbounded sources should go through SDF. I am currently
>>> working on adding such support to Fn API.
>>> The relevant document is s.apache.org/beam-breaking-fusion (note: it
>>> focuses on a much more general case, but also considers in detail the
>>> specific case of running unbounded sources on Fn API), and the first
>>> related PR is https://github.com/apache/beam/pull/4743 .
>>>
>>> Ways you can help speed up this effort:
>>> - Make necessary changes to Apex runner per se to support regular SDFs
>>> in streaming (without portability). They will likely largely carry over to
>>> portable world. I recall that the Apex runner had some level of support of
>>> SDFs, but didn't pass the ValidatesRunner tests yet.
>>> - (general to Beam, not Apex-related per se) Implement the translation
>>> of Read.from(UnboundedSource) via impulse, which will require implementing
>>> an SDF that reads from a given UnboundedSource (taking the UnboundedSource
>>> as an element). This should be fairly straightforward and will allow all
>>> portable runners to take advantage of existing UnboundedSource's.
>>>
>>>
>>> On Fri, Mar 23, 2018 at 3:08 PM Lukasz Cwik  wrote:
>>>
 Using impulse is a precursor for both bounded and unbounded SDF.

 This JIRA represents the work that would be to add support for
 unbounded SDF using portability APIs:
 https://issues.apache.org/jira/browse/BEAM-2939


 On Fri, Mar 23, 2018 at 11:46 AM Thomas Weise  wrote:

> So for streaming, we will need the Impulse translation for bounded
> input, identical with batch, and then in addition to that support for SDF?
>
> Any pointers what's involved in adding the SDF support? Is it runner
> specific? Does the ULR cover it?
>
>
> On Fri, Mar 23, 2018 at 11:26 AM, Lukasz Cwik 
> wrote:
>
>> All "sources" in portability will use splittable DoFns for execution.
>>
>> Specifically, runners will need to be able to checkpoint unbounded
>> sources to get a minimum viable pipeline working.
>> For bounded pipelines, a DoFn can read the contents of a bounded
>> source.
>>
>>
>> On Fri, Mar 23, 2018 at 10:52 AM Thomas Weise  wrote:
>>
>>> Hi,
>>>
>>> I'm looking at the portable pipeline translation for streaming. I
>>> understand that for batch pipelines, it is sufficient to translate 
>>> Impulse.
>>>
>>> What is the intended path to support unbounded sources?
>>>
>>> The goal here is to get a minimum translation working that will
>>> allow streaming wordcount execution.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>


Re: ErrorProne and -Werror enabled for all Java projects

2018-06-27 Thread Eugene Kirpichov
This is awesome, thanks to everybody involved! It's so good to have
./gradlew compileJava compileTestJava not produce heaps of warnings like it
used to.

On Wed, Jun 27, 2018 at 9:52 AM Andrew Pilloud  wrote:

> Looking at the diff I think you can replace "Default Setting" with "Only
> Setting". This is Awesome! Thanks guys!
>
> Andrew
>
> On Wed, Jun 27, 2018 at 9:50 AM Kenneth Knowles  wrote:
>
>> Awesome! Can we remove the ability to disable it? :-) :-) :-) or anyhow
>> make it more obscure, not like an expected top-level config choice.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 9:45 AM Tim  wrote:
>>
>>> Thanks also to you Scott
>>>
>>> Tim
>>>
>>> On 27 Jun 2018, at 18:39, Scott Wegner  wrote:
>>>
>>> Six weeks ago [1] we began an effort to improve the quality of the Java
>>> codebase via ErrorProne static analysis, and promoting compiler warnings to
>>> errors. As of today, all of our Java projects have been migrated and this
>>> is now the default setting for Beam [2].
>>>
>>> This was a community effort. The cleanup spanned 48 JIRA issues [3] and
>>> 46 pull requests [4]. I want to give a big thanks to everyone who helped
>>> out: Ismaël Mejía, Tim Robertson, Cade Markegard, and Teng Peng.
>>>
>>> Thanks!
>>>
>>> [1]
>>> https://lists.apache.org/thread.html/cdc729b6349f952d8db78bae99fff74b06b60918cbe09344e075ba35@%3Cdev.beam.apache.org%3E
>>> 
>>> [2] https://github.com/apache/beam/pull/5773
>>> [3]
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
>>>
>>> [4]
>>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+errorprone+merged%3A%3E%3D2018-05-16+
>>>
>>>
>>>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Daniel Oliveira
+1 I'll throw in my support for auto-formatting, especially if the entire
project is auto-formatted in advance.

On Wed, Jun 27, 2018 at 10:53 AM Huygaa Batsaikhan 
wrote:

> +1. Global auto-formatting is cool!
>
> On Wed, Jun 27, 2018 at 10:17 AM Kenneth Knowles  wrote:
>
>> I just mean that because of how the tool works. But I guess if there were
>> discretion then two different people could end up with autoformatting that
>> disagrees, so again you get lines in the PR diff that aren't real changes.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 10:16 AM Raghu Angadi  wrote:
>>
>>> On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles  wrote:
>>>
 Nope! No discretion allowed :-)

>>>
>>> +1. Fair enough!
>>>
>>>

 On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi 
 wrote:

> +1.
>
> Wondering if it can be configured to reformat only what we care most
> about (2 space indentation etc), allowing some discretion on the edges. An
> example of inconsistent formatting that ends up in my code:
> ---
> anObject.someLongMethodName(arg_number_1,
>arg_number_2);
> --- vs ---
> anObject.anotherMethodName(
>   arg_number_1,
>   arg_number_2
> );
>
>
> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:
>
>> It wasn't clear to me that the intent was to autoformat all the code
>> from the proposal initially. If thats the case, then the delta is quite
>> small typically.
>>
>> Also, it would be easier if we recommended to users to run run
>> "./gradlew spotlessApply" which will run spotless on all modules.
>>
>> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles 
>> wrote:
>>
>>> Luke: the proposal here solves exactly what you are talking about.
>>>
>>> The problem you describe happens when the PR author uses autoformat
>>> but the baseline is not already autoformatted. What I am proposing is to
>>> make sure the baseline is already autoformatted, so PRs never have
>>> extraneous formatting changes.
>>>
>>> Rafael: the default setting on GitHub is "allow edits by
>>> maintainers" so actually a committer can run spotless on behalf of a
>>> contributor and push the fixup. I have done this. It also lets a
>>> committer fix up a good PR and merge it even if the contributor is, say,
>>> asleep.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez <
>>> rfern...@google.com> wrote:
>>>
 Luke: Anything that helps contributors and reviewers work better
 together - +1! :D



 On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik 
 wrote:

> If spotless is run against a PR that is already well formatted its
> a non-issue as the formatting changes are usually related to the 
> change but
> I have reviewed a few PRs that have 100s of lines of formatting change
> which really obfuscates the work.
> Instead of asking contributors to run spotless, can we have a cron
> job run it across the project like once a day/week/... and cut a PR?
>
> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
> wrote:
>
>> Good points, Dan. Checkstyle will still run, but just focused on
>> the things that go beyond format.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
>> echauc...@apache.org> wrote:
>>
>>> +1 !
>>> It's my custom to avoid reformatting to spare meaningless diff
>>> burden to the reviewer. Now it will be over, thanks.
>>>
>>> Etienne
>>>
>>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>>
>>> Hi all,
>>>
>>> I like readable code, but I don't like formatting it myself. And
>>> I _really_ don't like discussing in code review. "Spotless" [1] can 
>>> enforce
>>> - and automatically apply - automatic formatting for Java, Groovy, 
>>> and some
>>> others.
>>>
>>> This is not about style or wanting a particular layout. This is
>>> about automation, contributor experience, and streamlining review
>>>
>>>  - Contributor experience: MUCH better than checkstyle: error
>>> message just says "run ./gradlew :beam-your-module:spotlessApply" 
>>> instead
>>> of telling them to go in and manually edit.
>>>
>>>  - Automation: You want to use autoformat so you don't have to
>>> format code by hand. But if you autoformat a file that was in some 
>>> other
>>> format, then you touch a bunch of unrelated lines. If the file is 
>>> already
>>> autoformatted, it is much better.
>>>
>>>  - Review: Never talk 

Re: Jenkins Groovy files naming convention

2018-06-27 Thread Scott Wegner
The Jenkins source files are naming according to the Jenkins jobs they
produce, which are named with Snake_Case. However this doesn't seem like a
requirement, and I'm all in favor of keeping things and idiomatic as
possible. So +1 to CamelCase.groovy

FYI, the official Groovy style guide is here:
http://groovy-lang.org/style-guide.html#_classes_as_first_class_citizens,
although it doesn't explicitly touch on naming conventions.

On Wed, Jun 27, 2018 at 9:25 AM Kenneth Knowles  wrote:

> SGTM. The underscores are more-or-less substituting for directories. Maybe
> we can just put them in directories?
>
> On Wed, Jun 27, 2018 at 9:18 AM Lukasz Cwik  wrote:
>
>> I don't really have a strong preference.
>>
>> On Wed, Jun 27, 2018 at 9:13 AM Łukasz Gajowy 
>> wrote:
>>
>>> Hi all,
>>>
>>> I think we should change the naming convention that we have in
>>> jenkins .groovy files. AFAIK, groovy is CamelCase, and we use snake_case
>>> names there. I suppose this is because we wanted to reflect jenkins job
>>> names (do we need this?)
>>>
>>> IMO, the convention should be CamelCase for all .groovy files (both
>>> actual job files and helper class files).
>>>
>>> WDYT?
>>>
>>> Łukasz
>>>
>>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
I just mean that because of how the tool works. But I guess if there were
discretion then two different people could end up with autoformatting that
disagrees, so again you get lines in the PR diff that aren't real changes.

Kenn

On Wed, Jun 27, 2018 at 10:16 AM Raghu Angadi  wrote:

> On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles  wrote:
>
>> Nope! No discretion allowed :-)
>>
>
> +1. Fair enough!
>
>
>>
>> On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi  wrote:
>>
>>> +1.
>>>
>>> Wondering if it can be configured to reformat only what we care most
>>> about (2 space indentation etc), allowing some discretion on the edges. An
>>> example of inconsistent formatting that ends up in my code:
>>> ---
>>> anObject.someLongMethodName(arg_number_1,
>>>arg_number_2);
>>> --- vs ---
>>> anObject.anotherMethodName(
>>>   arg_number_1,
>>>   arg_number_2
>>> );
>>>
>>>
>>> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:
>>>
 It wasn't clear to me that the intent was to autoformat all the code
 from the proposal initially. If thats the case, then the delta is quite
 small typically.

 Also, it would be easier if we recommended to users to run run
 "./gradlew spotlessApply" which will run spotless on all modules.

 On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:

> Luke: the proposal here solves exactly what you are talking about.
>
> The problem you describe happens when the PR author uses autoformat
> but the baseline is not already autoformatted. What I am proposing is to
> make sure the baseline is already autoformatted, so PRs never have
> extraneous formatting changes.
>
> Rafael: the default setting on GitHub is "allow edits by maintainers"
> so actually a committer can run spotless on behalf of a contributor and
> push the fixup. I have done this. It also lets a committer fix up a
> good PR and merge it even if the contributor is, say, asleep.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
> wrote:
>
>> Luke: Anything that helps contributors and reviewers work better
>> together - +1! :D
>>
>>
>>
>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>>
>>> If spotless is run against a PR that is already well formatted its a
>>> non-issue as the formatting changes are usually related to the change 
>>> but I
>>> have reviewed a few PRs that have 100s of lines of formatting change 
>>> which
>>> really obfuscates the work.
>>> Instead of asking contributors to run spotless, can we have a cron
>>> job run it across the project like once a day/week/... and cut a PR?
>>>
>>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
>>> wrote:
>>>
 Good points, Dan. Checkstyle will still run, but just focused on
 the things that go beyond format.

 Kenn

 On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
 echauc...@apache.org> wrote:

> +1 !
> It's my custom to avoid reformatting to spare meaningless diff
> burden to the reviewer. Now it will be over, thanks.
>
> Etienne
>
> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>
> Hi all,
>
> I like readable code, but I don't like formatting it myself. And I
> _really_ don't like discussing in code review. "Spotless" [1] can 
> enforce -
> and automatically apply - automatic formatting for Java, Groovy, and 
> some
> others.
>
> This is not about style or wanting a particular layout. This is
> about automation, contributor experience, and streamlining review
>
>  - Contributor experience: MUCH better than checkstyle: error
> message just says "run ./gradlew :beam-your-module:spotlessApply" 
> instead
> of telling them to go in and manually edit.
>
>  - Automation: You want to use autoformat so you don't have to
> format code by hand. But if you autoformat a file that was in some 
> other
> format, then you touch a bunch of unrelated lines. If the file is 
> already
> autoformatted, it is much better.
>
>  - Review: Never talk about code formatting ever again. A PR also
> needs baseline to already be autoformatted or formatting will make it
> unclear which lines are really changed.
>
> This is already available via applyJavaNature(enableSpotless:
> true) and it is turned on for SQL and our buildSrc gradle plugins. It 
> is
> very nice. There is a JIRA [2] to turn it on for the hold code base.
> Personally, I think (a) every module could make a different choice if 
> the
> main contributors feel strongly 

Re: [DISCUSS] Remove findbugs from sdks/java

2018-06-27 Thread Kenneth Knowles
Last time I checked, errorprone didn't enforce nullness typing. Does it
now? IMO that's the only P0 in our static analysis toolchain.

Kenn

On Wed, Jun 27, 2018 at 9:52 AM Scott Wegner  wrote:

> FYI, now that the ErrorProne migration is complete [1] it's a good time to
> discuss whether FindBugs should be removed or not.
>
> I haven't compared the set of checks we are doing with FindBugs vs
> ErrorProne. If ErrorProne can replace the checks from FindBugs then I'm in
> favor of getting rid of it.
>
> One of the main gripes with FindBugs is that is not well maintained.
> SpotBugs is a new fork of FindBugs which is more active; if we do decide
> there's value in FindBugs perhaps we should at least move to the newer
> fork: https://spotbugs.github.io/
>
> [1]
> https://lists.apache.org/thread.html/0ae24dc4e88e6c38407b74ad4d093985a525f3d3eea094e054ed4cfa@%3Cdev.beam.apache.org%3E
>
> On Wed, May 30, 2018 at 9:42 AM Kenneth Knowles  wrote:
>
>> Awesome!
>>
>> In the meantime I've tried out Gradle + Checker and unfortunately
>> compilation hung. It could be due to any subset of Gradle, Checker,
>> Errorprone. I would not expect a performance problem, since Checker is
>> "pluggable type systems" and type checking is a very fast sort of analysis.
>> Also I didn't immediately find the configuration to ignore generated code.
>> I didn't have a lot of time so I didn't dig further.
>>
>> Kenn
>>
>> On Wed, May 30, 2018 at 9:28 AM Pablo Estrada  wrote:
>>
>>> Thank you guys : D
>>>
>>> On Wed, May 30, 2018 at 9:20 AM Scott Wegner  wrote:
>>>
 Sorry to revive an old thread, but I wanted to give a shout-out and say
 thank you to Ismaël and Tim who have been quickly chipping away at the
 ErrorProne backlog. We started with 47 ErrorProne JIRA's [1], and in two
 weeks we're down to just 17 [2]. Thanks!

 [1] 
 *https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
 
  *
 [2]
 https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22)%20AND%20labels%20%3D%20errorprone


 On Fri, May 18, 2018 at 7:12 AM Ismaël Mejía  wrote:

> As part of the error-prone effort Tim has been also cleaning other
> static
> analysis warnings as reported by IntelliJ's Inspect -> Analyze code. I
> think this is a good moment to grok some of those too e.g. scoping,
> unused
> variables, redundancies, etc. So I hope the others taking part this
> work
> try to tackle a chunk of those as well.
>
> Extra note. Of course IntelliJ's code analysis should be judged a bit,
> there are always fake positives or undesirable changes.
>
> On Fri, May 18, 2018 at 7:56 AM Jean-Baptiste Onofré 
> wrote:
>
> > Thanks Tim.
>
> > I think we will be able to remove findbugs after some run/check using
> ErrorProne and see the gaps.
>
> > Regards
> > JB
> > Le 18 mai 2018, à 07:49, Tim Robertson  a
> écrit:
>
> >> Thank you all.
>
> >> I think this is clear.  Removing findbugs can happen at a future
> point.
>
> >> @Scott - I've been working through the java IO error prone issues
> (some
> already merged, some with open PRs now) so will take those IO Jiras. I
> will
> enable failOnWarning, address dependency issues for findbugs and
> tackle the
> error prone warnings.
>
>
> >> On Fri, May 18, 2018 at 1:07 AM, Scott Wegner 
> wrote:
>
> >>> +0.02173913
>
> >>> I'm happy to replace FindBugs with ErrorProne, but we need to first
> upgrade ErrorProne analyzer warnings to errors. Currently the codebase
> is
> full of warning spam, and there's no enforcement preventing future
> violations from being added.
>
> >>> I've done the work for enforcing ErrorProne analysis on
> java-sdk-core
> [1], and I've sharded out the rest of the Java components in JIRA
> issues
> [2] (45 total).  Fixing the issues is relatively straightforward, and
> I've
> tried to provide enough guidance to make them as starter tasks
> (example:
> [3]). Teng Peng has already started on Spark [4] (thanks!)
>
> >>> [1] https://github.com/apache/beam/pull/5319
> >>> [2]
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20Open%20AND%20labels%20%3D%20errorprone
> >>> [3] https://issues.apache.org/jira/browse/BEAM-4347
> >>> [4] https://issues.apache.org/jira/browse/BEAM-4318
>
> >>> On Thu, May 17, 2018 at 2:00 PM Ismaël Mejía 
> wrote:
>
>  +0.7 also. Findbugs support for more recent versions of Java is
> lacking and
>  the maintenance seems frozen in time.
>
>  As a possible plan b can we identify the missing 

Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Raghu Angadi
On Wed, Jun 27, 2018 at 10:13 AM Kenneth Knowles  wrote:

> Nope! No discretion allowed :-)
>

+1. Fair enough!


>
> On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi  wrote:
>
>> +1.
>>
>> Wondering if it can be configured to reformat only what we care most
>> about (2 space indentation etc), allowing some discretion on the edges. An
>> example of inconsistent formatting that ends up in my code:
>> ---
>> anObject.someLongMethodName(arg_number_1,
>>arg_number_2);
>> --- vs ---
>> anObject.anotherMethodName(
>>   arg_number_1,
>>   arg_number_2
>> );
>>
>>
>> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:
>>
>>> It wasn't clear to me that the intent was to autoformat all the code
>>> from the proposal initially. If thats the case, then the delta is quite
>>> small typically.
>>>
>>> Also, it would be easier if we recommended to users to run run
>>> "./gradlew spotlessApply" which will run spotless on all modules.
>>>
>>> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:
>>>
 Luke: the proposal here solves exactly what you are talking about.

 The problem you describe happens when the PR author uses autoformat but
 the baseline is not already autoformatted. What I am proposing is to make
 sure the baseline is already autoformatted, so PRs never have extraneous
 formatting changes.

 Rafael: the default setting on GitHub is "allow edits by maintainers"
 so actually a committer can run spotless on behalf of a contributor and
 push the fixup. I have done this. It also lets a committer fix up a
 good PR and merge it even if the contributor is, say, asleep.

 Kenn

 On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
 wrote:

> Luke: Anything that helps contributors and reviewers work better
> together - +1! :D
>
>
>
> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>
>> If spotless is run against a PR that is already well formatted its a
>> non-issue as the formatting changes are usually related to the change 
>> but I
>> have reviewed a few PRs that have 100s of lines of formatting change 
>> which
>> really obfuscates the work.
>> Instead of asking contributors to run spotless, can we have a cron
>> job run it across the project like once a day/week/... and cut a PR?
>>
>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
>> wrote:
>>
>>> Good points, Dan. Checkstyle will still run, but just focused on the
>>> things that go beyond format.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
>>> echauc...@apache.org> wrote:
>>>
 +1 !
 It's my custom to avoid reformatting to spare meaningless diff
 burden to the reviewer. Now it will be over, thanks.

 Etienne

 Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :

 Hi all,

 I like readable code, but I don't like formatting it myself. And I
 _really_ don't like discussing in code review. "Spotless" [1] can 
 enforce -
 and automatically apply - automatic formatting for Java, Groovy, and 
 some
 others.

 This is not about style or wanting a particular layout. This is
 about automation, contributor experience, and streamlining review

  - Contributor experience: MUCH better than checkstyle: error
 message just says "run ./gradlew :beam-your-module:spotlessApply" 
 instead
 of telling them to go in and manually edit.

  - Automation: You want to use autoformat so you don't have to
 format code by hand. But if you autoformat a file that was in some 
 other
 format, then you touch a bunch of unrelated lines. If the file is 
 already
 autoformatted, it is much better.

  - Review: Never talk about code formatting ever again. A PR also
 needs baseline to already be autoformatted or formatting will make it
 unclear which lines are really changed.

 This is already available via applyJavaNature(enableSpotless: true)
 and it is turned on for SQL and our buildSrc gradle plugins. It is very
 nice. There is a JIRA [2] to turn it on for the hold code base. 
 Personally,
 I think (a) every module could make a different choice if the main
 contributors feel strongly and (b) it is objectively better to always
 autoformat :-)

 WDYT? If we do it, it is trivial to add it module-at-a-time or
 globally. If someone conflicts with a massive autoformat commit, they 
 can
 just keep their changes and autoformat them and it is done.

 Kenn

 [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle

Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
Nope! No discretion allowed :-)

On Wed, Jun 27, 2018 at 9:57 AM Raghu Angadi  wrote:

> +1.
>
> Wondering if it can be configured to reformat only what we care most about
> (2 space indentation etc), allowing some discretion on the edges. An
> example of inconsistent formatting that ends up in my code:
> ---
> anObject.someLongMethodName(arg_number_1,
>arg_number_2);
> --- vs ---
> anObject.anotherMethodName(
>   arg_number_1,
>   arg_number_2
> );
>
>
> On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:
>
>> It wasn't clear to me that the intent was to autoformat all the code from
>> the proposal initially. If thats the case, then the delta is quite small
>> typically.
>>
>> Also, it would be easier if we recommended to users to run run
>> "./gradlew spotlessApply" which will run spotless on all modules.
>>
>> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:
>>
>>> Luke: the proposal here solves exactly what you are talking about.
>>>
>>> The problem you describe happens when the PR author uses autoformat but
>>> the baseline is not already autoformatted. What I am proposing is to make
>>> sure the baseline is already autoformatted, so PRs never have extraneous
>>> formatting changes.
>>>
>>> Rafael: the default setting on GitHub is "allow edits by maintainers" so
>>> actually a committer can run spotless on behalf of a contributor and push
>>> the fixup. I have done this. It also lets a committer fix up a good PR
>>> and merge it even if the contributor is, say, asleep.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
>>> wrote:
>>>
 Luke: Anything that helps contributors and reviewers work better
 together - +1! :D



 On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:

> If spotless is run against a PR that is already well formatted its a
> non-issue as the formatting changes are usually related to the change but 
> I
> have reviewed a few PRs that have 100s of lines of formatting change which
> really obfuscates the work.
> Instead of asking contributors to run spotless, can we have a cron job
> run it across the project like once a day/week/... and cut a PR?
>
> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles 
> wrote:
>
>> Good points, Dan. Checkstyle will still run, but just focused on the
>> things that go beyond format.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot <
>> echauc...@apache.org> wrote:
>>
>>> +1 !
>>> It's my custom to avoid reformatting to spare meaningless diff
>>> burden to the reviewer. Now it will be over, thanks.
>>>
>>> Etienne
>>>
>>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>>
>>> Hi all,
>>>
>>> I like readable code, but I don't like formatting it myself. And I
>>> _really_ don't like discussing in code review. "Spotless" [1] can 
>>> enforce -
>>> and automatically apply - automatic formatting for Java, Groovy, and 
>>> some
>>> others.
>>>
>>> This is not about style or wanting a particular layout. This is
>>> about automation, contributor experience, and streamlining review
>>>
>>>  - Contributor experience: MUCH better than checkstyle: error
>>> message just says "run ./gradlew :beam-your-module:spotlessApply" 
>>> instead
>>> of telling them to go in and manually edit.
>>>
>>>  - Automation: You want to use autoformat so you don't have to
>>> format code by hand. But if you autoformat a file that was in some other
>>> format, then you touch a bunch of unrelated lines. If the file is 
>>> already
>>> autoformatted, it is much better.
>>>
>>>  - Review: Never talk about code formatting ever again. A PR also
>>> needs baseline to already be autoformatted or formatting will make it
>>> unclear which lines are really changed.
>>>
>>> This is already available via applyJavaNature(enableSpotless: true)
>>> and it is turned on for SQL and our buildSrc gradle plugins. It is very
>>> nice. There is a JIRA [2] to turn it on for the hold code base. 
>>> Personally,
>>> I think (a) every module could make a different choice if the main
>>> contributors feel strongly and (b) it is objectively better to always
>>> autoformat :-)
>>>
>>> WDYT? If we do it, it is trivial to add it module-at-a-time or
>>> globally. If someone conflicts with a massive autoformat commit, they 
>>> can
>>> just keep their changes and autoformat them and it is done.
>>>
>>> Kenn
>>>
>>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>>
>>>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Raghu Angadi
+1.

Wondering if it can be configured to reformat only what we care most about
(2 space indentation etc), allowing some discretion on the edges. An
example of inconsistent formatting that ends up in my code:
---
anObject.someLongMethodName(arg_number_1,
   arg_number_2);
--- vs ---
anObject.anotherMethodName(
  arg_number_1,
  arg_number_2
);


On Wed, Jun 27, 2018 at 9:41 AM Lukasz Cwik  wrote:

> It wasn't clear to me that the intent was to autoformat all the code from
> the proposal initially. If thats the case, then the delta is quite small
> typically.
>
> Also, it would be easier if we recommended to users to run run "./gradlew 
> spotlessApply"
> which will run spotless on all modules.
>
> On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:
>
>> Luke: the proposal here solves exactly what you are talking about.
>>
>> The problem you describe happens when the PR author uses autoformat but
>> the baseline is not already autoformatted. What I am proposing is to make
>> sure the baseline is already autoformatted, so PRs never have extraneous
>> formatting changes.
>>
>> Rafael: the default setting on GitHub is "allow edits by maintainers" so
>> actually a committer can run spotless on behalf of a contributor and push
>> the fixup. I have done this. It also lets a committer fix up a good PR
>> and merge it even if the contributor is, say, asleep.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
>> wrote:
>>
>>> Luke: Anything that helps contributors and reviewers work better
>>> together - +1! :D
>>>
>>>
>>>
>>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>>>
 If spotless is run against a PR that is already well formatted its a
 non-issue as the formatting changes are usually related to the change but I
 have reviewed a few PRs that have 100s of lines of formatting change which
 really obfuscates the work.
 Instead of asking contributors to run spotless, can we have a cron job
 run it across the project like once a day/week/... and cut a PR?

 On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:

> Good points, Dan. Checkstyle will still run, but just focused on the
> things that go beyond format.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
> wrote:
>
>> +1 !
>> It's my custom to avoid reformatting to spare meaningless diff burden
>> to the reviewer. Now it will be over, thanks.
>>
>> Etienne
>>
>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>
>> Hi all,
>>
>> I like readable code, but I don't like formatting it myself. And I
>> _really_ don't like discussing in code review. "Spotless" [1] can 
>> enforce -
>> and automatically apply - automatic formatting for Java, Groovy, and some
>> others.
>>
>> This is not about style or wanting a particular layout. This is about
>> automation, contributor experience, and streamlining review
>>
>>  - Contributor experience: MUCH better than checkstyle: error message
>> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
>> telling them to go in and manually edit.
>>
>>  - Automation: You want to use autoformat so you don't have to format
>> code by hand. But if you autoformat a file that was in some other format,
>> then you touch a bunch of unrelated lines. If the file is already
>> autoformatted, it is much better.
>>
>>  - Review: Never talk about code formatting ever again. A PR also
>> needs baseline to already be autoformatted or formatting will make it
>> unclear which lines are really changed.
>>
>> This is already available via applyJavaNature(enableSpotless: true)
>> and it is turned on for SQL and our buildSrc gradle plugins. It is very
>> nice. There is a JIRA [2] to turn it on for the hold code base. 
>> Personally,
>> I think (a) every module could make a different choice if the main
>> contributors feel strongly and (b) it is objectively better to always
>> autoformat :-)
>>
>> WDYT? If we do it, it is trivial to add it module-at-a-time or
>> globally. If someone conflicts with a massive autoformat commit, they can
>> just keep their changes and autoformat them and it is done.
>>
>> Kenn
>>
>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>
>>


Re: ErrorProne and -Werror enabled for all Java projects

2018-06-27 Thread Andrew Pilloud
Looking at the diff I think you can replace "Default Setting" with "Only
Setting". This is Awesome! Thanks guys!

Andrew

On Wed, Jun 27, 2018 at 9:50 AM Kenneth Knowles  wrote:

> Awesome! Can we remove the ability to disable it? :-) :-) :-) or anyhow
> make it more obscure, not like an expected top-level config choice.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 9:45 AM Tim  wrote:
>
>> Thanks also to you Scott
>>
>> Tim
>>
>> On 27 Jun 2018, at 18:39, Scott Wegner  wrote:
>>
>> Six weeks ago [1] we began an effort to improve the quality of the Java
>> codebase via ErrorProne static analysis, and promoting compiler warnings to
>> errors. As of today, all of our Java projects have been migrated and this
>> is now the default setting for Beam [2].
>>
>> This was a community effort. The cleanup spanned 48 JIRA issues [3] and
>> 46 pull requests [4]. I want to give a big thanks to everyone who helped
>> out: Ismaël Mejía, Tim Robertson, Cade Markegard, and Teng Peng.
>>
>> Thanks!
>>
>> [1]
>> https://lists.apache.org/thread.html/cdc729b6349f952d8db78bae99fff74b06b60918cbe09344e075ba35@%3Cdev.beam.apache.org%3E
>> 
>> [2] https://github.com/apache/beam/pull/5773
>> [3]
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
>>
>> [4]
>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+errorprone+merged%3A%3E%3D2018-05-16+
>>
>>
>>


Re: [DISCUSS] Remove findbugs from sdks/java

2018-06-27 Thread Scott Wegner
FYI, now that the ErrorProne migration is complete [1] it's a good time to
discuss whether FindBugs should be removed or not.

I haven't compared the set of checks we are doing with FindBugs vs
ErrorProne. If ErrorProne can replace the checks from FindBugs then I'm in
favor of getting rid of it.

One of the main gripes with FindBugs is that is not well maintained.
SpotBugs is a new fork of FindBugs which is more active; if we do decide
there's value in FindBugs perhaps we should at least move to the newer
fork: https://spotbugs.github.io/

[1]
https://lists.apache.org/thread.html/0ae24dc4e88e6c38407b74ad4d093985a525f3d3eea094e054ed4cfa@%3Cdev.beam.apache.org%3E

On Wed, May 30, 2018 at 9:42 AM Kenneth Knowles  wrote:

> Awesome!
>
> In the meantime I've tried out Gradle + Checker and unfortunately
> compilation hung. It could be due to any subset of Gradle, Checker,
> Errorprone. I would not expect a performance problem, since Checker is
> "pluggable type systems" and type checking is a very fast sort of analysis.
> Also I didn't immediately find the configuration to ignore generated code.
> I didn't have a lot of time so I didn't dig further.
>
> Kenn
>
> On Wed, May 30, 2018 at 9:28 AM Pablo Estrada  wrote:
>
>> Thank you guys : D
>>
>> On Wed, May 30, 2018 at 9:20 AM Scott Wegner  wrote:
>>
>>> Sorry to revive an old thread, but I wanted to give a shout-out and say
>>> thank you to Ismaël and Tim who have been quickly chipping away at the
>>> ErrorProne backlog. We started with 47 ErrorProne JIRA's [1], and in two
>>> weeks we're down to just 17 [2]. Thanks!
>>>
>>> [1] 
>>> *https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
>>> 
>>>  *
>>> [2]
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22)%20AND%20labels%20%3D%20errorprone
>>>
>>>
>>> On Fri, May 18, 2018 at 7:12 AM Ismaël Mejía  wrote:
>>>
 As part of the error-prone effort Tim has been also cleaning other
 static
 analysis warnings as reported by IntelliJ's Inspect -> Analyze code. I
 think this is a good moment to grok some of those too e.g. scoping,
 unused
 variables, redundancies, etc. So I hope the others taking part this work
 try to tackle a chunk of those as well.

 Extra note. Of course IntelliJ's code analysis should be judged a bit,
 there are always fake positives or undesirable changes.

 On Fri, May 18, 2018 at 7:56 AM Jean-Baptiste Onofré 
 wrote:

 > Thanks Tim.

 > I think we will be able to remove findbugs after some run/check using
 ErrorProne and see the gaps.

 > Regards
 > JB
 > Le 18 mai 2018, à 07:49, Tim Robertson  a
 écrit:

 >> Thank you all.

 >> I think this is clear.  Removing findbugs can happen at a future
 point.

 >> @Scott - I've been working through the java IO error prone issues
 (some
 already merged, some with open PRs now) so will take those IO Jiras. I
 will
 enable failOnWarning, address dependency issues for findbugs and tackle
 the
 error prone warnings.


 >> On Fri, May 18, 2018 at 1:07 AM, Scott Wegner 
 wrote:

 >>> +0.02173913

 >>> I'm happy to replace FindBugs with ErrorProne, but we need to first
 upgrade ErrorProne analyzer warnings to errors. Currently the codebase
 is
 full of warning spam, and there's no enforcement preventing future
 violations from being added.

 >>> I've done the work for enforcing ErrorProne analysis on
 java-sdk-core
 [1], and I've sharded out the rest of the Java components in JIRA issues
 [2] (45 total).  Fixing the issues is relatively straightforward, and
 I've
 tried to provide enough guidance to make them as starter tasks (example:
 [3]). Teng Peng has already started on Spark [4] (thanks!)

 >>> [1] https://github.com/apache/beam/pull/5319
 >>> [2]

 https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20Open%20AND%20labels%20%3D%20errorprone
 >>> [3] https://issues.apache.org/jira/browse/BEAM-4347
 >>> [4] https://issues.apache.org/jira/browse/BEAM-4318

 >>> On Thu, May 17, 2018 at 2:00 PM Ismaël Mejía 
 wrote:

  +0.7 also. Findbugs support for more recent versions of Java is
 lacking and
  the maintenance seems frozen in time.

  As a possible plan b can we identify the missing important
 validations
 to
  identify how much we lose and if it is considerable, maybe we can
 create a
  minimal configuration for those, and eventually migrate from
 findbugs
 to
  spotbugs (https://github.com/spotbugs/spotbugs/) that seems at
 least
 to be
  maintained and the most 

Re: ErrorProne and -Werror enabled for all Java projects

2018-06-27 Thread Kenneth Knowles
Awesome! Can we remove the ability to disable it? :-) :-) :-) or anyhow
make it more obscure, not like an expected top-level config choice.

Kenn

On Wed, Jun 27, 2018 at 9:45 AM Tim  wrote:

> Thanks also to you Scott
>
> Tim
>
> On 27 Jun 2018, at 18:39, Scott Wegner  wrote:
>
> Six weeks ago [1] we began an effort to improve the quality of the Java
> codebase via ErrorProne static analysis, and promoting compiler warnings to
> errors. As of today, all of our Java projects have been migrated and this
> is now the default setting for Beam [2].
>
> This was a community effort. The cleanup spanned 48 JIRA issues [3] and 46
> pull requests [4]. I want to give a big thanks to everyone who helped out:
> Ismaël Mejía, Tim Robertson, Cade Markegard, and Teng Peng.
>
> Thanks!
>
> [1]
> https://lists.apache.org/thread.html/cdc729b6349f952d8db78bae99fff74b06b60918cbe09344e075ba35@%3Cdev.beam.apache.org%3E
> 
> [2] https://github.com/apache/beam/pull/5773
> [3]
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
>
> [4]
> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+errorprone+merged%3A%3E%3D2018-05-16+
>
>
>


Re: ErrorProne and -Werror enabled for all Java projects

2018-06-27 Thread Tim
Thanks also to you Scott

Tim

> On 27 Jun 2018, at 18:39, Scott Wegner  wrote:
> 
> Six weeks ago [1] we began an effort to improve the quality of the Java 
> codebase via ErrorProne static analysis, and promoting compiler warnings to 
> errors. As of today, all of our Java projects have been migrated and this is 
> now the default setting for Beam [2].
> 
> This was a community effort. The cleanup spanned 48 JIRA issues [3] and 46 
> pull requests [4]. I want to give a big thanks to everyone who helped out: 
> Ismaël Mejía, Tim Robertson, Cade Markegard, and Teng Peng. 
> 
> Thanks!
> 
> [1] 
> https://lists.apache.org/thread.html/cdc729b6349f952d8db78bae99fff74b06b60918cbe09344e075ba35@%3Cdev.beam.apache.org%3E
> [2] https://github.com/apache/beam/pull/5773 
> [3] 
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone
>  
> [4] 
> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+errorprone+merged%3A%3E%3D2018-05-16+
>  


Re: Jenkins emails

2018-06-27 Thread Andrew Pilloud
The failure emails are so frequent that they've just become noise to me.
I'm +1 for not mailing individuals. I would also be in favor of sending
them to commits@ instead of dev@.

Andrew

On Wed, Jun 27, 2018 at 8:38 AM Kenneth Knowles  wrote:

> I think it is just a matter of consensus. I have opened
> https://github.com/apache/beam/pull/5784 to make it off, and then we can
> go through the individual job definitions to remove the parameter entirely.
>
> Two more reasons to remove this email config:
>
>  - as job count scales up, email is just not sensible for managing their
> status
>  - many/most changes are not relevant to the particular test being run
>
> Kenn
>
> On Wed, Jun 27, 2018 at 8:26 AM Stefano Baghino 
> wrote:
>
>> Thanks for the explanation, Kenn, I appreciate. Indeed my commits were on
>> the website, so probably that's it.
>>
>> I presume not, but can I do something to facilitate the process?
>>
>> Best,
>> Stefano
>>
>> Sent from Blue 
>> On Jun 27, 2018, at 17:23, Kenneth Knowles  wrote:
>>>
>>> First let me say what I think the cause is:
>>>
>>>  - we have most jobs set to "email people who broke the build"
>>>  - this scrapes what Jenkins thinks all the changes since last time are,
>>> and for every registered address, sends email
>>>  - when we create a new job with these settings, *all* changes are
>>> considered new, and jobs usually start broken, so everyone gets email until
>>> the first passing run
>>>  - also we bulk merged the web site into the main repo, so *all* website
>>> commits were also considered for all existing jobs
>>>
>>> We have had some other issues where Jenkins is emailing people with *no*
>>> commits at all in any repo. Presumably due to an issue with looking up
>>> registered users.
>>>
>>> I think we should just turn off "email people who broke the build". We
>>> have other channels to monitor the status, and we don't need this spam or
>>> administrative work.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 8:10 AM < stef...@baghino.me> wrote:
>>>
 Hi all,

 I've contributed a couple of lines of code a couple of years back. Ever
 since a week, I started receiving email notification from Jenkins.

 How do I unsubscribe from those? I don't have an account on
 builds.apache.org.

 Best,
 Stefano

>>>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Lukasz Cwik
It wasn't clear to me that the intent was to autoformat all the code from
the proposal initially. If thats the case, then the delta is quite small
typically.

Also, it would be easier if we recommended to users to run run
"./gradlew spotlessApply"
which will run spotless on all modules.

On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:

> Luke: the proposal here solves exactly what you are talking about.
>
> The problem you describe happens when the PR author uses autoformat but
> the baseline is not already autoformatted. What I am proposing is to make
> sure the baseline is already autoformatted, so PRs never have extraneous
> formatting changes.
>
> Rafael: the default setting on GitHub is "allow edits by maintainers" so
> actually a committer can run spotless on behalf of a contributor and push
> the fixup. I have done this. It also lets a committer fix up a good PR
> and merge it even if the contributor is, say, asleep.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
> wrote:
>
>> Luke: Anything that helps contributors and reviewers work better together
>> - +1! :D
>>
>>
>>
>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>>
>>> If spotless is run against a PR that is already well formatted its a
>>> non-issue as the formatting changes are usually related to the change but I
>>> have reviewed a few PRs that have 100s of lines of formatting change which
>>> really obfuscates the work.
>>> Instead of asking contributors to run spotless, can we have a cron job
>>> run it across the project like once a day/week/... and cut a PR?
>>>
>>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:
>>>
 Good points, Dan. Checkstyle will still run, but just focused on the
 things that go beyond format.

 Kenn

 On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
 wrote:

> +1 !
> It's my custom to avoid reformatting to spare meaningless diff burden
> to the reviewer. Now it will be over, thanks.
>
> Etienne
>
> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>
> Hi all,
>
> I like readable code, but I don't like formatting it myself. And I
> _really_ don't like discussing in code review. "Spotless" [1] can enforce 
> -
> and automatically apply - automatic formatting for Java, Groovy, and some
> others.
>
> This is not about style or wanting a particular layout. This is about
> automation, contributor experience, and streamlining review
>
>  - Contributor experience: MUCH better than checkstyle: error message
> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
> telling them to go in and manually edit.
>
>  - Automation: You want to use autoformat so you don't have to format
> code by hand. But if you autoformat a file that was in some other format,
> then you touch a bunch of unrelated lines. If the file is already
> autoformatted, it is much better.
>
>  - Review: Never talk about code formatting ever again. A PR also
> needs baseline to already be autoformatted or formatting will make it
> unclear which lines are really changed.
>
> This is already available via applyJavaNature(enableSpotless: true)
> and it is turned on for SQL and our buildSrc gradle plugins. It is very
> nice. There is a JIRA [2] to turn it on for the hold code base. 
> Personally,
> I think (a) every module could make a different choice if the main
> contributors feel strongly and (b) it is objectively better to always
> autoformat :-)
>
> WDYT? If we do it, it is trivial to add it module-at-a-time or
> globally. If someone conflicts with a massive autoformat commit, they can
> just keep their changes and autoformat them and it is done.
>
> Kenn
>
> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
> [2] https://issues.apache.org/jira/browse/BEAM-4394
>
>


ErrorProne and -Werror enabled for all Java projects

2018-06-27 Thread Scott Wegner
Six weeks ago [1] we began an effort to improve the quality of the Java
codebase via ErrorProne static analysis, and promoting compiler warnings to
errors. As of today, all of our Java projects have been migrated and this
is now the default setting for Beam [2].

This was a community effort. The cleanup spanned 48 JIRA issues [3] and 46
pull requests [4]. I want to give a big thanks to everyone who helped out:
Ismaël Mejía, Tim Robertson, Cade Markegard, and Teng Peng.

Thanks!

[1]
https://lists.apache.org/thread.html/cdc729b6349f952d8db78bae99fff74b06b60918cbe09344e075ba35@%3Cdev.beam.apache.org%3E

[2] https://github.com/apache/beam/pull/5773
[3]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20errorprone

[4]
https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+errorprone+merged%3A%3E%3D2018-05-16+


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Rafael Fernandez
On Wed, Jun 27, 2018 at 9:31 AM Kenneth Knowles  wrote:

> Luke: the proposal here solves exactly what you are talking about.
>
> The problem you describe happens when the PR author uses autoformat but
> the baseline is not already autoformatted. What I am proposing is to make
> sure the baseline is already autoformatted, so PRs never have extraneous
> formatting changes.
>
> Rafael: the default setting on GitHub is "allow edits by maintainers" so
> actually a committer can run spotless on behalf of a contributor and push
> the fixup. I have done this. It also lets a committer fix up a good PR
> and merge it even if the contributor is, say, asleep.
>

​This is a great practice, review the technical part, use the tool to
address the mechanical part. ​



>
> Kenn
>
> On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
> wrote:
>
>> Luke: Anything that helps contributors and reviewers work better together
>> - +1! :D
>>
>>
>>
>> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>>
>>> If spotless is run against a PR that is already well formatted its a
>>> non-issue as the formatting changes are usually related to the change but I
>>> have reviewed a few PRs that have 100s of lines of formatting change which
>>> really obfuscates the work.
>>> Instead of asking contributors to run spotless, can we have a cron job
>>> run it across the project like once a day/week/... and cut a PR?
>>>
>>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:
>>>
 Good points, Dan. Checkstyle will still run, but just focused on the
 things that go beyond format.

 Kenn

 On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
 wrote:

> +1 !
> It's my custom to avoid reformatting to spare meaningless diff burden
> to the reviewer. Now it will be over, thanks.
>
> Etienne
>
> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>
> Hi all,
>
> I like readable code, but I don't like formatting it myself. And I
> _really_ don't like discussing in code review. "Spotless" [1] can enforce 
> -
> and automatically apply - automatic formatting for Java, Groovy, and some
> others.
>
> This is not about style or wanting a particular layout. This is about
> automation, contributor experience, and streamlining review
>
>  - Contributor experience: MUCH better than checkstyle: error message
> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
> telling them to go in and manually edit.
>
>  - Automation: You want to use autoformat so you don't have to format
> code by hand. But if you autoformat a file that was in some other format,
> then you touch a bunch of unrelated lines. If the file is already
> autoformatted, it is much better.
>
>  - Review: Never talk about code formatting ever again. A PR also
> needs baseline to already be autoformatted or formatting will make it
> unclear which lines are really changed.
>
> This is already available via applyJavaNature(enableSpotless: true)
> and it is turned on for SQL and our buildSrc gradle plugins. It is very
> nice. There is a JIRA [2] to turn it on for the hold code base. 
> Personally,
> I think (a) every module could make a different choice if the main
> contributors feel strongly and (b) it is objectively better to always
> autoformat :-)
>
> WDYT? If we do it, it is trivial to add it module-at-a-time or
> globally. If someone conflicts with a massive autoformat commit, they can
> just keep their changes and autoformat them and it is done.
>
> Kenn
>
> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
> [2] https://issues.apache.org/jira/browse/BEAM-4394
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
Luke: the proposal here solves exactly what you are talking about.

The problem you describe happens when the PR author uses autoformat but the
baseline is not already autoformatted. What I am proposing is to make sure
the baseline is already autoformatted, so PRs never have extraneous
formatting changes.

Rafael: the default setting on GitHub is "allow edits by maintainers" so
actually a committer can run spotless on behalf of a contributor and push
the fixup. I have done this. It also lets a committer fix up a good PR and
merge it even if the contributor is, say, asleep.

Kenn

On Wed, Jun 27, 2018 at 9:24 AM Rafael Fernandez 
wrote:

> Luke: Anything that helps contributors and reviewers work better together
> - +1! :D
>
>
>
> On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:
>
>> If spotless is run against a PR that is already well formatted its a
>> non-issue as the formatting changes are usually related to the change but I
>> have reviewed a few PRs that have 100s of lines of formatting change which
>> really obfuscates the work.
>> Instead of asking contributors to run spotless, can we have a cron job
>> run it across the project like once a day/week/... and cut a PR?
>>
>> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:
>>
>>> Good points, Dan. Checkstyle will still run, but just focused on the
>>> things that go beyond format.
>>>
>>> Kenn
>>>
>>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
>>> wrote:
>>>
 +1 !
 It's my custom to avoid reformatting to spare meaningless diff burden
 to the reviewer. Now it will be over, thanks.

 Etienne

 Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :

 Hi all,

 I like readable code, but I don't like formatting it myself. And I
 _really_ don't like discussing in code review. "Spotless" [1] can enforce -
 and automatically apply - automatic formatting for Java, Groovy, and some
 others.

 This is not about style or wanting a particular layout. This is about
 automation, contributor experience, and streamlining review

  - Contributor experience: MUCH better than checkstyle: error message
 just says "run ./gradlew :beam-your-module:spotlessApply" instead of
 telling them to go in and manually edit.

  - Automation: You want to use autoformat so you don't have to format
 code by hand. But if you autoformat a file that was in some other format,
 then you touch a bunch of unrelated lines. If the file is already
 autoformatted, it is much better.

  - Review: Never talk about code formatting ever again. A PR also needs
 baseline to already be autoformatted or formatting will make it unclear
 which lines are really changed.

 This is already available via applyJavaNature(enableSpotless: true) and
 it is turned on for SQL and our buildSrc gradle plugins. It is very nice.
 There is a JIRA [2] to turn it on for the hold code base. Personally, I
 think (a) every module could make a different choice if the main
 contributors feel strongly and (b) it is objectively better to always
 autoformat :-)

 WDYT? If we do it, it is trivial to add it module-at-a-time or
 globally. If someone conflicts with a massive autoformat commit, they can
 just keep their changes and autoformat them and it is done.

 Kenn

 [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
 [2] https://issues.apache.org/jira/browse/BEAM-4394




Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Andrew Pilloud
I agree that PRs with formatting changes are impossible to review, but I
think that is exactly what this is trying to avoid. If this is run nightly,
some contributors will still autoformat their PRs and have a bunch of
unrelated changes in the diff. If the code is always well formatted (and
that is enforced by a precommit), then you will never have a PR with
unrelated formatting issues.

Andrew

On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:

> If spotless is run against a PR that is already well formatted its a
> non-issue as the formatting changes are usually related to the change but I
> have reviewed a few PRs that have 100s of lines of formatting change which
> really obfuscates the work.
> Instead of asking contributors to run spotless, can we have a cron job run
> it across the project like once a day/week/... and cut a PR?
>
> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:
>
>> Good points, Dan. Checkstyle will still run, but just focused on the
>> things that go beyond format.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
>> wrote:
>>
>>> +1 !
>>> It's my custom to avoid reformatting to spare meaningless diff burden to
>>> the reviewer. Now it will be over, thanks.
>>>
>>> Etienne
>>>
>>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>>
>>> Hi all,
>>>
>>> I like readable code, but I don't like formatting it myself. And I
>>> _really_ don't like discussing in code review. "Spotless" [1] can enforce -
>>> and automatically apply - automatic formatting for Java, Groovy, and some
>>> others.
>>>
>>> This is not about style or wanting a particular layout. This is about
>>> automation, contributor experience, and streamlining review
>>>
>>>  - Contributor experience: MUCH better than checkstyle: error message
>>> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
>>> telling them to go in and manually edit.
>>>
>>>  - Automation: You want to use autoformat so you don't have to format
>>> code by hand. But if you autoformat a file that was in some other format,
>>> then you touch a bunch of unrelated lines. If the file is already
>>> autoformatted, it is much better.
>>>
>>>  - Review: Never talk about code formatting ever again. A PR also needs
>>> baseline to already be autoformatted or formatting will make it unclear
>>> which lines are really changed.
>>>
>>> This is already available via applyJavaNature(enableSpotless: true) and
>>> it is turned on for SQL and our buildSrc gradle plugins. It is very nice.
>>> There is a JIRA [2] to turn it on for the hold code base. Personally, I
>>> think (a) every module could make a different choice if the main
>>> contributors feel strongly and (b) it is objectively better to always
>>> autoformat :-)
>>>
>>> WDYT? If we do it, it is trivial to add it module-at-a-time or globally.
>>> If someone conflicts with a massive autoformat commit, they can just keep
>>> their changes and autoformat them and it is done.
>>>
>>> Kenn
>>>
>>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>>
>>>


Re: Jenkins Groovy files naming convention

2018-06-27 Thread Kenneth Knowles
SGTM. The underscores are more-or-less substituting for directories. Maybe
we can just put them in directories?

On Wed, Jun 27, 2018 at 9:18 AM Lukasz Cwik  wrote:

> I don't really have a strong preference.
>
> On Wed, Jun 27, 2018 at 9:13 AM Łukasz Gajowy 
> wrote:
>
>> Hi all,
>>
>> I think we should change the naming convention that we have in
>> jenkins .groovy files. AFAIK, groovy is CamelCase, and we use snake_case
>> names there. I suppose this is because we wanted to reflect jenkins job
>> names (do we need this?)
>>
>> IMO, the convention should be CamelCase for all .groovy files (both
>> actual job files and helper class files).
>>
>> WDYT?
>>
>> Łukasz
>>
>


Re: Jenkins Groovy files naming convention

2018-06-27 Thread Rafael Fernandez
+1 CamelCase. Feels "normal" to me in Groovy.

On Wed, Jun 27, 2018 at 9:18 AM Lukasz Cwik  wrote:

> I don't really have a strong preference.
>
> On Wed, Jun 27, 2018 at 9:13 AM Łukasz Gajowy 
> wrote:
>
>> Hi all,
>>
>> I think we should change the naming convention that we have in
>> jenkins .groovy files. AFAIK, groovy is CamelCase, and we use snake_case
>> names there. I suppose this is because we wanted to reflect jenkins job
>> names (do we need this?)
>>
>> IMO, the convention should be CamelCase for all .groovy files (both
>> actual job files and helper class files).
>>
>> WDYT?
>>
>> Łukasz
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Rafael Fernandez
Luke: Anything that helps contributors and reviewers work better together -
+1! :D



On Wed, Jun 27, 2018 at 9:04 AM Lukasz Cwik  wrote:

> If spotless is run against a PR that is already well formatted its a
> non-issue as the formatting changes are usually related to the change but I
> have reviewed a few PRs that have 100s of lines of formatting change which
> really obfuscates the work.
> Instead of asking contributors to run spotless, can we have a cron job run
> it across the project like once a day/week/... and cut a PR?
>
> On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:
>
>> Good points, Dan. Checkstyle will still run, but just focused on the
>> things that go beyond format.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
>> wrote:
>>
>>> +1 !
>>> It's my custom to avoid reformatting to spare meaningless diff burden to
>>> the reviewer. Now it will be over, thanks.
>>>
>>> Etienne
>>>
>>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>>
>>> Hi all,
>>>
>>> I like readable code, but I don't like formatting it myself. And I
>>> _really_ don't like discussing in code review. "Spotless" [1] can enforce -
>>> and automatically apply - automatic formatting for Java, Groovy, and some
>>> others.
>>>
>>> This is not about style or wanting a particular layout. This is about
>>> automation, contributor experience, and streamlining review
>>>
>>>  - Contributor experience: MUCH better than checkstyle: error message
>>> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
>>> telling them to go in and manually edit.
>>>
>>>  - Automation: You want to use autoformat so you don't have to format
>>> code by hand. But if you autoformat a file that was in some other format,
>>> then you touch a bunch of unrelated lines. If the file is already
>>> autoformatted, it is much better.
>>>
>>>  - Review: Never talk about code formatting ever again. A PR also needs
>>> baseline to already be autoformatted or formatting will make it unclear
>>> which lines are really changed.
>>>
>>> This is already available via applyJavaNature(enableSpotless: true) and
>>> it is turned on for SQL and our buildSrc gradle plugins. It is very nice.
>>> There is a JIRA [2] to turn it on for the hold code base. Personally, I
>>> think (a) every module could make a different choice if the main
>>> contributors feel strongly and (b) it is objectively better to always
>>> autoformat :-)
>>>
>>> WDYT? If we do it, it is trivial to add it module-at-a-time or globally.
>>> If someone conflicts with a massive autoformat commit, they can just keep
>>> their changes and autoformat them and it is done.
>>>
>>> Kenn
>>>
>>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins Groovy files naming convention

2018-06-27 Thread Lukasz Cwik
I don't really have a strong preference.

On Wed, Jun 27, 2018 at 9:13 AM Łukasz Gajowy 
wrote:

> Hi all,
>
> I think we should change the naming convention that we have in
> jenkins .groovy files. AFAIK, groovy is CamelCase, and we use snake_case
> names there. I suppose this is because we wanted to reflect jenkins job
> names (do we need this?)
>
> IMO, the convention should be CamelCase for all .groovy files (both actual
> job files and helper class files).
>
> WDYT?
>
> Łukasz
>


Jenkins Groovy files naming convention

2018-06-27 Thread Łukasz Gajowy
Hi all,

I think we should change the naming convention that we have in
jenkins .groovy files. AFAIK, groovy is CamelCase, and we use snake_case
names there. I suppose this is because we wanted to reflect jenkins job
names (do we need this?)

IMO, the convention should be CamelCase for all .groovy files (both actual
job files and helper class files).

WDYT?

Łukasz


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Lukasz Cwik
If spotless is run against a PR that is already well formatted its a
non-issue as the formatting changes are usually related to the change but I
have reviewed a few PRs that have 100s of lines of formatting change which
really obfuscates the work.
Instead of asking contributors to run spotless, can we have a cron job run
it across the project like once a day/week/... and cut a PR?

On Wed, Jun 27, 2018 at 8:07 AM Kenneth Knowles  wrote:

> Good points, Dan. Checkstyle will still run, but just focused on the
> things that go beyond format.
>
> Kenn
>
> On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
> wrote:
>
>> +1 !
>> It's my custom to avoid reformatting to spare meaningless diff burden to
>> the reviewer. Now it will be over, thanks.
>>
>> Etienne
>>
>> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>>
>> Hi all,
>>
>> I like readable code, but I don't like formatting it myself. And I
>> _really_ don't like discussing in code review. "Spotless" [1] can enforce -
>> and automatically apply - automatic formatting for Java, Groovy, and some
>> others.
>>
>> This is not about style or wanting a particular layout. This is about
>> automation, contributor experience, and streamlining review
>>
>>  - Contributor experience: MUCH better than checkstyle: error message
>> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
>> telling them to go in and manually edit.
>>
>>  - Automation: You want to use autoformat so you don't have to format
>> code by hand. But if you autoformat a file that was in some other format,
>> then you touch a bunch of unrelated lines. If the file is already
>> autoformatted, it is much better.
>>
>>  - Review: Never talk about code formatting ever again. A PR also needs
>> baseline to already be autoformatted or formatting will make it unclear
>> which lines are really changed.
>>
>> This is already available via applyJavaNature(enableSpotless: true) and
>> it is turned on for SQL and our buildSrc gradle plugins. It is very nice.
>> There is a JIRA [2] to turn it on for the hold code base. Personally, I
>> think (a) every module could make a different choice if the main
>> contributors feel strongly and (b) it is objectively better to always
>> autoformat :-)
>>
>> WDYT? If we do it, it is trivial to add it module-at-a-time or globally.
>> If someone conflicts with a massive autoformat commit, they can just keep
>> their changes and autoformat them and it is done.
>>
>> Kenn
>>
>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>
>>


Beam Dependency Ownership

2018-06-27 Thread Yifan Zou
Hi all,

We now have the automated detections for Beam dependency updates and
sending a weekly report to dev mailing list. In order to address the
updates in time, we want to find owners for all dependencies of Beam, and
finally, Jira bugs will be automatically created and assigned to the owners
if actions need to be taken. We also welcome non-owners to upgrade
dependency packages, but only owners will receive the Jira tickets.

Please review the spreadsheet Beam SDK Dependency Ownership

and
sign off if you are familiar with any Beam dependencies and willing to take
in charge of them. It is definitely fine that a single package have
multiple owners. The more owners we have, the more helps we will get to
keep Beam dependencies in a healthy state.

Thank you :)

Regards.
Yifan

https://docs.google.com/spreadsheets/d/12NN3vPqFTBQtXBc0fg4sFIb9c_mgst0IDePB_0Ui8kE/edit?ts=5b32bec1#gid=0


Re: Jenkins emails

2018-06-27 Thread Kenneth Knowles
I think it is just a matter of consensus. I have opened
https://github.com/apache/beam/pull/5784 to make it off, and then we can go
through the individual job definitions to remove the parameter entirely.

Two more reasons to remove this email config:

 - as job count scales up, email is just not sensible for managing their
status
 - many/most changes are not relevant to the particular test being run

Kenn

On Wed, Jun 27, 2018 at 8:26 AM Stefano Baghino  wrote:

> Thanks for the explanation, Kenn, I appreciate. Indeed my commits were on
> the website, so probably that's it.
>
> I presume not, but can I do something to facilitate the process?
>
> Best,
> Stefano
>
> Sent from Blue 
> On Jun 27, 2018, at 17:23, Kenneth Knowles  wrote:
>>
>> First let me say what I think the cause is:
>>
>>  - we have most jobs set to "email people who broke the build"
>>  - this scrapes what Jenkins thinks all the changes since last time are,
>> and for every registered address, sends email
>>  - when we create a new job with these settings, *all* changes are
>> considered new, and jobs usually start broken, so everyone gets email until
>> the first passing run
>>  - also we bulk merged the web site into the main repo, so *all* website
>> commits were also considered for all existing jobs
>>
>> We have had some other issues where Jenkins is emailing people with *no*
>> commits at all in any repo. Presumably due to an issue with looking up
>> registered users.
>>
>> I think we should just turn off "email people who broke the build". We
>> have other channels to monitor the status, and we don't need this spam or
>> administrative work.
>>
>> Kenn
>>
>> On Wed, Jun 27, 2018 at 8:10 AM < stef...@baghino.me> wrote:
>>
>>> Hi all,
>>>
>>> I've contributed a couple of lines of code a couple of years back. Ever
>>> since a week, I started receiving email notification from Jenkins.
>>>
>>> How do I unsubscribe from those? I don't have an account on
>>> builds.apache.org.
>>>
>>> Best,
>>> Stefano
>>>
>>


Re: Jenkins emails

2018-06-27 Thread Stefano Baghino
Thanks for the explanation, Kenn, I appreciate. Indeed my commits were on the 
website, so probably that's it.

I presume not, but can I do something to facilitate the process?

Best,
Stefano

⁣Sent from Blue ​

On Jun 27, 2018, 17:23, at 17:23, Kenneth Knowles  wrote:
>First let me say what I think the cause is:
>
> - we have most jobs set to "email people who broke the build"
>- this scrapes what Jenkins thinks all the changes since last time are,
>and for every registered address, sends email
> - when we create a new job with these settings, *all* changes are
>considered new, and jobs usually start broken, so everyone gets email
>until
>the first passing run
>- also we bulk merged the web site into the main repo, so *all* website
>commits were also considered for all existing jobs
>
>We have had some other issues where Jenkins is emailing people with
>*no*
>commits at all in any repo. Presumably due to an issue with looking up
>registered users.
>
>I think we should just turn off "email people who broke the build". We
>have
>other channels to monitor the status, and we don't need this spam or
>administrative work.
>
>Kenn
>
>On Wed, Jun 27, 2018 at 8:10 AM  wrote:
>
>> Hi all,
>>
>> I've contributed a couple of lines of code a couple of years back.
>Ever
>> since a week, I started receiving email notification from Jenkins.
>>
>> How do I unsubscribe from those? I don't have an account on
>> builds.apache.org.
>>
>> Best,
>> Stefano
>>


Re: Jenkins emails

2018-06-27 Thread Kenneth Knowles
First let me say what I think the cause is:

 - we have most jobs set to "email people who broke the build"
 - this scrapes what Jenkins thinks all the changes since last time are,
and for every registered address, sends email
 - when we create a new job with these settings, *all* changes are
considered new, and jobs usually start broken, so everyone gets email until
the first passing run
 - also we bulk merged the web site into the main repo, so *all* website
commits were also considered for all existing jobs

We have had some other issues where Jenkins is emailing people with *no*
commits at all in any repo. Presumably due to an issue with looking up
registered users.

I think we should just turn off "email people who broke the build". We have
other channels to monitor the status, and we don't need this spam or
administrative work.

Kenn

On Wed, Jun 27, 2018 at 8:10 AM  wrote:

> Hi all,
>
> I've contributed a couple of lines of code a couple of years back. Ever
> since a week, I started receiving email notification from Jenkins.
>
> How do I unsubscribe from those? I don't have an account on
> builds.apache.org.
>
> Best,
> Stefano
>


Jenkins emails

2018-06-27 Thread stefano

Hi all,

I've contributed a couple of lines of code a couple of years back. Ever 
since a week, I started receiving email notification from Jenkins.


How do I unsubscribe from those? I don't have an account on 
builds.apache.org.


Best,
Stefano


Jenkins build is back to normal : beam_SeedJob_Standalone #1219

2018-06-27 Thread Apache Jenkins Server
See 




Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Kenneth Knowles
Good points, Dan. Checkstyle will still run, but just focused on the things
that go beyond format.

Kenn

On Wed, Jun 27, 2018 at 8:03 AM Etienne Chauchot 
wrote:

> +1 !
> It's my custom to avoid reformatting to spare meaningless diff burden to
> the reviewer. Now it will be over, thanks.
>
> Etienne
>
> Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
>
> Hi all,
>
> I like readable code, but I don't like formatting it myself. And I
> _really_ don't like discussing in code review. "Spotless" [1] can enforce -
> and automatically apply - automatic formatting for Java, Groovy, and some
> others.
>
> This is not about style or wanting a particular layout. This is about
> automation, contributor experience, and streamlining review
>
>  - Contributor experience: MUCH better than checkstyle: error message just
> says "run ./gradlew :beam-your-module:spotlessApply" instead of telling
> them to go in and manually edit.
>
>  - Automation: You want to use autoformat so you don't have to format code
> by hand. But if you autoformat a file that was in some other format, then
> you touch a bunch of unrelated lines. If the file is already autoformatted,
> it is much better.
>
>  - Review: Never talk about code formatting ever again. A PR also needs
> baseline to already be autoformatted or formatting will make it unclear
> which lines are really changed.
>
> This is already available via applyJavaNature(enableSpotless: true) and it
> is turned on for SQL and our buildSrc gradle plugins. It is very nice.
> There is a JIRA [2] to turn it on for the hold code base. Personally, I
> think (a) every module could make a different choice if the main
> contributors feel strongly and (b) it is objectively better to always
> autoformat :-)
>
> WDYT? If we do it, it is trivial to add it module-at-a-time or globally.
> If someone conflicts with a massive autoformat commit, they can just keep
> their changes and autoformat them and it is done.
>
> Kenn
>
> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
> [2] https://issues.apache.org/jira/browse/BEAM-4394
>
>


Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Etienne Chauchot
+1 !It's my custom to avoid reformatting to spare meaningless diff burden to 
the reviewer. Now it will be over, thanks.
Etienne 
Le mardi 26 juin 2018 à 21:15 -0700, Kenneth Knowles a écrit :
> Hi all,
> I like readable code, but I don't like formatting it myself. And I _really_ 
> don't like discussing in code review.
> "Spotless" [1] can enforce - and automatically apply - automatic formatting 
> for Java, Groovy, and some others.
> 
> This is not about style or wanting a particular layout. This is about 
> automation, contributor experience, and
> streamlining review
> 
>  - Contributor experience: MUCH better than checkstyle: error message just 
> says "run ./gradlew :beam-your-
> module:spotlessApply" instead of telling them to go in and manually edit.
>  - Automation: You want to use autoformat so you don't have to format code by 
> hand. But if you autoformat a file that
> was in some other format, then you touch a bunch of unrelated lines. If the 
> file is already autoformatted, it is much
> better.
> 
>  - Review: Never talk about code formatting ever again. A PR also needs 
> baseline to already be autoformatted or
> formatting will make it unclear which lines are really changed.
> 
> This is already available via applyJavaNature(enableSpotless: true) and it is 
> turned on for SQL and our buildSrc
> gradle plugins. It is very nice. There is a JIRA [2] to turn it on for the 
> hold code base. Personally, I think (a)
> every module could make a different choice if the main contributors feel 
> strongly and (b) it is objectively better to
> always autoformat :-)
> 
> WDYT? If we do it, it is trivial to add it module-at-a-time or globally. If 
> someone conflicts with a massive
> autoformat commit, they can just keep their changes and autoformat them and 
> it is done.
> 
> Kenn
> 
> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
> [2] https://issues.apache.org/jira/browse/BEAM-4394
> 
> 

Re: Going on leave for a bit

2018-06-27 Thread Etienne Chauchot
Congrats Kenn !
Spend some good time with your family.
Etienne
Le lundi 25 juin 2018 à 22:42 -0700, Kenneth Knowles a écrit :
> Hi friends,
> I think I did not mention on dev@ at the time, but my child #2 arrived March 
> 14 (Pi day!) and I took some weeks off.
> Starting ~July 4 I will be taking a more significant absence, until ~October 
> 1, trying my best to be totally offline.
> 
> JFYI so that you know why JIRAs and PRs are not being addressed. I am also 
> unassigning my JIRAs so that I am not
> holding any mutexes, and I will close PRs so they don't get stale.
> 
> Any questions or pressing issues, I will be online this week and a little bit 
> next week.
> 
> Kenn

Jenkins build is back to normal : beam_SeedJob #2093

2018-06-27 Thread Apache Jenkins Server
See 



Re: Using user developped source in streamline python

2018-06-27 Thread Ahmet Altay
Hi Sébastien,

Currently there is no work in progress for including the write transforms
for the locations you listed. You could develop your own version if
interested. Please see WriteToBigquery transform [1] for reference.

Ahmet

[1]
https://github.com/apache/beam/blob/375bd3a6a53ba3ba7c965278dcb322875e1b4dca/sdks/python/apache_beam/io/gcp/bigquery.py#L1287

On Wed, Jun 27, 2018 at 2:48 AM, Sebastien Morand <
sebastien.mor...@veolia.com> wrote:

> Hi,
>
> Thanks for your answer.
>
> Ok looking forward and ready to test alpha on this. Because we have
> actually some use cases to send data to CloudSQL or Spanner or BigTable or
> Firestore. As far as I read the documentation, there is no native support
> for them and so we have already implemented a custom source support.
>
> So is there any work in progress for any of the 4 sinks above for native
> support in the python SDK?
>
> Regards,
>
> *Sébastien MORAND*
> Team Lead Solution Architect
> Technology & Operations / Digital Factory
> Veolia - Group Information Systems & Technology (IS)
> Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
> Bureau 0144C (Ouest)
> 30, rue Madeleine-Vionnet
> 
>  -
> 93300
> Aubervilliers, France
> 
> *www.veolia.com *
> 
> 
> 
> 
> 
>
>
> On Thu, 21 Jun 2018 at 16:53, Lukasz Cwik  wrote:
>
>> +dev@beam.apache.org
>>
>> Python streaming custom source support will be available via
>> SplittableDoFn. It is actively being worked on by a few contributors but to
>> my knowledge there is no roadmap yet for having support for this for
>> Dataflow.
>>
>>
>> On Thu, Jun 21, 2018 at 1:19 AM Sebastien Morand <
>> sebastien.mor...@veolia.com> wrote:
>>
>>> Hi,
>>>
>>> We need to setup streaming dataflow in python using developed source (in
>>> the opposite of native source) for Cloud SQL integration and Firestore
>>> integration.
>>>
>>> This is currently not supported as far as I understood the
>>> documentation, can you confirm? Any roadmap on the topic?
>>>
>>> Thanks by advance,
>>> Regards,
>>>
>>> *Sébastien MORAND*
>>> Team Lead Solution Architect
>>> Technology & Operations / Digital Factory
>>> Veolia - Group Information Systems & Technology (IS)
>>> Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
>>> Bureau 0144C (Ouest)
>>> 30, rue Madeleine-Vionnet
>>> 
>>>  -
>>> 93300
>>> Aubervilliers, France
>>> 
>>> *www.veolia.com *
>>> 
>>> 
>>> 
>>> 
>>> 
>>>
>>>
>>> 
>>> 
>>> This e-mail transmission (message and any attached files) may contain
>>> information that is proprietary, privileged and/or confidential to Veolia
>>> Environnement and/or its affiliates and is intended exclusively for the
>>> person(s) to whom it is addressed. If you are not the intended recipient,
>>> please notify the sender by return e-mail and delete all copies of this
>>> e-mail, including all attachments. Unless expressly authorized, any use,
>>> disclosure, publication, retransmission or dissemination of this e-mail
>>> and/or of its attachments is strictly prohibited.
>>>
>>> Ce message electronique et ses fichiers attaches sont strictement
>>> confidentiels et peuvent contenir des elements dont Veolia Environnement
>>> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
>>> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
>>> message par erreur, merci de le retourner a son emetteur et de le detruire
>>> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
>>> publication, la distribution, ou la reproduction non expressement
>>> autorisees de ce message et de ses pieces attachees sont interdites.
>>> 
>>> 
>>>
>>
>
> 
> 
> This e-mail transmission (message and 

Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Daniel Kulp
I’m +1 to the idea of using something to handle the auto-formatting  of code.   

HOWEVER, I want to make it clear that this will likely NOT remove checkstyle 
from the builds as there are things that checkstyle checks that aren’t part of 
this.   Variable names, type names, javadoc things, etc…. Are things that 
checkstyle currently handles that are not solved by this.   Checkstyle can also 
be used to enforce some other coding issues.  For example: checkstyle can mark 
imports of java.util.logging as a problem to link this to another thread on the 
list.  :)

That said, the checkstyle rules for whitespace and paren padding and brace 
placement and such can all be turned off.They likely will have to be turned 
off.   Last time I attempted to use spotless, I couldn’t get a set of 
checkstyle rules that would be a 100% match for what spotless generates.  

Dan

> On Jun 27, 2018, at 3:42 PM, Ismaël Mejía  wrote:
> 
> ++1
> 
> - This solves the big mismatch of autoformatting using eclipse's google java 
> style and the google-java-format plugin that makes a common source of 
> unneeded diffs part of the reviews.
> - The spotless plugin makes this trivial to do and uniform.
> 
> We need to update the instructions on formatting and we need to remove 
> checkstyle and the Eclipse codestyle template from the repo too. So probably 
> worth a JIRA/PR for this too. My pre​ference is to do it all at once because 
> the tool helps to do it quickly and we will have the pain only once and it 
> will be fixed forever. Only thing that bugs me is that we will break probably 
> most of the Java open Pull Requests. From a quicklook there are around 15 PRs 
> from non active contributors that could end up staling if not rebased, so 
> probably worth a message in these PRs remembering to rebase and linking the 
> new autoformatting docs.
> 

-- 
Daniel Kulp
dk...@apache.org  - http://dankulp.com/blog 

Talend Community Coder - http://coders.talend.com 


Build failed in Jenkins: beam_SeedJob #2092

2018-06-27 Thread Apache Jenkins Server
See 

--
GitHub pull request #4976 of commit c1d933b616b2067743a9245f11515a7c51b1be69, 
no merge conflicts.
Setting status of c1d933b616b2067743a9245f11515a7c51b1be69 to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/2092/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam12 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/4976/*:refs/remotes/origin/pr/4976/*
 > git rev-parse refs/remotes/origin/pr/4976/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/4976/merge^{commit} # timeout=10
Checking out Revision 6b6703aeb5ddd9fd5dcedfd9afff40f764234b1c 
(refs/remotes/origin/pr/4976/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6b6703aeb5ddd9fd5dcedfd9afff40f764234b1c
Commit message: "Merge c1d933b616b2067743a9245f11515a7c51b1be69 into 
f0ccbf1ab6cc9f42d193bc48d636d3382a825cdb"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_Dependency_Check.groovy
Processing DSL script job_Inventory.groovy
Processing DSL script job_PerformanceTests_Dataflow.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_PerformanceTests_JDBC.groovy
Processing DSL script job_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_PerformanceTests_Python.groovy
Processing DSL script job_PerformanceTests_Spark.groovy
Processing DSL script job_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_Nexmark_Direct.groovy
ERROR: (job_PostCommit_Java_Nexmark_Direct.groovy, line 30) No signature of 
method: static common_job_properties.setPostCommit() is applicable for argument 
types: (javaposse.jobdsl.dsl.jobs.FreeStyleJob) values: 
[javaposse.jobdsl.dsl.jobs.FreeStyleJob@7d7b37e8]
Possible solutions: setPreCommit(java.lang.Object, java.lang.String)



Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Ismaël Mejía
++1

- This solves the big mismatch of autoformatting using eclipse's google
java style and the google-java-format plugin that makes a common source of
unneeded diffs part of the reviews.
- The spotless plugin makes this trivial to do and uniform.

We need to update the instructions on formatting and we need to remove
checkstyle and the Eclipse codestyle template from the repo too. So
probably worth a JIRA/PR for this too. My pre​ference is to do it all at
once because the tool helps to do it quickly and we will have the pain only
once and it will be fixed forever. Only thing that bugs me is that we will
break probably most of the Java open Pull Requests. From a quicklook there
are around 15 PRs from non active contributors that could end up staling if
not rebased, so probably worth a message in these PRs remembering to rebase
and linking the new autoformatting docs.


Jenkins build is back to normal : beam_SeedJob #2091

2018-06-27 Thread Apache Jenkins Server
See 




Re: Unbounded source translation for portable pipelines

2018-06-27 Thread Thomas Weise
Hi Eugene,

The basic streaming translation is already in place from the prototype,
though I have not verified it on the master branch yet.

Are the user instructions for the portable Flink runner at
https://s.apache.org/beam-portability-team-doc current?

(I don't have a dependency on SDF since we are going to use custom native
Flink sources/sinks at this time.)

Thanks,
Thomas


On Tue, Jun 26, 2018 at 2:13 AM Eugene Kirpichov 
wrote:

> Hi!
>
> Wanted to let you know that I've just merged the PR that adds
> checkpointable SDF support to the portable reference runner (ULR) and the
> Java SDK harness:
>
> https://github.com/apache/beam/pull/5566
>
> So now we have a reference implementation of SDF support in a portable
> runner, and a reference implementation of SDF support in a portable SDK
> harness.
> From here on, we need to replicate this support in other portable runners
> and other harnesses. The obvious targets are Flink and Python respectively.
>
> Chamikara was going to work on the Python harness. +Thomas Weise
>  Would you be interested in the Flink portable streaming
> runner side? It is of course blocked by having the rest of that runner
> working in streaming mode though (the batch mode is practically done - will
> send you a separate note about the status of that).
>
> On Fri, Mar 23, 2018 at 12:20 PM Eugene Kirpichov 
> wrote:
>
>> Luke is right - unbounded sources should go through SDF. I am currently
>> working on adding such support to Fn API.
>> The relevant document is s.apache.org/beam-breaking-fusion (note: it
>> focuses on a much more general case, but also considers in detail the
>> specific case of running unbounded sources on Fn API), and the first
>> related PR is https://github.com/apache/beam/pull/4743 .
>>
>> Ways you can help speed up this effort:
>> - Make necessary changes to Apex runner per se to support regular SDFs in
>> streaming (without portability). They will likely largely carry over to
>> portable world. I recall that the Apex runner had some level of support of
>> SDFs, but didn't pass the ValidatesRunner tests yet.
>> - (general to Beam, not Apex-related per se) Implement the translation of
>> Read.from(UnboundedSource) via impulse, which will require implementing an
>> SDF that reads from a given UnboundedSource (taking the UnboundedSource as
>> an element). This should be fairly straightforward and will allow all
>> portable runners to take advantage of existing UnboundedSource's.
>>
>>
>> On Fri, Mar 23, 2018 at 3:08 PM Lukasz Cwik  wrote:
>>
>>> Using impulse is a precursor for both bounded and unbounded SDF.
>>>
>>> This JIRA represents the work that would be to add support for unbounded
>>> SDF using portability APIs:
>>> https://issues.apache.org/jira/browse/BEAM-2939
>>>
>>>
>>> On Fri, Mar 23, 2018 at 11:46 AM Thomas Weise  wrote:
>>>
 So for streaming, we will need the Impulse translation for bounded
 input, identical with batch, and then in addition to that support for SDF?

 Any pointers what's involved in adding the SDF support? Is it runner
 specific? Does the ULR cover it?


 On Fri, Mar 23, 2018 at 11:26 AM, Lukasz Cwik  wrote:

> All "sources" in portability will use splittable DoFns for execution.
>
> Specifically, runners will need to be able to checkpoint unbounded
> sources to get a minimum viable pipeline working.
> For bounded pipelines, a DoFn can read the contents of a bounded
> source.
>
>
> On Fri, Mar 23, 2018 at 10:52 AM Thomas Weise  wrote:
>
>> Hi,
>>
>> I'm looking at the portable pipeline translation for streaming. I
>> understand that for batch pipelines, it is sufficient to translate 
>> Impulse.
>>
>> What is the intended path to support unbounded sources?
>>
>> The goal here is to get a minimum translation working that will allow
>> streaming wordcount execution.
>>
>> Thanks,
>> Thomas
>>
>>



Re: [DISCUSS] Automation for Java code formatting

2018-06-27 Thread Łukasz Gajowy
+1 to automating this with Gradle and never have problems with
formatting/forgetting about it again! :)



śr., 27 cze 2018 o 07:40 Tim Robertson 
napisał(a):

> ++1
>
>
> On Wed, Jun 27, 2018 at 7:36 AM, Ahmet Altay  wrote:
>
>> +1
>>
>> This is great idea. Does anyone know a similar tool for python? I believe
>> go already has this as part of its tools with go fmt.
>>
>>
>> On Tue, Jun 26, 2018 at 9:55 PM, Ankur Goenka  wrote:
>>
>>> +1
>>>
>>> Intellij can help but still formatting is an additional thing to keep in
>>> mind. Enabling auto formatting at gradle level will remove this additional
>>> thing to keep in mind.
>>>
>>> On Tue, Jun 26, 2018 at 9:49 PM Eugene Kirpichov 
>>> wrote:
>>>
 +1!

 In some cases the temptation to format code manually can be quite
 strong, but the ease of just re-running the formatter after any change
 (especially after global changes like class/method renames) overweighs it.
 I lost count of the times when I wasted a precommit because some line
 became >100 characters after a refactoring. I especially love that there's
 a gradle task that does this for you - I used to manually run
 google-java-format-diff.

 On Tue, Jun 26, 2018 at 9:38 PM Rafael Fernandez 
 wrote:

> +1! Remove guesswork :D
>
>
>
> On Tue, Jun 26, 2018 at 9:15 PM Kenneth Knowles 
> wrote:
>
>> Hi all,
>>
>> I like readable code, but I don't like formatting it myself. And I
>> _really_ don't like discussing in code review. "Spotless" [1] can 
>> enforce -
>> and automatically apply - automatic formatting for Java, Groovy, and some
>> others.
>>
>> This is not about style or wanting a particular layout. This is about
>> automation, contributor experience, and streamlining review
>>
>>  - Contributor experience: MUCH better than checkstyle: error message
>> just says "run ./gradlew :beam-your-module:spotlessApply" instead of
>> telling them to go in and manually edit.
>>
>>  - Automation: You want to use autoformat so you don't have to format
>> code by hand. But if you autoformat a file that was in some other format,
>> then you touch a bunch of unrelated lines. If the file is already
>> autoformatted, it is much better.
>>
>>  - Review: Never talk about code formatting ever again. A PR also
>> needs baseline to already be autoformatted or formatting will make it
>> unclear which lines are really changed.
>>
>> This is already available via applyJavaNature(enableSpotless: true)
>> and it is turned on for SQL and our buildSrc gradle plugins. It is very
>> nice. There is a JIRA [2] to turn it on for the hold code base. 
>> Personally,
>> I think (a) every module could make a different choice if the main
>> contributors feel strongly and (b) it is objectively better to always
>> autoformat :-)
>>
>> WDYT? If we do it, it is trivial to add it module-at-a-time or
>> globally. If someone conflicts with a massive autoformat commit, they can
>> just keep their changes and autoformat them and it is done.
>>
>> Kenn
>>
>> [1] https://github.com/diffplug/spotless/tree/master/plugin-gradle
>> [2] https://issues.apache.org/jira/browse/BEAM-4394
>>
>>
>>
>


Build failed in Jenkins: beam_SeedJob_Standalone #1218

2018-06-27 Thread Apache Jenkins Server
See 


Changes:

[xiliu] [BEAM-4640] Samza runner postcommit ValidatesRunner job

[xiliu] Remove unused previous names

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam2 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3232a293696b898967aec600d57105216f2c1bfb (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3232a293696b898967aec600d57105216f2c1bfb
Commit message: "Merge pull request #5758: [BEAM-4640] Samza runner postcommit 
ValidatesRunner job"
 > git rev-list --no-walk 45aa4ce7e9cc12a694e82e24d8793fe15c616b5e # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_Dependency_Check.groovy
Processing DSL script job_Inventory.groovy
Processing DSL script job_PerformanceTests_Dataflow.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_PerformanceTests_JDBC.groovy
Processing DSL script job_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_PerformanceTests_Python.groovy
Processing DSL script job_PerformanceTests_Spark.groovy
Processing DSL script job_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Dataflow.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Flink.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Gearpump.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Samza.groovy
ERROR: startup failed:
job_PostCommit_Java_ValidatesRunner_Samza.groovy: 20: unable to resolve class 
JobBuilder
 @ line 20, column 1.
   import JobBuilder
   ^

1 error

Not sending mail to unregistered user xi...@xiliu-ld1.linkedin.biz


Re: Using user developped source in streamline python

2018-06-27 Thread Sebastien Morand
Hi,

Thanks for your answer.

Ok looking forward and ready to test alpha on this. Because we have
actually some use cases to send data to CloudSQL or Spanner or BigTable or
Firestore. As far as I read the documentation, there is no native support
for them and so we have already implemented a custom source support.

So is there any work in progress for any of the 4 sinks above for native
support in the python SDK?

Regards,

*Sébastien MORAND*
Team Lead Solution Architect
Technology & Operations / Digital Factory
Veolia - Group Information Systems & Technology (IS)
Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
Bureau 0144C (Ouest)
30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
*www.veolia.com *







On Thu, 21 Jun 2018 at 16:53, Lukasz Cwik  wrote:

> +dev@beam.apache.org
>
> Python streaming custom source support will be available via
> SplittableDoFn. It is actively being worked on by a few contributors but to
> my knowledge there is no roadmap yet for having support for this for
> Dataflow.
>
>
> On Thu, Jun 21, 2018 at 1:19 AM Sebastien Morand <
> sebastien.mor...@veolia.com> wrote:
>
>> Hi,
>>
>> We need to setup streaming dataflow in python using developed source (in
>> the opposite of native source) for Cloud SQL integration and Firestore
>> integration.
>>
>> This is currently not supported as far as I understood the documentation,
>> can you confirm? Any roadmap on the topic?
>>
>> Thanks by advance,
>> Regards,
>>
>> *Sébastien MORAND*
>> Team Lead Solution Architect
>> Technology & Operations / Digital Factory
>> Veolia - Group Information Systems & Technology (IS)
>> Cell.: +33 6 12 03 41 15 / Direct: +33 1 85 57 71 08
>> Bureau 0144C (Ouest)
>> 30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
>> *www.veolia.com *
>> 
>> 
>> 
>> 
>> 
>>
>>
>>
>> 
>> This e-mail transmission (message and any attached files) may contain
>> information that is proprietary, privileged and/or confidential to Veolia
>> Environnement and/or its affiliates and is intended exclusively for the
>> person(s) to whom it is addressed. If you are not the intended recipient,
>> please notify the sender by return e-mail and delete all copies of this
>> e-mail, including all attachments. Unless expressly authorized, any use,
>> disclosure, publication, retransmission or dissemination of this e-mail
>> and/or of its attachments is strictly prohibited.
>>
>> Ce message electronique et ses fichiers attaches sont strictement
>> confidentiels et peuvent contenir des elements dont Veolia Environnement
>> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
>> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
>> message par erreur, merci de le retourner a son emetteur et de le detruire
>> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
>> publication, la distribution, ou la reproduction non expressement
>> autorisees de ce message et de ses pieces attachees sont interdites.
>>
>> 
>>
>

-- 



This
 
e-mail transmission (message and any attached files) may contain 
information that is proprietary, privileged and/or confidential to Veolia 
Environnement and/or its affiliates and is intended exclusively for the 
person(s) to whom it is addressed. If you are not the intended recipient, 
please notify the sender by return e-mail and delete all copies of this 
e-mail, including all attachments. Unless expressly authorized, any use, 
disclosure, publication, retransmission or dissemination of this e-mail 
and/or of its attachments is strictly prohibited. 


Ce message 
electronique et ses fichiers attaches sont strictement confidentiels et 
peuvent contenir des elements dont Veolia Environnement et/ou l'une de ses 
entites affiliees sont proprietaires. Ils sont donc destines a l'usage de 
leurs seuls destinataires. Si vous avez recu ce message par erreur, merci 
de le retourner a son emetteur et de le detruire ainsi que toutes les 
pieces attachees. L'utilisation, la divulgation, la publication, la 
distribution, ou la reproduction non expressement autorisees de ce message 
et de ses pieces attachees sont interdites.





Re: Going on leave for a bit

2018-06-27 Thread Łukasz Gajowy
Congrats and enjoy! :)

Best regards,
Łukasz Gajowy

wt., 26 cze 2018 o 20:33 Mikhail Gryzykhin  napisał(a):

> Enjoy time off.
>
> --Mikhail
>
> Have feedback ?
>
>
> On Tue, Jun 26, 2018 at 11:24 AM Xinyu Liu  wrote:
>
>> Congrats! Enjoy the time without sleep.
>>
>> Thanks,
>> Xinyu
>>
>> On Tue, Jun 26, 2018 at 10:12 AM, Griselda Cuevas 
>> wrote:
>>
>>> Enjoy the time off Kenn!
>>>
>>>
>>> On Tue, 26 Jun 2018 at 12:14, Kai Jiang  wrote:
>>>
 Congrats! Enjoy your family time.

 Best,
 Kai

 On Tue, Jun 26, 2018, 09:11 Alan Myrvold  wrote:

> Congrats, Kenn and have a great break.
>
> On Tue, Jun 26, 2018 at 9:10 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
>
>> Best wishes to you and your family, Kenneth!
>>
>> Alexey
>>
>> On 26 Jun 2018, at 15:45, Rafael Fernandez 
>> wrote:
>>
>> Have a great time, Kenn!
>>
>> On Tue, Jun 26, 2018 at 5:49 AM Ismaël Mejía 
>> wrote:
>>
>>> Enjoy your family time.
>>>
>>> Best wishes,
>>> Ismael
>>>
>>>
>>> On Tue, Jun 26, 2018 at 12:13 PM Pei HE  wrote:
>>>
 (A late) Congrats for the newborn!
 --
 Pei

 On Tue, Jun 26, 2018 at 1:42 PM, Kenneth Knowles 
 wrote:
 > Hi friends,
 >
 > I think I did not mention on dev@ at the time, but my child #2
 arrived March
 > 14 (Pi day!) and I took some weeks off. Starting ~July 4 I will
 be taking a
 > more significant absence, until ~October 1, trying my best to be
 totally
 > offline.
 >
 > JFYI so that you know why JIRAs and PRs are not being addressed.
 I am also
 > unassigning my JIRAs so that I am not holding any mutexes, and I
 will close
 > PRs so they don't get stale.
 >
 > Any questions or pressing issues, I will be online this week and
 a little
 > bit next week.
 >
 > Kenn

>>>
>>
>>


Build failed in Jenkins: beam_SeedJob #2090

2018-06-27 Thread Apache Jenkins Server
See 


Changes:

[xiliu] [BEAM-4640] Samza runner postcommit ValidatesRunner job

[xiliu] Remove unused previous names

[amaliujia] replace byte[] with String as Map key.

[scott] Update Findbugs annotations dependency to version with

[scott] Add checkstyle validation to ban usage of @SuppressFBWarnings.

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam12 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3232a293696b898967aec600d57105216f2c1bfb (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3232a293696b898967aec600d57105216f2c1bfb
Commit message: "Merge pull request #5758: [BEAM-4640] Samza runner postcommit 
ValidatesRunner job"
 > git rev-list --no-walk 5c4c81d82d044b25a29ba7c3743cad36662bb37c # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_Dependency_Check.groovy
Processing DSL script job_Inventory.groovy
Processing DSL script job_PerformanceTests_Dataflow.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_PerformanceTests_JDBC.groovy
Processing DSL script job_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_PerformanceTests_Python.groovy
Processing DSL script job_PerformanceTests_Spark.groovy
Processing DSL script job_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Dataflow.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Flink.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Gearpump.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Samza.groovy
ERROR: startup failed:
job_PostCommit_Java_ValidatesRunner_Samza.groovy: 20: unable to resolve class 
JobBuilder
 @ line 20, column 1.
   import JobBuilder
   ^

1 error

Not sending mail to unregistered user xi...@xiliu-ld1.linkedin.biz
Not sending mail to unregistered user amaliu...@163.com


Re: Auto closing stale PRs label

2018-06-27 Thread Alan Myrvold
Glad to see it is working. The requests currently marked stale can be found
with https://github.com/apache/beam/labels/stale

On Tue, Jun 26, 2018 at 9:34 PM Rafael Fernandez 
wrote:

> Neat! Thanks for showing me where the options are.
>
> On Tue, Jun 26, 2018 at 7:24 PM Kenneth Knowles  wrote:
>
>> That's actually already how it works. We can configure how long it waits
>> after the message. Currently it is set for 60 day to stale and then 7 days
>> to close. You can see the options we've set up here; there may be more:
>> https://github.com/apache/beam/blob/master/.github/stale.yml
>>
>> Kenn
>>
>> On Tue, Jun 26, 2018 at 6:42 PM Rafael Fernandez 
>> wrote:
>>
>>> The new label makes sense to me, but Ismael: I want to make sure your
>>> concern is fully addressed. I see your point about making sure we are not
>>> shutting the door on a small fix that perhaps went unatended for benign
>>> reasons. Perhaps a step before closure is feasble? something like getting a
>>> nice message in the PR, "Ahoy! This PR hasn't moved in [X time]. If you're
>>> still working on it, can you comment? Otherwise, our highly sophisticated
>>> AI will declutter and close it in [Y days]".
>>>
>>> Thoughts?
>>>
>>>
>>> On Mon, Jun 25, 2018 at 8:23 AM Kenneth Knowles  wrote:
>>>
 Totally agree.

 By the way, these seem to be default labels for issue tracking. So I
 got rid of the ones that don't seem to make sense. Any committer can hack
 them I think. I just left "stale" for this purpose and "help wanted" since
 that makes sense on a PR. But probably we don't need any since we don't
 have a plan for them.

 Kenn

 On Mon, Jun 25, 2018 at 8:12 AM Ismaël Mejía  wrote:

> Thanks Kenn, much better.
>
> Yes closing stale PRs is worth, but our ultimate goal should be to get
> contributions in so we should keep in mind and try when it is worth to
> rescue fixes that can be lost  because of minor review issues or
> contributor inactivity.
>
> On Mon, Jun 25, 2018 at 4:23 PM Kenneth Knowles 
> wrote:
>
>> It is configured by just a file so alteration is very transparent. I
>> agree with your point about the label. I made a new one for it. Here:
>> https://github.com/apache/beam/pull/5750
>>
>> So far I have been satisfied that it close many _very_ stale PRs. I
>> have been watching it and didn't see any that seemed wrong.
>>
>> Kenn
>>
>>
>> On Mon, Jun 25, 2018 at 12:52 AM Ismaël Mejía 
>> wrote:
>>
>>> I saw some PRs auto closed recently and I was wondering if we could
>>> adjust the  label that is added to the autoclosed PRs, currently it
>>> is
>>> 'wontfix' but this label sends a fake (and negative) message. Can we
>>> parametrize the bot to put something closer to the intention like
>>> 'autoclosed'?
>>>
>>> Who can take care of this?
>>> Any other opinion/suggestion after these first days of the stale bot?
>>>
>>> I have the impression that the time between the staleness warning and
>>> the close is relatively short, of course PRs can be reopened but we
>>> (committers) should pay attention that a PR that is marked as stale
>>> is
>>> not stale because of unfinished reviews.
>>>
>>