Re: [ANNOUNCE] New committer announcement: Mark Liu

2019-03-25 Thread Jason Kuster
Wonderful, congrats Mark!

On Mon, Mar 25, 2019 at 11:30 AM Alan Myrvold  wrote:

> congratulations, Mark!!!
>
> On Mon, Mar 25, 2019 at 10:05 AM Ruoyun Huang  wrote:
>
>> Congratulations Mark!
>>
>> On Mon, Mar 25, 2019 at 9:31 AM Udi Meiri  wrote:
>>
>>> Congrats Mark!
>>>
>>> On Mon, Mar 25, 2019 at 9:24 AM Ahmet Altay  wrote:
>>>
>>>> Congratulations, Mark! 
>>>>
>>>> On Mon, Mar 25, 2019 at 7:24 AM Tim Robertson <
>>>> timrobertson...@gmail.com> wrote:
>>>>
>>>>> Congratulations Mark!
>>>>>
>>>>>
>>>>> On Mon, Mar 25, 2019 at 3:18 PM Michael Luckey 
>>>>> wrote:
>>>>>
>>>>>> Nice! Congratulations, Mark.
>>>>>>
>>>>>> On Mon, Mar 25, 2019 at 2:42 PM Katarzyna Kucharczyk <
>>>>>> ka.kucharc...@gmail.com> wrote:
>>>>>>
>>>>>>> Congratulations, Mark! 
>>>>>>>
>>>>>>> On Mon, Mar 25, 2019 at 11:24 AM Gleb Kanterov 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Congratulations!
>>>>>>>>
>>>>>>>> On Mon, Mar 25, 2019 at 10:23 AM Łukasz Gajowy 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Congrats! :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> pon., 25 mar 2019 o 08:11 Aizhamal Nurmamat kyzy <
>>>>>>>>> aizha...@google.com> napisał(a):
>>>>>>>>>
>>>>>>>>>> Congratulations, Mark!
>>>>>>>>>>
>>>>>>>>>> On Sun, Mar 24, 2019 at 23:18 Pablo Estrada 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeaah  Mark! : ) Congrats : D
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Mar 24, 2019 at 10:32 PM Yifan Zou 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Congratulations Mark!
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Mar 24, 2019 at 10:25 PM Connell O'Callaghan <
>>>>>>>>>>>> conne...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Well done congratulations Mark!!!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Mar 24, 2019 at 10:17 PM Robert Burke <
>>>>>>>>>>>>> rob...@frantil.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Congratulations Mark! 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Mar 24, 2019, 10:08 PM Valentyn Tymofieiev <
>>>>>>>>>>>>>> valen...@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Congratulations, Mark!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for your contributions, in particular for your
>>>>>>>>>>>>>>> efforts to parallelize test execution for Python SDK and 
>>>>>>>>>>>>>>> increase the speed
>>>>>>>>>>>>>>> of Python precommit checks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Mar 24, 2019 at 9:40 PM Kenneth Knowles <
>>>>>>>>>>>>>>> k...@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please join me and the rest of the Beam PMC in welcoming a
>>>>>>>>>>>>>>>> new committer: Mark Liu.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Mark has been contributing to Beam since late 2016! He has
>>>>>>>>>>>>>>>> proposed 100+ pull requests. Mark was instrumental in 
>>>>>>>>>>>>>>>> expanding test and
>>>>>>>>>>>>>>>> infrastructure coverage, especially for Python. In
>>>>>>>>>>>>>>>> consideration of Mark's contributions, the Beam PMC trusts 
>>>>>>>>>>>>>>>> Mark with the
>>>>>>>>>>>>>>>> responsibilities of a Beam committer [1].
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you, Mark, for your contributions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://beam.apache.org/contribute/become-a-committer/
>>>>>>>>>>>>>>>> #an-apache-beam-committer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> *Aizhamal Nurmamat kyzy*
>>>>>>>>>>
>>>>>>>>>> Open Source Program Manager
>>>>>>>>>>
>>>>>>>>>> 646-355-9740 Mobile
>>>>>>>>>>
>>>>>>>>>> 601 North 34th Street, Seattle, WA 98103
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers,
>>>>>>>> Gleb
>>>>>>>>
>>>>>>>
>>
>> --
>> 
>> Ruoyun  Huang
>>
>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Add code quality checks to pre-commits.

2019-01-07 Thread Jason Kuster
tests and coverage
>> reporting
>> > together properly. Last thing: How is "technical debt"
>> measured? I'm
>> > skeptical of quantitative measures for qualitative notions.
>> >
>> > Kenn
>> >
>> > On Thu, Jan 3, 2019 at 1:58 PM Heejong Lee <
>> heej...@google.com
>> > <mailto:heej...@google.com>> wrote:
>> >
>> > I don't have any experience of using SonarQube but
>> Coverity
>> > worked well for me. Looks like it already has beam repo:
>> > https://scan.coverity.com/projects/11881
>> >
>> > On Thu, Jan 3, 2019 at 1:27 PM Reuven Lax <
>> re...@google.com
>> > <mailto:re...@google.com>> wrote:
>> >
>> > checkstyle and findbugs are already run as
>> precommit checks,
>> > are they not?
>> >
>> > On Thu, Jan 3, 2019 at 7:19 PM Mikhail Gryzykhin
>> > mailto:mig...@google.com>>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > In our current builds we (can) run multiple
>> code quality
>> > checks tools like checkstyle, findbugs, code
>> test
>> > coverage via cubertura. However we do not
>> utilize many
>> > of those signals.
>> >
>> > I suggest to add requirements to code based on
>> those
>> > tools. Specifically, I suggest to add
>> pre-commit checks
>> > that will require PRs to conform to some
>> quality checks.
>> >
>> > We can see good example of thresholds to add at
>> Apache
>> > SonarQube provided default quality gate config
>> > <
>> https://builds.apache.org/analysis/quality_gates/show/1>:
>> > 80% tests coverage on new code,
>> > 5% technical technical debt on new code,
>> > No bugs/Vulnerabilities added.
>> >
>> > As another part of this proposal, I want to
>> suggest the
>> > use of SonarQube for tracking code statistics
>> and as
>> > agent for enforcing code quality thresholds. It
>> is
>> > Apache provided tool that has integration with
>> Jenkins
>> > or Gradle via plugins.
>> >
>> > I believe some reporting to SonarQube was
>> configured for
>> > mvn builds of some of Beam sub-projects, but
>> was lost
>> > during migration to gradle.
>> >
>> > I was looking for other options, but so far
>> found only
>> > general configs to gradle builds that will fail
>> build if
>> > code coverage for project is too low. Such
>> approach will
>> > force us to backfill tests for all existing
>> code that
>> > can be tedious and demand learning of all
>> legacy code
>> > that might not be part of current work.
>> >
>> > I suggest to discuss and come to conclusion on
>> two
>> > points in this tread:
>> > 1. Do we want to add code quality checks to our
>> > pre-commit jobs and require them to pass before
>> PR is
>> > merged?
>> >
>> > Suggested: Add code quality checks listed
>> above at
>> > first, adjust them as we see fit in the
>> future.
>> >
>> > 2. What tools do we want to utilize for
>> analyzing code
>> > quality?
>> >
>> > Under discussion. Suggested: SonarQube, but
>> will
>> > depend on functionality level we want to
>> achieve.
>> >
>> >
>> > Regards,
>> > --Mikhail
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>> >
>> > Got feedback? tinyurl.com/swegner-feedback
>> > <https://tinyurl.com/swegner-feedback>
>> >
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [RFC] I made a new tabbed Beam view in Jenkins

2018-12-18 Thread Jason Kuster
Oh, a good fact! Thanks for the info. :)

On Tue, Dec 18, 2018 at 1:48 PM Alan Myrvold  wrote:

> The _Cron variants of pre-commits are run post-commit, to make it easier
> to tell if the pre-commit is flaky or broken.
>
> On Tue, Dec 18, 2018 at 1:40 PM Jason Kuster 
> wrote:
>
>> This looks great! (also it looks like some precommits snuck into the
>> postcommit view)
>>
>> On Tue, Dec 18, 2018 at 1:25 PM Alan Myrvold  wrote:
>>
>>> It does look much better!
>>>
>>> On Tue, Dec 18, 2018 at 1:10 PM Ahmet Altay  wrote:
>>>
>>>> I like this version, it looks cleaner than the current combined view.
>>>>
>>>> On Tue, Dec 18, 2018 at 12:53 PM Scott Wegner 
>>>> wrote:
>>>>
>>>>> Very cool. I also didn't realize we had control over the Jenkins
>>>>> "views".
>>>>>
>>>>> We currently lack a decent dashboard to monitor the build health
>>>>> across Beam jenkins jobs and triage failures; this is a step in the right
>>>>> direction.
>>>>>
>>>>> I haven't played with Jenkins views before, but it appears they can be
>>>>> managed via the Job DSL similar to our job definitions [1]:
>>>>>
>>>>> > The DSL execution engine exposes several methods to create Jenkins
>>>>> jobs, views, folders and config files. [..]
>>>>>
>>>>> It would be cool to integrate this into our job config in such a way
>>>>> that we could automatically keep the views up-to-date as jobs are added or
>>>>> renamed.
>>>>>
>>>>> [1] https://github.com/jenkinsci/job-dsl-plugin/wiki/Job-DSL-Commands
>>>>>
>>>>> On Tue, Dec 18, 2018 at 12:35 PM Anton Kedin  wrote:
>>>>>
>>>>>> This is really helpful, didn't realize it was possible. Categories
>>>>>> and contents look reasonable. I think something like this definitely 
>>>>>> should
>>>>>> be the top-level Beam view.
>>>>>>
>>>>>> Regards,
>>>>>> Anton
>>>>>>
>>>>>> On Tue, Dec 18, 2018 at 12:05 PM Kenneth Knowles 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I made a new view to split Beam builds into tabs:
>>>>>>> https://builds.apache.org/view/A-D/view/Beam%20Nested/
>>>>>>>
>>>>>>>  - PostCommit tab includes PostCommit and "PreCommit_.*_Cron"
>>>>>>> because these are actually post-commit jobs; it is a feature not a bug.
>>>>>>>  - PreCommit tab includes jobs that have no meaningful history
>>>>>>> because they are just against PRs, commits, phrase triggering
>>>>>>>  - Inventory self-explanatory
>>>>>>>  - PerformanceTests self-explanatory
>>>>>>>  - All; I didn't want to keep making categories but just send this
>>>>>>> for feedback
>>>>>>>
>>>>>>> WDYT about making this the top-level Beam view? (vs
>>>>>>> https://builds.apache.org/view/A-D/view/Beam/)
>>>>>>>
>>>>>>> After that, maybe we could clean the categories so they fit into the
>>>>>>> tabs more easily with fewer regexes (to make sure things don't get 
>>>>>>> missed).
>>>>>>> I have read also that if you use / instead of _ as a separator in a name
>>>>>>> then Jenkins will display jobs as nested in folders automatically. Not 
>>>>>>> sure
>>>>>>> it actually results in a better view; haven't tried it.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Got feedback? tinyurl.com/swegner-feedback
>>>>>
>>>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>> See something? Say something. go/jasonkuster-feedback
>> <https://goto.google.com/jasonkuster-feedback>
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [RFC] I made a new tabbed Beam view in Jenkins

2018-12-18 Thread Jason Kuster
This looks great! (also it looks like some precommits snuck into the
postcommit view)

On Tue, Dec 18, 2018 at 1:25 PM Alan Myrvold  wrote:

> It does look much better!
>
> On Tue, Dec 18, 2018 at 1:10 PM Ahmet Altay  wrote:
>
>> I like this version, it looks cleaner than the current combined view.
>>
>> On Tue, Dec 18, 2018 at 12:53 PM Scott Wegner  wrote:
>>
>>> Very cool. I also didn't realize we had control over the Jenkins "views".
>>>
>>> We currently lack a decent dashboard to monitor the build health across
>>> Beam jenkins jobs and triage failures; this is a step in the right
>>> direction.
>>>
>>> I haven't played with Jenkins views before, but it appears they can be
>>> managed via the Job DSL similar to our job definitions [1]:
>>>
>>> > The DSL execution engine exposes several methods to create Jenkins
>>> jobs, views, folders and config files. [..]
>>>
>>> It would be cool to integrate this into our job config in such a way
>>> that we could automatically keep the views up-to-date as jobs are added or
>>> renamed.
>>>
>>> [1] https://github.com/jenkinsci/job-dsl-plugin/wiki/Job-DSL-Commands
>>>
>>> On Tue, Dec 18, 2018 at 12:35 PM Anton Kedin  wrote:
>>>
>>>> This is really helpful, didn't realize it was possible. Categories and
>>>> contents look reasonable. I think something like this definitely should be
>>>> the top-level Beam view.
>>>>
>>>> Regards,
>>>> Anton
>>>>
>>>> On Tue, Dec 18, 2018 at 12:05 PM Kenneth Knowles 
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I made a new view to split Beam builds into tabs:
>>>>> https://builds.apache.org/view/A-D/view/Beam%20Nested/
>>>>>
>>>>>  - PostCommit tab includes PostCommit and "PreCommit_.*_Cron" because
>>>>> these are actually post-commit jobs; it is a feature not a bug.
>>>>>  - PreCommit tab includes jobs that have no meaningful history because
>>>>> they are just against PRs, commits, phrase triggering
>>>>>  - Inventory self-explanatory
>>>>>  - PerformanceTests self-explanatory
>>>>>  - All; I didn't want to keep making categories but just send this for
>>>>> feedback
>>>>>
>>>>> WDYT about making this the top-level Beam view? (vs
>>>>> https://builds.apache.org/view/A-D/view/Beam/)
>>>>>
>>>>> After that, maybe we could clean the categories so they fit into the
>>>>> tabs more easily with fewer regexes (to make sure things don't get 
>>>>> missed).
>>>>> I have read also that if you use / instead of _ as a separator in a name
>>>>> then Jenkins will display jobs as nested in folders automatically. Not 
>>>>> sure
>>>>> it actually results in a better view; haven't tried it.
>>>>>
>>>>> Kenn
>>>>>
>>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>> Got feedback? tinyurl.com/swegner-feedback
>>>
>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [GitHub] aaltay commented on issue #1: Add .gitmodules file

2018-08-14 Thread Jason Kuster
It's probably an INFRA ticket to change the mailing list preferences.

On Tue, Aug 14, 2018 at 8:47 AM Boyuan Zhang  wrote:

> Hey Jason,
>
> I'm going to figure out this problem today. Do you have idea what should
> we do or where should we start?
>
> Boyuan
>
> On Tue, Aug 14, 2018 at 1:15 AM Jason Kuster 
> wrote:
>
>> Looks like these are getting sent to dev. Is there a per-project config
>> we need to tweak or a ticket we need to file in order to get these sent to
>> commits@ instead?
>> On Mon, Aug 13, 2018 at 6:17 PM GitBox  wrote:
>>
>>> aaltay commented on issue #1: Add .gitmodules file
>>> URL: https://github.com/apache/beam-wheels/pull/1#issuecomment-412720273
>>>
>>>
>>>LGTM
>>>
>>> 
>>> This is an automated message from the Apache Git Service.
>>> To respond to the message, please log on GitHub and use the
>>> URL above to go to the specific comment.
>>>
>>> For queries about this service, please contact Infrastructure at:
>>> us...@infra.apache.org
>>>
>>>
>>> With regards,
>>> Apache Git Services
>>>
>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>> See something? Say something. go/jasonkuster-feedback
>> <https://goto.google.com/jasonkuster-feedback>
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [GitHub] aaltay commented on issue #1: Add .gitmodules file

2018-08-14 Thread Jason Kuster
Looks like these are getting sent to dev. Is there a per-project config we
need to tweak or a ticket we need to file in order to get these sent to
commits@ instead?
On Mon, Aug 13, 2018 at 6:17 PM GitBox  wrote:

> aaltay commented on issue #1: Add .gitmodules file
> URL: https://github.com/apache/beam-wheels/pull/1#issuecomment-412720273
>
>
>LGTM
>
> 
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on GitHub and use the
> URL above to go to the specific comment.
>
> For queries about this service, please contact Infrastructure at:
> us...@infra.apache.org
>
>
> With regards,
> Apache Git Services
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


FYI: Jenkins Upgrade

2018-07-23 Thread Jason Kuster
Hi all,

As an FYI: the Jenkins master got an upgrade to a much beefier machine over
the weekend. Per Infra (on bui...@apache.org): > Jenkins has been migrated
to a brand new 48 core machine with 128GB RAM and 3.5TB of NVMe storage.
Initial testing looks good, and builds are proceeding as expected. We hope
this will provide a significant improvement in user experience on the
front-end, and increase the throughput of builds. > > Please note that the
3.5TB of NVMe is less than what the previous jenkins master had. It is
critical that all projects aggressively trim their retained builds in order
for this shared resource to continue to function optimally.
Anecdotally, the Jenkins UI feels much faster to me at this point, which is
exciting!
If anyone sees issues with e.g. scripts, access, builds queueing, or other
Jenkins-related things please ping on this thread and we can put together a
set of escalations to Infra.

As I recall we are not one of the projects causing problems with retained
builds (I think we generally keep artifacts for <1mo) but it's worth noting
the paragraph about storage space on Jenkins. If storage constraints give
Infra problems they may implement a global retention window which would
affect us at that point.

Best,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-13 Thread Jason Kuster
gt; issues
>>>>>>> are automatically created without a plan for triage.
>>>>>>>
>>>>>>> Andrew
>>>>>>>
>>>>>>> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>>>>>>>
>>>>>>>> Maybe this is also a good thread to start the discussion that if we
>>>>>>>> want to enforce postcommit test for every PR.
>>>>>>>>
>>>>>>>> Can we afford the cost of longer waiting time to catch potential
>>>>>>>> bugs?
>>>>>>>>
>>>>>>>> -Rui
>>>>>>>>
>>>>>>>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin <
>>>>>>>> mig...@google.com> wrote:
>>>>>>>>
>>>>>>>>> That's a valid point.
>>>>>>>>>
>>>>>>>>> Unfortunately, the JiraTestResultReporter plugin does not have
>>>>>>>>> features to dynamically assign owners. Additionally, I don't think it 
>>>>>>>>> is
>>>>>>>>> always easy to find proper owner for post-commit tests at first 
>>>>>>>>> glance,
>>>>>>>>> since they usually cover broad specter of issues.
>>>>>>>>>
>>>>>>>>> My assumption is that we need someone to triage new issues.
>>>>>>>>>
>>>>>>>>> Ideally, any contributor, who sees failing test, should check
>>>>>>>>> unassigned tickets and either do triage, or assign them to someone 
>>>>>>>>> who can.
>>>>>>>>> I strongly encourage this approach.
>>>>>>>>>
>>>>>>>>> We have couple other ready-made options to consider:
>>>>>>>>> 1. We can configure JIRA component owner who would be assigned to
>>>>>>>>> created tickets.
>>>>>>>>> 2. JiraTestReporterPlugin can assign tickets to specific user.
>>>>>>>>> This is configured per Jenkins job. We can utilize this if someone
>>>>>>>>> volunteers.
>>>>>>>>> 3. Dynamic assignment will most likely require custom solution.
>>>>>>>>>
>>>>>>>>> --Mikhail
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud <
>>>>>>>>> apill...@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Mikhail,
>>>>>>>>>>
>>>>>>>>>> I like the proposal! Hopefully this can replace the constant
>>>>>>>>>> stream of build failure emails. I noticed one detail seems to be 
>>>>>>>>>> missing:
>>>>>>>>>> How will new issues be assigned to the proper owner? Will the tool 
>>>>>>>>>> do this
>>>>>>>>>> automatically or will we need someone to triage new issues?
>>>>>>>>>>
>>>>>>>>>> Andrew
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin <
>>>>>>>>>> mig...@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> I want to add an automatic JIRA tickets creation for failing
>>>>>>>>>>> post-commit tests.
>>>>>>>>>>>
>>>>>>>>>>> I wrote up design proposal doc with more details on this:
>>>>>>>>>>>
>>>>>>>>>>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>>>>>>>>>>
>>>>>>>>>>> Quick summary:
>>>>>>>>>>> I suggest to utilize JiraTestResultReporter plugin.
>>>>>>>>>>> Since this plugin is not installed on our Jenkins yet, we have
>>>>>>>>>>> to request to Infra team to add it.
>>>>>>>>>>>
>>>>>>>>>>> Please, comment if this approach sounds good to you.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> --Mikhail
>>>>>>>>>>>
>>>>>>>>>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [ANN] Apache Beam 2.5.0 has been released!

2018-07-03 Thread Jason Kuster
Excellent news, thank you JB and everyone else who helped out with the
release!

On Tue, Jul 3, 2018 at 12:00 AM Etienne Chauchot 
wrote:

> Nice!
>
> Thanks for all the work you did on the release JB !
>
> Etienne
>
> Le dimanche 01 juillet 2018 à 06:27 +0200, Jean-Baptiste Onofré a écrit :
>
> The Apache Beam team is pleased to announce the release of 2.5.0 version!
>
>
> You can download the release here:
>
>
>   https://beam.apache.org/get-started/downloads/
>
>
> This release includes the following major new features & improvements:
>
> - Go SDK support
>
> - new ParquetIO
>
> - Build migrated to Gradle (including for the release)
>
> - Improvements on Nexmark as Kafka support
>
> - Improvements on Beam SQL DSL
>
> - Improvements on Portability
>
> - New metrics support pushing generic to all runners
>
>
> You can take a look on the following blog post and Release Notes for
>
> details:
>
>
> https://beam.apache.org/blog/2018/06/26/beam-2.5.0.html
>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12342847
>
>
> Enjoy !!
>
>
> --
>
> JB on behalf of The Apache Beam team
>
>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Is something wrong with our Jenkins? Waiting 2+ hours for tests to start running.

2018-06-29 Thread Jason Kuster
There's been a fairly long-standing discussion regarding merging
ValidatesRunner tests in the same class together into one pipeline; this
would give us more breathing room in terms of Dataflow job quota and may
allow us to run multiple ValidatesRunner suites at the same time.

On Fri, Jun 29, 2018 at 3:49 PM Lukasz Cwik  wrote:

> I believe it was increased so we could run batch and streaming tests in
> parallel but not enough to run multiple Dataflow VR runs in parallel.
>
> On Fri, Jun 29, 2018 at 3:43 PM Reuven Lax  wrote:
>
>> Didn't we recently have the quota increased?
>>
>> On Fri, Jun 29, 2018, 3:40 PM Lukasz Cwik  wrote:
>>
>>> Dataflow VR only runs one at a time due to Dataflow job quota capacity.
>>>
>>> On Fri, Jun 29, 2018 at 3:39 PM Andrew Pilloud 
>>> wrote:
>>>
>>>> Looks like it is only running 1 instance of Dataflow ValidatesRunner at
>>>> a time, and the job takes 2+ hours. The queue is a little backed up.
>>>> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle_PR/
>>>>
>>>> Andrew
>>>>
>>>> On Fri, Jun 29, 2018 at 3:36 PM Lukasz Cwik  wrote:
>>>>
>>>>> I think the jobs aren't being scheduled or the status is failing to be
>>>>> sent back to Github.
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 29, 2018 at 3:22 PM Reuven Lax  wrote:
>>>>>
>>>>>> Well it's now moving on 2.5 hours, and the tests still haven't even
>>>>>> started to execute. Something seems very wrong.
>>>>>>
>>>>>> On Fri, Jun 29, 2018 at 3:14 PM Andrew Pilloud 
>>>>>> wrote:
>>>>>>
>>>>>>> I don't know what is going on, but I saw the same high latency in
>>>>>>> jobs right after filtered pre-commit was turned on for the first time. 
>>>>>>> (I
>>>>>>> don't have anything to suggest it is related, just a memorable time it
>>>>>>> happened.) I'm also noticing the Jenkins UI is a bit laggy.
>>>>>>>
>>>>>>> Andrew
>>>>>>>
>>>>>>> On Fri, Jun 29, 2018 at 3:09 PM Reuven Lax  wrote:
>>>>>>>
>>>>>>>> In addition to they delay in the trigger phrases, I'm seeing
>>>>>>>> multi-hour (!) delays in running tests.
>>>>>>>>
>>>>>>>> For example: https://github.com/apache/beam/pull/5545
>>>>>>>>
>>>>>>>> I triggered the Dataflow ValidatesRunner tests. The check showed up
>>>>>>>> as yellow with no "Details" link, which generally means it had not yet 
>>>>>>>> been
>>>>>>>> scheduled on Jenkins. It stayed like this for two hours, never even
>>>>>>>> starting to run. Apparently there were a ton of idle Jenkins executors 
>>>>>>>> at
>>>>>>>> the time, so it's not that all our Jenkins executors were busy. As of 
>>>>>>>> now,
>>>>>>>> it still has not started running.
>>>>>>>>
>>>>>>>> Does anybody have any idea what's going on here?
>>>>>>>>
>>>>>>>> Reuven
>>>>>>>>
>>>>>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Precommits broken?

2018-06-14 Thread Jason Kuster
Having submitted a patch to the ghprb-plugin repo before, I think that
regretfully option (b) is probably the right decision here given that it's
unlikely to get accepted, merged, released, and to have Infra update the
plugin in under a week.

On Wed, Jun 13, 2018 at 10:42 PM Scott Wegner  wrote:

> Indeed, I was going to send out an email about pre-commit filtering, but
> we've already found some kinks and may need to revert it.
>
> The change was submitted in PR#5611 [1] and enables Jenkins triggering to
> only run pre-commits based on modified files. However, Udi noticed that
> this also prevents manually running pre-commits on a PR with trigger
> phrases when your PR changes don't match the pre-commit include path [2].
> This was blocking 2.5.0 release validation, so I have a PR out to revert
> the change [3].
>
> I did some investigation and this is a deficiency in the Jenkins plugin
> used to trigger jobs on pull requests. I've filed a bug [4] and submitted a
> PR [5], but there's no guarantee that it'll get accepted or when it will be
> available.
>
> Question for others: we were hoping to enable pre-commit triggering as an
> optimization to decrease testing wait time and limit the impact of test
> flakiness [6]. But this bug in the plugin means we'd lose the ability to
> manually trigger pre-commits which aren't automatically run. One workaround
> would be to run the tests locally instead of on Jenkins, though that's
> clearly less desirable. Is this a blocker?
>
> Should we:
> (a) Keep pre-commit triggering enabled for now and hope the upstream patch
> gets accepted, or
> (b) Revert the pre-commit change and wait for the patch
>
> Thoughts?
>
> [1] https://github.com/apache/beam/pull/5611
> [2] https://github.com/apache/beam/pull/5607#issuecomment-397080770
> [3] https://github.com/apache/beam/pull/5638
> [4] https://github.com/jenkinsci/ghprb-plugin/issues/678
> [5] https://github.com/jenkinsci/ghprb-plugin/pull/680
> [6]
> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#bookmark=id.6j8bwxnbp7fr
>
>
> On Wed, Jun 13, 2018 at 10:03 PM Rui Wang  wrote:
>
>> Precommit filter is a really coool optimization!
>>
>> -Rui
>>
>> On Wed, Jun 13, 2018 at 5:21 PM Andrew Pilloud 
>> wrote:
>>
>>> Ah, so this is intended and I didn't break anything? Cool! Sorry for the
>>> false alarm, looks like a great build optimization!
>>>
>>> Andrew
>>>
>>> On Wed, Jun 13, 2018 at 5:06 PM Yifan Zou  wrote:
>>>
>>>> Probably due to the precommit filter applied in #5611
>>>> <https://github.com/apache/beam/pull/5611>?
>>>>
>>>> On Wed, Jun 13, 2018 at 5:02 PM Andrew Pilloud 
>>>> wrote:
>>>>
>>>>> Looks like statuses got posted between me writing this email and
>>>>> sending it. Still wondering why the python and go jobs appear to be 
>>>>> missing?
>>>>>
>>>>> Andrew
>>>>>
>>>>> On Wed, Jun 13, 2018 at 5:00 PM Andrew Pilloud 
>>>>> wrote:
>>>>>
>>>>>> Recent PRs don't appear to be running all the precommits, and success
>>>>>> status isn't being pushed to PRs. Anyone know what is going on?
>>>>>>
>>>>>> See:
>>>>>> https://github.com/apache/beam/pull/5592
>>>>>> https://github.com/apache/beam/pull/5622
>>>>>>
>>>>>> Andrew
>>>>>>
>>>>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Design Proposal: Beam-Site Automation Reliability

2018-06-07 Thread Jason Kuster
Sounds good; I'm really excited about these changes Scott. Thanks for
taking this on!

On Tue, Jun 5, 2018 at 4:00 PM Scott Wegner  wrote:

> Thanks everyone; I've responded to feedback in the doc [1] and I believe
> we've reached consensus. I've added implementation tasks in JIRA
> under BEAM-4493 [2] and will start coding soon. As a recap, the high-level
> plan is:
>
> * Migrate website source code to the main apache/beam repository
> * Discontinue checking-in generated HTML during the PR workflow
> * Align to the existing apache/beam PR process (code review policy,
> precommits, generic Git merge)
> * Filter pre-commit jobs to only run when necessary
> * Add a post-commit Jenkins job to push generated HTML to a separate
> publishing branch
>
> [1] https://s.apache.org/beam-site-automation
> [2] https://issues.apache.org/jira/browse/BEAM-4493
>
> On Fri, Jun 1, 2018 at 10:33 AM Scott Wegner  wrote:
>
>> Pre-commit filtering has come up on previous discussions as well and is
>> an obvious improvement. I've opened BEAM-4445 [1] for this and assigned it
>> to myself.
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-4445
>>
>> On Fri, Jun 1, 2018 at 10:01 AM Kenneth Knowles  wrote:
>>
>>> +1
>>>
>>> Can we separate precommit filtering and get it set up independent from
>>> this? I think there's a lot of good directions to go once it is the norm.
>>>
>>> On Thu, May 31, 2018 at 9:25 PM Thomas Weise  wrote:
>>>
>>>> Very nice, enthusiastic +1
>>>>
>>>> On Thu, May 31, 2018 at 3:24 PM, Scott Wegner 
>>>> wrote:
>>>>
>>>>> Thanks to everyone who reviewed the doc. I put together a plan based
>>>>> on the initial feedback to improve website automation reliability. At a
>>>>> glance, I am proposing to:
>>>>>
>>>>> * Migrate website source code to the main apache/beam repository
>>>>> * Discontinue checking-in generated HTML during the PR workflow
>>>>> * Align to the existing apache/beam PR process (code review policy,
>>>>> precommits, generic Git merge)
>>>>> * Filter pre-commit jobs to only run when necessary
>>>>> * Add a post-commit Jenkins job to push generated HTML to a separate
>>>>> publishing branch
>>>>>
>>>>> Please take another look at the doc, specifically the new section
>>>>> entitled "Proposed Solution":
>>>>> https://s.apache.org/beam-site-automation
>>>>> I'd like to gather feedback by Monday June 4, and if there is
>>>>> consensus move forward with the implementation.
>>>>>
>>>>> Thanks,
>>>>> Scott
>>>>>
>>>>>
>>>>> Got feedback? tinyurl.com/swegner-feedback
>>>>>
>>>>> On Tue, May 29, 2018 at 4:32 PM Scott Wegner 
>>>>> wrote:
>>>>>
>>>>>> I've been looking into the beam-site merge automation reliability,
>>>>>> and I'd like to get some early feedback on ideas for improvement. Please
>>>>>> take a look at https://s.apache.org/beam-site-automation:
>>>>>>
>>>>>> > Apache Beam's website is maintained via the beam-site Git
>>>>>> repository, with a set of automation that manages the workflow from 
>>>>>> merging
>>>>>> a pull request to publishing. The automation is centralized in a tool
>>>>>> called Mergebot, which was built for Beam and donated to the ASF. 
>>>>>> However,
>>>>>> the automation has been somewhat unreliable, and when there are issues,
>>>>>> very few individuals have the necessary permissions and expertise to
>>>>>> resolve them. Overall, the reliability of Beam-site automation is 
>>>>>> impeding
>>>>>> productivity for Beam-site development.
>>>>>>
>>>>>> At this point I'm seeking feedback on a few possible solutions:
>>>>>>
>>>>>> 1. Invest in improvements to Mergebot reliability. Make stability
>>>>>> tweaks for various failure modes, distribute Mergebot expertise and
>>>>>> operations permissions to more committers.
>>>>>> 2. Deprecate Mergebot and revert to manual process. With the current
>>>>>> unreliability, some committers choose to forego merge automation anyway.
>>>>>> 3. Generate HTML only during publishing. This seems to be newly
>>>>>> supported by the Apache GitPubSub workflow. This would eliminate most or
>>>>>> all of the automation that Mergebot is responsible for.
>>>>>>
>>>>>> Feel free to add comments in the doc.
>>>>>>
>>>>>> Thanks,
>>>>>> Scott
>>>>>>
>>>>>>
>>>>>>
>>>>>> Got feedback? tinyurl.com/swegner-feedback
>>>>>>
>>>>>
>>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [VOTE] Code Review Process

2018-06-01 Thread Jason Kuster
+1

On Fri, Jun 1, 2018 at 11:36 AM Ankur Goenka  wrote:

> +1
>
> On Fri, Jun 1, 2018 at 11:28 AM Charles Chen  wrote:
>
>> +1
>>
>> On Fri, Jun 1, 2018 at 11:20 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> +1
>>>
>>> On Fri, Jun 1, 2018 at 10:40 AM, Ahmet Altay  wrote:
>>>
>>>> +1
>>>>
>>>> On Fri, Jun 1, 2018 at 10:37 AM, Kenneth Knowles 
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Fri, Jun 1, 2018 at 10:25 AM Thomas Groh  wrote:
>>>>>
>>>>>> As we seem to largely have consensus in "Reducing Committer Load for
>>>>>> Code Reviews"[1], this is a vote to change the Beam policy on Code 
>>>>>> Reviews
>>>>>> to require that
>>>>>>
>>>>>> (1) At least one committer is involved with the code review, as
>>>>>> either a reviewer or as the author
>>>>>> (2) A contributor has approved the change
>>>>>>
>>>>>> prior to merging any change.
>>>>>>
>>>>>> This changes our policy from its current requirement that at least
>>>>>> one committer *who is not the author* has approved the change prior to
>>>>>> merging. We believe that changing this process will improve code review
>>>>>> throughput, reduce committer load, and engage more of the community in 
>>>>>> the
>>>>>> code review process.
>>>>>>
>>>>>> Please vote:
>>>>>> [ ] +1: Accept the above proposal to change the Beam code
>>>>>> review/merge policy
>>>>>> [ ] -1: Leave the Code Review policy unchanged
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>> [1]
>>>>>> https://lists.apache.org/thread.html/7c1fde3884fbefacc252b6d4b434f9a9c2cf024f381654aa3e47df18@%3Cdev.beam.apache.org%3E
>>>>>>
>>>>>
>>>>
>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-01 Thread Jason Kuster
+1 (non-binding): automating policy ensures it is applied fairly and evenly
and lessens the load on project maintainers; hearty agreement.

On Fri, Jun 1, 2018 at 9:25 AM Alan Myrvold  wrote:

> +1 (non-binding) I updated the pull request to be 60 days (instead of 90)
> to match the contribute policy.
>
> On Fri, Jun 1, 2018 at 9:21 AM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> Following the discussion, please vote on the move to activate
>> probot/stale [3] to notify authors of stale PRs per current policy and
>> then close them after a 7 day grace period.
>>
>> For more details, see:
>>
>>  - our stale PR policy [1]
>>  - the discussion thread [2]
>>  - Probot stale [3]
>>  - BEAM ticket summarizing discussion [4]
>>  - INFRA ticket to activate probot/stale [5]
>>  - Example PR that would activate it [6]
>>
>> Please vote:
>> [ ] +1, Approve that we activate probot/stale
>> [ ] -1, Do not approve (please provide specific comments)
>>
>> Kenn
>>
>> [1] https://beam.apache.org/contribute/#stale-pull-requests
>> [2]
>> https://lists.apache.org/thread.html/bda552ea7073ca165aaf47034610afafe22d589e386525023d33609e@%3Cdev.beam.apache.org%3E
>> [3] https://github.com/probot/stale
>> [4] https://issues.apache.org/jira/browse/BEAM-4423
>> [5] https://issues.apache.org/jira/browse/INFRA-16589
>> [6] https://github.com/apache/beam/pull/5532
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [VOTE] Go SDK

2018-05-21 Thread Jason Kuster
+1! So excited to have gotten to this point -- congrats to all. I've been
excited to do some reviews of the Go SDK since becoming a committer; really
happy about this.

On Mon, May 21, 2018 at 6:03 PM Henning Rohde <hero...@google.com> wrote:

> Hi everyone,
>
> Now that the remaining issues have been resolved as discussed, I'd like to
> propose a formal vote on accepting the Go SDK into master. The main
> practical difference is that the Go SDK would be part of the Apache Beam
> release going forward.
>
> Highlights of the Go SDK:
>  * Go user experience with natively-typed DoFns with (simulated) generic
> types
>  * Covers most of the Beam model: ParDo, GBK, CoGBK, Flatten, Combine,
> Windowing, ..
>  * Includes several IO connectors: Datastore, BigQuery, PubSub,
> extensible textio.
>  * Supports the portability framework for both batch and streaming,
> notably the upcoming portable Flink runner
>  * Supports a direct runner for small batch workloads and testing.
>  * Includes pre-commit tests and post-commit integration tests.
>
> And last but not least
>  *  includes contributions from several independent users and developers,
> notably an IO connector for Datastore!
>
> Website: https://beam.apache.org/documentation/sdks/go/
> Code: https://github.com/apache/beam/tree/master/sdks/go
> Design: https://s.apache.org/beam-go-sdk-design-rfc
>
> Please vote:
> [ ] +1, Approve that the Go SDK becomes an official part of Beam
> [ ] -1, Do not approve (please provide specific comments)
>
> Thanks,
>  The Gophers of Apache Beam
>
>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: I'm back and ready to help grow our community!

2018-05-17 Thread Jason Kuster
Wonderful Gris; warmest congratulations on the milestone and glad to have
you back. :D

On Thu, May 17, 2018 at 2:36 PM Kenneth Knowles <k...@google.com> wrote:

> Congratulations!!
>
> On Thu, May 17, 2018 at 2:21 PM Griselda Cuevas <g...@google.com> wrote:
>
>> Hi Everyone,
>>
>>
>> I was absent from the mailing list, slack channel and our Beam community
>> for the past six weeks, the reason was that I took a leave to focus on
>> finishing my Masters Degree, which I finally did on May 15th.
>>
>>
>> I graduated as a Masters of Engineering in Operations Research with a
>> concentration in Data Science from UC Berkeley. I'm glad to be part of this
>> community and I'd like to share this accomplishment with you so I'm adding
>> two pictures of that day :)
>>
>>
>> Given that I've seen so many new folks around, I'd like to use this
>> opportunity to re-introduce myself. I'm Gris Cuevas and I work at Google.
>> Now that I'm back, I'll continue to work on supporting our community in two
>> main streams: Contribution Experience & Events, Meetups, and Conferences.
>>
>>
>> It's good to be back and I look forward to collaborating with you.
>>
>>
>> Cheers,
>>
>> Gris
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Apache Beam - jenkins question

2018-04-27 Thread Jason Kuster
Thanks for the heads-up regarding the permissions. At this point I need
more information about the credentials we want to use -- Kamil, can you
provide more info? What is the purpose of the credentials you want to use
here?

On Fri, Apr 27, 2018 at 3:50 PM Davor Bonaci <da...@apache.org> wrote:

> Jason, you should now have all the permissions needed. (You should,
> however, evaluate whether this is a good place for it. Executors
> themselves, for example, might be an alternative.)
>
> On Fri, Apr 27, 2018 at 7:42 PM, Jason Kuster <jasonkus...@google.com>
> wrote:
>
>> See
>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/common_job_properties.groovy#L119
>> for an example of this being done in practice to add the coveralls repo
>> token as an environment variable.
>>
>> On Fri, Apr 27, 2018 at 12:41 PM Jason Kuster <jasonkus...@google.com>
>> wrote:
>>
>>> Hi Kamil, Davor,
>>>
>>> I think what you want is the Jenkins secrets feature (see
>>> https://support.cloudbees.com/hc/en-us/articles/203802500-Injecting-Secrets-into-Jenkins-Build-Jobs).
>>> Davor, I believe you are the only one with enough karma on Jenkins to
>>> access the credentials UI; once the credential is created in Jenkins it
>>> should be able to be set as an environment variable through the Jenkins job
>>> configuration (groovy files in $BEAM_ROOT/.test-infra/jenkins). Hope
>>> this helps.
>>>
>>> Jason
>>>
>>> On Thu, Apr 26, 2018 at 8:43 PM Davor Bonaci <da...@apache.org> wrote:
>>>
>>>> Hi Kamil --
>>>> Thanks for reaching out.
>>>>
>>>> This is a great question for the dev@ mailing list. You may want to
>>>> share a little bit more why you need, how long, frequency of updates to the
>>>> secret, etc. for the community to be aware how things work.
>>>>
>>>> Hopefully others on the mailing list can help you by manually putting
>>>> the necessary secret into the cloud settings related to the executors.
>>>>
>>>> Davor
>>>>
>>>> -- Forwarded message --
>>>> From: Kamil Szewczyk <szewi...@gmail.com>
>>>> Date: Tue, Apr 24, 2018 at 12:21 PM
>>>> Subject: Apache Beam - jenkins question
>>>> To: da...@apache.org
>>>>
>>>>
>>>> Dear Davor
>>>>
>>>> I sent you a message on asf slack, wasn't sure how can I reach you.
>>>>
>>>> Anyway are you able to add secret (environment variable) to jenkins. ??
>>>> Or point me to a person that would be able to do that ?
>>>>
>>>> Kind Regards
>>>> Kamil Szewczyk
>>>>
>>>>
>>>
>>> --
>>> ---
>>> Jason Kuster
>>> Apache Beam / Google Cloud Dataflow
>>>
>>> See something? Say something. go/jasonkuster-feedback
>>> <https://goto.google.com/jasonkuster-feedback>
>>>
>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>> See something? Say something. go/jasonkuster-feedback
>> <https://goto.google.com/jasonkuster-feedback>
>>
>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Apache Beam - jenkins question

2018-04-27 Thread Jason Kuster
See
https://github.com/apache/beam/blob/master/.test-infra/jenkins/common_job_properties.groovy#L119
for an example of this being done in practice to add the coveralls repo
token as an environment variable.

On Fri, Apr 27, 2018 at 12:41 PM Jason Kuster <jasonkus...@google.com>
wrote:

> Hi Kamil, Davor,
>
> I think what you want is the Jenkins secrets feature (see
> https://support.cloudbees.com/hc/en-us/articles/203802500-Injecting-Secrets-into-Jenkins-Build-Jobs).
> Davor, I believe you are the only one with enough karma on Jenkins to
> access the credentials UI; once the credential is created in Jenkins it
> should be able to be set as an environment variable through the Jenkins job
> configuration (groovy files in $BEAM_ROOT/.test-infra/jenkins). Hope
> this helps.
>
> Jason
>
> On Thu, Apr 26, 2018 at 8:43 PM Davor Bonaci <da...@apache.org> wrote:
>
>> Hi Kamil --
>> Thanks for reaching out.
>>
>> This is a great question for the dev@ mailing list. You may want to
>> share a little bit more why you need, how long, frequency of updates to the
>> secret, etc. for the community to be aware how things work.
>>
>> Hopefully others on the mailing list can help you by manually putting the
>> necessary secret into the cloud settings related to the executors.
>>
>> Davor
>>
>> -- Forwarded message --
>> From: Kamil Szewczyk <szewi...@gmail.com>
>> Date: Tue, Apr 24, 2018 at 12:21 PM
>> Subject: Apache Beam - jenkins question
>> To: da...@apache.org
>>
>>
>> Dear Davor
>>
>> I sent you a message on asf slack, wasn't sure how can I reach you.
>>
>> Anyway are you able to add secret (environment variable) to jenkins. ??
>> Or point me to a person that would be able to do that ?
>>
>> Kind Regards
>> Kamil Szewczyk
>>
>>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>
> See something? Say something. go/jasonkuster-feedback
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Apache Beam - jenkins question

2018-04-27 Thread Jason Kuster
Hi Kamil, Davor,

I think what you want is the Jenkins secrets feature (see
https://support.cloudbees.com/hc/en-us/articles/203802500-Injecting-Secrets-into-Jenkins-Build-Jobs).
Davor, I believe you are the only one with enough karma on Jenkins to
access the credentials UI; once the credential is created in Jenkins it
should be able to be set as an environment variable through the Jenkins job
configuration (groovy files in $BEAM_ROOT/.test-infra/jenkins). Hope
this helps.

Jason

On Thu, Apr 26, 2018 at 8:43 PM Davor Bonaci <da...@apache.org> wrote:

> Hi Kamil --
> Thanks for reaching out.
>
> This is a great question for the dev@ mailing list. You may want to share
> a little bit more why you need, how long, frequency of updates to the
> secret, etc. for the community to be aware how things work.
>
> Hopefully others on the mailing list can help you by manually putting the
> necessary secret into the cloud settings related to the executors.
>
> Davor
>
> -- Forwarded message --
> From: Kamil Szewczyk <szewi...@gmail.com>
> Date: Tue, Apr 24, 2018 at 12:21 PM
> Subject: Apache Beam - jenkins question
> To: da...@apache.org
>
>
> Dear Davor
>
> I sent you a message on asf slack, wasn't sure how can I reach you.
>
> Anyway are you able to add secret (environment variable) to jenkins. ??
> Or point me to a person that would be able to do that ?
>
> Kind Regards
> Kamil Szewczyk
>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: [Go SDK] Proposal: Set up a Vanity Import Path

2018-04-19 Thread Jason Kuster
+1 this will be very convenient for the go community.

On Thu, Apr 19, 2018 at 2:21 PM Henning Rohde <hero...@google.com> wrote:

> +1 Great proposal!
>
> On Wed, Apr 18, 2018 at 1:37 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>
>> Well it is not really a rule for a proposal but notice that there are
>> people in different time zones or people that for different reasons
>> cannot answer immediately, so a longer period could give them a chance
>> to voice their opinions.
>>
>> On Wed, Apr 18, 2018 at 10:23 PM, Robert Burke <rob...@frantil.com>
>> wrote:
>> > That's good to know!
>> > I had heard of that specific rule, but I didn't realized it pertained to
>> > filing of a JIRA issue (when related to a proposal) as well.
>> > Thank you.
>> >
>> > On Wed, 18 Apr 2018 at 13:08 Ismaël Mejía <ieme...@gmail.com> wrote:
>> >>
>> >> +1 Nice idea and proposal.
>> >>
>> >> This was not a vote thread but for the future it is a good idea to let
>> >> a bigger time window before reaching consensus.
>> >> Notice that a formal vote lets at least 72h for participants to voice
>> >> their opinion before concluding something.
>> >>
>> >> https://www.apache.org/foundation/voting.html
>> >>
>> >>
>> >>
>> >>
>> >> On Wed, Apr 18, 2018 at 6:29 PM, Robert Burke <rob...@frantil.com>
>> wrote:
>> >> > This seems like enough consensus to file the JIRA, so
>> >> > https://issues.apache.org/jira/browse/BEAM-4115 has now been
>> created.
>> >> >
>> >> > I'll get to work on the PRs shortly.
>> >> >
>> >> > Cheers,
>> >> > Robert Burke
>> >> >
>> >> > On Wed, 18 Apr 2018 at 03:52 Jean-Baptiste Onofré <j...@nanthrax.net>
>> >> > wrote:
>> >> >>
>> >> >> +1
>> >> >>
>> >> >> Agree
>> >> >> Regards
>> >> >> JB
>> >> >> Le 18 avr. 2018, à 14:51, Aljoscha Krettek <aljos...@apache.org> a
>> >> >> écrit:
>> >> >>>
>> >> >>> +1 this sounds super reasonable
>> >> >>>
>> >> >>>
>> >> >>> On 17. Apr 2018, at 20:11, Kenneth Knowles <k...@google.com> wrote:
>> >> >>>
>> >> >>> This seems like a valuable layer of indirection to establish. The
>> >> >>> mechanisms are pretty esoteric, but I trust Gophers to know the
>> best
>> >> >>> way to
>> >> >>> do it. Commented just a smidgin on the doc.
>> >> >>>
>> >> >>> Kenn
>> >> >>>
>> >> >>> On Mon, Apr 16, 2018 at 4:57 PM Robert Burke <rob...@frantil.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hi All!
>> >> >>>> While the Go SDK is still experimental, that doesn't mean it
>> >> >>>> shouldn't
>> >> >>>> be future proofed.
>> >> >>>>
>> >> >>>> Go has the ability to specify custom import paths for a prefix of
>> >> >>>> packages. This has benefits of avoiding generic GitHub paths, and
>> >> >>>> avoids
>> >> >>>> breaking users in the event of infrastructure events such as
>> moving
>> >> >>>> off of
>> >> >>>> GitHub, or even splitting the repo into per language components.
>> >> >>>>
>> >> >>>> Currently users need to import paths like:
>> >> >>>>
>> >> >>>> import "github.com/apache/beam/sdks/go/pkg/beam/io/textio"
>> >> >>>>
>> >> >>>> to get at SDK packages. If we implement this proposal, they would
>> >> >>>> look
>> >> >>>> like:
>> >> >>>>
>> >> >>>> import "beam.apache.org/sdks/go/pkg/beam/io/textio"
>> >> >>>>
>> >> >>>> which are a bit shorter, a bit more stable, and a bit nicer, with
>> the
>> >> >>>> benefits outlined above.
>> >> >>>>
>> >> >>>> I wrote a doc with details which is at
>> >> >>>> https://s.apache.org/go-beam-vanity-import
>> >> >>>> (Thanks you Thomas for short linking it for me.)
>> >> >>>>
>> >> >>>> The doc should answer most of your questions, but please let me
>> know
>> >> >>>> if
>> >> >>>> you have others either here, or in a doc comment.
>> >> >>>>
>> >> >>>> If there's consensus to do so, it would be better it's done sooner
>> >> >>>> rather than after folks begin depending on it. We wouldn't want to
>> >> >>>> have
>> >> >>>> fragmented examples.
>> >> >>>>
>> >> >>>> Robert Burke
>> >> >>>> (One of the Gopher Googlers who have been quietly lurking on the
>> >> >>>> list,
>> >> >>>> and submitting the occasional PR for the Go SDK. I look forward to
>> >> >>>> working
>> >> >>>> with you all!)
>> >> >>>
>> >> >>>
>> >> >
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Node Errors ("Backing Channel... is Disconnected.")

2018-04-16 Thread Jason Kuster
Hi all,

We've been seeing some instances of nodes failing with messages like
"Backing channel... is disconnected." In talking to Infra, it appears that
this is due to both out-of-memory errors on the VMs killing the Jenkins
process, as well as potentially some Puppet issues on the Infra side.

I've opened https://issues.apache.org/jira/browse/INFRA-16380 to track the
infra-side issues; the out of memory errors are more concerning. Does
anyone know if the recent move to Gradle would have caused memory usage to
increase? We may be able to bump machine RAM, or limit parallelism somehow
to mitigate the issue. In the meantime, if you see this issue, feel free to
add the failing build as a comment to the bug.

Best,

Jason

-- 
-------
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Inconsistent jenkins workers

2018-04-16 Thread Jason Kuster
To provide slightly more context here, we do have some ability to change
what's on the build machines (they're puppetized and the configurations are
available at https://github.com/apache/infrastructure-puppet), but the
image that they're running is Ubuntu 14.04, and some latest versions of
things are just not available. The machines are owned by Google and managed
with the much-appreciated help of the Apache Infrastructure team.

Making a new image for the slaves has come up a couple of times, and a
sufficiently motivated person could have some impact there by taking on
that work, but it seems to me that given Jenkins has first-class support
for running builds inside of Docker containers, that's probably what we
want for long-term sustainability. I've chatted with Infra a bit about this
issue; they said that they'll have 18.04 support ready in a couple of
weeks, so in that time frame we could start looking at upgrading the image
and getting Dockerized builds going. Yifan, are you planning on owning this
area?

On Mon, Apr 16, 2018 at 4:02 PM Yifan Zou <yifan...@google.com> wrote:

> Those machines are managed by apache-infra, that we are not able to
> install/update tools on them. We plan to have a new instance group for beam
> Jenkins since we are required to update OS to the latest Ubuntu. With
> fresh, supportive dependencies installed on new machines could also get rid
> of restrictions on python tests. But for now, I can hardly tell when we
> could have new Jenkins VMs since the latest OS image is not available yet.
>
> Yifan Zou
>
>
> On Mon, Apr 16, 2018 at 3:10 PM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> Thanks.
>>
>> In the short term, I could try to limit precommits to these two machines
>> following that example, but presumably that would mean longer queues. Who
>> owns these machines? Could we just wipe them and install fresh, modern,
>> consistent OS/environments on them? (The container story seems like a great
>> long-term solution, especially for local reproducibility, but probably not
>> as easy...)
>>
>>
>> On Mon, Apr 16, 2018 at 2:33 PM Yifan Zou <yifan...@google.com> wrote:
>>
>>> The Jenkins worker configurations is a pain point of beam build and
>>> tests, and it is indeed difficult to debug. Originally, python tests such
>>> as beam_PostCommit_Python_Verify only run on one worker due to BEAM-1817
>>> <https://issues.apache.org/jira/projects/BEAM/issues/BEAM-3395?filter=allissues>.
>>> We probably need to do the same thing for
>>> beam_PreCommit_Python_GradleBuild in short term.
>>> In order to solve this problem, we did research and experiments on
>>> running Jenkins tests within a container and organized a short
>>> documentation. It is being reviewed within Engprod team and will be shared
>>> for wider review shortly.
>>>
>>> Yifan Zou
>>>
>>> On Mon, Apr 16, 2018 at 1:10 PM Robert Bradshaw <rober...@google.com>
>>> wrote:
>>>
>>>> I've been trying to debug why beam_PreCommit_Python_GradleBuild seems
>>>> to be
>>>> failing so often, and it looks like the beam-sdks-python:setupVirtualenv
>>>> task succeeds on beam2 and beam6, but always fails on beam1, beam3,
>>>> beam4,
>>>> and beam8. (I didn't see any runs on beam5 or beam7, I vaguely seem to
>>>> remember beam5 being blacklisted...) I can't reproduce the failure
>>>> locally
>>>> and the remote logs (e.g.
>>>>
>>>> https://builds.apache.org/job/beam_PreCommit_Python_GradleBuild/471/console
>>>> ) don't seem to be very enlightening either. This leads to a couple of
>>>> questions:
>>>>
>>>> * How are our jenkins beam workers configured, and why aren't they the
>>>> same?
>>>> * How does one go about debugging failures like this?
>>>>
>>>> Before too much effort is invested, how far are we from using
>>>> containers to
>>>> manage the build environments?
>>>>
>>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Rebuilding beam before running Performance Test - question.

2018-04-16 Thread Jason Kuster
I think that should be fine; I believe the way it was that way originally
was because we already had things set up for building in Jenkins, but it
was a while ago. Given the performance tests run infrequently enough, I
don't think an increase in runtime will be a big deal.

On Mon, Apr 16, 2018 at 7:48 AM Łukasz Gajowy <lukasz.gaj...@gmail.com>
wrote:

> Hi all,
>
> currently performance tests running on Jenkins all have "beam_prebuilt"
> [1] Perfkit's flag set to true, which means that PerfKit does not rebuild
> the code before invoking the Performance Test. This makes things faster but
> error prone - we observed Performance Tests failures several times due to
> the fact that something was not built on time.
>
> We should rebuild Beam in every testing job to avoid errors (only "bare"
> build, without tests and checkstyle). This will make the tests last longer
> (about 7 minutes per each test, as my experiments have shown). Probably it
> will be faster on Gradle (didn't test it yet). There are 12 tests now with
> "beam_PerformanceTests_JDBC" as the longest lasting (total of 15 minutes).
>
> It may be just a "formality question", but does anyone object that we
> rebuild beam like this in each Performance Tests Jenkins job?
>
> [1]
> https://beam.apache.org/documentation/io/testing/#implementing-integration-tests
>
> Best regards,
> Łukasz
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Performance tests status and anomaly detection proposal

2018-04-16 Thread Jason Kuster
gt;>>>> and
>>>>> created PR <https://github.com/apache/beam/pull/5003> that allows running
>>>>> them via Gradle.All of those tests are executed on daily basis on Apache
>>>>> Jenkins <https://builds.apache.org/> and their results are published to
>>>>> individual BigQuery tables. There is also a dashboard on which tests
>>>>> results may be viewed and
>>>>> compared:https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688
>>>>> <https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688>As
>>>>> we have some amount of tests already, we’re currently working on a tool
>>>>> that will analyze the results and search for anomalies, so devs are
>>>>> notified if degraded performance is observed. You can find proposal
>>>>> document
>>>>> here:https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA
>>>>> <https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA>We
>>>>> welcome you to share your thoughts on performance tests in general as well
>>>>> as proposed solution for anomaly detection.Best,Dariusz Aniszewski*
>>>>>
>>>>>
>>>> --
>> Got feedback? go/pabloem-feedback
>> <https://goto.google.com/pabloem-feedback>
>>
>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Beam7 Outage

2018-04-12 Thread Jason Kuster
I can ask Infra if there's an issue with that machine as well, but if it's
still accessible at all it's probably not the same issue; jenkins couldn't
get to beam7 at all.

On Thu, Apr 12, 2018 at 12:11 PM Ismaël Mejía <ieme...@gmail.com> wrote:

> beam5 has been failing in the last week. Almost all builds there break.
>
> On Thu, Apr 12, 2018, 6:41 PM Jason Kuster <jasonkus...@google.com> wrote:
>
>> Hi all,
>>
>> The Jenkins Beam7 executor gave up the ghost some time in the last couple
>> of days. I've been on the line with Infra yesterday and today getting it
>> fixed, and it looks like it should be back up in a few hours. I'll ping
>> this thread again when I have confirmation. Thanks for your patience.
>>
>> Best,
>>
>> Jason
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>> See something? Say something. go/jasonkuster-feedback
>>
>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Jenkins slowness

2018-03-09 Thread Jason Kuster
Jenkins executors are probably less beefy than your development machine and
also have two slots, thus are likely running two builds at once, causing
extra slowness.


On Fri, Mar 9, 2018 at 10:18 AM Udi Meiri <eh...@google.com> wrote:

> Hi,
>
> Does anybody know why Jenkins hosts take so long to run? For example,
> beam1 was running beam_PostCommit_Python_Verify and I saw this time for
> running "tox -e py27":
>
> Ran 1535 tests in 403.860s
>
> on my workstation I got:
>
> Ran 1535 tests in 160.242s
>
> Is there any way to troubleshoot this? Each run takes about an hour to
> run, not including time waiting in the queue.
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow

See something? Say something. go/jasonkuster-feedback


Re: Gradle status

2018-03-07 Thread Jason Kuster
t;>>>>> ignore
>>>>>>>> that criteria or are you suggesting we change that criteria (if yes, 
>>>>>>>> how)?
>>>>>>>>
>>>>>>>>
>>>>>>>> No, no. My goal is just to quit this state.
>>>>>>>>
>>>>>>>> Let s draft a plan:
>>>>>>>>
>>>>>>>> 1. 2.4 is released - i assume it is done with mvn here
>>>>>>>> 2. We drop all poms and jenkins mvn config
>>>>>>>> 3. We fix all build issues if so (let say in a week)
>>>>>>>> 4. Pr can nees updates but no more mvn merge
>>>>>>>>
>>>>>>>> April is gradle month :)
>>>>>>>>
>>>>>>>> Wdyt?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 7, 2018 at 10:39 AM, Romain Manni-Bucau <
>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 7 mars 2018 17:34, "Lukasz Cwik" <lc...@google.com> a écrit :
>>>>>>>>>
>>>>>>>>> Thanks for bringing this up Romain but I believe your data points
>>>>>>>>> on pass rates are only partially correct.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sure sure, it is mainly about my own PR which a very small % of
>>>>>>>>> the whole project ;).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In the past week the Java Gradle precommit passed 46.34% of the
>>>>>>>>> time compared to the Java Maven precommit which passed 46.15% of the 
>>>>>>>>> time.
>>>>>>>>> When I looked at these numbers in mid January they were around 37% so 
>>>>>>>>> there
>>>>>>>>> has been some improvement. Regardless of the build tool it seems that 
>>>>>>>>> our
>>>>>>>>> pass rates aren't stellar for the Java build and are causing the 
>>>>>>>>> community
>>>>>>>>> to not follow best practices (wait for precommits to be green before
>>>>>>>>> merging). I know that on the website we used the mergebot to ensure 
>>>>>>>>> that
>>>>>>>>> things passed before they were merged, should we institute this on the
>>>>>>>>> master branch or are their any other ideas?
>>>>>>>>>
>>>>>>>>> As a side note we had achieved the goals we set out to not need to
>>>>>>>>> maintain the Maven precommit and have authored the first PR to drop 
>>>>>>>>> the
>>>>>>>>> Maven precommit:  https://github.com/apache/beam/pull/4814
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well, I'd be for a strong switch otherwise PR will keep using
>>>>>>>>> maven, jenkins will not test the code and at the end we fail to 
>>>>>>>>> deliver
>>>>>>>>> something consistent. So whatever tool is selected I'm tempted to say 
>>>>>>>>> drop
>>>>>>>>> other build files and jenkins hooks references.
>>>>>>>>>
>>>>>>>>> What about doing it after 2.4 vote?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 7, 2018 at 2:24 AM, Romain Manni-Bucau <
>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Up,
>>>>>>>>>>
>>>>>>>>>> We discussed to have a strong switch to gradle or rollback to
>>>>>>>>>> maven around april to not be blocked by the build tool. I noticed 
>>>>>>>>>> gradle
>>>>>>>>>> build rarely passes on PR and kind of blurry our vision - not sure 
>>>>>>>>>> why
>>>>>>>>>> exactly. Also, PR don't always contain the gradle updates - generally
>>>>>>>>>> dependencies+plugins are added in pom.xml AFAIK, so it seems the 
>>>>>>>>>> adoption
>>>>>>>>>> is very slow to not say rejected.
>>>>>>>>>>
>>>>>>>>>> What do we do about that? When do we drop the double build
>>>>>>>>>> maintenance - whatever is picked?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>
>>>>>>>>>> 2018-01-12 6:30 GMT+01:00 Romain Manni-Bucau <
>>>>>>>>>> rmannibu...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le 11 janv. 2018 23:13, "Kenneth Knowles" <k...@google.com> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 11, 2018 at 8:43 AM, Romain Manni-Bucau <
>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> 2. gradle build doesn't use the same output directory than
>>>>>>>>>>>> maven so it is not really smooth to have both and have to maintain 
>>>>>>>>>>>> both
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I also have an opinion on this. It is useful and reasonable to
>>>>>>>>>>> be able to build even when the source is on a read-only filesystem. 
>>>>>>>>>>> Maven's
>>>>>>>>>>> defaults are undesirable and require workarounds. We shouldn't 
>>>>>>>>>>> mimic the
>>>>>>>>>>> behavior, but actually should set gradle up to build to a directory 
>>>>>>>>>>> outside
>>>>>>>>>>> the source tree always, if we can.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hmm, which is something you can do with maven as well so not
>>>>>>>>>>> sure I get it.
>>>>>>>>>>>
>>>>>>>>>>> Also note the thread is no more about the technical points but
>>>>>>>>>>> more the sources maintenance and consistency.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: A personal update

2017-12-13 Thread Jason Kuster
Glad to have you back. :)

On Wed, Dec 13, 2017 at 8:32 AM, Eugene Kirpichov <kirpic...@google.com>
wrote:

> Happy to see you return, and thank you again for all you've done so far!
>
> On Wed, Dec 13, 2017 at 10:24 AM Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
>> Welcome back! :-)
>>
>> > On 13. Dec 2017, at 15:42, Ismaël Mejía <ieme...@gmail.com> wrote:
>> >
>> > Hello Davor, great to know you are going to continue contributing to
>> > the project. Welcome back and best of wishes for this new phase !
>> >
>> > On Wed, Dec 13, 2017 at 3:12 PM, Kenneth Knowles <k...@google.com>
>> wrote:
>> >> Great to have you back!
>> >>
>> >> On Tue, Dec 12, 2017 at 11:20 PM, Robert Bradshaw <rober...@google.com
>> >
>> >> wrote:
>> >>>
>> >>> Great to hear from you again, and really happy you're sticking around!
>> >>>
>> >>> - Robert
>> >>>
>> >>>
>> >>> On Tue, Dec 12, 2017 at 10:47 PM, Ahmet Altay <al...@google.com>
>> wrote:
>> >>>> Welcome back! Looking forward to your contributions.
>> >>>>
>> >>>> Ahmet
>> >>>>
>> >>>> On Tue, Dec 12, 2017 at 10:05 PM, Jesse Anderson
>> >>>> <je...@bigdatainstitute.io>
>> >>>> wrote:
>> >>>>>
>> >>>>> Congrats!
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Dec 13, 2017, 5:54 AM Jean-Baptiste Onofré <j...@nanthrax.net
>> >
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Hi Davor,
>> >>>>>>
>> >>>>>> welcome back !!
>> >>>>>>
>> >>>>>> It's really great to see you back active in the Beam community. We
>> >>>>>> really
>> >>>>>> need you !
>> >>>>>>
>> >>>>>> I'm so happy !
>> >>>>>>
>> >>>>>> Regards
>> >>>>>> JB
>> >>>>>>
>> >>>>>> On 12/13/2017 05:51 AM, Davor Bonaci wrote:
>> >>>>>>> My dear friends,
>> >>>>>>> As many of you have noticed, I’ve been visibly absent from the
>> >>>>>>> project
>> >>>>>>> for a
>> >>>>>>> little while. During this time, a great number of you kept
>> reaching
>> >>>>>>> out, and for
>> >>>>>>> that I’m deeply humbled and grateful to each and every one of you.
>> >>>>>>>
>> >>>>>>> I needed some time for personal reflection, which led to a
>> >>>>>>> transition
>> >>>>>>> in my
>> >>>>>>> professional life. As things have settled, I’m happy to again be
>> >>>>>>> working among
>> >>>>>>> all of you, as we propel this project forward. I plan to be active
>> >>>>>>> in
>> >>>>>>> the
>> >>>>>>> future, but perhaps not quite full-time as I was before.
>> >>>>>>>
>> >>>>>>> In the near term, I’m working on getting the report to the Board
>> >>>>>>> completed, as
>> >>>>>>> well as framing the discussion about the project state and vision
>> >>>>>>> going
>> >>>>>>> forwards. Additionally, I’ll make sure that we foster healthy
>> >>>>>>> community
>> >>>>>>> culture
>> >>>>>>> and operate in the Apache Way.
>> >>>>>>>
>> >>>>>>> For those who are curious, I’m happy to say that I’m starting a
>> >>>>>>> company
>> >>>>>>> building
>> >>>>>>> products related to Beam, along with several other members of this
>> >>>>>>> community and
>> >>>>>>> authors of this technology. I’ll share more on this next year, but
>> >>>>>>> until then if
>> >>>>>>> you have a data processing problem or an Apache Beam question, I’d
>> >>>>>>> love
>> >>>>>>> to hear
>> >>>>>>> from you ;-).
>> >>>>>>>
>> >>>>>>> Thanks -- and so happy to be back!
>> >>>>>>>
>> >>>>>>> Davor
>> >>>>>>
>> >>>>>> --
>> >>>>>> Jean-Baptiste Onofré
>> >>>>>> jbono...@apache.org
>> >>>>>> http://blog.nanthrax.net
>> >>>>>> Talend - http://www.talend.com
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Jason Kuster
+1

>From the perspective of Beam's infrastructure, I've found that Gradle
provides us a good amount more flexibility to do the kinds of builds we
want. Additionally, the shorter run times (while not the only factor here)
will allow us to stretch our finite executor resources further, leading to
fewer instances where people are waiting for other builds to finish for
their presubmits to start.

On Tue, Nov 28, 2017 at 10:22 AM, Chamikara Jayalath <chamik...@google.com>
wrote:

> +1
>
> And thanks Luke for clearly mentioning the migration process. Let's make
> sure that all major use cases of Maven are properly addressed before
> removing Maven support.
>
> Thanks,
> Cham
>
>
> On Tue, Nov 28, 2017 at 10:09 AM Wesley Tanaka <wtanaka+b...@wtanaka.com>
> wrote:
>
>> +1
>>
>>
>> On 11/28/2017 07:55 AM, Lukasz Cwik wrote:
>>
>> This is a procedural vote for migrating to use Gradle for all our
>> development related processes (building, testing, and releasing). A
>> majority vote will signal that:
>> * Gradle build files will be supported and maintained alongside any
>> remaining Maven files.
>> * Once Gradle is able to replace Maven in a specific process (or portion
>> thereof), Maven will no longer be maintained for said process (or portion
>> thereof) and will be removed.
>>
>> +1 I support the process change
>> 0 I am indifferent to the process change
>> -1 I would like to remain with our current processes
>>
>> 
>> 
>>
>> Below is a summary of information contained in the disucssion thread
>> comparing Gradle and Maven: https://lists.apache.org/thread.html/
>> 225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%
>> 3Cdev.beam.apache.org%3E
>>
>> Gradle (mins)
>> min: 25.04
>> max: 160.14
>> median: 45.78
>> average: 52.19
>> stdev: 30.80
>>
>> Maven (mins)
>> min: 56.86
>> max: 216.55 (actually > 240 mins because this data does not include
>> timeouts)
>> median: 87.93
>> average: 109.10
>> stdev: 48.01
>>
>> Maven
>> Java Support: Mature
>> Python Support: None (via mvn exec plugin)
>> Go Support: Rudimentary (via mvn plugin)
>> Protobuf Support: Rudimentary (via mvn plugin)
>> Docker Support: Rudimentary (via mvn plugin)
>> ASF Release Automation: Mature
>> Jenkins Support: Mature
>> Configuration Language: XML
>> Multiple Java Versions: Yes
>> Static Analysis Tools: Some
>> ASF Release Audit Tool (RAT): Rudimentary (plugin complete and
>> longstanding but poor)
>> IntelliJ Integration: Mature
>> Eclipse Integration: Mature
>> Extensibility: Mature (updated per JB from discuss thread)
>> Number of GitHub Projects Using It: 146k
>> Continuous build daemon: None
>> Incremental build support: None (note that this is not the same as
>> incremental compile support offered by the compiler plugin)
>> Intra-module dependencies: Rudimentary (requires the use of many profiles
>> to get per runner dependencies)
>>
>> Gradle
>> Java Support: Mature
>> Python Support: Rudimentary (pygradle, lacks pypi support)
>> Go Support: Rudimentary (gogradle plugin)
>> Protobuf Support: Rudimentary (via protobuf plugin)
>> Docker Support: Rudimentary (via docker plugin)
>> ASF Release Automation: ?
>> Jenkins Support: Mature
>> Configuration Language: Groovy
>> Multiple Java Versions: Yes
>> Static Analysis Tools: Some
>> ASF Release Audit Tool (RAT): Rudimentary (plugin just calls Apache Maven
>> ANT plugin)
>> IntelliJ Integration: Mature
>> Eclipse Integration: Mature
>> Extensibility: Mature
>> Number of GitHub Projects Using It: 122k
>> Continuous build daemon: Mature
>> Incremental build support: Mature
>> Intra-module dependencies: Mature (via configurations)
>>
>>
>> --
>> Wesley Tanakahttps://wtanaka.com/
>>
>>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: beam-site issues with Jenkins and MergeBot

2017-08-18 Thread Jason Kuster
I'll take a look this afternoon Eugene, thanks! I'll also send a PR to
update MergeBot documentation to make testing and deployment information
clearer.

On Thu, Aug 17, 2017 at 3:58 PM, Eugene Kirpichov <kirpic...@google.com>
wrote:

> Hi Jason,
>
> Mergebot seems to be generally working, thanks! However I found another
> issue with it: it fails to add new files when regenerating the website. I
> ran into this in https://github.com/apache/beam-site/pull/292: mergebot
> generated commit b548e9ba
> <https://github.com/apache/beam-site/commit/b548e9ba51b29c1447bf4c68326d2602933d75d1>
>  which
> failed to add new files and I had to add them manually in 44d3769f
> <https://github.com/apache/beam-site/commit/44d3769f57c4e912417d64b7a96a46e26f1e27c0>
> .
>
> I sent a PR to fix this https://github.com/jasonkuster/merge-bot/pull/24 
> though
> I don't know how to test or deploy it.
>
> On Thu, Aug 10, 2017 at 3:32 PM Jason Kuster <jasonkus...@google.com>
> wrote:
>
>> Investigating mergebot outage currently. Apologies for the downtime.
>>
>> On Wed, Aug 9, 2017 at 9:55 PM, Eugene Kirpichov <kirpic...@google.com>
>> wrote:
>>
>>> Indeed beam-site is at https://gitbox.apache.org/repos/asf/beam-site.git
>>> <https://gitbox.apache.org/repos/asf?p=beam-site.git;a=summary> now.
>>>
>>> However, Mergebot appears to still be not working.
>>> https://github.com/apache/beam-site/pull/283 fixes the dead link and it
>>> passes the Jenkins precommit tests, but my "@asfgit merge" appears to have
>>> done nothing. I'm gonna have to merge things manually for now.
>>>
>>> +Jason Kuster <jasonkus...@google.com> Any ideas on why Mergebot is not
>>> working?
>>>
>>> On Wed, Aug 9, 2017 at 9:40 PM Jean-Baptiste Onofré <j...@nanthrax.net>
>>> wrote:
>>>
>>>> Beam site is no more on git-wip-us, but it moved to gitbox afair.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 08/09/2017 10:08 PM, Eugene Kirpichov wrote:
>>>> > Hello,
>>>> >
>>>> > I've been trying to merge a PR https://github.com/apache/
>>>> beam-site/pull/278
>>>> > and ran into the following issues:
>>>> >
>>>> > 1) When I do "git fetch --all" on beam-site, I get an error "fatal:
>>>> > repository 'https://git-wip-us.apache.org/repos/asf/beam-site.git/'
>>>> not
>>>> > found". Has the git address of the apache repo changed? Is it no
>>>> longer
>>>> > valid because we have MergeBot?
>>>> >
>>>> > 2) Precommit tests are failing nearly 100% of the time.
>>>> > If you look at build history on
>>>> > https://builds.apache.org/job/beam_PreCommit_Website_Test/ - 9 out
>>>> of 10
>>>> > last builds failed.
>>>> > Failures I saw:
>>>> >
>>>> > 7 times:
>>>> > + gpg --keyserver hkp://keys.gnupg.net --recv-keys
>>>> > 409B6B1796C275462A1703113804BB82D39DC0E3
>>>> > gpg: requesting key D39DC0E3 from hkp server keys.gnupg.net
>>>> > ?: keys.gnupg.net: Cannot assign requested address
>>>> >
>>>> > 2 times:
>>>> > - ./content/subdir/contribute/testing/index.html
>>>> >*  External link https://builds.apache.org/view/Beam/ failed: 404
>>>> No error
>>>> >
>>>> > The second failure seems legit - https://builds.apache.org/view/Beam/
>>>> is
>>>> > actually 404 right now (I'll send a separate email about htis)
>>>> >
>>>> > The gnupg failure is not legit - I'm able to run the same command
>>>> myself
>>>> > with no issues.
>>>> >
>>>> > 3) Suppose because of this, I'm not able to merge my PR with "@asfgit
>>>> > merge" command - I suppose it requires a successful test run. Would
>>>> be nice
>>>> > if it posted a comment saying why it refuses to merge.
>>>> >
>>>>
>>>> --
>>>> Jean-Baptiste Onofré
>>>> jbono...@apache.org
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>
>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [ANNOUNCEMENT] New PMC members, August 2017 edition!

2017-08-11 Thread Jason Kuster
Congratulations to both. :)

On Fri, Aug 11, 2017 at 10:56 AM, Amit Sela <amitsel...@gmail.com> wrote:

> Congratulations, well deserved!
>
> On Fri, Aug 11, 2017 at 1:53 PM Jesse Anderson <je...@bigdatainstitute.io>
> wrote:
>
> > Welcome!
> >
> > On Fri, Aug 11, 2017, 10:43 AM Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > Congratulations to Ahmet and Aviem.
> > >
> > > On Fri, Aug 11, 2017 at 10:40 AM, Davor Bonaci <da...@apache.org>
> wrote:
> > >
> > > > Please join me and the rest of Beam PMC in welcoming the following
> > > > committers as our newest PMC members. They have significantly
> > contributed
> > > > to the project in different ways, and we look forward to many more
> > > > contributions in the future.
> > > >
> > > > * Ahmet Altay
> > > > Beyond significant work to drive the Python SDK to the master branch,
> > > Ahmet
> > > > has worked project-wide, driving releases, improving processes and
> > > testing,
> > > > and growing the community.
> > > >
> > > > * Aviem Zur
> > > > Beyond significant work in the Spark runner, Aviem has worked to
> > improve
> > > > how the project operates, leading discussions on inclusiveness and
> > > > openness.
> > > >
> > > > Congratulations to both! Welcome!
> > > >
> > > > Davor
> > > >
> > >
> > --
> > Thanks,
> >
> > Jesse
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [ANNOUNCEMENT] New committers, August 2017 edition!

2017-08-11 Thread Jason Kuster
Congrats to all, many thanks for the great contributions.

On Fri, Aug 11, 2017 at 10:46 AM, Ahmet Altay <al...@google.com.invalid>
wrote:

> Congratulations to all of you. Well deserved and thank you for your
> contributions.
>
> On Fri, Aug 11, 2017 at 10:43 AM, tarush grover <tarushappt...@gmail.com>
> wrote:
>
> > Congratulations!!
> >
> > Regards,
> > Tarush
> >
> > On Fri, 11 Aug 2017 at 11:11 PM, Davor Bonaci <da...@apache.org> wrote:
> >
> > > Please join me and the rest of Beam PMC in welcoming the following
> > > contributors as our newest committers. They have significantly
> > contributed
> > > to the project in different ways, and we look forward to many more
> > > contributions in the future.
> > >
> > > * Reuven Lax
> > > Reuven has been with the project since the very beginning, contributing
> > > mostly to the core SDK and the GCP IO connectors. He accumulated 52
> > commits
> > > (19,824 ++ / 12,039 --). Most recently, Reuven re-wrote several IO
> > > connectors that significantly expanded their functionality.
> Additionally,
> > > Reuven authored important new design documents relating to update and
> > > snapshot functionality.
> > >
> > > * Jingsong Lee
> > > Jingsong has been contributing to Apache Beam since the beginning of
> the
> > > year, particularly to the Flink runner. He has accumulated 34 commits
> > > (11,214 ++ / 6,314 --) of deep, fundamental changes that significantly
> > > improved the quality of the runner. Additionally, Jingsong has
> > contributed
> > > to the project in other ways too -- reviewing contributions, and
> > > participating in discussions on the mailing list, design documents, and
> > > JIRA issue tracker.
> > >
> > > * Mingmin Xu
> > > Mingmin started the SQL DSL effort, and has driven it to the point of
> > > merging to the master branch. In this effort, he extended the project
> to
> > > the significant new user community.
> > >
> > > * Mingming (James) Xu
> > > James joined the SQL DSL effort, contributing some of the trickier
> parts,
> > > such as the Join functionality. Additionally, he's consistently shown
> > > himself to be an insightful code reviewer, significantly impacting the
> > > project’s code quality and ensuring the success of the new major
> > component.
> > >
> > > * Manu Zhang
> > > Manu initiated and developed a runner for the Apache Gearpump
> > (incubating)
> > > engine, and has driven it to the point of merging to the master branch.
> > In
> > > this effort, he accumulated 65 commits (7,812 ++ / 4,882 --) and
> extended
> > > the project to the new user community.
> > >
> > > Congratulations to all five! Welcome!
> > >
> > > Davor
> > >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: beam-site issues with Jenkins and MergeBot

2017-08-10 Thread Jason Kuster
Investigating mergebot outage currently. Apologies for the downtime.

On Wed, Aug 9, 2017 at 9:55 PM, Eugene Kirpichov <kirpic...@google.com>
wrote:

> Indeed beam-site is at https://gitbox.apache.org/repos/asf/beam-site.git
> <https://gitbox.apache.org/repos/asf?p=beam-site.git;a=summary> now.
>
> However, Mergebot appears to still be not working.
> https://github.com/apache/beam-site/pull/283 fixes the dead link and it
> passes the Jenkins precommit tests, but my "@asfgit merge" appears to have
> done nothing. I'm gonna have to merge things manually for now.
>
> +Jason Kuster <jasonkus...@google.com> Any ideas on why Mergebot is not
> working?
>
> On Wed, Aug 9, 2017 at 9:40 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Beam site is no more on git-wip-us, but it moved to gitbox afair.
>>
>> Regards
>> JB
>>
>> On 08/09/2017 10:08 PM, Eugene Kirpichov wrote:
>> > Hello,
>> >
>> > I've been trying to merge a PR https://github.com/apache/
>> beam-site/pull/278
>> > and ran into the following issues:
>> >
>> > 1) When I do "git fetch --all" on beam-site, I get an error "fatal:
>> > repository 'https://git-wip-us.apache.org/repos/asf/beam-site.git/' not
>> > found". Has the git address of the apache repo changed? Is it no longer
>> > valid because we have MergeBot?
>> >
>> > 2) Precommit tests are failing nearly 100% of the time.
>> > If you look at build history on
>> > https://builds.apache.org/job/beam_PreCommit_Website_Test/ - 9 out of
>> 10
>> > last builds failed.
>> > Failures I saw:
>> >
>> > 7 times:
>> > + gpg --keyserver hkp://keys.gnupg.net --recv-keys
>> > 409B6B1796C275462A1703113804BB82D39DC0E3
>> > gpg: requesting key D39DC0E3 from hkp server keys.gnupg.net
>> > ?: keys.gnupg.net: Cannot assign requested address
>> >
>> > 2 times:
>> > - ./content/subdir/contribute/testing/index.html
>> >*  External link https://builds.apache.org/view/Beam/ failed: 404
>> No error
>> >
>> > The second failure seems legit - https://builds.apache.org/view/Beam/
>> is
>> > actually 404 right now (I'll send a separate email about htis)
>> >
>> > The gnupg failure is not legit - I'm able to run the same command myself
>> > with no issues.
>> >
>> > 3) Suppose because of this, I'm not able to merge my PR with "@asfgit
>> > merge" command - I suppose it requires a successful test run. Would be
>> nice
>> > if it posted a comment saying why it refuses to merge.
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: is it ok to have a dicussion without subscribe the list

2017-08-08 Thread Jason Kuster
Hi Derek,

If you aren't subscribed to the list then people have to manually add you
back into the to: line in order for you to receive replies (I always do
anyway). Subscribing (and unsubscribing) to the list is fairly
straightforward, so that's what I would suggest.

Best,

Jason

On Mon, Aug 7, 2017 at 4:20 PM, derek <denc...@gmail.com> wrote:

> is it ok to have a dicussion without subscribe the list?
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Groovy Expertise?

2017-07-14 Thread Jason Kuster
Hey all,

If anyone has experience working with Groovy, I would love for someone to
take a look at https://github.com/apache/beam/pull/3545, which does a
decent amount of work with the Jenkins Job DSL in preparation for some
precommit changes.

Thanks,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: MergeBot is here!

2017-07-14 Thread Jason Kuster
Hey folks, thanks for your thoughts!

I have some responses for you. :)

Regarding squashing, it looks like there are three things at play --
forgive me if I've misunderstood.

1. Merge commits exactly as they are in the PR.
2. Squash all commits down to the first commit
3. Automatically squash fixup! and squash! commits but leave things as they
are otherwise.

Is the prevailing sentiment to enable all three of these, or just two?

To address Ismaël's comment about gaining insight into what MergeBot is
doing, I have two suggestions. a) MergeBot can comment back the list of
commands that it took in the case that it failed. b) MergeBot can comment a
link to the merge log for the PR in question. We already capture this and
put it somewhere internet-accessible, it's just not particularly
discoverable. This way people could watch the actual STDOUT as MergeBot is
operating, or after it fails to merge a PR. Would that help?

Also for the curious, I'm tracking all the future MergeBot work I can think
of in the doc linked below (I may wind up filing tickets for some of this
stuff at some point, but am more likely to track via github issues on the
mergebot repository for now). Comments welcome. :)

https://docs.google.com/document/d/13D1nUgTeonyvNtRi4bJM-Vyj9YOCVHZT7QA6EOauKT4/edit

On Wed, Jul 12, 2017 at 1:04 PM, Kenneth Knowles <k...@google.com.invalid>
wrote:

> On Wed, Jul 12, 2017 at 12:08 PM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>
> > On Tue, Jul 11, 2017 at 7:14 PM, Kenneth Knowles <k...@google.com.invalid
> >
> > wrote:
> >
> > > The thing is that "fixup! " indicates that this
> fixup
> > > should be reordered and applied to the referenced commit. Squashing in
> > > order is not correct. I think the bot reordering to squash is not a
> good
> > > idea.
> >
> > I don't see why reordering in this case is a bad thing (if it applies
> > cleanly--one could even automatically check that the patches commute).
> >
> > > So maybe I wasn't clear about the options I want. I want both of:
> > >
> > > (1) The bot merges the commits exactly as they are (for the
> > > git-knowledgable)
> > > (2) The bot squashes all the commits in order (for casual contributors)
> > >
> > > Way simpler than anything interactive and with no reording by the bot.
> > > The rest of my thoughts were just ways to further avoid messing this
> up.
> >
> > Yeah, these are the most common, and I highly agree should make it
> > hard to accidentally merge fixup commits. I would like to support the
> > (common) case of an advanced user having fixup commits in the review,
> > but being able to merge without waiting for her to manually squash
> > them after the LGTM.
> >
>
> OK, yea, I think it is fair to allow the bot to try to reorder if it goes
> cleanly. So sounds good to me.
>
> Kenn
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Jenkins Upgrade Sunday

2017-07-11 Thread Jason Kuster
Hi all,

An FYI that Infra is updating Jenkins over the weekend. One consequence of
this is that MavenJob type builds will need to use Java 8 for execution.
We've already been using Java 8 by default for executing our builds, so
this shouldn't be an issue for us, except potentially in the cross-JDK
test. More info can be found in the below link.

https://lists.apache.org/thread.html/1185e30193ff764870e9a7f58818e0d6a9af3a50b6e3ebd62476244c@%3Cbuilds.apache.org%3E

Best,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: MergeBot is here!

2017-07-10 Thread Jason Kuster
(quick update re #2 above): ~4 minutes after I reopened the ticket, it's
fixed.
https://github.com/apache/infrastructure-puppet/commit/709944291da5e8aea711cb8578f0594deb45e222
updates the website to the correct address. Infra is once again the best.

On Mon, Jul 10, 2017 at 12:38 PM, Jason Kuster <jasonkus...@google.com>
wrote:

> Glad to hear everyone's pretty happy about it! Have a couple answers for
> your questions.
>
> Ted: I believe the MFA stuff (two-factor auth on github) is necessary for
> getting the additional features on GitHub (reviewer, etc), but may not be
> necessary for MergeBot. I'll check in with Infra and get back to you.
>
> Ismaël: Great questions! Answered below.
>
> 1. The code will likely be transitioned over to an Infra-controlled
> repository, but for now is under my account: https://github.com/
> jasonkuster/merge-bot. It's written in Python, so Python aficionados
> especially feel free to take a look, kick the tires, and open PRs.
>
> 2. Glad to hear mergebot worked for you. :) The website not showing
> appears to be an issue with transitioning to GitBox; it seems a reference
> may have not been updated. Thanks for the report! I've reopened
> https://issues.apache.org/jira/browse/INFRA-14405 to track.
>
> 3. I'd love to chat about this more! It's totally possible to have
> mergebot pause and show the status of the repository before it does the
> final push, but given that mergebot is merging PRs serially I don't want to
> have someone forget to click "ok" and block other people's PRs. One other
> option would be to allow the person requesting the merge to say something
> like "@asfgit merge squash" or "@asfgit merge nosquash", parametrizing the
> merge request. Thoughts?
>
> On Mon, Jul 10, 2017 at 10:52 AM, Mark Liu <mark...@google.com.invalid>
> wrote:
>
>> +1 Awesome work!
>>
>> Thank you Jason!!!
>>
>> Mark
>>
>> On Mon, Jul 10, 2017 at 10:05 AM, Robert Bradshaw <
>> rober...@google.com.invalid> wrote:
>>
>> > +1, this is great! I'll second Ismaël's list requests, especially 1 and
>> 3.
>> >
>> > On Mon, Jul 10, 2017 at 2:09 AM, Ismaël Mejía <ieme...@gmail.com>
>> wrote:
>> > > Excellent!, Automation of such repetitive (and error-prone) tasks is
>> > > strongly welcomed.
>> > >
>> > > Thanks for making this happen Jason!
>> > >
>> > > Some comments:
>> > >
>> > > 1. I suppose the code of mergebot is now part of Apache Infra, no? Do
>> > > you know exactly where the code is hosted? And what is the procedure
>> > > in case somebody wants to improve it or change something in the
>> > > future? I suppose other projects can/would benefit of this.
>> > >
>> > > 2. I configured and used the mergebot with success, however the
>> > > website does not reflect the changes of the PR I 'merged', I suppose
>> > > there are still some things we have to fix, because the changes are
>> > > not there.
>> > > (The PR I am talking about is https://github.com/apache/
>> > beam-site/pull/264)
>> > >
>> > > 3. Other thing I noticed is that the mergebot didn’t squash the
>> > > commits (this probably makes sense) and I didn’t realize this to do it
>> > > before because there is not a preview of the state of the actions that
>> > > the mergebot is going to do, can this eventually be improved? (I don’t
>> > > know if this makes sense because this will add an extra validation
>> > > step and we must trust robots anyway :P).
>> > >
>> > > This new issue is something that reviewers/committers must remember,
>> > > and talking about this we need to update this in the contribution
>> > > guide to include the configuration/use of the mergebot instructions.
>> > >
>> > > Thanks again Jason and the other who made this possible, this is
>> great!
>> > > Ismaël
>> > >
>> > > ps. I’m eager to see this included too for the beam project.
>> > >
>> > > On Sat, Jul 8, 2017 at 7:28 AM, tarush grover <
>> tarushappt...@gmail.com>
>> > wrote:
>> > >> This is really good!!
>> > >>
>> > >> Regards,
>> > >> Tarush
>> > >>
>> > >> On Sat, 8 Jul 2017 at 10:20 AM, Jean-Baptiste Onofré <
>> j...@nanthrax.net>
>> > >> wrote:
>> > >>
>> > >>> That's awesome !
>> > >>>
>> &

Re: MergeBot is here!

2017-07-10 Thread Jason Kuster
Glad to hear everyone's pretty happy about it! Have a couple answers for
your questions.

Ted: I believe the MFA stuff (two-factor auth on github) is necessary for
getting the additional features on GitHub (reviewer, etc), but may not be
necessary for MergeBot. I'll check in with Infra and get back to you.

Ismaël: Great questions! Answered below.

1. The code will likely be transitioned over to an Infra-controlled
repository, but for now is under my account:
https://github.com/jasonkuster/merge-bot. It's written in Python, so Python
aficionados especially feel free to take a look, kick the tires, and open
PRs.

2. Glad to hear mergebot worked for you. :) The website not showing appears
to be an issue with transitioning to GitBox; it seems a reference may have
not been updated. Thanks for the report! I've reopened
https://issues.apache.org/jira/browse/INFRA-14405 to track.

3. I'd love to chat about this more! It's totally possible to have mergebot
pause and show the status of the repository before it does the final push,
but given that mergebot is merging PRs serially I don't want to have
someone forget to click "ok" and block other people's PRs. One other option
would be to allow the person requesting the merge to say something like
"@asfgit merge squash" or "@asfgit merge nosquash", parametrizing the merge
request. Thoughts?

On Mon, Jul 10, 2017 at 10:52 AM, Mark Liu <mark...@google.com.invalid>
wrote:

> +1 Awesome work!
>
> Thank you Jason!!!
>
> Mark
>
> On Mon, Jul 10, 2017 at 10:05 AM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>
> > +1, this is great! I'll second Ismaël's list requests, especially 1 and
> 3.
> >
> > On Mon, Jul 10, 2017 at 2:09 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
> > > Excellent!, Automation of such repetitive (and error-prone) tasks is
> > > strongly welcomed.
> > >
> > > Thanks for making this happen Jason!
> > >
> > > Some comments:
> > >
> > > 1. I suppose the code of mergebot is now part of Apache Infra, no? Do
> > > you know exactly where the code is hosted? And what is the procedure
> > > in case somebody wants to improve it or change something in the
> > > future? I suppose other projects can/would benefit of this.
> > >
> > > 2. I configured and used the mergebot with success, however the
> > > website does not reflect the changes of the PR I 'merged', I suppose
> > > there are still some things we have to fix, because the changes are
> > > not there.
> > > (The PR I am talking about is https://github.com/apache/
> > beam-site/pull/264)
> > >
> > > 3. Other thing I noticed is that the mergebot didn’t squash the
> > > commits (this probably makes sense) and I didn’t realize this to do it
> > > before because there is not a preview of the state of the actions that
> > > the mergebot is going to do, can this eventually be improved? (I don’t
> > > know if this makes sense because this will add an extra validation
> > > step and we must trust robots anyway :P).
> > >
> > > This new issue is something that reviewers/committers must remember,
> > > and talking about this we need to update this in the contribution
> > > guide to include the configuration/use of the mergebot instructions.
> > >
> > > Thanks again Jason and the other who made this possible, this is great!
> > > Ismaël
> > >
> > > ps. I’m eager to see this included too for the beam project.
> > >
> > > On Sat, Jul 8, 2017 at 7:28 AM, tarush grover <tarushappt...@gmail.com
> >
> > wrote:
> > >> This is really good!!
> > >>
> > >> Regards,
> > >> Tarush
> > >>
> > >> On Sat, 8 Jul 2017 at 10:20 AM, Jean-Baptiste Onofré <j...@nanthrax.net
> >
> > >> wrote:
> > >>
> > >>> That's awesome !
> > >>>
> > >>> Thanks Jason !
> > >>>
> > >>> Regards
> > >>> JB
> > >>>
> > >>> On 07/07/2017 10:21 PM, Jason Kuster wrote:
> > >>> > Hi Beam Community,
> > >>> >
> > >>> > Early on in the project, we had a number of discussions about
> > creating an
> > >>> > automated tool for merging pull requests. I’m happy to announce
> that
> > >>> we’ve
> > >>> > developed such a tool and it is ready for experimental usage in
> Beam!
> > >>> >
> > >>> > The tool, MergeBot, works in conjunction with ASF’s existing 

MergeBot is here!

2017-07-07 Thread Jason Kuster
Hi Beam Community,

Early on in the project, we had a number of discussions about creating an
automated tool for merging pull requests. I’m happy to announce that we’ve
developed such a tool and it is ready for experimental usage in Beam!

The tool, MergeBot, works in conjunction with ASF’s existing GitBox tool,
providing numerous benefits:
* Automating the merge process -- instead of many manual steps with
multiple Git remotes, merging is as simple as commenting a specific command
in GitHub.
* Automatic verification of each pull request against the latest master
code before merge.
* Merge queue enforces an ordering of pull requests, which ensures that
pull requests that have bad interactions don’t get merged at the same time.
* GitBox-enabled features such as reviewers, assignees, and labels.
* Enabling enhanced use of tools like reviewable.io.

If you are a committer, the first step is to link your Apache and GitHub
accounts at http://gitbox.apache.org/setup. Once the accounts are linked,
you should have immediate access to new GitHub features like labels,
assignees, etc., as well as the ability to merge pull requests by simply
commenting “@asfgit merge” on the pull request. MergeBot will communicate
its status back to you via the same mechanism used already by Jenkins.

This functionally is currently enabled for the “beam-site” repository only.
In this phase, we’d like to gather feedback and improve the user experience
-- so please comment back early and often. Once we are happy with the
experience, we’ll deploy it on the main Beam repository, and recommend it
for wider adoption.

I’d like to give a huge thank you to the Apache Infrastructure team,
especially Daniel Pono Takamori, Daniel Gruno, and Chris Thistlethwaite who
were instrumental in bringing this project to fruition. Additionally, this
could not have happened without the extensive work Davor put in to keep
things moving along. Thank you Davor.

Looking forward to hearing your comments and feedback. Thanks.

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Making it easier to run IO ITs

2017-07-07 Thread Jason Kuster
This looks great Stephen -- this will make the process of running IO ITs
much easier. I'll take a look at the doc and leave comments if I have any.

On Wed, Jul 5, 2017 at 5:02 PM, Kenneth Knowles <k...@google.com.invalid>
wrote:

> I have some extra nerdy maven-fu (pom-fu?) to suggest: use -D instead of -P
> for a little more flexibility.
>
> You can't have one profile activate another, but you _can_ activate two
> profiles with the same property. [1]
>
> This doesn't work: mvn -P profile1
>
>   
> 
>   profile1
>   
> true
>   
> 
>
> 
>   profile2
>   
> 
>   profile2
> 
>   
> 
>   
>
> (actually it doesn't work for two reasons - see below about [2])
>
> This does work: mvn -D shared-semantic-property
>
>   
> 
>   profile1
>   
> 
>   shared-semantic-property
> 
>   
> 
>
> 
>   profile2
>   
> 
>   shared-semantic-property
> 
>   
> 
>   
>
>
> Unfortunately, this is only true for _system_ properties, not those set up
> in a  block. [2]
>
> This doesn't work: mvn -D profile1
>
>   
> ${profile1}
>   
>
>   
> 
>   profile1
>   
> 
>   profile1
> 
>   
> 
>
> 
>   profile2
>   
> 
>   profile2
> 
>   
> 
>   
>
> So at that point I think you are stuck. But you can use the basic -D trick
> to do things like split the various alterations into separate profiles,
> which might end up having different boolean combinations with other -D
> options.
>
> Kenn
>
> [1]
> https://stackoverflow.com/questions/943411/can-i-make-
> one-maven-profile-activate-another
> [2]
> https://stackoverflow.com/questions/5676231/how-to-
> activate-profile-by-means-of-maven-property
>
> On Wed, Jul 5, 2017 at 3:11 PM, Stephen Sisk <s...@google.com.invalid>
> wrote:
>
> > hey all,
> >
> > I wanted to share an early draft of what it'll be like to invoke mvn for
> > the IO integration tests in the future when we have the integration with
> > kubernetes going.
> >
> > I'm really excited about these changes - working on the IO ITs, I have to
> > run them frequently, and the command lines to run them can be quite a
> bear.
> > For example:
> >
> > mvn -e verify -Dit.test=org.apache.beam.sdk.io.jdbc.JdbcIOIT
> > -DskipITs=false -pl sdks/java/io/jdbc -Pio-it -Pdataflow-runner
> > -DintegrationTestPipelineOptions=["--project=[project]","--
> > gcpTempLocation=gs://[bucket]/staging","--postgresUsername=
> > postgres","--postgresPassword=uuinkks","--postgresDatabaseName=postgres"
> > ,"--postgresSsl=False","--postgresServerName=[1.2.3.4]",
> > "--runner=TestDataflowRunner","--defaultWorkerLogLevel=INFO"]
> >
> > Also, in order to run this, I first need to have created an instance of
> > this datastore in kubernetes and then copied the parameter and
> inevitably I
> > mis-copy something in there or something changes, so it doesn't work
> > correctly and I have to go back in and edit it.
> >
> > So that's a pain.
> >
> > To invoke the IO ITs in the future, it'll be a command like this:
> >   mvn verify -Pio-it-suite -pl sdks/java/io/jdbc
> >   -DpkbLocation="path-to-pkb.py" \
> >   -DintegrationTestPipelineOptions='["--tempRoot=my-temp-root"]'
> > (or at least, that's what I'm proposing :)
> >
> > This will run the jdbc integration tests, spinning up the data store for
> > that integration test in your kubernetes cluster.
> >
> > This is all enabled by a combination of adding new profiles in maven for
> > each IO and changes to the beam benchmarks in pkb (perfkitbenchmarker) to
> > control kubernetes. Jason has already done a lot of work to get pkb
> working
> > to run our regular benchmarks, and I'm excited to continue that work for
> IO
> > ITs. We use pkb to control kubernetes and capture our benchmark times.
> This
> > means you'll need to install pkb if you'd like to use this nicer
> > experience, however, devs will never have to use pkb if they don't want
> to,
> > nor is making changes in pkb required when you want to add a new IO IT.
> You
> > can always spin up the data store yourself, and invoke the integration
> test
> > directly.
> >
> > Drafts of these changes can be seen at [0] and [1] - however, I don't
> > expect most folks will care about these changes other than "how do I
> invoke
> > this?", so let me know if you have comments about how this is invoked.
> >
> > S
> >
> > [0] pom changes hooking up the call to pkb -
> > https://github.com/ssisk/beam/commit/eec7cb5b71330761e71850e8e6f65f
> > 34249641b0
> > [1] pkb changes enabling kubernetes spin up-
> > https://github.com/ssisk/PerfKitBenchmarker/commits/kubernetes_create
> > (last
> > 2 changes)
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Jenkins Executor Issue

2017-07-05 Thread Jason Kuster
This has been resolved, although Infra has not yet determined a root cause
for the issue. If anyone sees this recurring please reply to this thread or
to the JIRA.

On Fri, Jun 30, 2017 at 12:34 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Thanks Jason.
>
> Regards
> JB
>
>
> On 06/30/2017 08:19 PM, Jason Kuster wrote:
>
>> Hi all,
>>
>> There appears to be an issue ongoing where our Jenkins jobs are not being
>> scheduled onto our executors. I've filed
>> https://issues.apache.org/jira/browse/INFRA-14476 to track this issue,
>> and
>> will try to communicate back here with relevant status updates.
>>
>> Best,
>>
>> Jason
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Jenkins Executor Issue

2017-06-30 Thread Jason Kuster
Hi all,

There appears to be an issue ongoing where our Jenkins jobs are not being
scheduled onto our executors. I've filed
https://issues.apache.org/jira/browse/INFRA-14476 to track this issue, and
will try to communicate back here with relevant status updates.

Best,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: First stable release completed!

2017-05-17 Thread Jason Kuster
Fantastic work everyone! I'm really excited to see what we've accomplished,
and the future for Beam looks bright.

On Wed, May 17, 2017 at 2:00 PM, Mark Liu <mark...@google.com.invalid>
wrote:

> Congratulations!
>
> On Wed, May 17, 2017 at 1:25 PM, Ismaël Mejía <ieme...@gmail.com> wrote:
>
> > Amazing milestone, congrats everyone!
> >
> > On Wed, May 17, 2017 at 7:54 PM, Reuven Lax <re...@google.com.invalid>
> > wrote:
> > > Sweet!
> > >
> > > On Wed, May 17, 2017 at 4:28 AM, Davor Bonaci <da...@apache.org>
> wrote:
> > >
> > >> The first stable release is now complete!
> > >>
> > >> Release artifacts are available through various repositories,
> including
> > >> dist.apache.org, Maven Central, and PyPI. The website is updated, and
> > >> announcements are published.
> > >>
> > >> Apache Software Foundation press release:
> > >> http://globenewswire.com/news-release/2017/05/17/986839/0/
> > >> en/The-Apache-Software-Foundation-Announces-Apache-Beam-v2-0-0.html
> > >>
> > >> Beam blog:
> > >> https://beam.apache.org/blog/2017/05/17/beam-first-stable-
> release.html
> > >>
> > >> Congratulations to everyone -- this is a really big milestone for the
> > >> project, and I'm proud to be a part of this great community.
> > >>
> > >> Davor
> > >>
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [VOTE] First stable release: release candidate #4

2017-05-13 Thread Jason Kuster
+1 -- validated several common cases on spark yarn and dataflow runner.

Exciting milestone for the project!

On May 12, 2017 21:48, "Davor Bonaci"  wrote:

Hi everyone --
After going through several release candidates, setting and validating
acceptance criteria, running a hackathon, and polishing the release, now is
the time to vote!

Please review and vote on the release candidate #4 for the version 2.0.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 8F0D334F [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.0.0-RC4" [5],
* website pull request listing the release and publishing the API reference
manual [6],
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].

Jenkins suites:
* https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/11439/
* https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/3801/
* https://builds.apache.org/job/beam_PostCommit_Python_Verify/2216/
*
https://builds.apache.org/job/beam_PostCommit_Java_
ValidatesRunner_Apex/1461/
*
https://builds.apache.org/job/beam_PostCommit_Java_
ValidatesRunner_Dataflow/3123/
*
https://builds.apache.org/job/beam_PostCommit_Java_
ValidatesRunner_Flink/2808/
*
https://builds.apache.org/job/beam_PostCommit_Java_
ValidatesRunner_Spark/2060/

The vote will be open for at least 72 hours. It is adopted by majority
approval of qualified votes, with at least 3 PMC affirmative votes.

Thanks!

Davor

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?
projectId=12319527=12339746
[2] https://dist.apache.org/repos/dist/dev/beam/2.0.0-RC4/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1017/
[5] https://github.com/apache/beam/tree/v2.0.0-RC4
[6] https://github.com/apache/beam-site/pull/231
[7]
https://lists.apache.org/thread.html/981c2f13c0daf29876059b14dbe97e
75bcc9e40d3ac38e33a6ecf3f9@%3Cdev.beam.apache.org%3E
[8]
https://issues.apache.org/jira/issues/?jql=project%20%
3D%20BEAM%20AND%20fixVersion%20%3D%202.0.0%20AND%20resolution%20%3D%
20Unresolved%20ORDER%20BY%20component%20ASC%2C%
20summary%20ASC%2C%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%
20created%20ASC


Re: First stable release: Acceptance criteria

2017-05-11 Thread Jason Kuster
Just validated a decently-sized wordcount on a YARN cluster successfully.

On Thu, May 11, 2017 at 3:51 PM, Kenneth Knowles <k...@google.com.invalid>
wrote:

> I gave the archetype-based quickstart a try on as many runners and
> configurations as I could manage today, mostly embedded and YARN.
>
> There are some issues (filed and added to the doc) that may have to do with
> my setup, but may not. I'd prefer the runner maintainers / system experts
> try these on their more realistic setups.
>
> On Thu, May 11, 2017 at 3:28 PM, Thomas Groh <tg...@google.com.invalid>
> wrote:
>
> > I'm making sure the direct runner plays nice in a variety of scenarios
> > (primarily the game examples, at the moment. Been a couple of hours and
> > still going strong in streaming)
> >
> > On Thu, May 11, 2017 at 3:09 PM, Dan Halperin
> <dhalp...@google.com.invalid
> > >
> > wrote:
> >
> > > I'm focusing on:
> > >
> > > * user reported bugs (Avro, TextIO, MongoDb)
> > > * the actual Apache Release criteria (licensing, dependencies, etc.)
> > >
> > > On Thu, May 11, 2017 at 3:04 PM, Lukasz Cwik <lc...@google.com.invalid
> >
> > > wrote:
> > >
> > > > I have been trying out various Python scenarios on Windows.
> > > >
> > > > On Thu, May 11, 2017 at 3:01 PM, Jason Kuster <
> > > > jasonkus...@google.com.invalid> wrote:
> > > >
> > > > > I'll try to get wordcount running against a Spark cluster.
> > > > >
> > > > > On Wed, May 10, 2017 at 10:32 PM, Davor Bonaci <da...@apache.org>
> > > wrote:
> > > > >
> > > > > > Just a quick remainder to consider to consider contributing here.
> > > > > >
> > > > > > We are now at 6 criteria -- thanks!
> > > > > >
> > > > > > On Tue, May 9, 2017 at 2:29 AM, Aljoscha Krettek <
> > > aljos...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for starting this document!
> > > > > > >
> > > > > > > I added a criterion and also verified it on the current RC.
> > > > > > >
> > > > > > > Best,
> > > > > > > Aljoscha
> > > > > > >
> > > > > > > > On 8. May 2017, at 22:48, Davor Bonaci <da...@apache.org>
> > wrote:
> > > > > > > >
> > > > > > > > Based on the process previously discussed [1], I've seeded
> the
> > > > > > acceptance
> > > > > > > > criteria document [2].
> > > > > > > >
> > > > > > > > Please consider contributing to this effort by:
> > > > > > > > * proposing additional acceptance criteria, and/or
> > > > > > > > * supporting criteria proposed by others, and/or
> > > > > > > > * validating a criteria.
> > > > > > > >
> > > > > > > > Please note that acceptance criteria shouldn't been too deep
> or
> > > too
> > > > > > broad
> > > > > > > > -- those are covered by automated tests and hackathon we had
> > > > earlier.
> > > > > > > This
> > > > > > > > should be "sanity-check"-type of criteria: simple,
> > surface-level
> > > > > > things.
> > > > > > > >
> > > > > > > > If you discover issues while validating a criteria, please:
> > > > > > > > * file a new JIRA issue, tag it as Fix Versions: “2.0.0”, and
> > > > > > > > * post on the dev@ mailing list on the thread about that
> > > specific
> > > > > > > release
> > > > > > > > candidate.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Davor
> > > > > > > >
> > > > > > > > [1]
> > > > > > > > https://lists.apache.org/thread.html/
> > > > 37caa5a94cec1405638410857f489d
> > > > > > > 7cf7fa12bbe3c36e9925b2d6e2@%3Cdev.beam.apache.org%3E
> > > > > > > > [2]
> > > > > > > > https://docs.google.com/document/d/
> > > 1XwojJ4Mj3wSlnBO1YlBs51P8kuGyg
> > > > > > > YRj2lrNrqmAUvo/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > ---
> > > > > Jason Kuster
> > > > > Apache Beam / Google Cloud Dataflow
> > > > >
> > > >
> > >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: First stable release: Acceptance criteria

2017-05-11 Thread Jason Kuster
on Spark, doh

On Thu, May 11, 2017 at 5:45 PM, Jason Kuster <jasonkus...@google.com>
wrote:

> Just validated a decently-sized wordcount on a YARN cluster successfully.
>
> On Thu, May 11, 2017 at 3:51 PM, Kenneth Knowles <k...@google.com.invalid>
> wrote:
>
>> I gave the archetype-based quickstart a try on as many runners and
>> configurations as I could manage today, mostly embedded and YARN.
>>
>> There are some issues (filed and added to the doc) that may have to do
>> with
>> my setup, but may not. I'd prefer the runner maintainers / system experts
>> try these on their more realistic setups.
>>
>> On Thu, May 11, 2017 at 3:28 PM, Thomas Groh <tg...@google.com.invalid>
>> wrote:
>>
>> > I'm making sure the direct runner plays nice in a variety of scenarios
>> > (primarily the game examples, at the moment. Been a couple of hours and
>> > still going strong in streaming)
>> >
>> > On Thu, May 11, 2017 at 3:09 PM, Dan Halperin
>> <dhalp...@google.com.invalid
>> > >
>> > wrote:
>> >
>> > > I'm focusing on:
>> > >
>> > > * user reported bugs (Avro, TextIO, MongoDb)
>> > > * the actual Apache Release criteria (licensing, dependencies, etc.)
>> > >
>> > > On Thu, May 11, 2017 at 3:04 PM, Lukasz Cwik <lc...@google.com.invalid
>> >
>> > > wrote:
>> > >
>> > > > I have been trying out various Python scenarios on Windows.
>> > > >
>> > > > On Thu, May 11, 2017 at 3:01 PM, Jason Kuster <
>> > > > jasonkus...@google.com.invalid> wrote:
>> > > >
>> > > > > I'll try to get wordcount running against a Spark cluster.
>> > > > >
>> > > > > On Wed, May 10, 2017 at 10:32 PM, Davor Bonaci <da...@apache.org>
>> > > wrote:
>> > > > >
>> > > > > > Just a quick remainder to consider to consider contributing
>> here.
>> > > > > >
>> > > > > > We are now at 6 criteria -- thanks!
>> > > > > >
>> > > > > > On Tue, May 9, 2017 at 2:29 AM, Aljoscha Krettek <
>> > > aljos...@apache.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks for starting this document!
>> > > > > > >
>> > > > > > > I added a criterion and also verified it on the current RC.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > Aljoscha
>> > > > > > >
>> > > > > > > > On 8. May 2017, at 22:48, Davor Bonaci <da...@apache.org>
>> > wrote:
>> > > > > > > >
>> > > > > > > > Based on the process previously discussed [1], I've seeded
>> the
>> > > > > > acceptance
>> > > > > > > > criteria document [2].
>> > > > > > > >
>> > > > > > > > Please consider contributing to this effort by:
>> > > > > > > > * proposing additional acceptance criteria, and/or
>> > > > > > > > * supporting criteria proposed by others, and/or
>> > > > > > > > * validating a criteria.
>> > > > > > > >
>> > > > > > > > Please note that acceptance criteria shouldn't been too
>> deep or
>> > > too
>> > > > > > broad
>> > > > > > > > -- those are covered by automated tests and hackathon we had
>> > > > earlier.
>> > > > > > > This
>> > > > > > > > should be "sanity-check"-type of criteria: simple,
>> > surface-level
>> > > > > > things.
>> > > > > > > >
>> > > > > > > > If you discover issues while validating a criteria, please:
>> > > > > > > > * file a new JIRA issue, tag it as Fix Versions: “2.0.0”,
>> and
>> > > > > > > > * post on the dev@ mailing list on the thread about that
>> > > specific
>> > > > > > > release
>> > > > > > > > candidate.
>> > > > > > > >
>> > > > > > > > Thanks!
>> > > > > > > >
>> > > > > > > > Davor
>> > > > > > > >
>> > > > > > > > [1]
>> > > > > > > > https://lists.apache.org/thread.html/
>> > > > 37caa5a94cec1405638410857f489d
>> > > > > > > 7cf7fa12bbe3c36e9925b2d6e2@%3Cdev.beam.apache.org%3E
>> > > > > > > > [2]
>> > > > > > > > https://docs.google.com/document/d/
>> > > 1XwojJ4Mj3wSlnBO1YlBs51P8kuGyg
>> > > > > > > YRj2lrNrqmAUvo/
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > ---
>> > > > > Jason Kuster
>> > > > > Apache Beam / Google Cloud Dataflow
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: First stable release: Acceptance criteria

2017-05-11 Thread Jason Kuster
I'll try to get wordcount running against a Spark cluster.

On Wed, May 10, 2017 at 10:32 PM, Davor Bonaci <da...@apache.org> wrote:

> Just a quick remainder to consider to consider contributing here.
>
> We are now at 6 criteria -- thanks!
>
> On Tue, May 9, 2017 at 2:29 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
> > Thanks for starting this document!
> >
> > I added a criterion and also verified it on the current RC.
> >
> > Best,
> > Aljoscha
> >
> > > On 8. May 2017, at 22:48, Davor Bonaci <da...@apache.org> wrote:
> > >
> > > Based on the process previously discussed [1], I've seeded the
> acceptance
> > > criteria document [2].
> > >
> > > Please consider contributing to this effort by:
> > > * proposing additional acceptance criteria, and/or
> > > * supporting criteria proposed by others, and/or
> > > * validating a criteria.
> > >
> > > Please note that acceptance criteria shouldn't been too deep or too
> broad
> > > -- those are covered by automated tests and hackathon we had earlier.
> > This
> > > should be "sanity-check"-type of criteria: simple, surface-level
> things.
> > >
> > > If you discover issues while validating a criteria, please:
> > > * file a new JIRA issue, tag it as Fix Versions: “2.0.0”, and
> > > * post on the dev@ mailing list on the thread about that specific
> > release
> > > candidate.
> > >
> > > Thanks!
> > >
> > > Davor
> > >
> > > [1]
> > > https://lists.apache.org/thread.html/37caa5a94cec1405638410857f489d
> > 7cf7fa12bbe3c36e9925b2d6e2@%3Cdev.beam.apache.org%3E
> > > [2]
> > > https://docs.google.com/document/d/1XwojJ4Mj3wSlnBO1YlBs51P8kuGyg
> > YRj2lrNrqmAUvo/
> >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Slack Invites

2017-05-04 Thread Jason Kuster
Gitter does look like it solves many of the challenges we've been facing
and looks like it's had wide adoption in some open-source communities. I
haven't found any other official Apache projects using it, so we could be
the vanguard here.

On Thu, May 4, 2017 at 9:45 AM, Eric Anderson <eric...@google.com.invalid>
wrote:

> While on the topic: Have we considered alternatives like Gitter
> <http://gitter.im>? I'm not very familiar with Slack or Gitter, but Gitter
> advertises being a little friendlier to open invites.
>
> On Thu, May 4, 2017 at 9:38 AM Dan Halperin <dhalp...@google.com.invalid>
> wrote:
>
> > My understanding is that if you use something like that plugin, and they
> > detect it, Slack will ban you from new invites entirely or otherwise
> punish
> > you. They want this friction for free projects so that there's pressure
> to
> > pay.
> >
> > On Thu, May 4, 2017 at 9:18 AM, Jesse Anderson <
> je...@bigdatainstitute.io>
> > wrote:
> >
> > > Is possible to change how Slack invites are handled? This might
> encourage
> > > our community contributions.
> > >
> > > Right now, people have to email in (causing extra dev@/user@ emails).
> I
> > > did
> > > a quick search and found this <https://github.com/rauchg/slackin> so
> > > people
> > > can invite themselves.
> > >
> > > Thanks,
> > >
> > > Jesse
> > > --
> > > Thanks,
> > >
> > > Jesse
> > >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: An Update on Jenkins

2017-04-26 Thread Jason Kuster
Yup, not. Update as of this morning: Infra has contacted Github support
about the throttling issue; they're hoping that support will be able to
point to the issue. They are continuing to see rolling problems as the
result of some certificate upgrades they performed over the weekend.

In terms of the hackathon today, several of the committers with Jenkins
access have graciously agreed to manually trigger builds via the Jenkins UI
if anyone needs a PR validated. Reach out on the dev list or on slack and
someone will help you out. :)

Best,

Jason

On Wed, Apr 26, 2017 at 9:18 AM, Dan Halperin <dhalp...@google.com.invalid>
wrote:

> > If not, feel free to reply to this thread
>
> ... not. :) :(
>
> On Tue, Apr 25, 2017 at 9:58 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Thanks for the update !
> >
> > Regards
> > JB
> >
> > On Apr 26, 2017, 05:51, at 05:51, Jason Kuster <jasonkus...@google.com.
> INVALID>
> > wrote:
> > >Hey folks,
> > >
> > >There have been a couple of different issues over the last couple of
> > >days
> > >related to some necessary updates Infra has been working on. We've
> > >tracked
> > >down the last couple of issues, and the latest one seems to be that
> > >we're
> > >being hit by the rate limiter as a result of everything starting back
> > >up
> > >again. They expect that waiting a couple of hours should solve the
> > >problem,
> > >so hopefully by tomorrow things will be back to normal. If not, feel
> > >free
> > >to reply to this thread, and I'll try to keep things up to date with
> > >status.
> > >
> > >Best,
> > >
> > >Jason
> > >
> > >--
> > >---
> > >Jason Kuster
> > >Apache Beam / Google Cloud Dataflow
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


An Update on Jenkins

2017-04-25 Thread Jason Kuster
Hey folks,

There have been a couple of different issues over the last couple of days
related to some necessary updates Infra has been working on. We've tracked
down the last couple of issues, and the latest one seems to be that we're
being hit by the rate limiter as a result of everything starting back up
again. They expect that waiting a couple of hours should solve the problem,
so hopefully by tomorrow things will be back to normal. If not, feel free
to reply to this thread, and I'll try to keep things up to date with status.

Best,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Build failed in Jenkins: beam_SeedJob #214

2017-04-18 Thread Jason Kuster
Yup -- it looks like we're going to need reapproval when we change our
jobs[1].

[1]
https://github.com/jenkinsci/job-dsl-plugin/wiki/Script-Security#script-approval

On Tue, Apr 18, 2017 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Thanks Jason for the effort.
> Looks like we hit this:
>
> ERROR: script not yet approved for use
>
>
> On Tue, Apr 18, 2017 at 10:16 AM, Jason Kuster <
> jasonkus...@google.com.invalid> wrote:
>
> > I'm looking into this currently as well; that's one of the mitigations
> I'm
> > considering too but I'm giving the evaluate thing a try[1][2] (once it
> > starts running -- executors are full currently).
> >
> > [1] https://builds.apache.org/view/Beam/job/beam_SeedJob/215/
> > [2] https://github.com/apache/beam/pull/2578
> >
> > On Tue, Apr 18, 2017 at 10:12 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > To unblock the builds, how about embedding functions used by respective
> > > scripts in the scripts themselves ?
> > >
> > > e.g. buildPerformanceTest is only used by the following scripts:
> > >
> > > .test-infra/jenkins/job_beam_PerformanceTests_Dataflow.groovy:
> > >  common_job_properties.buildPerformanceTest(delegate, argMap)
> > > .test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy:
> > >  common_job_properties.buildPerformanceTest(delegate, argMap)
> > > .test-infra/jenkins/job_beam_PerformanceTests_Spark.groovy:
> > >  common_job_properties.buildPerformanceTest(delegate, argMap)
> > >
> > > On Tue, Apr 18, 2017 at 10:05 AM, Davor Bonaci <da...@apache.org>
> wrote:
> > >
> > > > Not so simple, unfortunately [1]. Ideas welcome ;-)
> > > >
> > > > Davor
> > > >
> > > > [1]
> > > > https://github.com/jenkinsci/job-dsl-plugin/wiki/Migration#
> > > > migrating-to-160
> > > >
> > > > On Tue, Apr 18, 2017 at 9:57 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > > >
> > > > > I wonder if we should adopt the suggestion here (involving
> evaluate):
> > > > > http://stackoverflow.com/questions/9136328/including-a-
> > > > > groovy-script-in-another-groovy
> > > > >
> > > > > On Tue, Apr 18, 2017 at 9:45 AM, Apache Jenkins Server <
> > > > > jenk...@builds.apache.org> wrote:
> > > > >
> > > > > > See <https://builds.apache.org/job/beam_SeedJob/214/display/
> > > > > > redirect?page=changes>
> > > > > >
> > > > > > Changes:
> > > > > >
> > > > > > [jbonofre] [BEAM-59] Register standard FileSystems wherever we
> > > register
> > > > > >
> > > > > > [iemejia] Enable flink dependency enforcement and make
> dependencies
> > > > > > explicit
> > > > > >
> > > > > > [iemejia] Fix Javadoc warnings on Flink Runner
> > > > > >
> > > > > > [iemejia] Remove flink-annotations dependency
> > > > > >
> > > > > > [iemejia] [BEAM-1993] Remove special unbounded Flink source/sink
> > > > > >
> > > > > > [tgroh] Translate PTransforms to and from Runner API Protos
> > > > > >
> > > > > > [altay] Clean up DirectRunner Clock and TransformResult
> > > > > >
> > > > > > [altay] Remove overloading of __call__ in DirectRunner
> > > > > >
> > > > > > --
> > > > > > [...truncated 202.75 KB...]
> > > > > >  x [deleted] (none) -> origin/pr/902/merge
> > > > > >  x [deleted] (none) -> origin/pr/903/head
> > > > > >  x [deleted] (none) -> origin/pr/903/merge
> > > > > >  x [deleted] (none) -> origin/pr/904/head
> > > > > >  x [deleted] (none) -> origin/pr/904/merge
> > > > > >  x [deleted] (none) -> origin/pr/905/head
> > > > > >  x [deleted] (none) -> origin/pr/905/merge
> > > > > >  x [deleted] (none) -> origin/pr/906/head
> > > > > >  x [deleted] (none) -> origin/pr/906/merge
> > > > > >  x [deleted] (none) -> origin/pr/907/head
> > > > > >  x [deleted] (none) -> origin/pr/907/merge
> &g

Re: Build failed in Jenkins: beam_SeedJob #214

2017-04-18 Thread Jason Kuster
gt;  x [deleted] (none) -> origin/pr/984/merge
> > > >  x [deleted] (none) -> origin/pr/985/head
> > > >  x [deleted] (none) -> origin/pr/985/merge
> > > >  x [deleted] (none) -> origin/pr/986/head
> > > >  x [deleted] (none) -> origin/pr/986/merge
> > > >  x [deleted] (none) -> origin/pr/987/head
> > > >  x [deleted] (none) -> origin/pr/988/head
> > > >  x [deleted] (none) -> origin/pr/988/merge
> > > >  x [deleted] (none) -> origin/pr/989/head
> > > >  x [deleted] (none) -> origin/pr/989/merge
> > > >  x [deleted] (none) -> origin/pr/99/head
> > > >  x [deleted] (none) -> origin/pr/99/merge
> > > >  x [deleted] (none) -> origin/pr/990/head
> > > >  x [deleted] (none) -> origin/pr/990/merge
> > > >  x [deleted] (none) -> origin/pr/991/head
> > > >  x [deleted] (none) -> origin/pr/991/merge
> > > >  x [deleted] (none) -> origin/pr/992/head
> > > >  x [deleted] (none) -> origin/pr/992/merge
> > > >  x [deleted] (none) -> origin/pr/993/head
> > > >  x [deleted] (none) -> origin/pr/993/merge
> > > >  x [deleted] (none) -> origin/pr/994/head
> > > >  x [deleted] (none) -> origin/pr/994/merge
> > > >  x [deleted] (none) -> origin/pr/995/head
> > > >  x [deleted] (none) -> origin/pr/995/merge
> > > >  x [deleted] (none) -> origin/pr/996/head
> > > >  x [deleted] (none) -> origin/pr/996/merge
> > > >  x [deleted] (none) -> origin/pr/997/head
> > > >  x [deleted] (none) -> origin/pr/997/merge
> > > >  x [deleted] (none) -> origin/pr/998/head
> > > >  x [deleted] (none) -> origin/pr/999/head
> > > >  x [deleted] (none) -> origin/pr/999/merge
> > > > error: RPC failed; result=18, HTTP code = 200
> > > > fatal: The remote end hung up unexpectedly
> > > >
> > > > at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> > > > launchCommandIn(CliGitAPIImpl.java:1799)
> > > > at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> > > > launchCommandWithCredentials(CliGitAPIImpl.java:1525)
> > > > at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> > > > access$300(CliGitAPIImpl.java:65)
> > > > at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.
> > > > execute(CliGitAPIImpl.java:316)
> > > > at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> > > > CommandInvocationHandler$1.call(RemoteGitImpl.java:153)
> > > > at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> > > > CommandInvocationHandler$1.call(RemoteGitImpl.java:146)
> > > > at hudson.remoting.UserRequest.perform(UserRequest.java:153)
> > > > at hudson.remoting.UserRequest.perform(UserRequest.java:50)
> > > > at hudson.remoting.Request$2.run(Request.java:336)
> > > > at hudson.remoting.InterceptingExecutorService$1.call(
> > > > InterceptingExecutorService.java:68)
> > > > at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > > at ..remote call to beam4(Native Method)
> > > > at hudson.remoting.Channel.attachCallSiteStackTrace(
> > > > Channel.java:1537)
> > > > at hudson.remoting.UserResponse.
> retrieve(UserRequest.java:253)
> > > > at hudson.remoting.Channel.call(Channel.java:822)
> > > > at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> > > > CommandInvocationHandler.execute(RemoteGitImpl.java:146)
> > > > at sun.reflect.GeneratedMethodAccessor862.invoke(Unknown
> > Source)
> > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > > DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:498)
> > > > at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> > > > CommandInvocationHandler.invoke(RemoteGitImpl.java:132)
> > > > at com.sun.proxy.$Proxy103.execute(Unknown Source)
> > > > at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:804)
> > > > ... 11 more
> > > > ERROR: null
> > > > Retrying after 10 seconds
> > > >  > git rev-parse --is-inside-work-tree # timeout=10
> > > > Fetching changes from the remote Git repository
> > > >  > git config remote.origin.url https://github.com/apache/beam.git #
> > > > timeout=10
> > > > Pruning obsolete local branches
> > > > Fetching upstream changes from https://github.com/apache/beam.git
> > > >  > git --version # timeout=10
> > > >  > git fetch --tags --progress https://github.com/apache/beam.git
> > > > +refs/heads/*:refs/remotes/origin/* +refs/pull/*:refs/remotes/
> > > origin/pr/*
> > > > --prune
> > > >  > git rev-parse origin/master^{commit} # timeout=10
> > > > Checking out Revision 034bcb4e2a2ee59c1a9bf16690547765064327e2
> > > > (origin/master)
> > > >  > git config core.sparsecheckout # timeout=10
> > > >  > git checkout -f 034bcb4e2a2ee59c1a9bf16690547765064327e2
> > > >  > git rev-list c52ce7c4bd952d943bccb8acff53b36b40c35428 #
> timeout=10
> > > > Cleaning workspace
> > > >  > git rev-parse --verify HEAD # timeout=10
> > > > Resetting working tree
> > > >  > git reset --hard # timeout=10
> > > >  > git clean -fdx # timeout=10
> > > > [EnvInject] - Executing scripts and injecting environment variables
> > after
> > > > the SCM step.
> > > > [EnvInject] - Injecting as environment variables the properties
> content
> > > > SPARK_LOCAL_IP=127.0.0.1
> > > >
> > > > [EnvInject] - Variables injected successfully.
> > > > Processing DSL script job_beam_PerformanceTests_Dataflow.groovy
> > > > ERROR: startup failed:
> > > > <https://builds.apache.org/job/beam_SeedJob/ws/.test-
> > > > infra/jenkins/job_beam_PerformanceTests_Dataflow.groovy>: 19: unable
> > to
> > > > resolve class common_job_properties
> > > >  @ line 19, column 1.
> > > >import common_job_properties
> > > >^
> > > >
> > > > 1 error
> > > >
> > > >
> > >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


First IO IT Running!

2017-03-21 Thread Jason Kuster
Hi all,

Exciting news! As of yesterday, we have checked in the Jenkins
configuration for our first continuously running IO Integration Test! You
can check it out in Jenkins here[1]. We’re also publishing results to a
database, and we’ve turned up a basic dashboarding system where you can see
the results here[2]. Caveat: there are only two runs, and we’ll be tweaking
the underlying system still, so don’t panic that we’re up and to the right
currently. ;)

This is the first test running continuously on top of the performance / IO
testing infrastructure described in this doc[3].  Initial support for Beam
is now present in PerfKit Benchmarker; given what they had already, it was
easiest to add support for Dataflow and Java. We need your help to add
additional support! The doc lists a number of JIRA issues to build out
support for other systems. I’m happy to work with people to help them
understand what is necessary for these tasks; just send an email to the
list if you need help and I’ll help you move forwards.

Looking forward to it!

Jason

[1] https://builds.apache.org/job/beam_PerformanceTests_JDBC/
[2]
https://apache-beam-testing.appspot.com/explore?dashboard=5714163003293696
[3]
https://docs.google.com/document/d/1PsjGPSN6FuorEEPrKEP3u3m16tyOzph5FnL2DhaRDz0/edit?ts=58a78e73

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [ANNOUNCEMENT] New committers, March 2017 edition!

2017-03-17 Thread Jason Kuster
Congratulations to the new committers!

On Fri, Mar 17, 2017 at 2:16 PM, Kenneth Knowles <k...@google.com.invalid>
wrote:

> Congrats all!
>
> On Fri, Mar 17, 2017 at 2:13 PM, Davor Bonaci <da...@apache.org> wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > contributors as our newest committers. They have significantly
> contributed
> > to the project in different ways, and we look forward to many more
> > contributions in the future.
> >
> > * Chamikara Jayalath
> > Chamikara has been contributing to Beam since inception, and previously
> to
> > Google Cloud Dataflow, accumulating a total of 51 commits (8,301 ++ /
> 3,892
> > --) since February 2016 [1]. He contributed broadly to the project, but
> > most significantly to the Python SDK, building the IO framework in this
> SDK
> > [2], [3].
> >
> > * Eugene Kirpichov
> > Eugene has been contributing to Beam since inception, and previously to
> > Google Cloud Dataflow, accumulating a total of 95 commits (22,122 ++ /
> > 18,407 --) since February 2016 [1]. In recent months, he’s been driving
> the
> > Splittable DoFn effort [4]. A true expert on IO subsystem, Eugene has
> > reviewed nearly every IO contributed to Beam. Finally, Eugene contributed
> > the Beam Style Guide, and is championing it across the project.
> >
> > * Ismaël Mejia
> > Ismaël has been contributing to Beam since mid-2016, accumulating a total
> > of 35 commits (3,137 ++ / 1,328 --) [1]. He authored the HBaseIO
> connector,
> > helped on the Spark runner, and contributed in other areas as well,
> > including cross-project collaboration with Apache Zeppelin. Ismaël
> reported
> > 24 Jira issues.
> >
> > * Aviem Zur
> > Aviem has been contributing to Beam since early fall, accumulating a
> total
> > of 49 commits (6,471 ++ / 3,185 --) [1]. He reported 43 Jira issues, and
> > resolved ~30 issues. Aviem improved the stability of the Spark runner a
> > lot, and introduced support for metrics. Finally, Aviem is championing
> > dependency management across the project.
> >
> > Congratulations to all four! Welcome!
> >
> > Davor
> >
> > [1]
> > https://github.com/apache/beam/graphs/contributors?from=
> > 2016-02-01=2017-03-17=c
> > [2]
> > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > apache_beam/io/iobase.py#L70
> > [3]
> > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > apache_beam/io/iobase.py#L561
> > [4] https://s.apache.org/splittable-do-fn
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Jenkins build became unstable: beam_SeedJob #165

2017-03-14 Thread Jason Kuster
Manual build, ignore.

On Tue, Mar 14, 2017 at 10:37 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <https://builds.apache.org/job/beam_SeedJob/165/display/redirect>
>
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Performance Testing Next Steps

2017-02-22 Thread Jason Kuster
Hey all, just wanted to pop this up again for people -- if anyone has
thoughts on performance testing please feel welcome to chime in. :)

On Fri, Feb 17, 2017 at 4:03 PM, Jason Kuster <jasonkus...@google.com>
wrote:

> Hi all,
>
> I've written up a doc on next steps for getting performance testing up and
> running for Beam. I'd love to hear from people -- there's a fair amount of
> work encapsulated in here, but the end result is that we have a performance
> testing system which we can use for benchmarking all aspects of Beam, which
> would be really exciting. Looking forward to your thoughts.
>
> https://docs.google.com/document/d/1PsjGPSN6FuorEEPrKEP3u3m16tyOz
> ph5FnL2DhaRDz0/edit?ts=58a78e73
>
> Best,
>
> Jason
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Performance Testing Next Steps

2017-02-17 Thread Jason Kuster
Hi all,

I've written up a doc on next steps for getting performance testing up and
running for Beam. I'd love to hear from people -- there's a fair amount of
work encapsulated in here, but the end result is that we have a performance
testing system which we can use for benchmarking all aspects of Beam, which
would be really exciting. Looking forward to your thoughts.

https://docs.google.com/document/d/1PsjGPSN6FuorEEPrKEP3u3m16tyOzph5FnL2DhaRDz0/edit?ts=58a78e73

Best,

Jason

-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Build failed in Jenkins: beam_SeedJob #115

2017-02-16 Thread Jason Kuster
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:799)
> at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1055)
> at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1086)
> at hudson.scm.SCM.checkout(SCM.java:485)
> at hudson.model.AbstractProject.checkout(AbstractProject.java:
> 1269)
> at hudson.model.AbstractBuild$AbstractBuildExecution.
> defaultCheckout(AbstractBuild.java:604)
> at jenkins.scm.SCMCheckoutStrategy.checkout(
> SCMCheckoutStrategy.java:86)
> at hudson.model.AbstractBuild$AbstractBuildExecution.run(
> AbstractBuild.java:529)
> at hudson.model.Run.execute(Run.java:1741)
> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> at hudson.model.ResourceController.execute(
> ResourceController.java:98)
> at hudson.model.Executor.run(Executor.java:410)
> Caused by: hudson.plugins.git.GitException: Command "git -c
> core.askpass=true fetch --tags --progress https://github.com/apache/${
> repositoryName}.git +refs/heads/*:refs/remotes/origin/*
> +refs/pull/*:refs/remotes/origin/pr/*" returned status code 128:
> stdout:
> stderr: fatal: unable to access 'https://github.com/apache/${
> repositoryName}.git/': The requested URL returned error: 400
>
> at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> launchCommandIn(CliGitAPIImpl.java:1723)
> at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> launchCommandWithCredentials(CliGitAPIImpl.java:1459)
> at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.
> access$300(CliGitAPIImpl.java:63)
> at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.
> execute(CliGitAPIImpl.java:314)
> at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> CommandInvocationHandler$1.call(RemoteGitImpl.java:152)
> at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> CommandInvocationHandler$1.call(RemoteGitImpl.java:145)
> at hudson.remoting.UserRequest.perform(UserRequest.java:153)
> at hudson.remoting.UserRequest.perform(UserRequest.java:50)
> at hudson.remoting.Request$2.run(Request.java:332)
> at hudson.remoting.InterceptingExecutorService$1.call(
> InterceptingExecutorService.java:68)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> at ..remote call to beam1(Native Method)
> at hudson.remoting.Channel.attachCallSiteStackTrace(
> Channel.java:1416)
> at hudson.remoting.UserResponse.retrieve(UserRequest.java:253)
> at hudson.remoting.Channel.call(Channel.java:781)
> at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> CommandInvocationHandler.execute(RemoteGitImpl.java:145)
> at sun.reflect.GeneratedMethodAccessor637.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.jenkinsci.plugins.gitclient.RemoteGitImpl$
> CommandInvocationHandler.invoke(RemoteGitImpl.java:131)
> at com.sun.proxy.$Proxy93.execute(Unknown Source)
> at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:797)
> ... 11 more
> ERROR: null
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Build failed in Jenkins: beam_SeedJob #113

2017-02-16 Thread Jason Kuster
Manual build, safe to ignore.

On Thu, Feb 16, 2017 at 12:10 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <https://builds.apache.org/job/beam_SeedJob/113/>
>
> --
> GitHub pull request #2020 of commit 509b4d61b5b7505d71b5f7d19a541e217d04d407,
> no merge conflicts.
> Setting status of 509b4d61b5b7505d71b5f7d19a541e217d04d407 to PENDING
> with url https://builds.apache.org/job/beam_SeedJob/113/ and message:
> 'Build started sha1 is merged.'
> Using context: Jenkins: Seed Job
> [EnvInject] - Loading node environment variables.
> Building remotely on beam3 (beam) in workspace <https://builds.apache.org/
> job/beam_SeedJob/ws/>
>  > git rev-parse --is-inside-work-tree # timeout=10
> Fetching changes from the remote Git repository
>  > git config remote.origin.url https://github.com/apache/beam.git #
> timeout=10
> Fetching upstream changes from https://github.com/apache/beam.git
>  > git --version # timeout=10
>  > git -c core.askpass=true fetch --tags --progress
> https://github.com/apache/beam.git +refs/heads/*:refs/remotes/origin/*
> +refs/pull/*:refs/remotes/origin/pr/*
>  > git rev-parse refs/remotes/origin/pr/2020/merge^{commit} # timeout=10
>  > git rev-parse refs/remotes/origin/origin/pr/2020/merge^{commit} #
> timeout=10
> Checking out Revision 04a945a46f0b04a65947ec92df327265c71aa785
> (refs/remotes/origin/pr/2020/merge)
>  > git config core.sparsecheckout # timeout=10
>  > git checkout -f 04a945a46f0b04a65947ec92df327265c71aa785
> First time build. Skipping changelog.
> Cleaning workspace
>  > git rev-parse --verify HEAD # timeout=10
> Resetting working tree
>  > git reset --hard # timeout=10
>  > git clean -fdx # timeout=10
> [EnvInject] - Executing scripts and injecting environment variables after
> the SCM step.
> [EnvInject] - Injecting as environment variables the properties content
> SPARK_LOCAL_IP=127.0.0.1
>
> [EnvInject] - Variables injected successfully.
> Processing DSL script job_beam_PostCommit_Java_MavenInstall.groovy
> ERROR: (common_job_properties.groovy, line 33) No signature of method:
> static common_job_properties.setTopLevelJobProperties() is applicable for
> argument types: (javaposse.jobdsl.dsl.jobs.MavenJob, java.lang.String,
> java.lang.String, java.lang.Integer) values: [javaposse.jobdsl.dsl.jobs.
> MavenJob@feaad40, beam, master, 100]
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Apex #363

2017-01-31 Thread Jason Kuster
This seems like it could be a legitimate flake.

Expected: <1970-01-01T00:09:59.999Z>
 but: was <2017-02-01T01:38:42.261Z>


Anyone with more knowledge about the apex runner have any ideas?

On Tue, Jan 31, 2017 at 5:48 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <https://builds.apache.org/job/beam_PostCommit_Java_
> RunnableOnService_Apex/363/changes>
>
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [ANNOUNCEMENT] New committers, January 2017 edition!

2017-01-26 Thread Jason Kuster
Congrats all! Very exciting. :)

On Thu, Jan 26, 2017 at 4:48 PM, Jesse Anderson <je...@smokinghand.com>
wrote:

> Welcome!
>
> On Thu, Jan 26, 2017, 7:27 PM Davor Bonaci <da...@apache.org> wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > contributors as our newest committers. They have significantly
> contributed
> > to the project in different ways, and we look forward to many more
> > contributions in the future.
> >
> > * Stas Levin
> > Stas has contributed across the breadth of the project, from the Spark
> > runner to the core pieces and Java SDK. Looking at code contributions
> > alone, he authored 43 commits and reported 25 issues. Stas is very active
> > on the mailing lists too, contributing to good discussions and proposing
> > improvements to the Beam model.
> >
> > * Ahmet Altay
> > Ahmet is a major contributor to the Python SDK, both in terms of design
> and
> > code contribution. Looking at code contributions alone, he authored 98
> > commits and reviewed dozens of pull requests. With Python SDK’s imminent
> > merge to the master branch, Ahmet contributed towards establishing a new
> > major component in Beam.
> >
> > * Pei He
> > Pei has been contributing to Beam since its inception, accumulating a
> total
> > of 118 commits since February. He has made several major contributions,
> > most recently by redesigning IOChannelFactory / FileSystem APIs (in
> > progress), which would extend Beam’s portability to many additional file
> > systems and cloud providers.
> >
> > Congratulations to all three! Welcome!
> >
> > Davor
> >
>



-- 
---
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow


Re: Better developer instructions for using Maven?

2017-01-25 Thread Jason Kuster
gt; >>>>> others to look at it. I think this should be our criteria (i.e. what
> >>>>> will a new but maven-savvy user run before pushing their code).
> >>>>>
> >>>>> As long as the pre-commit hooks still check everything I'm ok with
> >>>>>>
> >>>>> making
> >>>>
> >>>>> the default a little more lightweight.
> >>>>>>
> >>>>>
> >>>>> The fact that our pre-commit hooks take a long time to run does
> change
> >>>>> things. Nothing more annoying than seeing that your PR failed 3 hours
> >>>>> later because you had some trailing whitespace...
> >>>>>
> >>>>> On Thu, 5 Jan 2017 at 21:49 Lukasz Cwik <lc...@google.com.invalid>
> >>>>>>
> >>>>> wrote:
> >>>>>
> >>>>>>
> >>>>>> I was hoping that the default mvn verify would be the slow build
> >>>>>>>
> >>>>>> and a
> >>>
> >>>> profile could be enabled that would skip checks to make things
> >>>>>>>
> >>>>>> faster
> >>>
> >>>> for
> >>>>>
> >>>>>> regular contributors. This way a person doesn't need to have
> >>>>>>>
> >>>>>> detailed
> >>>
> >>>> knowledge of all our profiles and what they do (typically mvn
> >>>>>>>
> >>>>>> verify)
> >>>
> >>>> will
> >>>>>
> >>>>>> do the right thing most of the time.
> >>>>>>>
> >>>>>>> On Thu, Jan 5, 2017 at 9:30 AM, Dan Halperin
> >>>>>>>
> >>>>>> <dhalp...@google.com.invalid>
> >>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> On Thu, Jan 5, 2017 at 9:28 AM, Jesse Anderson <
> >>>>>>>>
> >>>>>>> je...@smokinghand.com
> >>>>
> >>>>>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> @dan are you saying that mvn verify isn't doing checkstyle
> >>>>>>>>>
> >>>>>>>> anymore?
> >>>>
> >>>>>
> >>>>>>>>
> >>>>>>>> `mvn verify` alone should not be running checkstyle, if modules
> >>>>>>>>
> >>>>>>> are
> >>>
> >>>> configured correctly.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Some of
> >>>>>>>>> the checkstyles are still running for a few modules. Also, the
> >>>>>>>>>
> >>>>>>>> contribution
> >>>>>>>>
> >>>>>>>>> docs will need to change.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Yes. The PR includes discussion of these other needed changes,
> >>>>>>>> unfortunately one PR can't change two repositories.
> >>>>>>>>
> >>>>>>>> Please continue the discussion on the PR, then I will summarize it
> >>>>>>>>
> >>>>>>> back
> >>>>>
> >>>>>> into the dev thread.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Dan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> They say to run mvn verify before commits.
> >>>>>>>>>
> >>>>>>>>> On Thu, Jan 5, 2017 at 9:25 AM Dan Halperin
> >>>>>>>>>
> >>>>>>>> <dhalp...@google.com.invalid
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Several folks seem to have been confused after BEAM-246, where
> >>>>>>>>>>
> >>>>>>>>> we
> >>>>
> >>>>> moved
> >>>>>>>
> >>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> "slow things" into the release profile. I've started a
> >>>>>>>>>>
> >>>>>>>>> discussion
> >>>>
> >>>>> with
> >>>>>>>
> >>>>>>>> https://github.com/apache/beam/pull/1740 to see if there are
> >>>>>>>>>>
> >>>>>>>>> things
> >>>>>
> >>>>>> we
> >>>>>>>
> >>>>>>>> can
> >>>>>>>>>
> >>>>>>>>>> do to fill these gaps.
> >>>>>>>>>>
> >>>>>>>>>> Would love folks to chime in with opinions.
> >>>>>>>>>>
> >>>>>>>>>> Dan
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jan 4, 2017 at 1:34 PM, Jesse Anderson <
> >>>>>>>>>>
> >>>>>>>>> je...@smokinghand.com>
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> @Eugene, yes that failed on the checkstyle.
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jan 4, 2017 at 1:27 PM Eugene Kirpichov
> >>>>>>>>>>> <kirpic...@google.com.invalid> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Try just -Prelease.
> >>>>>>>>>>>> On Wed, Jan 4, 2017 at 1:21 PM Jesse Anderson <
> >>>>>>>>>>>>
> >>>>>>>>>>> je...@smokinghand.com
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Fails because I don't have a secret key.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jan 4, 2017 at 1:03 PM Jean-Baptiste Onofré <
> >>>>>>>>>>>>>
> >>>>>>>>>>>> j...@nanthrax.net
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Jesse,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Could you try the same with:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> mvn verify -Prelease,apache-release
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards
> >>>>>>>>>>>>>> JB
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 01/04/2017 09:53 PM, Jesse Anderson wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For some reason, running "mvn verify" isn't running
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> checkstyle
> >>>>>>>>
> >>>>>>>>> on
> >>>>>>>>>
> >>>>>>>>>> everything. I had checkstyle errors in
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> beam-sdks-java-core
> >>>>>
> >>>>>> that
> >>>>>>>>
> >>>>>>>>> weren't
> >>>>>>>>>>>>
> >>>>>>>>>>>>> being found.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I thought this was due to the extra parameters. I
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> reran
> >>>>
> >>>>> with
> >>>>>>>
> >>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> plain
> >>>>>>>>>>>>
> >>>>>>>>>>>>> "mvn
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> verify" and it still didn't find them. From the
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> output,
> >>>>
> >>>>> it
> >>>>>
> >>>>>> doesn't
> >>>>>>>>>>
> >>>>>>>>>>> look
> >>>>>>>>>>>>
> >>>>>>>>>>>>> like they're being run at all.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Jesse
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Jean-Baptiste Onofré
> >>>>>>>>>>>>>> jbono...@apache.org
> >>>>>>>>>>>>>> http://blog.nanthrax.net
> >>>>>>>>>>>>>> Talend - http://www.talend.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>



-- 
---
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow


Re: Build failed in Jenkins: beam_PostCommit_Java_RunnableOnService_Dataflow #2046

2017-01-17 Thread Jason Kuster
sults
> 2017-01-18T07:12:26.996 [INFO]
> 2017-01-18T07:12:26.996 [INFO] --- maven-dependency-plugin:2.10:analyze-only
> (default) @ beam-runners-google-cloud-dataflow-java ---
> 2017-01-18T07:12:27.054 [INFO] No dependency problems found
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> [JENKINS] Archiving disabled
> 2017-01-18T07:12:29.809 [INFO] --
> --
> 2017-01-18T07:12:29.809 [INFO] Reactor Summary:
> 2017-01-18T07:12:29.810 [INFO]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: Parent
> .. SUCCESS [ 21.063 s]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: SDKs
>  SUCCESS [  0.519 s]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: SDKs :: Java
>  SUCCESS [  0.454 s]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: SDKs :: Java :: Core
>  SUCCESS [02:09 min]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: Runners
> . SUCCESS [  0.600 s]
> 2017-01-18T07:12:29.810 [INFO] Apache Beam :: Runners :: Google Cloud
> Dataflow  SUCCESS [  01:08 h]
> 2017-01-18T07:12:29.810 [INFO] --
> --
> 2017-01-18T07:12:29.810 [INFO] BUILD SUCCESS
> 2017-01-18T07:12:29.810 [INFO] --
> --
> 2017-01-18T07:12:29.810 [INFO] Total time: 01:11 h
> 2017-01-18T07:12:29.810 [INFO] Finished at: 2017-01-18T07:12:29+00:00
> 2017-01-18T07:12:30.215 [INFO] Final Memory: 90M/3226M
> 2017-01-18T07:12:30.215 [INFO] --
> --
> Waiting for Jenkins to finish collecting data
> channel stopped
>



-- 
---
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow


Some Thoughts on IO Integration Tests

2017-01-10 Thread Jason Kuster
Hi all,

Following up on some of the discussions already on-list, I wanted to
solicit some more feedback about some implementation details regarding the
IO Integration Tests.

As it currently stands, we mostly have IO ITs for GCP-based IO, which our
GCP-based Jenkins executors handle natively, but as our integration test
coverage begins to expand, we're going to run into several of the problems
relevant to what Steven is doing with hosting data stores for use by ITs. I
wanted to get people's feedback about how to handle passing credentials to
the ITs. We have a couple options, motivated by some goals.

Goals:

* Has to work in Apache Beam CI environment.
* Has to run on dev machines (w/o access to beam CI environment).
* One way of passing datastore config only.
* An individual IT fails fast if run and it doesn't have valid config.
* IO performance tests will have a validation component (this implies we
need to run the IO ITs, not just the IO IT pipelines).
* Devs working on an individual IO transform can run Integration & perf
tests without recreating the data stores every time
* Devs working on a runner's IO can run all the IO integration & perf
tests. They may have to recreate the data stores every time (or possibly
have a manual config that helps with this.) It's okay if this world is a
bit awkward, it just needs to be possible.


Option 1: IO Configuration File

The first option is to read all credentials from some file stored on disk.
We can define a location for an (xml, json, yaml, etc) file which we can
read in each IT to find the credentials that IT needs. This method has a
couple of upsides and a couple of downsides.

* Upsides
* Passing credentials to ITs, and adding new credentials, is relatively
easy.
* Individual users can spin up their own data stores, put the
credentials in the file, run their ITs and have things just work.
* Downsides
* Relying on a file, especially a file not checked in to the repository
(to prevent people from accidentally sharing credentials to their data
store, etc) is fragile and can lead to some confusing failure cases.
* ITs are supposed to be self-contained; using a file on disk makes
things like running them in CI harder.
* It seems like datastore location, username, and password are things
that are a better fit for the IT PipelineOptions anyway.


Option 2: TestPipelineOptions

Another option is to specify them as general PipelineOptions on
TestPipelineOptions and then to build the specific IT's options from there.
For example, say we have MyStoreIT1, MyStoreIT2 and MyStoreIT3. We could
specify inside of TestPipelineOptions some options like "MyStoreIP",
"MyStoreUsername", and "MyStorePassword", and then the command for invoking
them would look like (omitting some irrelevant things):

mvn clean verify -DskipITs=false -DbeamTestPipelineOptions='[...,
"--MyStoreIP=1.2.3.4", "--MyStoreUsername=beamuser",
"--MyStorePassword=hunter2"]'.
* Upsides
* Test is self-contained -- no dependency on an external file and all
relevant things can be specified on the command line; easier for users and
CI.
* Passing credentials to ITs via pipelineoptions feels better.
* Downsides
* Harder to pass different credentials to one specific IT; e.g. I want
MyStoreIT1 and 2 to run against 1.2.3.4, but MyStoreIT3 to run against
9.8.7.6.
* Investing in this pattern means a proliferation of
TestPipelineOptions. Potentially bad, especially for a CI suite running a
large number of ITs -- size of command line args may become unmanageable
with 5+ data stores.


Option 3: Individual IT Options

The last option I can think of is to specify the options directly on the
IT's options, e.g. MyStoreIT1Options, and set defaults which work well for
CI. This means that CI could run an entire suite of ITs without specifying
any arguments and trusting that the ITs' defaults will work, but means an
individual developer is potentially able to run only one IT at a time,
since it will be impossible to override all options from the command line.
* Upsides
* Test is still self-contained, and even more so -- possible to specify
args targeted at one IT in particular.
* Args are specified right where they're used; way smaller chance of
confusion or mistakes.
* Easiest for CI -- as long as defaults for data store auth and
location are correct from the perspective of the Jenkins executor, it can
essentially just turn all ITs on and run them as is.
* Downsides
* Hardest for individual developers to run an entire suite of ITs --
since defaults are configured for running in CI environment, they will
likely fail when running on the user's machine, resulting in annoyance for
the user.


If anyone has thoughts on these, please let me know.

Best,

Jason

-- 
---
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow


Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Apex #119

2016-12-27 Thread Jason Kuster
Error is

Expected: <1970-01-01T00:00:00.000Z>
 but: was <2016-12-27T17:53:23.629Z>


2016-12-27T17:53:23.629Z is the time the test was run. Looks like it
could be an actual issue?


On Tue, Dec 27, 2016 at 11:11 AM, Jason Kuster <jasonkus...@google.com>
wrote:

> Sorry, this is for #117. Moving there, investigating #119.
>
> On Tue, Dec 27, 2016 at 11:10 AM, Jason Kuster <jasonkus...@google.com>
> wrote:
>
>> Pipeline succeeded when it should have failed (PAssert for a
>> containsinanyorder on (1, 2, 3, 4) and (2, 1, 3, 4, 7)). Is there a known
>> issue for this?
>>
>> On Tue, Dec 27, 2016 at 10:06 AM, Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:
>>
>>> See <https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>> OnService_Apex/119/changes>
>>>
>>>
>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam (Incubating) / Google Cloud Dataflow
>>
>
>
>
> --
> ---
> Jason Kuster
> Apache Beam (Incubating) / Google Cloud Dataflow
>



-- 
---
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow