Re: Connection leaks with PostgreSQL instance

2019-01-16 Thread Jonathan Perron
Hi Kenneth, Thank you for your reply. I find out that the leak was coming somehow from my individual DoFns. I replaced all the connections by a connection pooling and I haven't seen connection leaks since. I will keep monitoring the pipeline state and if I see new leaks, I would investigate

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-16 Thread Kenneth Knowles
Quick update on this. There are three remaining issues: - https://issues.apache.org/jira/browse/BEAM-6407: A DirectRunner self-check was broken from 2.8.0 to 2.9.0 - PR looks good modulo our infra flakes - https://issues.apache.org/jira/browse/BEAM-6354: PAssert + DirectRunner + Unbounded data

Re: [spark runner based on dataset POC] your opinion

2019-01-16 Thread Kenneth Knowles
Cool! I don't quite understand the issue in "bytes serialization to comply to spark dataset schemas to store windowedValues". Can you say a little more? Kenn On Tue, Jan 15, 2019 at 8:54 AM Etienne Chauchot wrote: > Hi guys, > regarding the new (made from scratch) spark runner POC based on

Re: gradle clean causes long-running python installs

2019-01-16 Thread Kenneth Knowles
Filed https://issues.apache.org/jira/browse/BEAM-6459 to record the conclusion. Doesn't require Beam knowledge so I labeled "starter". Kenn On Wed, Jan 16, 2019 at 12:14 AM Michael Luckey wrote: > This seems to be on purpose [1] > > AFAIU setup is done to be able to call into setup.py clean.

[PROPOSAL] decrease the number of threads for BigQuery streaming insertAll

2019-01-16 Thread Heejong Lee
Hi, I want to suggest the change[1] of the thread pool type in BigQuery streaming insert for Java SDK (BEAM-6443). When we insert small data into BigQuery very fast by using BigQueryIO.write, it generates lots of rate limit exceeded errors in a log file. It's mainly because the number of threads

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ruoyun Huang
+1 This would be a great thing to have. On Wed, Jan 16, 2019 at 6:11 PM Ankur Goenka wrote: > grc.io seems to be a good option. Given that we don't need the hosting > server name in the image name makes it easily changeable later. > > Docker container for Apache Flink is named "flink" and they

Re: Add all tests to release validation

2019-01-16 Thread Kenneth Knowles
Good points. I wasn't tuned in to those nuances of how the jobs are run. I think we *could* cause a postcommit job to run against exactly that commit hash instead of origin/master, but I won't advocate for that. My suggestion of the "find a green commit" approach is a holdover from continuously

Re: Enforce javadoc comments in public methods?

2019-01-16 Thread Ruoyun Huang
Hi, everyone, To make sure we move forward to a clean state where we catch violations in any new PR, we created this change: https://github.com/apache/beam/pull/7532 This PR makes checkstyle to report error on missing javadocs. For existing violations, we explicitly added them as suppression

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ankur Goenka
grc.io seems to be a good option. Given that we don't need the hosting server name in the image name makes it easily changeable later. Docker container for Apache Flink is named "flink" and they have different tags for different releases and configurations https://hub.docker.com/_/flink .We can

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ahmet Altay
For snapshots, we could use gcr.io. Permission would not be a problem since Jenkins is already correctly setup. The cost will be covered under apache-beam-testing project. And since this is only for snapshots, it will be only for temporary artifacts not for release artifacts. On Wed, Jan 16, 2019

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Valentyn Tymofieiev
+1, releasing containers is a useful process that we need to build in Beam and it is required for FnApi users. Among other reasons, having officially-released Beam SDK harness container images will make it easier for users to do simple customizations to container images, as they will be able to

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ankur Goenka
On Wed, Jan 16, 2019 at 5:37 PM Ahmet Altay wrote: > > > On Wed, Jan 16, 2019 at 5:28 PM Ankur Goenka wrote: > >> - Could we start from snapshots first and then do it for releases? >> +1, releasing snapsots first makes sense to me. >> - For snapshots, do we need to clean old containers after a

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Kenneth Knowles
Sounds good. I set up the bintray stuff a while ago but got stuck on perms to have Jenkins upload the snapshot, and the release was not really relevant. Kenn On Wed, Jan 16, 2019 at 5:37 PM Ahmet Altay wrote: > > > On Wed, Jan 16, 2019 at 5:28 PM Ankur Goenka wrote: > >> - Could we start from

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ahmet Altay
On Wed, Jan 16, 2019 at 5:28 PM Ankur Goenka wrote: > - Could we start from snapshots first and then do it for releases? > +1, releasing snapsots first makes sense to me. > - For snapshots, do we need to clean old containers after a while? > Otherwise I guess we will accumulate lots of

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ankur Goenka
- Could we start from snapshots first and then do it for releases? +1, releasing snapsots first makes sense to me. - For snapshots, do we need to clean old containers after a while? Otherwise I guess we will accumulate lots of containers. For snap shots we can maintain a single snapshot image from

Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ahmet Altay
This sounds like a good idea. Some questions: - Could we start from snapshots first and then do it for releases? - For snapshots, do we need to clean old containers after a while? Otherwise I guess we will accumulate lots of containers. - Do we also need additional code changes for snapshots and

Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-01-16 Thread Ankur Goenka
Hi All, As portability/FnApi is taking shape and are compatible with ULR and Flink. I wanted to discuss the release plan release of SDKHarness Docker images. Of-course users can create their own images but it will be useful to have a default image available out of box. Pre build image are a must

Re: Our jenkins beam1 server is down

2019-01-16 Thread Yifan Zou
Yes, beam14 is offline as well. We're on it. On Wed, Jan 16, 2019 at 4:11 PM Ruoyun Huang wrote: > With another try, succeeding on beam10. > > Thanks for the fix. > > On Wed, Jan 16, 2019 at 3:53 PM Ruoyun Huang wrote: > >> Just did a rerun, got error saying "*10:12:21* ERROR: beam14 is

Re: TestDirectRunner for Java?

2019-01-16 Thread Kenneth Knowles
Yea, I guess I mean I have no objections to make DirectRunner grab and run TestPipelineOptions.getOnSuccessMatcher. But my counter-proposal is meant to read as: remove onSuccessMatcher entirely and only use waitUntilFinish and then assert on the results. I suspect the onSuccessMatcher came from

Re: Our jenkins beam1 server is down

2019-01-16 Thread Ruoyun Huang
With another try, succeeding on beam10. Thanks for the fix. On Wed, Jan 16, 2019 at 3:53 PM Ruoyun Huang wrote: > Just did a rerun, got error saying "*10:12:21* ERROR: beam14 is offline; > cannot locate JDK 1.8 (latest)". > > Beam1 is not the only one broken? > > On Wed, Jan 16, 2019 at 3:45

Re: TestDirectRunner for Java?

2019-01-16 Thread Udi Meiri
Existing one? I'm not sure what you mean. There's already TestPipelineOptions.setOnSuccessMatcher(), but that's silently ignored on runners like DirectRunner that don't support it. As I see it the differences are: onSuccessMatcher - runs after the pipeline has completed successfully, can be

Re: Our jenkins beam1 server is down

2019-01-16 Thread Ruoyun Huang
Just did a rerun, got error saying "*10:12:21* ERROR: beam14 is offline; cannot locate JDK 1.8 (latest)". Beam1 is not the only one broken? On Wed, Jan 16, 2019 at 3:45 PM Yifan Zou wrote: > The beam1 was still accepting jobs and breaking them after reset this > morning. We temporarily

Re: [Go SDK] User Defined Coders

2019-01-16 Thread Robert Burke
I've updated the design doc with a section on schemas. Interestingly, the lack of Generics in Go ends up being very handy. No incompatibility between converting from a concrete type, and it's Schema equivalent.

Re: Python Flink tests failing on Jenkins

2019-01-16 Thread Ankur Goenka
The problem can be because of long task name. I have seen this happen is post commits and hence shortened the name to " beam_PostCommit_Python_VR_Flink" (Mind the VR instead of validates runner). Created a new PR to address this https://github.com/apache/beam/pull/7539 On Wed, Jan 16, 2019 at

Re: Our jenkins beam1 server is down

2019-01-16 Thread Yifan Zou
The VM instance was reset and Infra is trying to repuppetize it. https://issues.apache.org/jira/browse/INFRA-17672 is created to track this issue. On Wed, Jan 16, 2019 at 10:51 AM Mark Liu wrote: > Thanks you Yifan! > > Looks like following precommits are affected according to my PR: > >

Re: Watch for growth error

2019-01-16 Thread Vilhelm von Ehrenheim
Ok, thanks! On Wed, 16 Jan 2019, 17:47 Kenneth Knowles Hi Vilhelm, > > You've hit https://issues.apache.org/jira/browse/BEAM-6352. We are > treating this as a blocker for 2.10.0. > > Kenn > > On Wed, Jan 16, 2019 at 8:44 AM Vilhelm von Ehrenheim < > vonehrenh...@gmail.com> wrote: > >> Hi!I am

Re: Our jenkins beam1 server is down

2019-01-16 Thread Mark Liu
Thanks you Yifan! Looks like following precommits are affected according to my PR: Java_Examples_Dataflow, Portable_Python, Website_Stage_GCS On Wed, Jan 16, 2019 at 9:25 AM Yifan Zou wrote: > I am looking on it. > > On Wed, Jan 16, 2019 at 8:18 AM Ismaël Mejía wrote: > >> Can somebody PTAL.

Re: Our jenkins beam1 server is down

2019-01-16 Thread Yifan Zou
I am looking on it. On Wed, Jan 16, 2019 at 8:18 AM Ismaël Mejía wrote: > Can somebody PTAL. Sadly the poor jenkins shuffling algorithm is > sending most builds to it so there are issues to validate some PRs. >

Re: Add all tests to release validation

2019-01-16 Thread Scott Wegner
I like the idea of using test greenness to choose a release commit. There's a couple challenges with our current setup: 1) Post-commits don't run at every commit. The Jenkins jobs are configured to run on pushes to master, but (at least some Jobs) are serialized to run a single Jenkins job

Re: Watch for growth error

2019-01-16 Thread Kenneth Knowles
Hi Vilhelm, You've hit https://issues.apache.org/jira/browse/BEAM-6352. We are treating this as a blocker for 2.10.0. Kenn On Wed, Jan 16, 2019 at 8:44 AM Vilhelm von Ehrenheim < vonehrenh...@gmail.com> wrote: > Hi!I am trying to get a watch transform that always read the whole file if > it

Watch for growth error

2019-01-16 Thread Vilhelm von Ehrenheim
Hi!I am trying to get a watch transform that always read the whole file if it was changed at all. I can get this working in Beam 2.8 but get the following error when using 2.9: java.lang.IllegalArgumentException: org.apache.beam.sdk.transforms.Watch$WatchGrowthFn, @ProcessElement

Our jenkins beam1 server is down

2019-01-16 Thread Ismaël Mejía
Can somebody PTAL. Sadly the poor jenkins shuffling algorithm is sending most builds to it so there are issues to validate some PRs.

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-16 Thread Kenneth Knowles
Thanks, Ismaël! On Wed, Jan 16, 2019 at 2:13 AM Ismaël Mejía wrote: > Ok since there were not many issues I did the 'update' for the > misplaced issues to version 2.10. We are good to go. New resolved > issues in master musg go now into 2.11.0 > > On Wed, Jan 16, 2019 at 10:38 AM Ismaël Mejía

Re: GSOC - Summer of Code, on Beam?

2019-01-16 Thread Ismaël Mejía
Little reminder for the interested parties: GSoC 2019 Org applications are now open! Deadline is at February 6 at 20:00 UTC For mentors, remember that apart of the standard process you must apply via filling the Apache spreadsheet. On Fri, Dec 14, 2018 at 6:44 PM Kenneth Knowles wrote: > > I put

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-16 Thread Ismaël Mejía
Ok since there were not many issues I did the 'update' for the misplaced issues to version 2.10. We are good to go. New resolved issues in master musg go now into 2.11.0 On Wed, Jan 16, 2019 at 10:38 AM Ismaël Mejía wrote: > > This means that the tickets resolved and marked for 2.11 since

Re: Python Flink tests failing on Jenkins

2019-01-16 Thread Michael Luckey
similar to https://jira.apache.org/jira/browse/BEAM-4256 ? On Wed, Jan 16, 2019 at 10:40 AM Robert Bradshaw wrote: > I created https://github.com/apache/beam/pull/7533 to roll this back. > > I am a bit at a loss as to why this would be failing in PreCommit but > pass in PostCommit. Especially

Re: Python Flink tests failing on Jenkins

2019-01-16 Thread Robert Bradshaw
I created https://github.com/apache/beam/pull/7533 to roll this back. I am a bit at a loss as to why this would be failing in PreCommit but pass in PostCommit. Especially as the task ":beam-sdks-python:setupVirtualenv" runs fine in the other precommits. Any insights here would be appreciated. On

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-16 Thread Ismaël Mejía
This means that the tickets resolved and marked for 2.11 since January 2 should be reviewed and retargetted to version 2.10. So this is a call for action for committers who have merged fixes after the cut to update the tickets if required. Ismaël On Tue, Jan 15, 2019 at 9:22 PM Kenneth Knowles

Re: gradle clean causes long-running python installs

2019-01-16 Thread Michael Luckey
This seems to be on purpose [1] AFAIU setup is done to be able to call into setup.py clean. We probably should work around that. [1] https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1600-L1610 On Wed, Jan 16, 2019 at 7:01 AM