Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Ahmet Altay
On Fri, Nov 16, 2018 at 8:16 PM, Thomas Weise wrote: > > > On Fri, Nov 16, 2018 at 8:02 PM Ahmet Altay wrote: > >> >> >> On Fri, Nov 16, 2018 at 6:25 PM, Thomas Weise wrote: >> >>> >>> >>> On Thu, Nov 15, 2018 at 10:53 PM Charles Chen wrote: >>> +1 Note that we need to

Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Thomas Weise
On Fri, Nov 16, 2018 at 8:02 PM Ahmet Altay wrote: > > > On Fri, Nov 16, 2018 at 6:25 PM, Thomas Weise wrote: > >> >> >> On Thu, Nov 15, 2018 at 10:53 PM Charles Chen wrote: >> >>> +1 >>> >>> Note that we need to temporarily revert >>> https://github.com/apache/beam/pull/6683 before the

Re: [DISCUSS] Reverting commits on green post-commit status

2018-11-16 Thread Ahmet Altay
It sounds like we are in agreement that addressing issues sooner is better. I think reverting is in general the less stressful option because it allows a solution to be developed in parallel. Even with that, it is not the only option we have and based on the severity and the complexity of the

Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Ahmet Altay
On Fri, Nov 16, 2018 at 6:25 PM, Thomas Weise wrote: > > > On Thu, Nov 15, 2018 at 10:53 PM Charles Chen wrote: > >> +1 >> >> Note that we need to temporarily revert https://github.com/apache/ >> beam/pull/6683 before the release branch cut per the discussion at >>

Re: [DISCUSS] Reverting commits on green post-commit status

2018-11-16 Thread Thomas Weise
On Fri, Nov 16, 2018 at 7:39 PM Ahmet Altay wrote: > Thank you for bringing this discussion back to the mailing list. > > On Fri, Nov 16, 2018 at 6:49 PM, Thomas Weise wrote: > >> We have observed instances of changes being reverted in master that have >> been authored following the contributor

Re: [DISCUSS] Reverting commits on green post-commit status

2018-11-16 Thread Ahmet Altay
Thank you for bringing this discussion back to the mailing list. On Fri, Nov 16, 2018 at 6:49 PM, Thomas Weise wrote: > We have observed instances of changes being reverted in master that have > been authored following the contributor guidelines and pass all tests (post > commit). While we

Portable wordcount on Flink runner broken

2018-11-16 Thread Thomas Weise
Since last few days, the steps under https://beam.apache.org/roadmap/portability/#python-on-flink are broken. The gradle task hangs because the job server isn't able to launch the docker container. ./gradlew :beam-sdks-python:portableWordCount -PjobEndpoint=localhost:8099 [CHAIN MapPartition

[DISCUSS] Reverting commits on green post-commit status

2018-11-16 Thread Thomas Weise
We have observed instances of changes being reverted in master that have been authored following the contributor guidelines and pass all tests (post commit). While we generally seem to have quite a bit of revert action happening [1], this thread is about those instances that are outside of our

Re: Bigquery streaming TableRow size limit

2018-11-16 Thread Reuven Lax
This sounds a bit more specific, so I wouldn't add this to BigQueryIO yet. On Thu, Nov 15, 2018 at 6:58 PM Wout Scheepers < wout.scheep...@vente-exclusive.com> wrote: > Thanks for your thoughts. > > Also, I’m doing something similar when streaming data into partitioned > tables. > > From [1]: >

Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Thomas Weise
On Thu, Nov 15, 2018 at 10:53 PM Charles Chen wrote: > +1 > > Note that we need to temporarily revert > https://github.com/apache/beam/pull/6683 before the release branch cut > per the discussion at >

Re: Delete permissions on cwiki

2018-11-16 Thread Thomas Weise
cwiki permissions are a neglected/undefined area. Currently they are set to give PMC full access and everyone else Add / Delete Own. (I deleted the page) Thanks, Thomas On Fri, Nov 16, 2018 at 1:55 PM Scott Wegner wrote: > It seems that cwiki permissions are locked down for deletes. I

Re: Need help regarding memory leak issue

2018-11-16 Thread Udi Meiri
If you're working with Dataflow, it supports this flag: https://github.com/apache/beam/blob/75e9f645c7bec940b87b93f416823b020e4c5f69/sdks/python/apache_beam/options/pipeline_options.py#L602 which uses guppy for heap profiling. On Fri, Nov 16, 2018 at 3:08 PM Ruoyun Huang wrote: > Even tough the

Re: Need help regarding memory leak issue

2018-11-16 Thread Ruoyun Huang
Even tough the algorithm works on your batch system, did you verify anything that can rule out the possibility where it is the underlying ML package causing the memory leak? If not, maybe replace your prediction with a dummy function which does not load any model at all, and always just give the

Delete permissions on cwiki

2018-11-16 Thread Scott Wegner
It seems that cwiki permissions are locked down for deletes. I noticed I also don't have permission. I get error: "Error! You do not have permission to delete the pages." Thomas, do you know how delete permissions are managed? Is this intentionally locked down? -- Forwarded message

Re: [VOTE] Mark 2.7.0 branch as a long term support (LTS) branch

2018-11-16 Thread Ahmet Altay
On Thu, Nov 15, 2018 at 6:11 PM, Ahmet Altay wrote: > Thank you all for voting. This vote was open for 7 days, let's wrap it up. > There were 8 +1, 1 +0, and no -1 votes. > > - I added a new version 2.7.1 for tracking anything that could be used for > tracking whatever we would like to consider

Re: Contributor status change

2018-11-16 Thread Kenneth Knowles
Welcome! I have added you to the "Contributors" role. Kenn On Fri, Nov 16, 2018 at 11:46 AM Adrian Witas wrote: > > *Hi, my name is Adrian Witas. I am interested in contributing GO SDK to > the Apache Beam SDK. I'd like to be added **as a Jira contributor so that > I can assign issues to

Contributor status change

2018-11-16 Thread Adrian Witas
*Hi, my name is Adrian Witas. I am interested in contributing GO SDK to the Apache Beam SDK. I'd like to be added **as a Jira contributor so that I can assign issues to myself. My ASF Jira Username is witas.* Here is my contribution so far https://issues.apache.org/jira/browse/BEAM-5729

[DISCUSS] Could we depreciate BeamSqlCli?

2018-11-16 Thread Rui Wang
Hi, BeamSqlCli is a wrapper of BeamSqlEnv and it is experimental. This wrapper seems redundant to me: one can directly use BeamSqlEnv because BeamSqlEnv accepts TableProvider. I searched codebase and didn't see strong evidence that BeamSqlCli can do something special. Is there any use case in

Re: :beam-sdks-java-io-hadoop-input-format:test task issues

2018-11-16 Thread Alex Amato
Would this be correct? I want to run it as part of presubmit. I updated libjffi-jni as well, with apt-get. Still encountering the same issue. https://gradle.com/s/wlaql5cxpb3lk ./gradlew :javaPreCommit --stacktrace --scan -Dcom.datastax.driver.USE_NATIVE_CLOCK=false On Mon, Nov 5, 2018 at 9:37

Re: [VOTE] Release Vendored gRPC 1.13.1 and Guava 20.0, release candidate #1

2018-11-16 Thread Kenneth Knowles
I notice in the vendored Guava jar there is: META-INF/maven/com.google.guava/guava/pom.xml META-INF/maven/com.google.guava/guava/pom.properties Are these expected? If not, are they benign? I haven't found any documentation for what these contents actually mean or do. There are many more in the

Re: [VOTE] Release Vendored gRPC 1.13.1 and Guava 20.0, release candidate #1

2018-11-16 Thread Thomas Weise
It would be nice to have a build task that allows to create the source artifacts locally, if we cannot publish them. +1 for the release On Fri, Nov 16, 2018 at 7:48 AM Lukasz Cwik wrote: > I have been relying on the Intellij's ability to decompile the class > files, its not as good as the

Re: Python profiling

2018-11-16 Thread Ahmet Altay
On Fri, Nov 16, 2018 at 10:12 AM, Thomas Weise wrote: > Since it is for users, it should eventually go to the web site. > > How about a new section under: https://beam.apache. > org/documentation/sdks/python/ > > "Troubleshooting and Tuning" ? > That is a good idea. > > > On Fri, Nov 16, 2018

Re: Python profiling

2018-11-16 Thread Thomas Weise
Since it is for users, it should eventually go to the web site. How about a new section under: https://beam.apache.org/documentation/sdks/python/ "Troubleshooting and Tuning" ? On Fri, Nov 16, 2018 at 10:08 AM Ahmet Altay wrote: > > > On Fri, Nov 16, 2018 at 2:12 AM, Robert Bradshaw >

Re: Python profiling

2018-11-16 Thread Ahmet Altay
On Fri, Nov 16, 2018 at 2:12 AM, Robert Bradshaw wrote: > One needs to ensure that gprof2dot is importable (i.e. installed via pip > into your Python environment). > > As for specifying the FnApiRunner via the runner argument, --runner can > take fully qualified names (if it's not in the short

Re: Wiki edit access

2018-11-16 Thread Lukasz Cwik
I tried finding your account on cwiki.apache.org but was unable to, what is your user id on cwiki.apache.org? On Thu, Nov 15, 2018 at 7:51 AM Wout Scheepers < wout.scheep...@vente-exclusive.com> wrote: > Can anyone give me edit access for the wiki? > > > > Thanks, > > Wout >

Re: A new Beam Runner on Apache Nemo

2018-11-16 Thread Kenneth Knowles
Hi Wonook, Very cool! I see it here: https://github.com/apache/incubator-nemo/tree/master/compiler/frontend/beam/src/main/java/org/apache/nemo/compiler/frontend/beam Some more details on what Max said about running the ValidatesRunner tests: - if you are planning to contribute the runner to

Re: [VOTE] Release Vendored gRPC 1.13.1 and Guava 20.0, release candidate #1

2018-11-16 Thread Lukasz Cwik
I have been relying on the Intellij's ability to decompile the class files, its not as good as the original source for sure. On Fri, Nov 16, 2018 at 3:26 AM Maximilian Michels wrote: > +1 > > We decided not to publish source files for now. The main reason are > possible legal issues with

Re: Contributor permission for Beam Jira tickets

2018-11-16 Thread Ismaël Mejía
Hello Fabien, welcome to Beam, you have now permission to self assign JIRA tickets. Enjoy! On Fri, Nov 16, 2018 at 1:59 PM Fabien Rousseau wrote: > > Hi, I'm using BEAM and Cassandra. I'd like to be granted contributor > permission for Beam JIRA tickets. My user id is frousseau. > > Thanks. >

Contributor permission for Beam Jira tickets

2018-11-16 Thread Fabien Rousseau
Hi, I'm using BEAM and Cassandra. I'd like to be granted contributor permission for Beam JIRA tickets. My user id is frousseau. Thanks. Fabien

Re: Flink operator max parallelism and rescalable jobs

2018-11-16 Thread Jozef Vilcek
Hey Max, thanks for the pointer to UnboundedSourceWrapper. I have created BEAM-6077 and will try to come up with the patch On Fri, Nov 16, 2018 at 12:41 PM Maximilian Michels wrote: > Hi Jozef, > > The main blocker for rescaling Beam pipelines on Flink was the use of > Key Group state. This

Re: Flink operator max parallelism and rescalable jobs

2018-11-16 Thread Maximilian Michels
Hi Jozef, The main blocker for rescaling Beam pipelines on Flink was the use of Key Group state. This splits each operator state additionally into N partitions, such that N * P = MAX_PARALLELISM, where P is the parallelism of the operator. This has largely been done. However, it is not

Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Łukasz Gajowy
+1 Thanks, Łukasz pt., 16 lis 2018, 12:00: Maximilian Michels napisał(a): > +1 for starting the release process in time > > > Note that we need to temporarily revert > https://github.com/apache/beam/pull/6683 before the release branch cut > per the discussion > > +1 > > Thanks, > Max > > On

Re: [VOTE] Release Vendored gRPC 1.13.1 and Guava 20.0, release candidate #1

2018-11-16 Thread Maximilian Michels
+1 We decided not to publish source files for now. The main reason are possible legal issues with publishing relocated source code. On 16.11.18 05:24, Thomas Weise wrote: Thanks for driving this. Did we reach a conclusion regarding publishing relocated source artifacts? Debugging would be

Re: A new Beam Runner on Apache Nemo

2018-11-16 Thread Maximilian Michels
Hi Wonook, First of all, welcome to the Beam community! It is great to see another Runner emerging. If you're planning to contribute your Runner to Beam, you should verify the compatibility with the ValidatesRunner integration tests. Then open a PR with documentation, a Runner page, and

Re: [PROPOSAL] Prepare Beam 2.9.0 release

2018-11-16 Thread Maximilian Michels
+1 for starting the release process in time > Note that we need to temporarily revert https://github.com/apache/beam/pull/6683 before the release branch cut per the discussion +1 Thanks, Max On 16.11.18 07:53, Charles Chen wrote: +1 Note that we need to temporarily revert

Re: Python profiling

2018-11-16 Thread Robert Bradshaw
One needs to ensure that gprof2dot is importable (i.e. installed via pip into your Python environment). As for specifying the FnApiRunner via the runner argument, --runner can take fully qualified names (if it's not in the short list of known runners). However, the FnApiRunner is the DirectRunner

Flink operator max parallelism and rescalable jobs

2018-11-16 Thread Jozef Vilcek
Hi, I want to collect some feedback on rescaling streaming Beam pipeline on Flink runner. Flink seems to be able to re-scale jobs, which in Beam terms means changing the parallelism in Beam. However, one have to make sure that state can rescale as well to the predefined MAX parallelism. Max