Re: Links to Java API docs in Beam Website documentation (Was: Version Beam Website Documentation)

2019-12-10 Thread Kenneth Knowles
+1 to site.release_latest We do have a dead link checker in the website tests. Does it not catch moved classes, etc? On Tue, Dec 10, 2019 at 1:49 PM Pablo Estrada wrote: > +1 to rely on expanding {{site.release_latest}}. > > On Tue, Dec 10, 2019 at 12:05 PM Brian Hulette > wrote: > >> I was

Re: Is org.apache.beam.sdk.transforms.FlattenTest.testFlattenMultipleCoders supposed to be supported ?

2019-12-10 Thread Kenneth Knowles
It is a good point. Nullable(VarLong) and VarLong are two different types, with least upper bound that is Nullable(VarLong). BigEndianLong and VarLong are two different types, with no least upper bound in the "coders" type system. Yet we understand that the values they encode are equal. I do not

Re: Hello!

2019-12-10 Thread Kenneth Knowles
Hi Paweł, I went to add you and it seems someone else has already done so. Welcome! Kenn On Mon, Dec 9, 2019 at 11:53 PM Paweł Pasterz wrote: > My name is Paweł and I am a developer from Warsaw/Poland. I'd like to > contribute to beam project therefore I am kindly asking for JIRA >

Re: Cython unit test suites running without Cythonized sources

2019-12-10 Thread Udi Meiri
Sorry I didn't realize you already had a solution for the shadowing issue and BEAM-8572. On Tue, Dec 10, 2019 at 6:21 PM Chad Dombrova wrote: > Hi Udi, I know you're aware of my PR > , but I really encourage you > to look into pep517 and pep518. They

Re: Cython unit test suites running without Cythonized sources

2019-12-10 Thread Chad Dombrova
Hi Udi, I know you're aware of my PR , but I really encourage you to look into pep517 and pep518. They are the new solution for all of this -- declaring build dependencies and creating isolated out-of-source builds e.g. using tox. Another thing I added

Re: request for access: pypi and dockerhub

2019-12-10 Thread Udi Meiri
Thank you both! On Tue, Dec 10, 2019 at 1:58 PM Ahmet Altay wrote: > Udi, added 'udim' as a maintainer to pypi. > Pablo, I updated you to be an owner. You should have privileges now. > > Ahmet > > On Tue, Dec 10, 2019 at 12:01 PM Pablo Estrada wrote: > >> I've added you as a maintainer in

Re: Cython unit test suites running without Cythonized sources

2019-12-10 Thread Udi Meiri
To follow up, since I'm trying to run cython-based tests using pytest: - tox does in fact correctly install apache-beam with cythonized modules in its virtualenv. - Since our tests are under apache_beam/, local sources shadow those in the installed apache_beam package. - The original issue I

Re: Pipeline parameters for running jobs in a cluster

2019-12-10 Thread Ankur Goenka
Hi Matthew, For 1: Beam does not compute the right configuration for the pipeline so its recommended to tune it manually as it's done in regular Spark jobs. For 2: The recommendation is same as that for a regular Spark job. Thanks, Ankur On Tue, Dec 10, 2019 at 2:46 PM Matthew K. wrote: >

Pipeline parameters for running jobs in a cluster

2019-12-10 Thread Matthew K.
Hi,   To run a beam job on a spark cluster with some number of nodes running:   1. Is it recommended to set pipeline parameters --num_workers, --max_num_workers, --autoscaling_algorithms, --worker_machine_type, etc, or beam (spark) will figure that out?   2. If that is recommended to set

Re: request for access: pypi and dockerhub

2019-12-10 Thread Ahmet Altay
Udi, added 'udim' as a maintainer to pypi. Pablo, I updated you to be an owner. You should have privileges now. Ahmet On Tue, Dec 10, 2019 at 12:01 PM Pablo Estrada wrote: > I've added you as a maintainer in docker hub. > I don't have privileges to add you in pypi. +Ahmet Altay > can you? >

Re: Links to Java API docs in Beam Website documentation (Was: Version Beam Website Documentation)

2019-12-10 Thread Pablo Estrada
+1 to rely on expanding {{site.release_latest}}. On Tue, Dec 10, 2019 at 12:05 PM Brian Hulette wrote: > I was thinking about this recently as well. I requested we add a link to > the java API docs in a website change [1]. I searched around a bit to look > for precedent on how to do this, but I

[Proposal] Slowly Changing Dimensions and Distributed Map Side Inputs (in Dataflow)

2019-12-10 Thread Mikhail Gryzykhin
"Good news, everyone-" ―Farnsworth Hi everyone, Recently, I was looking into relaxing limitations on side inputs in Dataflow runner. As part of it, I came up with design proposal for standardizing slowly changing dimensions use case in Beam and relevant changes to add support for distributed map

Links to Java API docs in Beam Website documentation (Was: Version Beam Website Documentation)

2019-12-10 Thread Brian Hulette
I was thinking about this recently as well. I requested we add a link to the java API docs in a website change [1]. I searched around a bit to look for precedent on how to do this, but I found three different methods: - Links to a specific version (e.g.

Re: request for access: pypi and dockerhub

2019-12-10 Thread Pablo Estrada
I've added you as a maintainer in docker hub. I don't have privileges to add you in pypi. +Ahmet Altay can you? -P. On Mon, Dec 9, 2019 at 4:34 PM Udi Meiri wrote: > Hi, > > I'm following the release guide > , and it says I need > access to a

Re: Python: pytest migration update

2019-12-10 Thread Udi Meiri
On Mon, Dec 9, 2019 at 9:33 PM Kenneth Knowles wrote: > > > On Mon, Dec 9, 2019 at 6:34 PM Udi Meiri wrote: > >> Valentyn, the speedup is due to parallelization. >> >> On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova wrote: >> >>> >>> On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri wrote: >>> I

Re: OnTimerContext timestamp weird behavior?

2019-12-10 Thread Jan Lukavský
Hi Marek, that is because you set the timer to be relative. The baseline time will then be set to current watermark, which is without being initialized equal to BoundedWindow.TIMESTAMP_MIN_VALUE, which is -9223372036854775 (in millis). Adding 1 (10 seconds) to this yields the result you

OnTimerContext timestamp weird behavior?

2019-12-10 Thread marek-simunek
Hi,    PROCESSING_TIME  or EVENT_TIME timer reports in first trigger of timer function in : OnTimerContext context.timestamp()=-290308-12-21T19:59:05.225Z PROCESSING_TIME has known unresolved issue [1], but I didn’t expect that EVENT_TIME would report in @onTimer function weird timestamp in first

Re: [UPDATE] Preparing for Beam 2.17.0 release

2019-12-10 Thread Ismaël Mejía
Mikhail it seems the SQL module generated pom is broken in 2.17.0. We will probably need to create a new RC. For context https://issues.apache.org/jira/browse/BEAM-8858 and still to confirm https://issues.apache.org/jira/browse/BEAM-8917 Cherry pick PRs coming soon. On Fri, Dec 6, 2019 at 7:04 PM

Re: Quota limitation for Java tests

2019-12-10 Thread Łukasz Gajowy
Of course, fixing https://issues.apache.org/jira/browse/BEAM-8939 is also crucial to avoid resource exhaustion but I didn't have time to do this. Anyone, feel free to resolve it. Thanks! wt., 10 gru 2019 o 16:25 Łukasz Gajowy napisał(a): > https://github.com/apache/beam/pull/10342 - pr that

Re: Quota limitation for Java tests

2019-12-10 Thread Łukasz Gajowy
https://github.com/apache/beam/pull/10342 - pr that skips the tests listed above - looking for reviewers Thanks! wt., 10 gru 2019 o 13:30 Łukasz Gajowy napisał(a): > What I invoked in the apache-beam-testing project: > > gcloud dataflow jobs list --created-before=-P5H --status=active >

Re: [DISCUSS] BIP reloaded

2019-12-10 Thread Łukasz Gajowy
+1 for formalizing the process, enhancing it and documenting clearly. I noticed that Apache Airflow has a cool way of both creating AIPs and keeping track of all of them. There is a "Create new AIP" button on

Re: Hello!

2019-12-10 Thread Ismaël Mejía
Hello Pawel You were added now to JIRA. You can now create or select the issues you want to work on. Welcome! Ismaël On Tue, Dec 10, 2019 at 9:33 AM Paweł Pasterz wrote: > Of course, forgot about my JIRA nickname: pawel.pasterz > > On 2019/12/10 07:53:24, Paweł Pasterz wrote: > > My name is

Re: Quota limitation for Java tests

2019-12-10 Thread Łukasz Gajowy
What I invoked in the apache-beam-testing project: gcloud dataflow jobs list --created-before=-P5H --status=active --format="value(JOB_ID)" --region=us-central|xargs gcloud dataflow jobs cancel wt., 10 gru 2019 o 13:28 Łukasz Gajowy napisał(a): > Hi Kirill, > > We (along with Michał and Kamil)

Re: [DISCUSS] BIP reloaded

2019-12-10 Thread jincheng sun
Thanks for bring up this discussion Jan! +1 for cearly define BIP for beam. And I think would be nice to initialize a concept document for BIP. Just a reminder: the document may contains: - How many kinds of improvement in beam. - What kind of improvement should to create a BIP. - What should

Is org.apache.beam.sdk.transforms.FlattenTest.testFlattenMultipleCoders supposed to be supported ?

2019-12-10 Thread Etienne Chauchot
Hi all, I have an interrogation around testFlattenMultipleCoders test: This test uses 2 collections 1. long and null data encoded using NullableCoder(BigEndianLongCoder) 2. long data encoded using VarlongCoder It then flattens the 2 collections and set the coder of the resulting collection

Re: Hello!

2019-12-10 Thread Paweł Pasterz
Of course, forgot about my JIRA nickname: pawel.pasterz On 2019/12/10 07:53:24, Paweł Pasterz wrote: > My name is Paweł and I am a developer from Warsaw/Poland. I'd like to> > contribute to beam project therefore I am kindly asking for JIRA> > permissions.> > > Regards> > Paweł> >