[PROPOSAL] Make PBegin and PDone public in the Python SDK

2020-07-13 Thread Udi Meiri
Details: One item of interest that came up during the implementation of BEAM-10258 [1] is how to treat PTransforms that act like sources or sinks. These transforms have either no input or output PCollections, respectively. Internally, we use PBegin and PDone to denote this. (ex: [2]) IIUC, PBegin

Re:

2020-07-13 Thread Ahmet Altay
Welcome! On Fri, Jul 10, 2020 at 12:57 PM Tyson Hamilton wrote: > Welcome! > > On Fri, Jul 10, 2020, 10:38 AM Rui Wang wrote: > >> Welcome! >> >> >> -Rui >> >> On Fri, Jul 10, 2020 at 10:33 AM Kenneth Knowles wrote: >> >>> Welcome to dev@ ! >>> >>> On Fri, Jul 10, 2020 at 2:14 AM Maximilian

Re: Composable DoFn IOs Connection Reuse

2020-07-13 Thread Siyuan Chen
Thanks a lot for the discussions and comments! My conclusions for the proposal (https://s.apache.org/sharded-group-into-batches) are as follows: - Keep the existing API for GroupIntoBatches as is. No exposure of shard id. - Enable runner determined sharding as a default (and only) option for

Re: Contributor permission for Beam Jira tickets

2020-07-13 Thread Pablo Estrada
I've added you as contributor Siyuan! Welcome! : D On Mon, Jul 13, 2020 at 3:08 PM Siyuan Chen wrote: > Hi, > > This is Siyuan from Google. I am working on scalability improvements for > Dataflow runner. Can someone grant me the contributor permission for Beam > Jira tickets? My Jira username

Contributor permission for Beam Jira tickets

2020-07-13 Thread Siyuan Chen
Hi, This is Siyuan from Google. I am working on scalability improvements for Dataflow runner. Can someone grant me the contributor permission for Beam Jira tickets? My Jira username is sychen. Thanks in advance! -- Best regards, Siyuan

Re: Finer-grained test runs?

2020-07-13 Thread Kenneth Knowles
Some context links for the benefit of the thread & archive: Beam issue mentioning a Jenkins plugin that caches on the Jenkins master: https://issues.apache.org/jira/browse/BEAM-4400 Beam's request to infra: https://issues.apache.org/jira/browse/INFRA-16630 Denied and reasoning on prior request:

Re: Finer-grained test runs?

2020-07-13 Thread Kenneth Knowles
Having thought this over a bit, I think there are a few goals and they are interfering with each other. 1. Clear signal for module / test suite health. This is a post-commit concern. Post-commit jobs already all run as cronjobs with no dependency-driven stuff. 2. Making precommit test signal stay

Re: Python SDK ReadFromKafka: Timeout expired while fetching topic metadata

2020-07-13 Thread Kamil Wasilewski
I'd like to bump this thread up since I get the same error when trying to read from Kafka in Python SDK: *java.lang.UnsupportedOperationException: The ActiveBundle does not have a registered bundle checkpoint handler.* Can someone familiar with cross-language and Flink verify the problem? I use

Beam Dependency Check Report (2020-07-13)

2020-07-13 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue cachetools 3.1.1 4.1.1 2019-12-23