Re: Aliasing Pub/Sub Lite IO in external repo

2021-06-17 Thread Tomo Suzuki
Hi Daniel, (You helped me apply some change to this strange setup a few months back. Thank you for working on rectifying the situation.) I like that idea overall. Question 1: How are you going to approach testing/CI? The pull requests in the java-pubsublite repo do not trigger Beam repo's CI.

Re: [VOTE] Release 2.30.0, release candidate #1

2021-06-17 Thread Andrew Pilloud
For the website I filed https://issues.apache.org/jira/browse/BEAM-12507 For the jars I filed https://issues.apache.org/jira/browse/BEAM-12508 Andrew On Mon, Jun 14, 2021 at 5:16 PM Justin Mclean wrote: > Hi there, > > I'm not on your PMC but I took a look at your release and noticed >

Aliasing Pub/Sub Lite IO in external repo

2021-06-17 Thread Daniel Collins
Hello beam developers, I'm the primary author of the Pub/Sub Lite I/O, and I'd like to get some feedback on a change to the model for hosting this I/O in beam. Our team has been frustrated by the fact that we have no way to release features or fixes for bugs to customers on time scales shorter

Flaky test issue report (29)

2021-06-17 Thread Beam Jira Bot
This is your daily summary of Beam's current flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake) These are P1 issues because they have a major negative impact on the community and make it hard to

P1 issues report (41)

2021-06-17 Thread Beam Jira Bot
This is your daily summary of Beam's current P1 issues, not including flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake). See

Re: FileIO with custom sharding function

2021-06-17 Thread Robert Bradshaw
Sharding is typically a physical rather than logical property of the pipeline, and I'm not convinced it makes sense to add it to Beam in general. One can already use keys and GBK/Stateful DoFns if some kind of logical grouping is needed, and adding constraints like this can prevent opportunities

Re: Java precomit failing, (though no test are failing)

2021-06-17 Thread Alex Amato
Hmm, perhaps it only happens sometimes. The other half of the time I "Run Java Precommit" on this PR I hit this different failure: The connection is not obvious to me, if its related to my PR. https://github.com/apache/beam/pull/14804 I only added some Precondition checks. But I don't see those

Re: FileIO with custom sharding function

2021-06-17 Thread je . ik
Alright, but what is worth emphasizing is that we talk about batch workloads. The typical scenario is that the output is committed once the job finishes (e.g., by atomic rename of directory). JanDne 17. 6. 2021 17:59 napsal uživatel Reuven Lax :Yes - the problem is that Beam makes no guarantees of

Re: [PROPOSAL] Stable URL for "current" API Documentation

2021-06-17 Thread Cristian Constantinescu
Big +1 here. In the past few days I've replaced the 2.*.0 part of the google found javadoc url with 2.29.0 more times than I could count. I should have made a pipeline with a session window to count those replacements them though :P On Thu, Jun 17, 2021 at 12:18 PM Robert Bradshaw wrote: > This

Re: [PROPOSAL] Stable URL for "current" API Documentation

2021-06-17 Thread Robert Bradshaw
This makes a lot of sense to me. On Thu, Jun 17, 2021 at 9:03 AM Brian Hulette wrote: > > Hi everyone, > You may have noticed that our API Documentation could really use some SEO. > It's possible to search for Beam APIs (e.g. "beam dataframe read_csv" [1] or > "beam java ParquetIO" [2]) and

[PROPOSAL] Stable URL for "current" API Documentation

2021-06-17 Thread Brian Hulette
Hi everyone, You may have noticed that our API Documentation could really use some SEO. It's possible to search for Beam APIs (e.g. "beam dataframe read_csv" [1] or "beam java ParquetIO" [2]) and you will be directed to some documentation, but it almost always takes you to an old version. I think

Re: FileIO with custom sharding function

2021-06-17 Thread Reuven Lax
Yes - the problem is that Beam makes no guarantees of determinism anywhere in the system. User DoFns might be non deterministic, and have no way to know (we've discussed proving users with an @IsDeterministic annotation, however empirically users often think their functions are deterministic when

Re: [EXTERNAL] [EXTERNAL]

2021-06-17 Thread Alexey Romanenko
> On 15 Jun 2021, at 22:59, Raphael Sanamyan > wrote: > > Hello, > >> Is it somehow related to this work [1]? > > > No, this work adds the ability to return values from a sql insert query. > There are no improvements to work with row and schema in it. > >> Not sure that I got it. Could

Re: [Proposal] Go SDK Exits Experimental

2021-06-17 Thread Robert Burke
Yup! My immediate plan is to work on incorporating the Go SDK fully into the Beam Programming Guide. I've audited the guide, and am beginning to add missing content and filling in the Go specific gaps. This will be tied to improving the Go Doc with more Go specific user documentation that isn't

CometD IO Connector for Beam

2021-06-17 Thread ZAFFALON Mattia - NTTDATA
Hi, I'm reaching out to ask you a couple of questions regarding the existence, and eventually the state, of a connector for CometD (https://docs.cometd.org/), especially when used as publish/subscribe mechanism. You may have already heard about CometD, but, just to provide some context to

Re: [Proposal] Go SDK Exits Experimental

2021-06-17 Thread Ismaël Mejía
Oups forgot to write one question. Will this come with revamped website instructions/doc for golang too? On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía wrote: > > Huge +1 > > This is definitely something many people have asked about, so it is > great to see it finally happening. > > On Wed, Jun

Re: [Proposal] Go SDK Exits Experimental

2021-06-17 Thread Ismaël Mejía
Huge +1 This is definitely something many people have asked about, so it is great to see it finally happening. On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles wrote: > > +1 awesome > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke wrote: >> >> Sounds reasonable to me. I agree. We'll aim to get

Re: FileIO with custom sharding function

2021-06-17 Thread Jan Lukavský
Correct, by the external shuffle service I pretty much meant "offloading the contents of a shuffle phase out of the system". Looks like that is what the Spark's checkpoint does as well. On the other hand (if I understand the concept correctly), that implies some performance penalty - the data