Re: RedisIO refactoring

2019-05-20 Thread trsell
Hi, I created a custom RedisIO for similar reasons, we wanted more operation types, but also, we wanted to be able to set expiry differently per key. https://github.com/gojek/feast/blob/master/ingestion/src/main/java/feast/store/serving/redis/RedisCustomIO.java We ended up with a PTransform

beam_PreCommit_Python_PVR_Flink_Commit most perma-red

2019-05-20 Thread Udi Meiri
FYI, I opened an issue here: https://issues.apache.org/jira/browse/BEAM-7378 Please triage if you know how these tests work. Thanks! smime.p7s Description: S/MIME Cryptographic Signature

Re: [Discuss] Ideas for Apache Beam presence in social media

2019-05-20 Thread Austin Bennett
Is PMC definitely in charge of this (approving, communication channel, etc)? There could even be a more concrete pull-request-like function even for things like tweets (to minimize cut/paste operations)? I remember a bit of a mechanism having been proposed some time ago (in another

[Discussion] A tweak to existing large iterable protocol?

2019-05-20 Thread Ruoyun Huang
Hi, Folks, We propose to make a tweak to existing fnapi Large Iterable (result from GBK) protocol. Would like to see what everyone thinks. *To clarify a few terms used:* *[large iterable]* A list of elements that are too expensive to hold them all in memory; To store a single element is

Re: [Discuss] Ideas for Apache Beam presence in social media

2019-05-20 Thread Robert Burke
+1 As a twitter user, I like this idea. On Mon, 20 May 2019 at 15:18, Aizhamal Nurmamat kyzy wrote: > Hello everyone, > > What does the community think of making Apache Beam’s social media > presence more active and more community driven? > > The Slack and StackOverflow for Apache Beam offer

Re: [VOTE] Remove deprecated Java Reference Runner code from repository.

2019-05-20 Thread Daniel Oliveira
Pablo has merged the PR in and assigned a tag to the commit to make the ULR code easy to find in the future (java-ulr-removal ). The Java ULR is officially removed! On Fri, May 17, 2019 at 4:59 PM Daniel Oliveira wrote: > It's been 72 hours

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Reza Rokni
Hi, If I have understood the use case correctly, your output is an ordered counter of state changes. One approach which might be worth exploring is outlined below, haven't had a chance to test it so could be missing pieces or be plane old wrong ( will try and come up with a test example later

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Kenneth Knowles
Thanks for the nice small example of a calculation that depends on order. You are right that many state machines have this property. I agree w/ you and Luke that it is convenient for batch processing to sort by event timestamp before running a stateful ParDo. In streaming you could also implement

Beam Dependency Check Report (2019-05-20)

2019-05-20 Thread Apache Jenkins Server
ERROR: File 'src/build/dependencyUpdates/beam-dependency-check-report.html' does not exist

Beam Dependency Check Report (2019-05-20)

2019-05-20 Thread Apache Jenkins Server
ERROR: File 'src/build/dependencyUpdates/beam-dependency-check-report.html' does not exist

[Discuss] Ideas for Apache Beam presence in social media

2019-05-20 Thread Aizhamal Nurmamat kyzy
Hello everyone, What does the community think of making Apache Beam’s social media presence more active and more community driven? The Slack and StackOverflow for Apache Beam offer pretty nice support, but we still could utilize Twitter & LinkedIn better to share more interesting Beam news. For

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Jan Lukavský
Yes, the problem will arise probably mostly when you have not well distributed keys (or too few keys). I'm really not sure if a pure GBK with a trigger can solve this - it might help to have data driven trigger. There would still be some doubts, though. The main question is still here - people

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Lukasz Cwik
It is read all per key and window and not just read all (this still won't scale with hot keys in the global window). The GBK preceding the StatefulParDo will guarantee that you are processing all the values for a specific key and window at any given time. Is there a specific window/trigger that is

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Jan Lukavský
Hi Lukasz, > Today, if you must have a strict order, you must guarantee that your StatefulParDo implements the necessary "buffering & sorting" into state. Yes, no problem with that. But this whole discussion started, because *this doesn't work on batch*. You simply cannot first read

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Lukasz Cwik
On Mon, May 20, 2019 at 8:24 AM Jan Lukavský wrote: > This discussion brings many really interesting questions for me. :-) > > > I don't see batch vs. streaming as part of the model. One can have > microbatch, or even a runner that alternates between different modes. > > Although I understand

Fwd: FW: Travel Assistance for ApacheCon NA Las Vegas 2019 now open.

2019-05-20 Thread Aizhamal Nurmamat kyzy
If it helps anyone. -- Forwarded message - From: Christofer Dutz Date: Fri, May 17, 2019 at 12:30 AM Subject: FW: Travel Assistance for ApacheCon NA Las Vegas 2019 now open. To: d...@training.apache.org Hi all, I just wanted to let you know that we finally managed to open the

Re: Proposal: Add permanent url to community metrics dashboard

2019-05-20 Thread Mikhail Gryzykhin
@Ahmet Altay Thank you for the comment. Point on search engines is really good. If that happens we can look into configuring robots.txt to notify search engines to ignore whole domain. The link is a redirect to static IP. So it is still confusing. Having domain name will allow for getting SSL

Re: Dealing with incompatible changes in build system on LTS releases

2019-05-20 Thread Lukasz Cwik
On Fri, May 17, 2019 at 4:44 AM Michael Luckey wrote: > > > On Wed, May 15, 2019 at 8:56 PM Kenneth Knowles wrote: > >> >> >> On Wed, May 15, 2019 at 11:21 AM Lukasz Cwik wrote: >> >>> >>> >>> *From: *Michael Luckey >>> *Date: *Tue, May 14, 2019 at 11:42 PM >>> *To: * >>> >>> Hi,

Re: Proposal: Add permanent url to community metrics dashboard

2019-05-20 Thread Ahmet Altay
Hi Mikhail, Thank you for your work on this. I have some comments: - There is already a short link (https://s.apache.org/beam-community-metrics). Would a link from contributing to beam page (if there is not one already) sufficient> People can bookmark the short link if they need to quickly

Beam Dependency Check Report (2019-05-20)

2019-05-20 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue future 0.16.0 0.17.1 2016-10-27

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread charith . ellawala
That was quick! Thanks Maximilian and Alexey. On 2019/05/20 15:47:10, Alexey Romanenko wrote: > Charith, > > Yes, it was caused by changing a scope of "kafka-clients" dependency. Now it > should work fine (I run “JavaPortabilityApi” test on your PR and it’s green), > thanks to Maximilian

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Alexey Romanenko
Charith, Yes, it was caused by changing a scope of "kafka-clients" dependency. Now it should work fine (I run “JavaPortabilityApi” test on your PR and it’s green), thanks to Maximilian Michels for quick fix. Sorry for inconvenience. > On 20 May 2019, at 16:24, Maximilian Michels wrote: > >

Re: RedisIO refactoring

2019-05-20 Thread Ismaël Mejía
Hello Varun, This is an excellent idea because Redis already supports byte arrays as both keys and values. A more generic approach makes total sense. So worth a JIRA / PR. About the compatiblity concerns, RedisIO is tagged as @Experimental which means we can still evolve its API. Currently we

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Jan Lukavský
This discussion brings many really interesting questions for me. :-) > I don't see batch vs. streaming as part of the model. One can have microbatch, or even a runner that alternates between different modes. Although I understand motivation of this statement, this project name is "Apache

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Robert Bradshaw
On Mon, May 20, 2019 at 1:19 PM Jan Lukavský wrote: > > Hi Robert, > > yes, I think you rephrased my point - although no *explicit* guarantees > of ordering are given in either mode, there is *implicit* ordering in > streaming case that is due to nature of the processing - the difference >

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Maximilian Michels
I've opened a PR: https://github.com/apache/beam/pull/8625 Due to the provided scope, we have to add it as an explicit dependency to the container task. Thanks, Max On 20.05.19 15:16, Robert Bradshaw wrote: I created https://issues.apache.org/jira/browse/BEAM-7367 On Mon, May 20, 2019 at

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread charith . ellawala
Thank you. On 2019/05/20 13:16:06, Robert Bradshaw wrote: > I created https://issues.apache.org/jira/browse/BEAM-7367 > > On Mon, May 20, 2019 at 3:11 PM Michael Luckey wrote: > > > > This is most likely caused by Merge of > > https://issues.apache.org/jira/browse/BEAM-7349, which was done

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Robert Bradshaw
I created https://issues.apache.org/jira/browse/BEAM-7367 On Mon, May 20, 2019 at 3:11 PM Michael Luckey wrote: > > This is most likely caused by Merge of > https://issues.apache.org/jira/browse/BEAM-7349, which was done lately. > > Best, > > michel > > On Mon, May 20, 2019 at 2:49 PM Charith

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Michael Luckey
This is most likely caused by Merge of https://issues.apache.org/jira/browse/BEAM-7349, which was done lately. Best, michel On Mon, May 20, 2019 at 2:49 PM Charith Ellawala wrote: > Hello, > > I am trying to create a PR for BEAM-6673 which adds schema support for > BigQuery reads

Re: [BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Robert Bradshaw
I have no idea about this failure, but it sounds like you've done due diligence looking into it at this point and it makes sense to ask some reviewers to take a look at your code which can happen in parallel to figuring out the root cuase of this kafka issue before it finally gets submitted. On

Re: Contributor permissions for Beam JIRA

2019-05-20 Thread Maximilian Michels
You should have the contributor permission now. Cheers, Max On 20.05.19 11:59, Kamil Wasilewski wrote: Here's my username: kamilwu Kamil On Mon, May 20, 2019 at 11:47 AM Maximilian Michels > wrote: Hi Kamil, That sounds great. Could you send me your JIRA

[BEAM-6673] pre-commit checks failing due to test failure in unrelated Docker module

2019-05-20 Thread Charith Ellawala
Hello, I am trying to create a PR for BEAM-6673 which adds schema support for BigQuery reads (https://github.com/apache/beam/pull/8620). However, one of the pre-commit tests is failing in the (unrelated) Docker module: *Task :sdks:java:container:docker* FAILEDADD failed: stat

Beam Dependency Check Report (2019-05-20)

2019-05-20 Thread Apache Jenkins Server
ERROR: File 'src/build/dependencyUpdates/beam-dependency-check-report.html' does not exist

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Jan Lukavský
On 5/20/19 1:39 PM, Reuven Lax wrote: On Mon, May 20, 2019 at 4:19 AM Jan Lukavský > wrote: Hi Robert, yes, I think you rephrased my point - although no *explicit* guarantees of ordering are given in either mode, there is *implicit* ordering in

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Reuven Lax
On Mon, May 20, 2019 at 4:19 AM Jan Lukavský wrote: > Hi Robert, > > yes, I think you rephrased my point - although no *explicit* guarantees > of ordering are given in either mode, there is *implicit* ordering in > streaming case that is due to nature of the processing - the difference > between

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Jan Lukavský
Hi Robert, yes, I think you rephrased my point - although no *explicit* guarantees of ordering are given in either mode, there is *implicit* ordering in streaming case that is due to nature of the processing - the difference between watermark and timestamp of elements flowing through the

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-20 Thread Robert Bradshaw
On Fri, May 17, 2019 at 4:48 PM Jan Lukavský wrote: > > Hi Reuven, > > > How so? AFAIK stateful DoFns work just fine in batch runners. > > Stateful ParDo works in batch as far, as the logic inside the state works for > absolutely unbounded out-of-orderness of elements. That basically >

Re: Contributor permissions for Beam JIRA

2019-05-20 Thread Kamil Wasilewski
Here's my username: kamilwu Kamil On Mon, May 20, 2019 at 11:47 AM Maximilian Michels wrote: > Hi Kamil, > > That sounds great. Could you send me your JIRA username? I couldn't find > your account on JIRA. > > Thanks, > Max > > On 20.05.19 11:27, Kamil Wasilewski wrote: > > Hi, > > I am Kamil

Re: Contributor permissions for Beam JIRA

2019-05-20 Thread Maximilian Michels
Hi Kamil, That sounds great. Could you send me your JIRA username? I couldn't find your account on JIRA. Thanks, Max On 20.05.19 11:27, Kamil Wasilewski wrote: Hi, I am Kamil Wasilewski and I would like to start making improvements to the Beam Python SDK. I would like to assign JIRA issues

Contributor permissions for Beam JIRA

2019-05-20 Thread Kamil Wasilewski
Hi, I am Kamil Wasilewski and I would like to start making improvements to the Beam Python SDK. I would like to assign JIRA issues to myself. Can someone mark me as a contributor? Thanks, Kamil

Re: FlinkRunner CheckPoint Failed - Couldn't materialized/TypeSerialization

2019-05-20 Thread Maximilian Michels
Hi, Since you get AbstractMethodError, there is likely a version mismatch with Beam/Flink. Could you please provide: - Beam version used - Flink Runner artifact used - Flink version used Thanks, Max On 20.05.19 01:16, cm...@godaddy.com wrote: Hello Beam Dev, I am having a hard time to