Re: Custom Watermark Instance being created multiple times for KafkaIO

2019-05-22 Thread rahul patwari
Hi Lukasz, There was a bug in my code. When the topic is idle, I indeed get watermark as (now - maxDelay). I have a few questions: I have created a static int variable in my watermark class and incremented the variable inside the constructor. I ran the pipeline in SparkRunner for approximately 3

Re: Contributing Beam Kata (Java & Python)

2019-05-22 Thread hsuryawirawan
Btw the Beam Kata courses have been reviewed and endorsed by JetBrains on Stepik. They are now featured courses, which adds more credibility when the students browse the available courses. JetBrains is also interested to come up with promotion activities to promote the courses. Will inform here

[VOTE] Release 2.13.0, release candidate #1

2019-05-22 Thread Ankur Goenka
Hi everyone, Please review and vote on the release candidate #1 for the version 2.13.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1],

Re: Jenkins not triggering test runs? Be careful with merges.

2019-05-22 Thread Lukasz Cwik
I just noticed this as well on my PR. On Wed, May 22, 2019 at 12:26 PM Jan Lukavský wrote: > Looks like something is wrong with triggers on github side. Triggers are > not triggered on other projects, too. > Jan > -- Původní e-mail -- > Od: Valentyn Tymofieiev > Komu: dev >

Re: Jenkins not triggering test runs? Be careful with merges.

2019-05-22 Thread Jan Lukavský
Looks like something is wrong with triggers on github side. Triggers are not triggered on other projects, too. Jan -- Původní e-mail -- Od: Valentyn Tymofieiev Komu: dev Datum: 22. 5. 2019 21:23:19 Předmět: Jenkins not triggering test runs? Be careful with merges. " Is

Re: Dataflow runner with Apache Beam 2.12

2019-05-22 Thread Lukasz Cwik
Have you tried following the troubleshooting your pipeline guide[1]? Have you tried to reach out to Google Cloud support with an example job id? 1: https://cloud.google.com/dataflow/docs/guides/troubleshooting-your-pipeline On Wed, May 22, 2019 at 12:16 PM pasquale.bon...@gmail.com <

Jenkins not triggering test runs? Be careful with merges.

2019-05-22 Thread Valentyn Tymofieiev
Is something happening with Jenkins infra? On a recent PR of mine precommit tests have not been triggered for at least 10 min, and are still not triggered. Another PR was merged before tests on a new revision of PR started to run, while passing test signal actually referred to an earlier

Dataflow runner with Apache Beam 2.12

2019-05-22 Thread pasquale . bonito
Hi all, I would like to use the streaming engine in my project to see if performance improves. in order to that I upgraded my project dependencies from google-cloud-dataflow-java-sdk-all 2.5.0 to beam-sdks-java-core 2.12.0. I also added a dependency from beam-runners-google-cloud-dataflow-java.

Re: Custom Watermark Instance being created multiple times for KafkaIO

2019-05-22 Thread Lukasz Cwik
On Wed, May 22, 2019 at 11:17 AM rahul patwari wrote: > will watermark also get checkpointed by default along with the offset of > the partition? > > We have found a limitation for CustomTimestampPolicyWithLimitedDelay. Consider > this scenario: > If we are processing a stream of events from

Re: Custom Watermark Instance being created multiple times for KafkaIO

2019-05-22 Thread rahul patwari
will watermark also get checkpointed by default along with the offset of the partition? We have found a limitation for CustomTimestampPolicyWithLimitedDelay. Consider this scenario: If we are processing a stream of events from Kafka with event timestamps older than the current processing time(say

Re: Environments for External Transforms

2019-05-22 Thread Lukasz Cwik
2(c) can also be "hacked" inside an SDK as an explicit environment override by the "user" where the expansion service isn't involved and the user/SDK manipulates the expansion service response. As Chamikara pointed out, I believe the response from the expansion service should be "safe" instead of

Re: Environments for External Transforms

2019-05-22 Thread Chamikara Jayalath
On Wed, May 22, 2019 at 9:17 AM Maximilian Michels wrote: > Hi, > > Robert and me were discussing on the subject of user-specified > environments for external transforms [1]. We couldn't decide whether > users should have direct control over the environment when they use an > external transform

Re: RedisIO refactoring

2019-05-22 Thread Alexey Romanenko
On 21 May 2019, at 22:06, Ismaël Mejía wrote: > > After a quick review of the code now I think I understand why it was modeled > as KV in the first place, the library that RedisIO uses > (Jedis) only supports 'mget' operation on Strings, so the first issue would > be to find a way to do the

Environments for External Transforms

2019-05-22 Thread Maximilian Michels
Hi, Robert and me were discussing on the subject of user-specified environments for external transforms [1]. We couldn't decide whether users should have direct control over the environment when they use an external transform in their pipeline. In my mind, it is quite natural that the

Re: Proposal: Add permanent url to community metrics dashboard

2019-05-22 Thread Kenneth Knowles
I suggest asking infra about the best way to proceed, so that we don't vote on something that doesn't work for them. This might be something handy to spin up easily for any Apache project using similar tools. Kenn On Tue, May 21, 2019 at 1:02 PM Mikhail Gryzykhin wrote: > Current

Re: Better naming for runner specific options

2019-05-22 Thread Maximilian Michels
+1 On 22.05.19 04:28, Reza Rokni wrote: Hi, Coming back to this, is the general consensus that this can be addressed via https://issues.apache.org/jira/browse/BEAM-6531 in Beam 3.0? Cheers Reza On Tue, 7 May 2019 at 23:15, Valentyn Tymofieiev > wrote: I

Re: Please add me to BEAM JIRA

2019-05-22 Thread Lukasz Cwik
Welcome. I have added you as a contributor and assigned BEAM-7101 to you. On Tue, May 21, 2019 at 8:49 PM E. J. Arens wrote: > Hi, > > As instructed by Valentyn Tymofieiev at > https://issues.apache.org/jira/browse/BEAM-7101 > > Could you please add me to Beam JIRA, so that you can assign the

Re: [Discussion] A tweak to existing large iterable protocol?

2019-05-22 Thread Robert Bradshaw
On Tue, May 21, 2019 at 9:33 PM Lukasz Cwik wrote: > I don't think the runner needs to know the size of the iterable ahead of > time for encoding, the binary format could be something like: > first page | token for first page | token for second page > The only difference between the current

Re: Definition of Unified model

2019-05-22 Thread Maximilian Michels
Someone from Flink might correct me if I'm wrong, but that's my current understanding. In essence your description of how exactly-once works in Flink is correct. The general assumption in Flink is that pipelines must be deterministic and thus produce idempotent writes in the case of