Re: [RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-29 Thread Claire McGinty
Thanks Udi! I just tagged you in a cherrypick PR.

- Claire

On Wed, Sep 29, 2021 at 3:39 PM Udi Meiri  wrote:

> Yes, we can include this fix in RC2. I haven't started preparing it yet.
> Could someone prepare a cherrypick?
>
> On Wed, Sep 29, 2021 at 10:10 AM Ryan Skraba  wrote:
>
>> Just to follow up -- the cherry-pick of
>> https://github.com/apache/beam/pull/15616 changes the default value of
>> a configuration option that appears for the first time in 2.33.0.
>>
>> I think it's a strong argument for making the change now, before
>> unwary developers start using the wrong default.  I understand that
>> it's *extremely* late in the release cycle!
>>
>> All my best, Ryan
>>
>> On Wed, Sep 29, 2021 at 6:50 PM Alexey Romanenko
>>  wrote:
>> >
>> > Is it still possible to cherry-pick this fix [1][2] since it’s a recent
>> regression that touches pipelines in production?
>> >
>> > [1] https://github.com/apache/beam/pull/15616
>> > [2] https://issues.apache.org/jira/browse/BEAM-12628
>> >
>> > On 28 Sep 2021, at 03:31, Udi Meiri  wrote:
>> >
>> > I spoke too soon. We will be doing an rc2
>> >
>> > On Mon, Sep 27, 2021 at 1:29 PM Udi Meiri  wrote:
>> >>
>> >> I'm happy to announce that we have unanimously approved this release.
>> >>
>> >> There are 8 approving votes, 4 of which are binding:
>> >> * Ahmet Altay
>> >> * Alexey Romanenko
>> >> * Robert Bradshaw
>> >> * Chamikara Jayalath
>> >>
>> >> There are no disapproving votes.
>> >>
>> >> Thanks everyone!
>> >>
>> >
>>
>


Flaky test issue report (33)

2021-09-29 Thread Beam Jira Bot
This is your daily summary of Beam's current flaky tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake)

These are P1 issues because they have a major negative impact on the community 
and make it hard to determine the quality of the software.

https://issues.apache.org/jira/browse/BEAM-12928: beam_PostCommit_Python36 
- CrossLanguageSpannerIOTest - flakey failing (created 2021-09-21)
https://issues.apache.org/jira/browse/BEAM-12912: 
:runners:direct-java:runMobileGamingJavaDirect flakey failure in 
PostRelease_NightlySnapshot (created 2021-09-17)
https://issues.apache.org/jira/browse/BEAM-12908: 
[beam_PostCommit_Java_DataflowV1] [TestName] 
org.apache.beam.sdk.io.gcp.pubsublite.ReadWriteIT Failing (created 2021-09-16)
https://issues.apache.org/jira/browse/BEAM-12861: 
apache_beam.ml.gcp.recommendations_ai_test_it.RecommendationAIIT.test_create_catalog_item
  is flaky (created 2021-09-09)
https://issues.apache.org/jira/browse/BEAM-12859: 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky (created 2021-09-08)
https://issues.apache.org/jira/browse/BEAM-12809: 
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky (created 2021-08-26)
https://issues.apache.org/jira/browse/BEAM-12766: Already Exists: Dataset 
apache-beam-testing:python_bq_file_loads_NNN (created 2021-08-16)
https://issues.apache.org/jira/browse/BEAM-12694: DICOMIoIntegrationTest 
flaky due to store ID (Python PreCommit) (created 2021-07-30)
https://issues.apache.org/jira/browse/BEAM-12540: 
beam_PostRelease_NightlySnapshot - Task 
:runners:direct-java:runMobileGamingJavaDirect FAILED (created 2021-06-25)
https://issues.apache.org/jira/browse/BEAM-12515: Python PreCommit flaking 
in PipelineOptionsTest.test_display_data (created 2021-06-18)
https://issues.apache.org/jira/browse/BEAM-12322: Python precommit flaky: 
Failed to read inputs in the data plane (created 2021-05-10)
https://issues.apache.org/jira/browse/BEAM-12320: 
PubsubTableProviderIT.testSQLSelectsArrayAttributes[0] failing in SQL 
PostCommit (created 2021-05-10)
https://issues.apache.org/jira/browse/BEAM-12291: 
org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming: 
false] is flaky (created 2021-05-05)
https://issues.apache.org/jira/browse/BEAM-12200: 
SamzaStoreStateInternalsTest is flaky (created 2021-04-20)
https://issues.apache.org/jira/browse/BEAM-12163: Python GHA PreCommits 
flake with grpc.FutureTimeoutError on SDK harness startup (created 2021-04-13)
https://issues.apache.org/jira/browse/BEAM-12061: beam_PostCommit_SQL 
failing on KafkaTableProviderIT.testFakeNested (created 2021-03-27)
https://issues.apache.org/jira/browse/BEAM-11837: Java build flakes: 
"Memory constraints are impeding performance" (created 2021-02-18)
https://issues.apache.org/jira/browse/BEAM-11661: hdfsIntegrationTest 
flake: network not found (py38 postcommit) (created 2021-01-19)
https://issues.apache.org/jira/browse/BEAM-11645: beam_PostCommit_XVR_Flink 
failing (created 2021-01-15)
https://issues.apache.org/jira/browse/BEAM-11641: Bigquery Read tests are 
flaky on Flink runner in Python PostCommit suites (created 2021-01-15)
https://issues.apache.org/jira/browse/BEAM-11541: 
testTeardownCalledAfterExceptionInProcessElement flakes on direct runner. 
(created 2020-12-30)
https://issues.apache.org/jira/browse/BEAM-10955: Flink Java Runner test 
flake: Could not find Flink job (FlinkJobNotFoundException) (created 2020-09-23)
https://issues.apache.org/jira/browse/BEAM-10866: 
PortableRunnerTestWithSubprocesses.test_register_finalizations flaky on macOS 
(created 2020-09-09)
https://issues.apache.org/jira/browse/BEAM-10485: Failure / flake: 
ElasticsearchIOTest > testWriteWithIndexFn (created 2020-07-14)
https://issues.apache.org/jira/browse/BEAM-9649: 
beam_python_mongoio_load_test started failing due to mismatched results 
(created 2020-03-31)
https://issues.apache.org/jira/browse/BEAM-8453: Failure in 
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety (created 
2019-10-21)
https://issues.apache.org/jira/browse/BEAM-8101: Flakes in 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful for 
Direct, Spark, Flink (created 2019-08-27)
https://issues.apache.org/jira/browse/BEAM-8035: 
WatchTest.testMultiplePollsWithManyResults flake: Outputs must be in timestamp 
order (sickbayed) (created 2019-08-22)
https://issues.apache.org/jira/browse/BEAM-7827: 
MetricsTest$AttemptedMetricTests.testAllAttemptedMetrics is flaky on 
DirectRunner (created 2019-07-26)
https://issues.apache.org/jira/browse/BEAM-7752: Java Validates 
DirectRunner: testTeardownCalledAfterExceptionInFinishBundleStateful flaky 
(created 2019-07-16)

Re: Will Apache Beam adopt a Pandas-like syntax to program in Python?

2021-09-29 Thread Brian Hulette
Hi David,

Yes! Apache Beam now has a DataFrame API [1], which provides similar
functionality. It exited experimental in Beam 2.32.0 [2]. You can see some
example pipelines that use it here [3].

Brian

[1] https://beam.apache.org/documentation/dsls/dataframes/overview/
[2] https://beam.apache.org/blog/beam-2.32.0/
[3]
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/dataframe

On Wed, Sep 29, 2021 at 12:14 PM David Ciudad Gomez <
david.ciudad.go...@gmail.com> wrote:

> Hi,
>
> Apache Spark is adopting a new Pandas-like syntax (
> https://github.com/databricks/koalas) for programming in Python. Will
> Apache Beam adopt a similar syntax in the future?
>
> Thanks and best regards.
>
> David Ciudad
>


Re: SBE Beam Schema

2021-09-29 Thread Kenneth Knowles
Thanks for the super clear summary. Makes sense!

On Tue, Sep 28, 2021 at 9:57 AM Zachary Houfek  wrote:

> Hi, everyone,
>
> This is a follow-up to my original proposal for adding SBE message support
> in Beam. A couple things came out in that review:
>
> 1. A coder-focused design for working with SBE types directly can cause a
> lot of confusion.
> 2. Since the primary use case will be IO support, it would be best to
> focus on `PayloadSerializer` support. In other words, Beam schemas should
> be the primary focus, not future work.
>
> Based on that, I wrote up another doc for how to map SBE schemas to Beam
> schemas. I tried looking over some existing schemas (mostly proto ones) to
> make sure I wasn't doing anything  too weird, but I would love some
> feedback:
>
>  SBE Schema in Beam
> 
>
> Thanks,
> Zach
>
> --
>
> Zachary Houfek
>
> Software Engineer
>
> DataPLS PLAT
>
> zhou...@google.com
>


Will Apache Beam adopt a Pandas-like syntax to program in Python?

2021-09-29 Thread David Ciudad Gomez
Hi,

Apache Spark is adopting a new Pandas-like syntax (
https://github.com/databricks/koalas) for programming in Python. Will
Apache Beam adopt a similar syntax in the future?

Thanks and best regards.

David Ciudad


Contributor permission for Beam Jira tickets

2021-09-29 Thread David Prieto Rivera
Hi,

My name is David Prieto. I work at Europcar Mobility Group as a Data
Engineer. We are currently using Beam for both batch and streaming
pipelines and I would like to be added as a contributor to the project to
be able to assign issues to me.

I created this issue a few days ago:
https://issues.apache.org/jira/browse/BEAM-12950 and I would like to assign
it to me.

My Jira Id is: davidpr

Thank you,

David


Re: Multi Environment Support

2021-09-29 Thread Ke Wu
Thanks for the advice.

Here are some more background:

We are building a feature called “split deployment” such that, we can isolate 
framework/platform core from user code/dependencies to address couple of 
operational challenges such as dependency conflict, alert/exception triaging.

With Beam’s portability framework, runner and sdk worker process naturally 
decouples beam core and user UDFs(DoFn), which is awesome! On top of this, we 
could further distinguish DoFn(s) that end user authors from DoFn(s) that 
platform provides, therefore, we would like these DoFn(s) to be executed in 
different environments, even in the same language, e.g. Java.

Therefore, I am exploring approaches and recommendations what are the proper 
way to do that.

Let me know your thoughts, any feedback/advice is welcome. 

Best,
Ke

> On Sep 27, 2021, at 11:56 AM, Luke Cwik  wrote:
> 
> Resource hints have a limited use case and might fit your need.
> You could also try to use the expansion service XLang route to bring in a 
> different Java environment.
> Finally, you could modify the pipeline proto that is generated directly to 
> choose which environment is used for which PTransform.
> 
> Can you provide additional details as to why you would want to have two 
> separate java environments (e.g. incompatible versions of libraries)?
> 
> On Wed, Sep 22, 2021 at 3:41 PM Ke Wu  > wrote:
> Thanks Luke for the reply, do you know what is the preferred way to configure 
> a PTransform to be executed in a different environment from another 
> PTransform when both are in the same SDK, e.g. Java ?
> 
> Best,
> Ke
> 
>> On Sep 21, 2021, at 9:48 PM, Luke Cwik > > wrote:
>> 
>> Environments that aren't exactly the same are already in separate 
>> ExecutableStages. The GreedyPCollectionFuser ensures that today[1].
>> 
>> Workarounds like getOnlyEnvironmentId would need to be removed. It may also 
>> be effectively dead-code.
>> 
>> 1: 
>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPCollectionFusers.java#L144
>>  
>> 
>> On Tue, Sep 21, 2021 at 1:45 PM Ke Wu > > wrote:
>> Hello All,
>> 
>> We have a use case where in a java portable pipeline, we would like to have 
>> multiple environments setup in order that some executable stage runs in one 
>> environment while some other executable stages runs in another environment. 
>> Couple of questions on this:
>> 
>> 1. Is this current supported? I noticed a TODO in [1] which suggests it is 
>> feature pending support
>> 2. If we did support it, what would the ideal mechanism to distinguish 
>> ParDo/ExecutableStage to be executed in different environment, is it through 
>> ResourceHints?
>> 
>> 
>> Best,
>> Ke 
>> 
>> 
>> [1] 
>> https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SdkComponents.java#L344
>>  
>> 
>>  
> 



P1 issues report (46)

2021-09-29 Thread Beam Jira Bot
This is your daily summary of Beam's current P1 issues, not including flaky 
tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake).

See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the 
meaning and expectations around P1 issues.

https://issues.apache.org/jira/browse/BEAM-12950: Missing events when using 
Python WriteToFiles in streaming pipeline (created 2021-09-24)
https://issues.apache.org/jira/browse/BEAM-12867: Either Create or 
DirectRunner fails to produce all elements to the following transform (created 
2021-09-09)
https://issues.apache.org/jira/browse/BEAM-12843: (Broken Pipe induced) 
Bricked Dataflow Pipeline  (created 2021-09-06)
https://issues.apache.org/jira/browse/BEAM-12807: Java creates an incorrect 
pipeline proto when core-construction-java jar is not in the CLASSPATH (created 
2021-08-26)
https://issues.apache.org/jira/browse/BEAM-12794: 
PortableRunnerTestWithExternalEnv.test_pardo_timers flaky (created 2021-08-24)
https://issues.apache.org/jira/browse/BEAM-12792: Beam worker only installs 
--extra_package once (created 2021-08-24)
https://issues.apache.org/jira/browse/BEAM-12781: SDFBoundedSourceReader 
behaves much slower compared with the original behavior of BoundedSource 
(created 2021-08-20)
https://issues.apache.org/jira/browse/BEAM-12766: Already Exists: Dataset 
apache-beam-testing:python_bq_file_loads_NNN (created 2021-08-16)
https://issues.apache.org/jira/browse/BEAM-12632: ElasticsearchIO: Enabling 
both User/Pass auth and SSL overwrites User/Pass (created 2021-07-16)
https://issues.apache.org/jira/browse/BEAM-12628: AvroCoder changed 
underlying String class for SpecificRecords (created 2021-07-16)
https://issues.apache.org/jira/browse/BEAM-12607: Copy Code Snippet copies 
html tags (created 2021-07-13)
https://issues.apache.org/jira/browse/BEAM-12540: 
beam_PostRelease_NightlySnapshot - Task 
:runners:direct-java:runMobileGamingJavaDirect FAILED (created 2021-06-25)
https://issues.apache.org/jira/browse/BEAM-12525: SDF BoundedSource seems 
to execute significantly slower than 'normal' BoundedSource (created 2021-06-22)
https://issues.apache.org/jira/browse/BEAM-12505: codecov/patch has poor 
behavior (created 2021-06-17)
https://issues.apache.org/jira/browse/BEAM-12500: Dataflow SocketException 
(SSLException) error while trying to send message from Cloud Pub/Sub to 
BigQuery (created 2021-06-16)
https://issues.apache.org/jira/browse/BEAM-12484: JdbcIO date conversion is 
sensitive to OS (created 2021-06-14)
https://issues.apache.org/jira/browse/BEAM-12467: 
java.io.InvalidClassException With Flink Kafka (created 2021-06-09)
https://issues.apache.org/jira/browse/BEAM-12380: Go SDK Kafka IO Transform 
implemented via XLang (created 2021-05-21)
https://issues.apache.org/jira/browse/BEAM-12310: 
beam_PostCommit_Java_DataflowV2 failing (created 2021-05-07)
https://issues.apache.org/jira/browse/BEAM-12279: Implement 
destination-dependent sharding in FileIO.writeDynamic (created 2021-05-04)
https://issues.apache.org/jira/browse/BEAM-12256: 
PubsubIO.readAvroGenericRecord creates SchemaCoder that fails to decode some 
Avro logical types (created 2021-04-29)
https://issues.apache.org/jira/browse/BEAM-11959: Python Beam SDK Harness 
hangs when installing pip packages (created 2021-03-11)
https://issues.apache.org/jira/browse/BEAM-11906: No trigger early 
repeatedly for session windows (created 2021-03-01)
https://issues.apache.org/jira/browse/BEAM-11875: XmlIO.Read does not 
handle XML encoding per spec (created 2021-02-26)
https://issues.apache.org/jira/browse/BEAM-11828: JmsIO is not 
acknowledging messages correctly (created 2021-02-17)
https://issues.apache.org/jira/browse/BEAM-11755: Cross-language 
consistency (RequiresStableInputs) is quietly broken (at least on portable 
flink runner) (created 2021-02-05)
https://issues.apache.org/jira/browse/BEAM-11578: `dataflow_metrics` 
(python) fails with TypeError (when int overflowing?) (created 2021-01-06)
https://issues.apache.org/jira/browse/BEAM-11148: Kafka 
commitOffsetsInFinalize OOM on Flink (created 2020-10-28)
https://issues.apache.org/jira/browse/BEAM-11017: Timer with dataflow 
runner can be set multiple times (dataflow runner) (created 2020-10-05)
https://issues.apache.org/jira/browse/BEAM-10670: Make non-portable 
Splittable DoFn the only option when executing Java "Read" transforms (created 
2020-08-10)
https://issues.apache.org/jira/browse/BEAM-10617: python 
CombineGlobally().with_fanout() cause duplicate combine results for sliding 
windows (created 2020-07-31)
https://issues.apache.org/jira/browse/BEAM-10569: SpannerIO tests don't 
actually assert anything. (created 2020-07-23)

P0 (outage) report

2021-09-29 Thread Beam Jira Bot
This is your daily summary of Beam's current outages. See 
https://beam.apache.org/contribute/jira-priorities/#p0-outage for the meaning 
and expectations around P0 issues.

BEAM-12959: Dataflow error in CombinePerKey operation 
(https://issues.apache.org/jira/browse/BEAM-12959)


Re: [RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-29 Thread Ryan Skraba
Just to follow up -- the cherry-pick of
https://github.com/apache/beam/pull/15616 changes the default value of
a configuration option that appears for the first time in 2.33.0.

I think it's a strong argument for making the change now, before
unwary developers start using the wrong default.  I understand that
it's *extremely* late in the release cycle!

All my best, Ryan

On Wed, Sep 29, 2021 at 6:50 PM Alexey Romanenko
 wrote:
>
> Is it still possible to cherry-pick this fix [1][2] since it’s a recent 
> regression that touches pipelines in production?
>
> [1] https://github.com/apache/beam/pull/15616
> [2] https://issues.apache.org/jira/browse/BEAM-12628
>
> On 28 Sep 2021, at 03:31, Udi Meiri  wrote:
>
> I spoke too soon. We will be doing an rc2
>
> On Mon, Sep 27, 2021 at 1:29 PM Udi Meiri  wrote:
>>
>> I'm happy to announce that we have unanimously approved this release.
>>
>> There are 8 approving votes, 4 of which are binding:
>> * Ahmet Altay
>> * Alexey Romanenko
>> * Robert Bradshaw
>> * Chamikara Jayalath
>>
>> There are no disapproving votes.
>>
>> Thanks everyone!
>>
>


Re: [RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-29 Thread Alexey Romanenko
Is it still possible to cherry-pick this fix [1][2] since it’s a recent 
regression that touches pipelines in production? 

[1] https://github.com/apache/beam/pull/15616
[2] https://issues.apache.org/jira/browse/BEAM-12628

> On 28 Sep 2021, at 03:31, Udi Meiri  wrote:
> 
> I spoke too soon. We will be doing an rc2
> 
> On Mon, Sep 27, 2021 at 1:29 PM Udi Meiri  > wrote:
> I'm happy to announce that we have unanimously approved this release.
> 
> There are 8 approving votes, 4 of which are binding:
> * Ahmet Altay
> * Alexey Romanenko
> * Robert Bradshaw
> * Chamikara Jayalath
> 
> There are no disapproving votes.
> 
> Thanks everyone!
> 



Re: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

2021-09-29 Thread Alexander Zhuravlev
Thank you!

Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko 
Отправлено: 29 сентября 2021 г. 16:29:11
Кому: dev@beam.apache.org
Тема: Re: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

Thanks, it’s ok now. I added you as a contributor.

Welcome!

—
Alexey

On 29 Sep 2021, at 14:19, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

I'm sorry, signed up 5 mins ago, can you try again?


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko mailto:aromanenko@gmail.com>>
Отправлено: 29 сентября 2021 г. 16:08:22
Кому: dev@beam.apache.org
Тема: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

Hmm, I can’t find it. Did you already signed up at Apache Jira before?



On 29 Sep 2021, at 13:25, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

Jira name - Alexander Zhuravlev


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko mailto:aromanenko@gmail.com>>
Отправлено: 29 сентября 2021 г. 14:59:58
Кому: dev@beam.apache.org
Тема: [EXTERNAL] [BULK] Re: Jira Permission Request

Hi Alexander,

What is your Jira ID/name?

—
Alexey

On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

Hello,
My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
Playground project and contribute to Apache Beam. Can someone add me as a 
contributor to Beam’s Jira issue tracker? I would like to create/assign tickets 
for my work.


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565



Re: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

2021-09-29 Thread Alexey Romanenko
Thanks, it’s ok now. I added you as a contributor.

Welcome!

—
Alexey

> On 29 Sep 2021, at 14:19, Alexander Zhuravlev 
>  wrote:
> 
> I'm sorry, signed up 5 mins ago, can you try again?
> 
> Alexander Zhuravlev
> Flutter Developer, Akvelon
> Skype: hello030998
> Telegram: @ftredmist
> Phone: +79128760565
> От: Alexey Romanenko 
> Отправлено: 29 сентября 2021 г. 16:08:22
> Кому: dev@beam.apache.org
> Тема: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request
>  
> Hmm, I can’t find it. Did you already signed up at Apache Jira before? 
> 
> 
> 
>> On 29 Sep 2021, at 13:25, Alexander Zhuravlev 
>> mailto:alexander.zhurav...@akvelon.com>> 
>> wrote:
>> 
>> Jira name - Alexander Zhuravlev
>> 
>> Alexander Zhuravlev
>> Flutter Developer, Akvelon
>> Skype: hello030998
>> Telegram: @ftredmist
>> Phone: +79128760565
>> От: Alexey Romanenko > >
>> Отправлено: 29 сентября 2021 г. 14:59:58
>> Кому: dev@beam.apache.org 
>> Тема: [EXTERNAL] [BULK] Re: Jira Permission Request
>>  
>> Hi Alexander,
>> 
>> What is your Jira ID/name? 
>> 
>> —
>> Alexey
>> 
>>> On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
>>> mailto:alexander.zhurav...@akvelon.com>> 
>>> wrote:
>>> 
>>> Hello,
>>> My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
>>> Playground project and contribute to Apache Beam. Can someone add me as a 
>>> contributor to Beam’s Jira issue tracker? I would like to create/assign 
>>> tickets for my work.
>>> 
>>> Alexander Zhuravlev
>>> Flutter Developer, Akvelon
>>> Skype: hello030998
>>> Telegram: @ftredmist
>>> Phone: +79128760565



Re: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

2021-09-29 Thread Alexander Zhuravlev
I'm sorry, signed up 5 mins ago, can you try again?

Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko 
Отправлено: 29 сентября 2021 г. 16:08:22
Кому: dev@beam.apache.org
Тема: [BULK] Re: [EXTERNAL] [BULK] Re: Jira Permission Request

Hmm, I can’t find it. Did you already signed up at Apache Jira before?

[cid:D67A8FBB-B3ED-416C-BF5E-B29E2B0E7FD2]

On 29 Sep 2021, at 13:25, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

Jira name - Alexander Zhuravlev


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko mailto:aromanenko@gmail.com>>
Отправлено: 29 сентября 2021 г. 14:59:58
Кому: dev@beam.apache.org
Тема: [EXTERNAL] [BULK] Re: Jira Permission Request

Hi Alexander,

What is your Jira ID/name?

—
Alexey

On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

Hello,
My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
Playground project and contribute to Apache Beam. Can someone add me as a 
contributor to Beam’s Jira issue tracker? I would like to create/assign tickets 
for my work.


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565



Re: [EXTERNAL] [BULK] Re: Jira Permission Request

2021-09-29 Thread Alexey Romanenko
Hmm, I can’t find it. Did you already signed up at Apache Jira before? 



> On 29 Sep 2021, at 13:25, Alexander Zhuravlev 
>  wrote:
> 
> Jira name - Alexander Zhuravlev
> 
> Alexander Zhuravlev
> Flutter Developer, Akvelon
> Skype: hello030998
> Telegram: @ftredmist
> Phone: +79128760565
> От: Alexey Romanenko 
> Отправлено: 29 сентября 2021 г. 14:59:58
> Кому: dev@beam.apache.org
> Тема: [EXTERNAL] [BULK] Re: Jira Permission Request
>  
> Hi Alexander,
> 
> What is your Jira ID/name? 
> 
> —
> Alexey
> 
>> On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
>> mailto:alexander.zhurav...@akvelon.com>> 
>> wrote:
>> 
>> Hello,
>> My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
>> Playground project and contribute to Apache Beam. Can someone add me as a 
>> contributor to Beam’s Jira issue tracker? I would like to create/assign 
>> tickets for my work.
>> 
>> Alexander Zhuravlev
>> Flutter Developer, Akvelon
>> Skype: hello030998
>> Telegram: @ftredmist
>> Phone: +79128760565



Re: [EXTERNAL] [BULK] Re: Jira Permission Request

2021-09-29 Thread Alexander Zhuravlev
Jira name - Alexander Zhuravlev

Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


От: Alexey Romanenko 
Отправлено: 29 сентября 2021 г. 14:59:58
Кому: dev@beam.apache.org
Тема: [EXTERNAL] [BULK] Re: Jira Permission Request

Hi Alexander,

What is your Jira ID/name?

—
Alexey

On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
mailto:alexander.zhurav...@akvelon.com>> wrote:

Hello,
My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
Playground project and contribute to Apache Beam. Can someone add me as a 
contributor to Beam’s Jira issue tracker? I would like to create/assign tickets 
for my work.


Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565



Re: Jira Permission Request

2021-09-29 Thread Alexey Romanenko
Hi Alexander,

What is your Jira ID/name? 

—
Alexey

> On 29 Sep 2021, at 12:28, Alexander Zhuravlev 
>  wrote:
> 
> Hello,
> My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
> Playground project and contribute to Apache Beam. Can someone add me as a 
> contributor to Beam’s Jira issue tracker? I would like to create/assign 
> tickets for my work.
> 
> Alexander Zhuravlev
> Flutter Developer, Akvelon
> Skype: hello030998
> Telegram: @ftredmist
> Phone: +79128760565



Jira Permission Request

2021-09-29 Thread Alexander Zhuravlev
Hello,
My name is Alexander. I work as an engineer in Akvelon. I'll work at Beam 
Playground project and contribute to Apache Beam. Can someone add me as a 
contributor to Beam’s Jira issue tracker? I would like to create/assign tickets 
for my work.

Alexander Zhuravlev
Flutter Developer, Akvelon
Skype: hello030998
Telegram: @ftredmist
Phone: +79128760565


Re: Documenting per-key delivery semantics for various runners

2021-09-29 Thread Jan Lukavský

Hi Pablo,

thanks for the motivating examples. I understand the motivation now, but 
one question comes to mind - we do not mind the case, when in-between 
the emitting PTransform and the comsuming PTransform is another 
(grouping) PTransform which _changes the key_? Via the change of key, 
two elements originally emitted with the same key, can change the key to 
two different ones, and then back to the same one, which would obviously 
violate any ordering defined on the original emitting transform. I'm 
aware, that the per-key definition describes the two PTransforms to be 
directly connected. I'm just asking, if we would not want to solve this 
for the more general case.


In the original design document for @RequiresTimeSortedInput [1], there 
was (not implemented) mention about "User supplied sorting criterion", 
which I believe is exactly what for instance Kafka offset offers, and 
what is actually described by the per-key delivery semantics. The key 
point is that each element is (anyhow) assigned a sequential ID, which 
then defines order as seen by a per-key authoritative _observer_ (for 
instance, the source emitting transform, in the case of Kafka that is 
the leader for a partition, and so on). If this per-key sequential ID is 
carried along the element, then the order can be reconstructed at any 
downstream stage.


I'm +1 to splitting the documentation and validation / implementation 
parts, that sounds good to me.


 Jan

[1] 
https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/edit?usp=sharing



On 9/27/21 11:56 PM, Pablo Estrada wrote:

Hello all,
thanks for your comments.

All runners that I've tested have these semantics for streaming. See 
the PR[1] with the cap matrix changes.


I agree it makes sense to add a pipeline check for this. I think I 
would like to receive comments / agreement on the definition and the 
changes to the documentation - and then follow up with the pipeline 
check. Is that reasonable for everyone?


I have added a section of motivating use cases to the document, Jan. 
Let me know if those make sense.


[1] https://github.com/apache/beam/pull/15378 



Thanks
-P

On Sat, Sep 25, 2021 at 1:06 PM Jan Lukavský > wrote:


+1 to adding a Pipeline requirement for this, if business logic
relies
on a specific feature runner might/might not have, then Pipeline
should
be rejected on runners that do not support it. Do we have a list
runners
that have or lack this semantics? Just for clarification - sorry my
ignorance, if this has been already described - do we have a
description
of the use-cases that drive this effort?

Thanks,

  Jan

On 9/24/21 10:58 PM, Robert Bradshaw wrote:
> Thanks for writing this up. Rather than just documenting it,
should we
> have a way of asserting/requesting it (like time sorted inputs) such
> that a pipeline author that needs to rely on this property can be
> rejected on runners that don't provide it?
>
> On Fri, Sep 24, 2021 at 12:25 PM Kenneth Knowles
mailto:k...@apache.org>> wrote:
>> Took a look. I definitely agree that something like this is
useful, and well-motivated by the use cases you raise.
>>
>> Kenn
>>
>> On Thu, Sep 23, 2021 at 4:30 PM Pablo Estrada
mailto:pabl...@google.com>> wrote:
>>> Hi all,
>>> I've been spending some time thinking about CDC use cases on
Beam. One valuable piece to enable these use cases is to define
how Beam deals with ordering of elements in streaming pipelines.
>>> With that in mind, I wrote a document[1] that proposes a
definition of the ordering semantics supported by most Beam
runners, and a pull request [2] with ValidatesRunner tests and
documentation updates.
>>>
>>> Would you please review these, add your comments and thoughts,
and let me know if they make sense?
>>>
>>> Thanks!
>>> -P.
>>>
>>> [1]

https://docs.google.com/document/d/1_7WRJznXlOtWuVaHl_dpy8OZcx_M8BUmeWVA4G0-wEc/edit#


>>> [2] https://github.com/apache/beam/pull/15378