date:20231031

Re: Credentials Rotation Failure on IO-Datastores cluster

2023-10-31 Thread Svetak Sundhar via dev

I took a quick look -- the error is the following:

*22:17:26* ERROR: (gcloud.container.clusters.update) ResponseError:
code=400, message=Operation
operation-1698804621818-e9c8fe33-d4a2-44cd-86aa-9c4e09dea259 is
currently upgrading cluster io-datastores. Please wait and try again
once it is done.

This is different than the last time this error happened
(https://lists.apache.org/thread/xw2hx8yycpfmhf64w0vyt96r0d8zwnyg)

I noticed node pool pool-1 was still updating when this error was
sent, so I think it should succeed now.

Should we retrigger the seed job manually?

Svetak Sundhar

  Data Engineer
s vetaksund...@google.com

On Tue, Oct 31, 2023 at 10:17 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Something went wrong during the automatic credentials rotation for
> IO-Datastores Cluster, performed at Wed Nov 01 00:52:45 UTC 2023. It may be
> necessary to check the state of the cluster certificates. For further
> details refer to the following links:
>  * Failing job:
> https://ci-beam.apache.org/job/Rotate%20IO-Datastores%20Cluster%20Credentials/
>  * Job configuration:
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_IODatastoresCredentialsRotation.groovy
>  * Cluster URL:
> https://pantheon.corp.google.com/kubernetes/clusters/details/us-central1-a/io-datastores/details?mods=dataflow_dev&project=apache-beam-testing

Credentials Rotation Failure on IO-Datastores cluster

2023-10-31 Thread Apache Jenkins Server

Something went wrong during the automatic credentials rotation for 
IO-Datastores Cluster, performed at Wed Nov 01 00:52:45 UTC 2023. It may be 
necessary to check the state of the cluster certificates. For further details 
refer to the following links:
 * Failing job: 
https://ci-beam.apache.org/job/Rotate%20IO-Datastores%20Cluster%20Credentials/ 
 * Job configuration: 
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_IODatastoresCredentialsRotation.groovy
 
 * Cluster URL: 
https://pantheon.corp.google.com/kubernetes/clusters/details/us-central1-a/io-datastores/details?mods=dataflow_dev&project=apache-beam-testing

Credentials Rotation Failure on Metrics cluster (2023-11-01)

2023-10-31 Thread gacti...@beam.apache.org

Something went wrong during the automatic credentials rotation for Metrics 
Cluster, performed at 2023-11-01. It may be necessary to check the state of the 
cluster certificates. For further details refer to the following links:\n * 
Failing job: 
https://github.com/apache/beam/actions/workflows/beam_MetricsCredentialsRotation.yml
 \n * Job configuration: 
https://github.com/apache/beam/blob/master/.github/workflows/beam_MetricsCredentialsRotation.yml
 \n * Cluster URL: 
https://pantheon.corp.google.com/kubernetes/clusters/details/us-central1-a/metrics/details?mods=dataflow_dev&project=apache-beam-testing

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Robert Bradshaw via dev

On Tue, Oct 31, 2023 at 10:28 AM Jan Lukavský  wrote:
>
> On 10/31/23 17:44, Robert Bradshaw via dev wrote:
> > There are really two cases that make sense:
> >
> > (1) We read the event timestamps from the kafka records themselves and
> > have some external knowledge that guarantees (or at least provides a
> > very good heuristic) about what the timestamps of unread messages
> > could be in the future to set the watermark. This could possibly
> > involve knowing that the timestamps in a partition are monotonically
> > increasing, or somehow have bounded skew.
> +1
> >
> > (2) We use processing time as both the watermark and for setting the
> > event timestamp on produced messages. From this point on we can safely
> > reason about the event time.
> This is where I have some doubts. We can reason about event time, but is
> is not stable upon Pipeline restarts (if there is any downstream
> processing that depends on event time and is not shielded by
> @RequiresStableInput, it might give different results on restarts).

That is a fair point, but I don't think we can guarantee that we have
a timestamp embedded in the record. (Or is there some stable kafka
metadata we could use here, I'm not that familiar with what kafka
guarantees). We could require it to be opt-in given the caveats.

> Is
> there any specific case why not use option 1)? Do we have to provide the
> alternative 2), provided users can implement it themselves (we would
> need to allow users to specify custom timestamp function, but that
> should be done in all cases)?

The tricky bit is how the user specifies the watermark, unless they
can guarantee the custom timestamps are monotonically ordered (at
least within a partition).

> > The current state seems a bit broken if I understand correctly.
> +1
> >
> > On Tue, Oct 31, 2023 at 1:16 AM Jan Lukavský  wrote:
> >> I think that instead of deprecating and creating new version, we could 
> >> leverage the proposed update compatibility flag for this [1]. I still have 
> >> some doubts if the processing-time watermarking (and event-time 
> >> assignment) makes sense. Do we have a valid use-case for that? This is 
> >> actually the removed SYNCHRONIZED_PROCESSING_TIME time domain, which is 
> >> problematic - restarts of Pipelines causes timestamps to change and hence 
> >> makes *every* DoFn potentially non-deterministic, which would be 
> >> unexpected side-effect. This makes me wonder if we should remove this 
> >> policy altogether (deprecate or use the update compatibility flag, so that 
> >> the policy throws exception in new version).
> >>
> >> The crucial point would be to find a use-case where it is actually helpful 
> >> to use such policy.
> >> Any ideas?
> >>
> >>   Jan
> >>
> >> [1] https://lists.apache.org/thread/29r3zv04n4ooq68zzvpw6zm1185n59m2
> >>
> >> On 10/27/23 18:33, Alexey Romanenko wrote:
> >>
> >> Ahh, ok, I see.
> >>
> >> Yes, it looks like a bug. So, I'd propose to deprecate the old "processing 
> >> time” watermark policy, which we can remove later, and create a new fixed 
> >> one.
> >>
> >> PS: It’s recommended to use 
> >> "org.apache.beam.sdk.io.aws2.kinesis.KinesisIO” instead of deprecated 
> >> “org.apache.beam.sdk.io.kinesis.KinesisIO” one.
> >>
> >> —
> >> Alexey
> >>
> >> On 27 Oct 2023, at 17:42, Jan Lukavský  wrote:
> >>
> >> No, I'm referring to this [1] policy which has unexpected (and hardly 
> >> avoidable on the user-code side) data loss issues. The problem is that 
> >> assigning timestamps to elements and watermarks is completely decoupled 
> >> and unrelated, which I'd say is a bug.
> >>
> >>   Jan
> >>
> >> [1] 
> >> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.Read.html#withProcessingTimeWatermarkPolicy--
> >>
> >> On 10/27/23 16:51, Alexey Romanenko wrote:
> >>
> >> Why not just to create a custom watermark policy for that? Or you mean to 
> >> make it as a default policy?
> >>
> >> —
> >> Alexey
> >>
> >> On 27 Oct 2023, at 10:25, Jan Lukavský  wrote:
> >>
> >>
> >> Hi,
> >>
> >> when discussing about [1] we found out, that the issue is actually caused 
> >> by processing time watermarks in KinesisIO. Enabling this watermark 
> >> outputs watermarks based on current processing time, _but event timestamps 
> >> are derived from ingestion timestamp_. This can cause unbounded lateness 
> >> when processing backlog. I think this setup is error-prone and will likely 
> >> cause data loss due to dropped elements. This can be solved in two ways:
> >>
> >>   a) deprecate processing time watermarks, or
> >>
> >>   b) modify KinesisIO's watermark policy so that is assigns event 
> >> timestamps as well (the processing-time watermark policy would have to 
> >> derive event timestamps from processing-time).
> >>
> >> I'd prefer option b) , but it might be a breaking change, moreover I'm not 
> >> sure if I understand the purpose of processing-time watermark policy, it 
> >> might be essentially ill defined from the beginn

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Jan Lukavský

On 10/31/23 17:44, Robert Bradshaw via dev wrote:

There are really two cases that make sense:

(1) We read the event timestamps from the kafka records themselves and
have some external knowledge that guarantees (or at least provides a
very good heuristic) about what the timestamps of unread messages
could be in the future to set the watermark. This could possibly
involve knowing that the timestamps in a partition are monotonically
increasing, or somehow have bounded skew.

(2) We use processing time as both the watermark and for setting the
event timestamp on produced messages. From this point on we can safely
reason about the event time.
This is where I have some doubts. We can reason about event time, but is
is not stable upon Pipeline restarts (if there is any downstream
processing that depends on event time and is not shielded by
@RequiresStableInput, it might give different results on restarts). Is
there any specific case why not use option 1)? Do we have to provide the
alternative 2), provided users can implement it themselves (we would
need to allow users to specify custom timestamp function, but that
should be done in all cases)?

The current state seems a bit broken if I understand correctly.

On Tue, Oct 31, 2023 at 1:16 AM Jan Lukavský wrote:

I think that instead of deprecating and creating new version, we could leverage
the proposed update compatibility flag for this [1]. I still have some doubts
if the processing-time watermarking (and event-time assignment) makes sense. Do
we have a valid use-case for that? This is actually the removed
SYNCHRONIZED_PROCESSING_TIME time domain, which is problematic - restarts of
Pipelines causes timestamps to change and hence makes *every* DoFn potentially
non-deterministic, which would be unexpected side-effect. This makes me wonder
if we should remove this policy altogether (deprecate or use the update
compatibility flag, so that the policy throws exception in new version).

The crucial point would be to find a use-case where it is actually helpful to
use such policy.
Any ideas?

Jan

[1] https://lists.apache.org/thread/29r3zv04n4ooq68zzvpw6zm1185n59m2

On 10/27/23 18:33, Alexey Romanenko wrote:

Ahh, ok, I see.

Yes, it looks like a bug. So, I'd propose to deprecate the old "processing
time” watermark policy, which we can remove later, and create a new fixed one.

PS: It’s recommended to use "org.apache.beam.sdk.io.aws2.kinesis.KinesisIO”
instead of deprecated “org.apache.beam.sdk.io.kinesis.KinesisIO” one.

—
Alexey

On 27 Oct 2023, at 17:42, Jan Lukavský wrote:

No, I'm referring to this [1] policy which has unexpected (and hardly avoidable
on the user-code side) data loss issues. The problem is that assigning
timestamps to elements and watermarks is completely decoupled and unrelated,
which I'd say is a bug.

Jan

[1]
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.Read.html#withProcessingTimeWatermarkPolicy--

On 10/27/23 16:51, Alexey Romanenko wrote:

Why not just to create a custom watermark policy for that? Or you mean to make
it as a default policy?

—
Alexey

On 27 Oct 2023, at 10:25, Jan Lukavský wrote:

Hi,

when discussing about [1] we found out, that the issue is actually caused by
processing time watermarks in KinesisIO. Enabling this watermark outputs
watermarks based on current processing time, _but event timestamps are derived
from ingestion timestamp_. This can cause unbounded lateness when processing
backlog. I think this setup is error-prone and will likely cause data loss due
to dropped elements. This can be solved in two ways:

a) deprecate processing time watermarks, or

b) modify KinesisIO's watermark policy so that is assigns event timestamps as
well (the processing-time watermark policy would have to derive event
timestamps from processing-time).

I'd prefer option b) , but it might be a breaking change, moreover I'm not sure
if I understand the purpose of processing-time watermark policy, it might be
essentially ill defined from the beginning, thus it might really be better to
remove it completely. There is also a related issue [2].

Any thoughts on this?

Jan

[1] https://github.com/apache/beam/issues/25975

[2] https://github.com/apache/beam/issues/28760

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Robert Bradshaw via dev

There are really two cases that make sense:

(1) We read the event timestamps from the kafka records themselves and
have some external knowledge that guarantees (or at least provides a
very good heuristic) about what the timestamps of unread messages
could be in the future to set the watermark. This could possibly
involve knowing that the timestamps in a partition are monotonically
increasing, or somehow have bounded skew.

(2) We use processing time as both the watermark and for setting the
event timestamp on produced messages. From this point on we can safely
reason about the event time.

The current state seems a bit broken if I understand correctly.

On Tue, Oct 31, 2023 at 1:16 AM Jan Lukavský  wrote:
>
> I think that instead of deprecating and creating new version, we could 
> leverage the proposed update compatibility flag for this [1]. I still have 
> some doubts if the processing-time watermarking (and event-time assignment) 
> makes sense. Do we have a valid use-case for that? This is actually the 
> removed SYNCHRONIZED_PROCESSING_TIME time domain, which is problematic - 
> restarts of Pipelines causes timestamps to change and hence makes *every* 
> DoFn potentially non-deterministic, which would be unexpected side-effect. 
> This makes me wonder if we should remove this policy altogether (deprecate or 
> use the update compatibility flag, so that the policy throws exception in new 
> version).
>
> The crucial point would be to find a use-case where it is actually helpful to 
> use such policy.
> Any ideas?
>
>  Jan
>
> [1] https://lists.apache.org/thread/29r3zv04n4ooq68zzvpw6zm1185n59m2
>
> On 10/27/23 18:33, Alexey Romanenko wrote:
>
> Ahh, ok, I see.
>
> Yes, it looks like a bug. So, I'd propose to deprecate the old "processing 
> time” watermark policy, which we can remove later, and create a new fixed one.
>
> PS: It’s recommended to use "org.apache.beam.sdk.io.aws2.kinesis.KinesisIO” 
> instead of deprecated “org.apache.beam.sdk.io.kinesis.KinesisIO” one.
>
> —
> Alexey
>
> On 27 Oct 2023, at 17:42, Jan Lukavský  wrote:
>
> No, I'm referring to this [1] policy which has unexpected (and hardly 
> avoidable on the user-code side) data loss issues. The problem is that 
> assigning timestamps to elements and watermarks is completely decoupled and 
> unrelated, which I'd say is a bug.
>
>  Jan
>
> [1] 
> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.Read.html#withProcessingTimeWatermarkPolicy--
>
> On 10/27/23 16:51, Alexey Romanenko wrote:
>
> Why not just to create a custom watermark policy for that? Or you mean to 
> make it as a default policy?
>
> —
> Alexey
>
> On 27 Oct 2023, at 10:25, Jan Lukavský  wrote:
>
>
> Hi,
>
> when discussing about [1] we found out, that the issue is actually caused by 
> processing time watermarks in KinesisIO. Enabling this watermark outputs 
> watermarks based on current processing time, _but event timestamps are 
> derived from ingestion timestamp_. This can cause unbounded lateness when 
> processing backlog. I think this setup is error-prone and will likely cause 
> data loss due to dropped elements. This can be solved in two ways:
>
>  a) deprecate processing time watermarks, or
>
>  b) modify KinesisIO's watermark policy so that is assigns event timestamps 
> as well (the processing-time watermark policy would have to derive event 
> timestamps from processing-time).
>
> I'd prefer option b) , but it might be a breaking change, moreover I'm not 
> sure if I understand the purpose of processing-time watermark policy, it 
> might be essentially ill defined from the beginning, thus it might really be 
> better to remove it completely. There is also a related issue [2].
>
> Any thoughts on this?
>
>  Jan
>
> [1] https://github.com/apache/beam/issues/25975
>
> [2] https://github.com/apache/beam/issues/28760
>
>
>

Beam High Priority Issue Report (47)

2023-10-31 Thread beamactions

This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java SDK Harness 
doesn't update user counters in OnTimer callback functions
https://github.com/apache/beam/issues/29076 [Failing Test]: Python ARM 
PostCommit failing after #28385
https://github.com/apache/beam/issues/29022 [Failing Test]: Python Github 
actions tests are failing due to update of pip 
https://github.com/apache/beam/issues/28760 [Bug]: EFO Kinesis IO reader 
provided by apache beam does not pick the event time for watermarking
https://github.com/apache/beam/issues/28715 [Bug]: Python WriteToBigtable get 
stuck for large jobs due to client dead lock
https://github.com/apache/beam/issues/28703 [Failing Test]: Building a wheel 
for integration tests sometimes times out
https://github.com/apache/beam/issues/28383 [Failing Test]: 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest.testMaxThreadMetric
https://github.com/apache/beam/issues/28339 Fix failing 
"beam_PostCommit_XVR_GoUsingJava_Dataflow" job
https://github.com/apache/beam/issues/28326 Bug: 
apache_beam.io.gcp.pubsublite.ReadFromPubSubLite not working
https://github.com/apache/beam/issues/28142 [Bug]: [Go SDK] Memory seems to be 
leaking on 2.49.0 with Dataflow
https://github.com/apache/beam/issues/27892 [Bug]: ignoreUnknownValues not 
working when using CreateDisposition.CREATE_IF_NEEDED 
https://github.com/apache/beam/issues/27648 [Bug]: Python SDFs (e.g. 
PeriodicImpulse) running in Flink and polling using tracker.defer_remainder 
have checkpoint size growing indefinitely 
https://github.com/apache/beam/issues/27616 [Bug]: Unable to use 
applyRowMutations() in bigquery IO apache beam java
https://github.com/apache/beam/issues/27486 [Bug]: Read from datastore with 
inequality filters
https://github.com/apache/beam/issues/27314 [Failing Test]: 
bigquery.StorageApiSinkCreateIfNeededIT.testCreateManyTables[1]
https://github.com/apache/beam/issues/27238 [Bug]: Window trigger has lag when 
using Kafka and GroupByKey on Dataflow Runner
https://github.com/apache/beam/issues/26981 [Bug]: Getting an error related to 
SchemaCoder after upgrading to 2.48
https://github.com/apache/beam/issues/26911 [Bug]: UNNEST ARRAY with a nested 
ROW (described below)
https://github.com/apache/beam/issues/26343 [Bug]: 
apache_beam.io.gcp.bigquery_read_it_test.ReadAllBQTests.test_read_queries is 
flaky
https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not 
propagate a Coder to AvroSource
https://github.com/apache/beam/issues/26041 [Bug]: Unable to create 
exactly-once Flink pipeline with stream source and file sink
https://github.com/apache/beam/issues/24776 [Bug]: Race condition in Python SDK 
Harness ProcessBundleProgress
https://github.com/apache/beam/issues/24389 [Failing Test]: 
HadoopFormatIOElasticTest.classMethod ExceptionInInitializerError 
ContainerFetchException
https://github.com/apache/beam/issues/24313 [Flaky]: 
apache_beam/runners/portability/portable_runner_test.py::PortableRunnerTestWithSubprocesses::test_pardo_state_with_custom_key_coder
https://github.com/apache/beam/issues/23944  beam_PreCommit_Python_Cron 
regularily failing - test_pardo_large_input flaky
https://github.com/apache/beam/issues/23709 [Flake]: Spark batch flakes in 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElement and 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder 
will drop message id and orderingKey
https://github.com/apache/beam/issues/22913 [Bug]: 
beam_PostCommit_Java_ValidatesRunner_Flink is flakes in 
org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState
https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for 
dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it
https://github.com/apache/beam/issues/21714 
PulsarIOTest.testReadFromSimpleTopic is very flaky
https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit 
test action 
StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer
https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial 
(order 1000 elements) numpy input flakes in non-cython environment
https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table 
destinations returns wrong tableId
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: 
Connection refused
https://github.com/apache/beam/issues/21424 Java VR (Dataflow, V2, Streaming) 
failing: ParDoTest$TimestampTests/OnWindowExpirationTests
https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not 
follow spec
https://github.com/apache/beam/issues/

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Jan Lukavský

I think that instead of deprecating and creating new version, we could
leverage the proposed update compatibility flag for this [1]. I still
have some doubts if the processing-time watermarking (and event-time
assignment) makes sense. Do we have a valid use-case for that? This is
actually the removed SYNCHRONIZED_PROCESSING_TIME time domain, which is
problematic - restarts of Pipelines causes timestamps to change and
hence makes *every* DoFn potentially non-deterministic, which would be
unexpected side-effect. This makes me wonder if we should remove this
policy altogether (deprecate or use the update compatibility flag, so
that the policy throws exception in new version).

The crucial point would be to find a use-case where it is actually
helpful to use such policy.

Any ideas?

Jan

[1] https://lists.apache.org/thread/29r3zv04n4ooq68zzvpw6zm1185n59m2

On 10/27/23 18:33, Alexey Romanenko wrote:

Ahh, ok, I see.

Yes, it looks like a bug. So, I'd propose to deprecate the old
"processing time” watermark policy, which we can remove later, and
create a new fixed one.

PS: It’s recommended to use
/"org.apache.beam.sdk.io.aws2.kinesis.KinesisIO”/ instead of
deprecated /“org.apache.beam.sdk.io.kinesis.KinesisIO”/ one.

—
Alexey

On 27 Oct 2023, at 17:42, Jan Lukavský wrote:

No, I'm referring to this [1] policy which has unexpected (and hardly
avoidable on the user-code side) data loss issues. The problem is
that assigning timestamps to elements and watermarks is completely
decoupled and unrelated, which I'd say is a bug.

Jan

[1]
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.Read.html#withProcessingTimeWatermarkPolicy--

On 10/27/23 16:51, Alexey Romanenko wrote:
Why not just to create a custom watermark policy for that? Or you
mean to make it as a default policy?

—
Alexey

On 27 Oct 2023, at 10:25, Jan Lukavský wrote:

Hi,

when discussing about [1] we found out, that the issue is actually
caused by processing time watermarks in KinesisIO. Enabling this
watermark outputs watermarks based on current processing time, _but
event timestamps are derived from ingestion timestamp_. This can
cause unbounded lateness when processing backlog. I think this
setup is error-prone and will likely cause data loss due to dropped
elements. This can be solved in two ways:

a) deprecate processing time watermarks, or

b) modify KinesisIO's watermark policy so that is assigns event
timestamps as well (the processing-time watermark policy would have
to derive event timestamps from processing-time).

I'd prefer option b) , but it might be a breaking change, moreover
I'm not sure if I understand the purpose of processing-time
watermark policy, it might be essentially ill defined from the
beginning, thus it might really be better to remove it completely.
There is also a related issue [2].

Any thoughts on this?

Jan

[1] https://github.com/apache/beam/issues/25975

[2] https://github.com/apache/beam/issues/28760

Re: Credentials Rotation Failure on IO-Datastores cluster

Credentials Rotation Failure on IO-Datastores cluster

Credentials Rotation Failure on Metrics cluster (2023-11-01)

Re: Processing time watermarks in KinesisIO

Re: Processing time watermarks in KinesisIO

Re: Processing time watermarks in KinesisIO

Beam High Priority Issue Report (47)

Re: Processing time watermarks in KinesisIO

8 matches

Site Navigation

Mail list logo

Footer information