[
https://issues.apache.org/jira/browse/FLINK-30562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654507#comment-17654507
]
Biao Geng commented on FLINK-30562:
-----------------------------------
hi [~Jamalarm], thanks for the report. I tried to reproduce your problem with
flink 1.16.0 using standalone cluster on my computer(my demo using event time
can be found
[here|https://github.com/bgeng777/ververica-cep-demo/tree/FLINK-30562]). I
agree there are some difference in stdout when setting parallism to 2 comparing
with setting it 1. But it seems that thre result is corrent.
In my demo, when p is 2, the stdout of the matches is:
{quote}1> 3,3,3
1> 2,2,2{quote}
when p is 1:
{quote}3,3,3
2,2,2{quote}
Is above result the same with your experiments?
> Patterns are not emitted with parallelism >1 since 1.15.x+
> ----------------------------------------------------------
>
> Key: FLINK-30562
> URL: https://issues.apache.org/jira/browse/FLINK-30562
> Project: Flink
> Issue Type: Bug
> Components: Library / CEP
> Affects Versions: 1.16.0, 1.15.3
> Environment: Problem observed in:
> Production:
> Dockerised Flink cluster running in AWS Fargate, sourced from AWS Kinesis and
> sink to AWS SQS
> Local:
> Completely local MiniCluster based test with no external sinks or sources
> Reporter: Thomas Wozniakowski
> Priority: Major
>
> (Apologies for the speculative and somewhat vague ticket, but I wanted to
> raise this while I am investigating to see if anyone has suggestions to help
> me narrow down the problem.)
> We are encountering an issue where our streaming Flink job has stopped
> working correctly since Flink 1.15.3. This problem is also present on Flink
> 1.16.0. The Keyed CEP operators that our job uses are no longer emitting
> Patterns reliably, but critically *this is only happening when parallelism is
> set to a value greater than 1*.
> Our local build tests were previously set up using in-JVM `MiniCluster`
> instances, or dockerised Flink clusters all set with a parallelism of 1, so
> this problem was not caught and it caused an outage when we upgraded the
> cluster version in production.
> Observing the job using the Flink console in production, I can see that
> events are *arriving* into the Keyed CEP operators, but no Pattern events are
> being emitted out of any of the operators. Furthermore, all the reported
> Watermark values are zero, though I don't know if that is a red herring as it
> seems Watermark reporting seems to have changed since 1.14.x.
> I am currently attempting to create a stripped down version of our streaming
> job to demonstrate the problem, but this is quite tricky to set up. In the
> meantime I would appreciate any hints that could point me in the right
> direction.
> I have isolated the problem to the Keyed CEP operator by removing our real
> sinks and sources from the failing test. I am still seeing the erroneous
> behaviour when setting up a job as:
> # Events are read from a list using `env.fromCollection( ... )`
> # CEP operator processes events
> # Output is captured in another list for assertions
> My best guess at the moment is something to do with Watermark emission? There
> seems to have been changes related to watermark alignment, perhaps this has
> caused some kind of regression in the CEP library? To reiterate, *this
> problem only occurs with parallelism of 2 or more. Setting the parallelism to
> 1 immediately fixes the issue*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)