[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-05-19 Thread Anton Kalashnikov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539477#comment-17539477
 ] 

Anton Kalashnikov commented on FLINK-27031:
---

As I can see we setup Virtual Channels inside 
`StreamTaskNetworkInputFactory#create` when `InflightDataRescalingDescriptor` 
is not equal to 'NO_RESCALE'. It happens only if the subtask receives the 
'JobManagerTaskRestore'(see 
'TaskStateManagerImpl#getInputRescalingDescriptor').  So if the subtask doesn't 
receive  'JobManagerTaskRestore' but it receives the old state which should be 
filtered then we have an error described in this ticket. 
As I understand, It is only possible when the subtask doesn't have any states 
for restoring, but this task's upstream has output channel states. According to 
current logic(see last 'for' in 'StateAssignmentOperation#assignStates'), we 
assign states only based on states of current subtask and ignore its upstream 
states which actually leads to the described problem. 
So I think we can expand the condition for assigning the states by including 
upstream output channel states check.(which I have done in 
https://github.com/apache/flink/pull/19702).  Or we can think of another fix 
that should somehow notify the subtask about creating the Virtual Channels. 

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Runtime / State Backends
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:5

[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-04-11 Thread Yun Gao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17520361#comment-17520361
 ] 

Yun Gao commented on FLINK-27031:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34494&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=5380

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Runtime / State Backends
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:767)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
> Apr 01 20:26:53   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-04-08 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519687#comment-17519687
 ] 

Roman Khachatryan commented on FLINK-27031:
---

[https://dev.azure.com/khachatryanroman/flink/_build/results?buildId=1548&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=41041]

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Runtime / State Backends
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:767)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
> Apr 01 20:26:53   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-04-04 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516867#comment-17516867
 ] 

Roman Khachatryan commented on FLINK-27031:
---

The test fails reliably when rescaling from 1 to 3 or higher DoP, ~6 runs out 
of 10.  ACCUMULATE_TIME_MILLIS doesn't affect it and can be zero.

Furthermore, it only fails when both Changelog and Unaligned checkpoints are 
enabled.

While debugging, I can see that:
- the problematic record comes from ResultSubpartition state
- according to its key, it should be assigned to subtask 1 or 2
- it is produced by upstream 0 and  consumed by downstream 0 (the downstream 
should filter it out)
- however, the downstream 0 doesn't setup Virtual Channels that could filter 
the record; when it does, the test passes

I didn't look further why Virtual Channels aren't being setup.

>From the above, the failure seems to be caused by Unaligned checkpoints rather 
>than changelog.
[~pnowojski] could you please take a look?

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Runtime / State Backends
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Matthias Pohl
>Assignee: Roman Khachatryan
>Priority: Critical
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:767)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
> Apr 01 20:26:53   at 
>

[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-04-04 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516661#comment-17516661
 ] 

Roman Khachatryan commented on FLINK-27031:
---

Sure [~mapohl], I'll take a look.

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.16.0
>Reporter: Matthias Pohl
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: test-stability
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:767)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
> Apr 01 20:26:53   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-27031) ChangelogRescalingITCase.test failed due to IllegalStateException

2022-04-03 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516478#comment-17516478
 ] 

Matthias Pohl commented on FLINK-27031:
---

[~roman]  may you have a look

> ChangelogRescalingITCase.test failed due to IllegalStateException
> -
>
> Key: FLINK-27031
> URL: https://issues.apache.org/jira/browse/FLINK-27031
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.16.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: test-stability
>
> [This 
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=923&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=a9a20597-291c-5240-9913-a731d46d6dd1&l=12961]
>  failed in {{ChangelogRescalingITCase.test}}:
> {code}
> Apr 01 20:26:53 Caused by: java.lang.IllegalArgumentException: Key group 94 
> is not in KeyGroupRange{startKeyGroup=96, endKeyGroup=127}. Unless you're 
> directly using low level state access APIs, this is most likely caused by 
> non-deterministic shuffle key (hashCode and equals implementation).
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.KeyGroupRangeOffsets.newIllegalKeyGroupException(KeyGroupRangeOffsets.java:37)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.getMapForKeyGroup(StateTable.java:305)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:261)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.StateTable.get(StateTable.java:143)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:94)
> Apr 01 20:26:53   at 
> org.apache.flink.state.changelog.ChangelogListState.add(ChangelogListState.java:78)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:404)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:531)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:227)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:841)
> Apr 01 20:26:53   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:767)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
> Apr 01 20:26:53   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
> Apr 01 20:26:53   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)