[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2020-01-14 Thread Yu Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated FLINK-14843:
--
Priority: Major  (was: Critical)

Thanks for the investigation [~banmoy]. Downgrading priority here and will 
track FLINK-14505 instead.

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2020-01-14 Thread Yu Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated FLINK-14843:
--
Priority: Critical  (was: Major)

Upgrading to Critical since it reproduces in release-1.10 nightly build, and I 
suggest to fix it as soon as possible if the given PR looks good [~banmoy] 
[~gjy]. Thanks.

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2020-01-06 Thread Gary Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-14843:
-
Issue Type: Bug  (was: Test)

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-12-26 Thread PengFei Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PengFei Li updated FLINK-14843:
---
Issue Type: Test  (was: Bug)

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Test
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-12-26 Thread PengFei Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PengFei Li updated FLINK-14843:
---
Priority: Major  (was: Critical)

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Test
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-12-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14843:
---
Labels: pull-request-available test-stability  (was: test-stability)

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Assignee: PengFei Li
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-12-10 Thread Gary Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-14843:
-
Fix Version/s: 1.10.0

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26930/6
> Truncating buckets
> Number of produced values 30890/6
> Truncating buckets
> Number of produced values 37340/6
> Truncating buckets
> Number of produced values 41290/6
> Truncating buckets
> Number of produced values 46710/6
> Truncating buckets
> Number of produced values 52120/6
> Truncating buckets
> Number of produced values 57110/6
> Truncating buckets
> Number of produced values 62530/6
> Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
> Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
> Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
> CANCELED ...
> Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
> Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 52034), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52472), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 52916), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> wait_for_number_of_running_tms 2
> {code}
> Command to run the test:
> {noformat}
> FLINK_DIR=build-target/ 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-11-18 Thread Gary Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-14843:
-
Description: 
*Description*
Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can fail 
with Output hash mismatch.

{noformat}
Number of running task managers has reached 4.
Job (e0b7a86e4d4111f3947baa3d004e083a) is running.
Waiting until all values have been produced
Truncating buckets
Number of produced values 26930/6
Truncating buckets
Number of produced values 30890/6
Truncating buckets
Number of produced values 37340/6
Truncating buckets
Number of produced values 41290/6
Truncating buckets
Number of produced values 46710/6
Truncating buckets
Number of produced values 52120/6
Truncating buckets
Number of produced values 57110/6
Truncating buckets
Number of produced values 62530/6
Cancelling job e0b7a86e4d4111f3947baa3d004e083a.
Cancelled job e0b7a86e4d4111f3947baa3d004e083a.
Waiting for job (e0b7a86e4d4111f3947baa3d004e083a) to reach terminal state 
CANCELED ...
Job (e0b7a86e4d4111f3947baa3d004e083a) reached terminal state CANCELED
Job e0b7a86e4d4111f3947baa3d004e083a was cancelled, time to verify
FAIL Bucketing Sink: Output hash mismatch.  Got 
9e00429abfb30eea4f459eb812b470ad, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
head hexdump of actual:
000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
0f0   y   l   o   a   d   .   .   .   )  \n
0fa
Stopping taskexecutor daemon (pid: 55164) on host gyao-desktop.
Stopping standalonesession daemon (pid: 51073) on host gyao-desktop.
Stopping taskexecutor daemon (pid: 51504) on host gyao-desktop.
Skipping taskexecutor daemon (pid: 52034), because it is not running anymore on 
gyao-desktop.
Skipping taskexecutor daemon (pid: 52472), because it is not running anymore on 
gyao-desktop.
Skipping taskexecutor daemon (pid: 52916), because it is not running anymore on 
gyao-desktop.
Stopping taskexecutor daemon (pid: 54121) on host gyao-desktop.
Stopping taskexecutor daemon (pid: 54726) on host gyao-desktop.
[FAIL] Test script contains errors.
Checking of logs skipped.

[FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' failed 
after 2 minutes and 3 seconds! Test exited with exit code 1
{noformat}


*How to reproduce*
Comment out the delay of 10s after the 1st TM is restarted to provoke the issue:

{code:bash}
echo "Restarting 1 TM"
$FLINK_DIR/bin/taskmanager.sh start
wait_for_number_of_running_tms 4

#sleep 10

echo "Killing 2 TMs"
kill_random_taskmanager
kill_random_taskmanager
wait_for_number_of_running_tms 2
{code}

Command to run the test:
{noformat}
FLINK_DIR=build-target/ flink-end-to-end-tests/run-single-test.sh skip 
flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh
{noformat}




  was:
*Description*
Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can fail 
with Output hash mismatch.

{noformat}
Number of running task managers has reached 4.
Job (67212178694f8b2a9bc9d9572567a53f) is running.
Waiting until all values have been produced
Truncating buckets
Number of produced values 26325/6
Truncating buckets
Number of produced values 31315/6
Truncating buckets
Number of produced values 36735/6
Truncating buckets
Number of produced values 40705/6
Truncating buckets
Number of produced values 46125/6
Truncating buckets
Number of produced values 51135/6
Truncating buckets
Number of produced values 56555/6
Truncating buckets
Number of produced values 61935/6
Cancelling job 67212178694f8b2a9bc9d9572567a53f.
Cancelled job 67212178694f8b2a9bc9d9572567a53f.
Waiting for job (67212178694f8b2a9bc9d9572567a53f) to reach terminal state 
CANCELED ...
Job (67212178694f8b2a9bc9d9572567a53f) reached terminal state CANCELED
Job 67212178694f8b2a9bc9d9572567a53f was cancelled, time to verify
FAIL Bucketing Sink: 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-11-18 Thread Gary Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-14843:
-
Attachment: flink-gary-standalonesession-0-gyao-desktop.log
flink-gary-taskexecutor-0-gyao-desktop.log
flink-gary-taskexecutor-1-gyao-desktop.log
flink-gary-taskexecutor-2-gyao-desktop.log
flink-gary-taskexecutor-3-gyao-desktop.log
flink-gary-taskexecutor-4-gyao-desktop.log
flink-gary-taskexecutor-5-gyao-desktop.log
flink-gary-taskexecutor-6-gyao-desktop.log
complete_result

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Priority: Critical
>  Labels: test-stability
> Attachments: complete_result, 
> flink-gary-standalonesession-0-gyao-desktop.log, 
> flink-gary-taskexecutor-0-gyao-desktop.log, 
> flink-gary-taskexecutor-1-gyao-desktop.log, 
> flink-gary-taskexecutor-2-gyao-desktop.log, 
> flink-gary-taskexecutor-3-gyao-desktop.log, 
> flink-gary-taskexecutor-4-gyao-desktop.log, 
> flink-gary-taskexecutor-5-gyao-desktop.log, 
> flink-gary-taskexecutor-6-gyao-desktop.log
>
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (67212178694f8b2a9bc9d9572567a53f) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26325/6
> Truncating buckets
> Number of produced values 31315/6
> Truncating buckets
> Number of produced values 36735/6
> Truncating buckets
> Number of produced values 40705/6
> Truncating buckets
> Number of produced values 46125/6
> Truncating buckets
> Number of produced values 51135/6
> Truncating buckets
> Number of produced values 56555/6
> Truncating buckets
> Number of produced values 61935/6
> Cancelling job 67212178694f8b2a9bc9d9572567a53f.
> Cancelled job 67212178694f8b2a9bc9d9572567a53f.
> Waiting for job (67212178694f8b2a9bc9d9572567a53f) to reach terminal state 
> CANCELED ...
> Job (67212178694f8b2a9bc9d9572567a53f) reached terminal state CANCELED
> Job 67212178694f8b2a9bc9d9572567a53f was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 4e2d1859e41184a38e5bc95090fe9941, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 654547) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 650368) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 650812) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 651347), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 651795), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 652249), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 653481) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 654099) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit 

[jira] [Updated] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-11-18 Thread Gary Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-14843:
-
Labels: test-stability  (was: )

> Streaming bucketing end-to-end test can fail with Output hash mismatch
> --
>
> Key: FLINK-14843
> URL: https://issues.apache.org/jira/browse/FLINK-14843
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.10.0
> Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
>Reporter: Gary Yao
>Priority: Critical
>  Labels: test-stability
>
> *Description*
> Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can 
> fail with Output hash mismatch.
> {noformat}
> Number of running task managers has reached 4.
> Job (67212178694f8b2a9bc9d9572567a53f) is running.
> Waiting until all values have been produced
> Truncating buckets
> Number of produced values 26325/6
> Truncating buckets
> Number of produced values 31315/6
> Truncating buckets
> Number of produced values 36735/6
> Truncating buckets
> Number of produced values 40705/6
> Truncating buckets
> Number of produced values 46125/6
> Truncating buckets
> Number of produced values 51135/6
> Truncating buckets
> Number of produced values 56555/6
> Truncating buckets
> Number of produced values 61935/6
> Cancelling job 67212178694f8b2a9bc9d9572567a53f.
> Cancelled job 67212178694f8b2a9bc9d9572567a53f.
> Waiting for job (67212178694f8b2a9bc9d9572567a53f) to reach terminal state 
> CANCELED ...
> Job (67212178694f8b2a9bc9d9572567a53f) reached terminal state CANCELED
> Job 67212178694f8b2a9bc9d9572567a53f was cancelled, time to verify
> FAIL Bucketing Sink: Output hash mismatch.  Got 
> 4e2d1859e41184a38e5bc95090fe9941, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
> head hexdump of actual:
> 000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
> 010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
> 020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
> 030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
> 040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
> 050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
> 060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
> 070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
> 080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
> 090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
> 0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
> 0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
> 0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
> 0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
> 0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
> 0f0   y   l   o   a   d   .   .   .   )  \n
> 0fa
> Stopping taskexecutor daemon (pid: 654547) on host gyao-desktop.
> Stopping standalonesession daemon (pid: 650368) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 650812) on host gyao-desktop.
> Skipping taskexecutor daemon (pid: 651347), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 651795), because it is not running anymore 
> on gyao-desktop.
> Skipping taskexecutor daemon (pid: 652249), because it is not running anymore 
> on gyao-desktop.
> Stopping taskexecutor daemon (pid: 653481) on host gyao-desktop.
> Stopping taskexecutor daemon (pid: 654099) on host gyao-desktop.
> [FAIL] Test script contains errors.
> Checking of logs skipped.
> [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' 
> failed after 2 minutes and 3 seconds! Test exited with exit code 1
> {noformat}
> *How to reproduce*
> Comment out the delay of 10s after the 1st TM is restarted to provoke the 
> issue:
> {code:bash}
> echo "Restarting 1 TM"
> $FLINK_DIR/bin/taskmanager.sh start
> wait_for_number_of_running_tms 4
> #sleep 10
> echo "Killing 2 TMs"
> kill_random_taskmanager
> kill_random_taskmanager
> wait_for_number_of_running_tms 2
> {code}
> Command to run the test:
> {noformat}
> FLINK_DIR=build-target/ flink-end-to-end-tests/run-single-test.sh skip 
> flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)