[ https://issues.apache.org/jira/browse/BEAM-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17549970#comment-17549970 ]
Danny McCormick commented on BEAM-14407: ---------------------------------------- This issue has been migrated to https://github.com/apache/beam/issues/21540 > Jenkins worker sometimes crashes while running Python Flink pipeline > -------------------------------------------------------------------- > > Key: BEAM-14407 > URL: https://issues.apache.org/jira/browse/BEAM-14407 > Project: Beam > Issue Type: Bug > Components: test-failures > Reporter: Valentyn Tymofieiev > Priority: P2 > Labels: flake > > Example failure from > [https://ci-beam.apache.org/job/beam_PostCommit_Python37/5184/] > {noformat} > >>> RUNNING integration tests with pipeline options: --runner=FlinkRunner > --project=apache-beam-testing --environment_type=LOOPBACK – > temp_location=gs://temp-storage-for-end-to-end-tests/temp-it > --flink_job_server_jar=/home/jenkins/jenkins-slave/workspace/ > > beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar > 4216 >>> pytest options: apache_beam/io/gcp/bigquery_read_it_test.py > apache_beam/io/external/xlang_jdbcio_it_test.py apache_beam/io/ > external/xlang_kafkaio_it_test.py > apache_beam/io/external/xlang_kinesisio_it_test.py > apache_beam/io/external/xlang_debeziumio_it_test. py --log-cli-level=INFO > ... > 15:27:18 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:116 Starting service > with ['java' '{-}jar' > '/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar' > '{-}{-}flink-master' '[auto]' '{-}{-}artifacts-dir' > '/tmp/beam-temp34uahjm8/artifactsfzc4uc4c' '{-}{-}job-port' '56343' > '{-}{-}artifact-port' '0' '{-}-expansion-port' '0'] > 15:27:18 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:20 PM software.amazon.awssdk.regions.internal.util.EC2MetadataUtils > getItems' > 15:27:20 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'WARNING: > Unable to retrieve the requested metadata.' > 15:27:20 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:20 PM org.apache.beam.sdk.io.aws2.s3.DefaultS3ClientBuilderFactory > createBuilder' > 15:27:20 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b"INFO: The AWS > S3 Beam extension was included in this build, but the awsRegion flag was not > specified. If you don't plan to use S3, then ignore this message." > 15:27:20 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver > createArtifactStagingService' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: > ArtifactStagingService started on localhost:36631' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver > createExpansionService' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Java > ExpansionService started on localhost:35729' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver > createJobServer' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: > JobService started on localhost:56343' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 > 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver run' > 15:27:21 INFO > apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Job > server now running, terminate with Ctrl+C' > 15:27:21 FATAL: command execution failed > 15:27:21 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is > disconnected. > 15:27:21 at > hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216) > ... > 4318 FATAL: command execution failed > > 4319 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is > disconnected. > 4320 at > hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216) > > 4321 at > hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286) > {noformat} > Perhaps a random crash or worker got overloaded. Other suites running at the > same time: > beam_BiqQueryIO_Streaming_Performance_Test_Java #3729 > beam_LoadTests_Java_CoGBK_Dataflow_V2_Streaming_Java17 #134 > beam_LoadTests_Python_GBK_Dataflow_Batch #1060 > also crashed, but at the moment those tests have launched Dataflow jobs and > were streaming log output. Only the beam_PostCommit_Python37 suite appeared > to be running something intensive on the worker. > Filing to see how frequently this happens. -- This message was sent by Atlassian Jira (v8.20.7#820007)