[
https://issues.apache.org/jira/browse/BEAM-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Valentyn Tymofieiev updated BEAM-14407:
---------------------------------------
Status: Open (was: Triage Needed)
> Jenkins worker sometimes crashes while running postcommits
> ----------------------------------------------------------
>
> Key: BEAM-14407
> URL: https://issues.apache.org/jira/browse/BEAM-14407
> Project: Beam
> Issue Type: Bug
> Components: test-failures
> Reporter: Valentyn Tymofieiev
> Priority: P2
> Labels: flake
>
> Example failure from
> https://ci-beam.apache.org/job/beam_PostCommit_Python37/5184/
> ```
> >>> RUNNING integration tests with pipeline options: --runner=FlinkRunner
> --project=apache-beam-testing --environment_type=LOOPBACK –
> temp_location=gs://temp-storage-for-end-to-end-tests/temp-it
> --flink_job_server_jar=/home/jenkins/jenkins-slave/workspace/
>
> beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar
> 4216 >>> pytest options: apache_beam/io/gcp/bigquery_read_it_test.py
> apache_beam/io/external/xlang_jdbcio_it_test.py apache_beam/io/
> external/xlang_kafkaio_it_test.py
> apache_beam/io/external/xlang_kinesisio_it_test.py
> apache_beam/io/external/xlang_debeziumio_it_test. py --log-cli-level=INFO
> ...
> 15:27:18 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:116 Starting service
> with ['java' '-jar'
> '/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar'
> '--flink-master' '[auto]' '--artifacts-dir'
> '/tmp/beam-temp34uahjm8/artifactsfzc4uc4c' '--job-port' '56343'
> '--artifact-port' '0' '--expansion-port' '0']
> 15:27:18 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:20 PM software.amazon.awssdk.regions.internal.util.EC2MetadataUtils
> getItems'
> 15:27:20 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'WARNING:
> Unable to retrieve the requested metadata.'
> 15:27:20 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:20 PM org.apache.beam.sdk.io.aws2.s3.DefaultS3ClientBuilderFactory
> createBuilder'
> 15:27:20 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b"INFO: The AWS
> S3 Beam extension was included in this build, but the awsRegion flag was not
> specified. If you don't plan to use S3, then ignore this message."
> 15:27:20 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver
> createArtifactStagingService'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO:
> ArtifactStagingService started on localhost:36631'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver
> createExpansionService'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Java
> ExpansionService started on localhost:35729'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver
> createJobServer'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO:
> JobService started on localhost:56343'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver run'
> 15:27:21 INFO
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Job
> server now running, terminate with Ctrl+C'
> 15:27:21 FATAL: command execution failed
> 15:27:21 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is
> disconnected.
> 15:27:21 at
> hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
> ...
> 4318 FATAL: command execution failed
>
> 4319 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is
> disconnected.
> 4320 at
> hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
>
> 4321 at
> hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
> ```
> Perhaps a random crash or worker got overloaded. Other suites running at the
> same time:
> beam_BiqQueryIO_Streaming_Performance_Test_Java #3729
> beam_LoadTests_Java_CoGBK_Dataflow_V2_Streaming_Java17 #134
> beam_LoadTests_Python_GBK_Dataflow_Batch #1060
> also crashed, but at the moment those tests have launched Dataflow jobs and
> were streaming log output. Only the beam_PostCommit_Python37 suite appeared
> to be running something intensive on the worker.
> Filing to see how frequently this happens.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)