[ 
https://issues.apache.org/jira/browse/BEAM-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17549970#comment-17549970
 ] 

Danny McCormick commented on BEAM-14407:
----------------------------------------

This issue has been migrated to https://github.com/apache/beam/issues/21540

> Jenkins worker sometimes crashes while running Python Flink pipeline
> --------------------------------------------------------------------
>
>                 Key: BEAM-14407
>                 URL: https://issues.apache.org/jira/browse/BEAM-14407
>             Project: Beam
>          Issue Type: Bug
>          Components: test-failures
>            Reporter: Valentyn Tymofieiev
>            Priority: P2
>              Labels: flake
>
> Example failure from 
> [https://ci-beam.apache.org/job/beam_PostCommit_Python37/5184/]
> {noformat}
>  >>> RUNNING integration tests with pipeline options: --runner=FlinkRunner 
> --project=apache-beam-testing --environment_type=LOOPBACK –      
> temp_location=gs://temp-storage-for-end-to-end-tests/temp-it 
> --flink_job_server_jar=/home/jenkins/jenkins-slave/workspace/                 
>  
> beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar
> 4216 >>>   pytest options: apache_beam/io/gcp/bigquery_read_it_test.py 
> apache_beam/io/external/xlang_jdbcio_it_test.py apache_beam/io/           
> external/xlang_kafkaio_it_test.py 
> apache_beam/io/external/xlang_kinesisio_it_test.py 
> apache_beam/io/external/xlang_debeziumio_it_test.      py --log-cli-level=INFO
> ...
> 15:27:18 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:116 Starting service 
> with ['java' '{-}jar' 
> '/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/runners/flink/1.14/job-server/build/libs/beam-runners-flink-1.14-job-server-2.39.0-SNAPSHOT.jar'
>  '{-}{-}flink-master' '[auto]' '{-}{-}artifacts-dir' 
> '/tmp/beam-temp34uahjm8/artifactsfzc4uc4c' '{-}{-}job-port' '56343' 
> '{-}{-}artifact-port' '0' '{-}-expansion-port' '0']
> 15:27:18 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:20 PM software.amazon.awssdk.regions.internal.util.EC2MetadataUtils 
> getItems'
> 15:27:20 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'WARNING: 
> Unable to retrieve the requested metadata.'
> 15:27:20 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:20 PM org.apache.beam.sdk.io.aws2.s3.DefaultS3ClientBuilderFactory 
> createBuilder'
> 15:27:20 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b"INFO: The AWS 
> S3 Beam extension was included in this build, but the awsRegion flag was not 
> specified. If you don't plan to use S3, then ignore this message."
> 15:27:20 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver 
> createArtifactStagingService'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: 
> ArtifactStagingService started on localhost:36631'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver 
> createExpansionService'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Java 
> ExpansionService started on localhost:35729'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver 
> createJobServer'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: 
> JobService started on localhost:56343'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'May 03, 2022 
> 1:27:21 PM org.apache.beam.runners.jobsubmission.JobServerDriver run'
> 15:27:21 INFO     
> apache_beam.utils.subprocess_server:subprocess_server.py:125 b'INFO: Job 
> server now running, terminate with Ctrl+C'
> 15:27:21 FATAL: command execution failed
> 15:27:21 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is 
> disconnected.
> 15:27:21     at 
> hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
> ...
> 4318 FATAL: command execution failed                                          
>        
> 4319 java.io.IOException: Backing channel 'apache-beam-jenkins-10' is 
> disconnected.  
> 4320   at 
> hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
>                                            
> 4321   at 
> hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
>  {noformat}
> Perhaps a random crash or worker got overloaded. Other suites running at the 
> same time:
> beam_BiqQueryIO_Streaming_Performance_Test_Java #3729    
> beam_LoadTests_Java_CoGBK_Dataflow_V2_Streaming_Java17 #134
> beam_LoadTests_Python_GBK_Dataflow_Batch #1060
> also crashed, but at the moment those tests have launched Dataflow jobs and 
> were streaming log output. Only the beam_PostCommit_Python37 suite appeared 
> to be running something intensive on the worker.
> Filing to see how frequently this happens.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to