zentol opened a new pull request #7062: [FLINK-10825][tests] Increase 
request-backoff for high-parallelism e2e test
URL: https://github.com/apache/flink/pull/7062
 
 
   ## What is the purpose of the change
   
   This PR stabilizes the high-parallelism iterations e2e test.
   
   When a task starts running it requests data (partitions) from other tasks. 
In case of a timeout the request is retried with a backoff, until the maximum 
backoff (`taskmanager.network.request-backoff.max`) is reached.
   When reached a `PartitionNotFoundException` is thrown as reported in the 
JIRA that fails the job.
   
   If a job is not fully deployed within the time that it takes 1 task to reach 
the maximum backoff it is quite likely for this exception to occur.
   
   This PR bumps the maximum backoff to 60 seconds, which should give the job 
more time to fully deploy.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to