Github user a-roberts commented on the issue:

    https://github.com/apache/spark/pull/14961
  
    [info] - using external shuffle service *** FAILED *** (1 minute)
    [info]   java.util.concurrent.TimeoutException: Can't find 2 executors 
before 60000 milliseconds elapsed
    
    60 seconds really is an eternity, I can't reproduce this on my local set 
up, I expect we've got deadlock going on after the upgrade and would require 
some proper debugging (again, if only I could reproduce it on my test systems 
with access to tools like gdb/healthcenter/servicing APIs we use here). My 
systems have between two and eight cores and I know this farm has a lot more 
available...could be that having more cores increases the chances of thread 
contention.
    
    I had a look at other pull requests being tested and see it typically 
completes in 3 seconds on a good run
    using external shuffle service (3 seconds, 822 milliseconds)
    at 
https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3258/consoleText
    
    using external shuffle service (4 seconds, 543 milliseconds)
    
https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3233/consoleText


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to