Github user dragos commented on the pull request:

    https://github.com/apache/spark/pull/10993#issuecomment-179251806
  
    I'm having troubles running this with dynamic allocation. Did you test it 
in that scenario?
    
    I'm seeing disconnects from the driver, leading to 
    
    ```
    6/02/03 15:03:29 WARN TaskSetManager: Lost task 3.2 in stage 4.0 (TID 4015, 
10.0.1.205): java.io.FileNotFoundException: 
/tmp/blockmgr-f008b463-1d87-406b-b879-bae73c915907/27/shuffle_2_3_0.data.607ce66e-b528-4fc8-97e2-5028fc7b8e99
 (No such file or directory)
    ```
    
    In the Shuffle Service logs I see
    
    ```
    16/02/03 14:58:32 DEBUG MesosExternalShuffleBlockHandler: Received 
registration request from app 1521e408-d8fe-416d-898b-3801e73a8293-0119 (remote 
address /10.0.1.47:52808).
    16/02/03 14:58:34 INFO ExternalShuffleBlockResolver: Registered executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=4} with 
ExecutorShuffleInfo{localDirs=[/tmp/blockmgr-248a584a-89b7-461a-8d8d-3363bd0f1a1b],
 subDirsPerLocalDir=64, shuffleManager=sort}
    16/02/03 14:58:34 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.208:42483 disconnected.
    16/02/03 14:58:43 INFO ExternalShuffleBlockResolver: Registered executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=2} with 
ExecutorShuffleInfo{localDirs=[/tmp/blockmgr-d9865194-5c38-46ae-bce7-de5605cbb4f6],
 subDirsPerLocalDir=64, shuffleManager=sort}
    16/02/03 14:58:43 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.208:42498 disconnected.
    16/02/03 14:58:43 INFO ExternalShuffleBlockResolver: Registered executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=0} with 
ExecutorShuffleInfo{localDirs=[/tmp/blockmgr-b8350cfd-fa2e-4a29-92c2-a88f1bec17ca],
 subDirsPerLocalDir=64, shuffleManager=sort}
    16/02/03 14:58:43 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.208:42499 disconnected.
    16/02/03 14:59:20 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.208:42509 disconnected.
    16/02/03 14:59:20 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.205:35465 disconnected.
    16/02/03 14:59:20 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.205:35462 disconnected.
    16/02/03 15:00:09 INFO ExternalShuffleBlockResolver: Registered executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=7} with 
ExecutorShuffleInfo{localDirs=[/tmp/blockmgr-19a734ac-496a-4b7d-b304-acf16f4b5a78],
 subDirsPerLocalDir=64, shuffleManager=sort}
    16/02/03 15:00:09 WARN MesosExternalShuffleBlockHandler: Unknown 
/10.0.1.208:42522 disconnected.
    16/02/03 15:00:32 INFO MesosExternalShuffleBlockHandler: Application 
1521e408-d8fe-416d-898b-3801e73a8293-0119 disconnected (address was 
/10.0.1.47:52808).
    16/02/03 15:00:32 INFO ExternalShuffleBlockResolver: Application 
1521e408-d8fe-416d-898b-3801e73a8293-0119 removed, cleanupLocalDirs = true
    16/02/03 15:00:32 INFO ExternalShuffleBlockResolver: Cleaning up executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=4}'s 1 local 
dirs
    16/02/03 15:00:32 INFO ExternalShuffleBlockResolver: Cleaning up executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=2}'s 1 local 
dirs
    16/02/03 15:00:32 INFO ExternalShuffleBlockResolver: Cleaning up executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=0}'s 1 local 
dirs
    16/02/03 15:00:32 INFO ExternalShuffleBlockResolver: Cleaning up executor 
AppExecId{appId=1521e408-d8fe-416d-898b-3801e73a8293-0119, execId=7}'s 1 local 
dirs
    ```
    
    I am not sure if it's related to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to