[GitHub] spark pull request: [SPARK-4983]exception handling about adding ta...

GenTang Sat, 10 Jan 2015 13:45:47 -0800

Github user GenTang commented on the pull request:

    https://github.com/apache/spark/pull/3986#issuecomment-69473316
  
    However, I met a really strange error a moment ago.
    I launched a cluster containing 1 master and 1 slave with the script. 
    Add_tag to master succussed after two tries and add_tag to slave succussed 
without throwing out the error. However, EC2 threw out 
`InvalidInstanceID.NotFound` error for slave node at :
    ```
    for i in cluster_instances:
        i.update()
    ```
    in  wait_for_cluster_state function. It seems that the information of 
instance has not been propagated for the update action. Meantime, information 
of instance has reached to certain point that add_tag action can be succussed.  
    I tried several time, it happened only once. I am not very clear why it 
happened. As wait_for_cluster_state is used for `launch`, `start`(these two 
need more than 1 minute to reach `ssh-ready` state), `destroy`(it needs about 1 
second to reach `terminated` state) action, maybe the workaround this to add 
some more waiting time to launch update action by making following change:
    ```
    while True:
        time.sleep(5 * num_attempts + 1) 
    ```
    at line 724




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4983]exception handling about adding ta...

Reply via email to