Github user sachingoel0101 commented on the pull request:

    https://github.com/apache/flink/pull/1214#issuecomment-145972139
  
    I agree. We need to have a proper handling of eager execution, and why 
exactly it shouldn't be allowed. The client exits is not a very good 
explanation from the user viewpoint, and it might end up breaking programs.
    Further, the collect call is used in other API methods as well, for 
example, in Gelly. Detached mode in the current version will lead to failure of 
every such code.
    
    One question: What exactly is the significance of a detached submission and 
what does the user expect when they submit the job in detached mode? From what 
I understand, it should mean that the user should be able to shut down the 
machine on which the client is present. They should be able to monitor the 
output for however long they want, stop and resume at a later point. I'm 
drawing the analogy from how the `screen` program works on linux systems. 
    According to this understanding of mine, the following organization would 
make sense:
                                                            __ __ __ __ __ __ 
__ __ __ __
                                                           |  
ExecutorClient(invokes main)  |
                                                           |   [Output 
available on web        |
    User(command line/ webclient) ->  |    if detached mode. Otherwise |
    [Actor system receives updates     |    sent to user client]                
  |
    if not in detached mode otherwise |   [Compute job graph and         |
    shuts down after submitting]          |       submit to Job Manager]       
|  -> Task Managers
                                                           |                    
                               |
                                                           |             Job 
Manager                 |
                                                           |__ __ __ __ __ __ 
__ __ __ __  |
                                                              Job Manager actor 
system
    
    However, if the executor program is in the job manager actor system, what 
happens in case of a failure? Can we recover the state of the main thread, its 
memory space etc. on a different leader?
    
    I hope that added something to a discussion that is now bound to happen on 
detached mode. :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to