potiuk commented on issue #24466:
URL: https://github.com/apache/airflow/issues/24466#issuecomment-1317083295

   > @potiuk Thanks for your response.
   > 
   > The weird thing about the described issue was that the airflow task (an d 
DAG) did not terminate once the process was killed on the GCP machine. 
Shouldn't the task also be terminated in case the SSH connection gets closed?
   
   It depends. TCP connections work this way that if the client does not send 
anything on the connection, closing the connection by firewall might make the 
client not realise that the connection has been broken. This is one reason why 
keep-alive is needed to make sure such connections are closed. The state 
machine for TCP connection and packets sent to close/shutdown the connections 
are pretty complex and there are mechanisms in place which eventually shutdown 
such opened connections with kernel-configured timeouts, but if you are not 
sending data over TCP to "ping" the other side, there are scenarios where 
either of the sides might not realise that connections have been closed. 
   
   The thing is TCP connection is not a physical "link" to be broken as you 
might imagine it. It's just agreement between client and server that if a 
packet is sent over the network and destination/source and port numbers agree, 
then such a packet gets routed by the kernel to the right client that "keeps" 
the right socket open. But if - suddenly - someone in between starts dropping 
all the packets, when there is no keep-alive neither of the parties might 
realise tha the link has been broken. So it is really the question on "how" the 
firewall breaks the connection. If it will signal both ends that the connection 
has been brokent (by sending TCP shutdown/close packet seuence to either 
parties), they they will get "broken pipe" error. But if the firewally will 
simply stop forwarding packets. then you got a "hanging connection".
    
   > Either way, I just re-structured my code, running the task in GCP in a 
background process and using sensors to check for the termination criteria. 
Felt bad to keep the SSH connection open for the whole time in the first place.
   
   Good idea. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to