snjypl commented on issue #21087:
URL: https://github.com/apache/airflow/issues/21087#issuecomment-1121052643

   @ecerulm  
   as i understand there are two aspects to this issue:
   
   **1.) preventing 410 error from happening:**
   
   To prevent it we need to use the BOOKMARK feature. it was specifically 
introduced for this purpose.  the prevention should be done by `kubernetes 
python client` in the  stream loop  [here](
   
https://github.com/kubernetes-client/python/blob/34ed13f9c729eea0b7581887ae0a1b0018664c5a/kubernetes/base/watch/watch.py#L162
   )
   now because of a bug in the kubernetes python client, the BOOKMARK events 
were not used correctly. the following pull requests seeks to fix it. 
https://github.com/kubernetes-client/python/pull/1796
   
   it is because of this bug in kubernetes python client that you were getting 
410 error even with `allow_watch_bookmark` enabled. 
   
   
   **2.) handling the error once it happens.** 
   
   now with or without `allow_watch_bookmarks` enabled , in case 410 error 
occurs the client is expected to restart the watch with resource_version ="0" 
or None [ None is preferred unless you have a need to trace from the beginning] 
   i understand your pull request seeks to correctly address the `handling 410 
part` of the issue. 
   
   my view is, there is no need airflow to keep track of the 
last_resource_version or even have ResoureVersion singleton. it is not airflow 
job to do the actual 'watching' of resource. 
   
   if Kubernetes client throws 410 error for a particular resourceVersion, then 
even if you restart the watch process with that resourceVersion you are bound 
to get the same error. 
   
   i have opened a WIP pull request 
https://github.com/kubernetes-client/python/pull/1796  for a better discussion 
of this issue. 
   
   @ecerulm  
   
   i went through your PR,  and i can see that you are addressing the 'handle 
410' part of the issue. 
   my suggestion would be that, we can catch the `ApiException` and call `_run` 
with resource_version="0" if `e.status == 410`. 
   @cansjt  has raised the concern about relying on the status for this 
particular error, but i believe it is safe, 410 is raised only for this 
particular error. 
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to