Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
eladkal closed issue #39236: KubernetesPodOperator duplicating logs when interrupted URL: https://github.com/apache/airflow/issues/39236 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
fdemiane commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2132400038 I opened a pull request, but I am not really sure if this is the correct way to go, as this is a rare occurrence, and logs might get polluted (space consumed is minimal, but still). What do you think? (CC: @eladkal) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
fdemiane commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2132387258 If we actually look at the logs, the logs that have been duplicated are within one second. If we look at the code [here](https://github.com/apache/airflow/blob/providers-cncf-kubernetes/7.13.0/airflow/providers/cncf/kubernetes/utils/pod_manager.py#L424), we see that read_pod_logs take since_seconds which is in seconds, and is passed to [_client.read_namespaced_pod_logs](https://github.com/apache/airflow/blob/providers-cncf-kubernetes/7.13.0/airflow/providers/cncf/kubernetes/utils/pod_manager.py#L645) (docs [here](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#read_namespaced_pod_log)) which does not support a finer grained time representation. Also looking at the [Kubernetes API reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/), it doesn't seem to support passing a finer-grained time representation. kubctl seem to support passing a since_time which allows passing a timestamp which supports milliseconds as seen [here](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_logs/#options). Doing a little search, I found this issue [here](https://github.com/kubernetes-client/python/issues/1351) in the distant past. The **optimal** fix for this issue to to provide a way to support passing a since_time in the kubernetes client (out of scope of Airflow), then do the necessary code changes in the KPO. A **quick win** would be to add a warning message that logs within one second might get duplicated (maybe [here](airflow/providers/cncf/kubernetes/utils/pod_manager.py)?). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
eladkal commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2132123404 Some work around it was done https://github.com/apache/airflow/issues/33498 cc @fdemiane maybe you will have time to take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
gbonazzoli commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2080427612 @raphaelauv with version 8.1.1 the problem is still present. It seems that now is allways getting "_Pod docker-java-w2ade41b log read interrupted but container base still running_" Airflow's version: ```bash airflow@airflow-test-worker-6cb8744f69-sw7xg:/opt/airflow$ airflow version 2.9.0 airflow@airflow-test-worker-6cb8744f69-sw7xg:/opt/airflow$ pip list | grep kub apache-airflow-providers-cncf-kubernetes 8.1.1 kubernetes 29.0.0 kubernetes_asyncio 29.0.0 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
tirkarthi commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2075830089 Related https://github.com/apache/airflow/issues/33498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
raphaelauv commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2075176634 could you try the latest version 8.1.1 of `apache-airflow-providers-cncf-kubernetes` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] KubernetesPodOperator duplicating logs when interrupted [airflow]
boring-cyborg[bot] commented on issue #39236: URL: https://github.com/apache/airflow/issues/39236#issuecomment-2075062319 Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] KubernetesPodOperator duplicating logs when interrupted [airflow]
Nikita-Sobolev opened a new issue, #39236: URL: https://github.com/apache/airflow/issues/39236 ### Apache Airflow version Other Airflow 2 version (please specify below) ### If "Other Airflow 2 version" selected, which one? 2.8.1 ### What happened? The KubernetesPodOperator is duplicating tasks's logs two times when `log read interrupted but container base still running` they are interrupted. Happens randomly on different dags and different runs of the same dag. Assume it is somehow connected to the https://github.com/apache/airflow/issues/35019 ### What you think should happen instead? no logs duplicate ### How to reproduce KubernetesPodOperator on cloud AKS cluster ### Operating System Ubuntu 22.04 ### Versions of Apache Airflow Providers apache-airflow==2.8.1 apache-airflow-providers-amazon==8.16.0 apache-airflow-providers-celery==3.5.1 apache-airflow-providers-cncf-kubernetes==7.13.0 apache-airflow-providers-common-io==1.2.0 apache-airflow-providers-common-sql==1.10.0 apache-airflow-providers-docker==3.9.1 apache-airflow-providers-elasticsearch==5.3.1 apache-airflow-providers-ftp==3.7.0 apache-airflow-providers-google==10.13.1 apache-airflow-providers-grpc==3.4.1 apache-airflow-providers-hashicorp==3.6.1 apache-airflow-providers-http==4.8.0 apache-airflow-providers-imap==3.5.0 apache-airflow-providers-microsoft-azure==8.5.1 apache-airflow-providers-mysql==5.5.1 apache-airflow-providers-odbc==4.4.0 apache-airflow-providers-openlineage==1.4.0 apache-airflow-providers-postgres==5.10.0 apache-airflow-providers-redis==3.6.0 apache-airflow-providers-sendgrid==3.4.0 apache-airflow-providers-sftp==4.8.1 apache-airflow-providers-slack==8.5.1 apache-airflow-providers-snowflake==5.2.1 apache-airflow-providers-sqlite==3.7.0 apache-airflow-providers-ssh==3.10.0 google-cloud-orchestration-airflow==1.10.0 ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? ![Untitled](https://github.com/apache/airflow/assets/59029283/f558cb03-75c0-4b25-ac8a-ff9f2945ece5) ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org