jaruji opened a new issue, #37459:
URL: https://github.com/apache/airflow/issues/37459
### Official Helm Chart version
1.12.0 (latest released)
### Apache Airflow version
2.8.1
### Kubernetes Version
1.26.6
### Helm Chart configuration
```
config:
webserver:
expose_config: 'True'
logging:
remote_logging: 'True'
remote_base_log_folder: wasb-airflow/logs
remote_wasb_log_container: airflow
remote_log_conn_id: wasb_default
images:
airflow:
# define custom airflow image here (with PyPi packages installed)
repository: org.azurecr.io/internal-airflow
# CHANGE THIS when updating
tag: "0.2.0"
executor: KubernetesExecutor
fernetKeySecretName: airflow-fernet-secret
webserverSecretKeySecretName: airflow-webserver-secret
createUserJob:
useHelmHooks: false
applyCustomEnv: false
migrateDatabaseJob:
enabled: true
useHelmHooks: false
applyCustomEnv: false
jobAnnotations:
"argocd.argoproj.io/hook": Sync
useStandardNaming: true
dags:
gitSync:
enabled: true
repo: [email protected]:ORG/custom-airflow.git
branch: master
subPath: "dags"
sshKeySecret: airflow-ssh-secret
ingress:
web:
enabled: true
annotations:
cert-manager.io/cluster-issuer: "letsencrypt"
# The path for the web Ingress
path: "/"
# The pathType for the above path (used only with Kubernetes v1.19 and
above)
pathType: "ImplementationSpecific"
# The hostnames or hosts configuration for the web Ingress
# Set in argoCD application yaml
hosts: []
# # The hostname for the web Ingress (can be templated)
# - name: ""
# # configs for web Ingress TLS
# tls:
# # Enable TLS termination for the web Ingress
# enabled: false
# # the name of a pre-created Secret containing a TLS private key
and certificate
# secretName: ""
# The Ingress Class for the web Ingress (used only with Kubernetes
v1.19 and above)
ingressClassName: "nginx"
```
### Docker Image customizations
# Use the specified Apache Airflow image as a base
FROM apache/airflow:2.8.1
# Install dependencies required for building pymssql
USER root
RUN apt-get update && apt-get install -y \
freetds-dev \
build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update -y \
&& apt-get install -y \
libglib2.0-0 \
libnss3 \
libnspr4 \
libdbus-1-3 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libcups2 \
libdrm2 \
libxkbcommon0 \
libatspi2.0-0 \
libxcomposite1 \
libxdamage1 \
libxext6 \
libxfixes3 \
libxrandr2 \
libgbm1 \
libpango-1.0-0 \
libcairo2 \
libasound2 \
&& rm -rf /var/lib/apt/lists/*
# Copy the requirements file into the container
COPY requirements.txt /
COPY .env /
# Switch back to the airflow user
USER airflow
# Install the requirements, including Apache Airflow
RUN pip install --no-cache-dir "apache-airflow==${AIRFLOW_VERSION}" -r
/requirements.txt
RUN pip install python-dotenv
#install azure provider for airflow, needed for remote logging to azure blob
RUN pip install apache-airflow-providers-microsoft-azure
RUN playwright install
### What happened
When I define the connection manually using the webserver UI (I add a wasb
connection using the azure blob connection string), the DAG execution always
fails to remotely log the logs - saying that the provided container does not
exist. The error I get:
```
[2024-02-15T17:26:13.076+0000] {wasb_task_handler.py:238} ERROR - Could not
write logs to
wasb-airflow/logs/dag_id=internal_dag/run_id=manual__2024-02-15T17:25:55.528727+00:00/task_id=read_product_feed/attempt=1.log
│
│ Traceback (most recent call last):
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/microsoft/azure/log/wasb_task_handler.py",
line 236, in wasb_write
│
│ self.hook.load_string(log, self.wasb_container, remote_log_location,
overwrite=True)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/microsoft/azure/hooks/wasb.py",
line 373, in load_string
│
│ self.upload(
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/microsoft/azure/hooks/wasb.py",
line 431, in upload
│
│ return blob_client.upload_blob(data, blob_type, length=length,
**kwargs)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/core/tracing/decorator.py",
line 78, in wrapper_use_tracer
│
│ return func(*args, **kwargs)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_blob_client.py",
line 765, in upload_blob
│
│ return upload_block_blob(**options)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_upload_helpers.py",
line 195, in upload_block_blob
│
│ process_storage_error(error)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_shared/response_handlers.py",
line 184, in process_storage_error
│
│ exec("raise error from None") # pylint: disable=exec-used # nosec
│
│ File "<string>", line 1, in <module>
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_upload_helpers.py",
line 105, in upload_block_blob
│
│ response = client.upload(
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/core/tracing/decorator.py",
line 78, in wrapper_use_tracer
│
│ return func(*args, **kwargs)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_generated/operations/_block_blob_operations.py",
line 864, in upload
│
│ map_error(status_code=response.status_code, response=response,
error_map=error_map)
│
│ File
"/home/airflow/.local/lib/python3.8/site-packages/azure/core/exceptions.py",
line 164, in map_error
│
│ raise error
│
│ azure.core.exceptions.ResourceNotFoundError: The specified container does
not exist.
│
│ RequestId:b1e6ba42-b01e-005c-1f34-60c86e000000
│
│ Time:2024-02-15T17:26:13.0720393Z
│
│ ErrorCode:ContainerNotFound
│
│ Content: <?xml version="1.0"
encoding="utf-8"?><Error><Code>ContainerNotFound</Code><Message>The specified
container does not exist.
│
│ RequestId:b1e6ba42-b01e-005c-1f34-60c86e000000
│
│ Time:2024-02-15T17:26:13.0720393Z</Message></Error>
```
### What you think should happen instead
The logs should get uploaded to the provided location in the blob using the
configured Azure Blob connection.
### How to reproduce
Deploy airflow to kubernetes cluster using the official helm chart and use
the configurations for remote logging into azure blob with the kubernetes
executor. I use the azure blob connection string to authenticate.
### Anything else
This problem occurs every time the log upload process is initiated. I
checked multiple times whether the airflow container exists on the blob, and it
does. It's also possible I'm overlooking something / missing something obvious.
I was following the docs at:
https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/logging/index.html
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]