Coqueiro opened a new issue #13076:
URL: https://github.com/apache/airflow/issues/13076
**Apache Airflow version**: 1.10.12 with Python3.7
**Kubernetes version**:
```
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5",
GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"clean",
BuildDate:"2020-06-27T00:38:11Z", GoVersion:"go1.14.4", Compiler:"gc",
Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8",
GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean",
BuildDate:"2020-08-13T16:04:18Z", GoVersion:"go1.13.15", Compiler:"gc",
Platform:"linux/amd64"}
```
I'm running an Airflow cluster using CeleryExecutor inside a Kubernetes
cluster, after installing `cncf.kubernetes` backport package. I already did
some testing with the Spark Operator inside the cluster and I'm able to run
`SparkApplications` smoothly by applying them with `kubectl`. I'm trying now to
make an Airflow DAG execute them. When doing so, I ran into this message:
```
[2020-12-14 22:18:10,609] {taskinstance.py:1150} ERROR - Invalid kube-config
file. No configuration found.
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 984, in _run_raw_task
result = task_copy.execute(context=context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
line 67, in execute
namespace=self.namespace,
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py",
line 127, in create_custom_object
api = client.CustomObjectsApi(self.api_client)
File
"/home/airflow/.local/lib/python3.7/site-packages/cached_property.py", line 35,
in __get__
value = obj.__dict__[self.func.__name__] = self.func(obj)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py",
line 108, in api_client
return self.get_conn()
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py",
line 102, in get_conn
config.load_kube_config(client_configuration=self.client_configuration)
File
"/home/airflow/.local/lib/python3.7/site-packages/kubernetes/config/kube_config.py",
line 739, in load_kube_config
persist_config=persist_config)
File
"/home/airflow/.local/lib/python3.7/site-packages/kubernetes/config/kube_config.py",
line 701, in _get_kube_config_loader_for_yaml_file
'Invalid kube-config file. '
kubernetes.config.config_exception.ConfigException: Invalid kube-config
file. No configuration found.
```
Here's my DAG code:
```
import os
from datetime import datetime, timedelta
from airflow import DAG
from airflow.providers.cncf.kubernetes.operators.spark_kubernetes import
SparkKubernetesOperator
from airflow.operators.dummy_operator import DummyOperator
DAG_ID = "spark_operator_test"
DESCRIPTION = ""
SCHEDULE_INTERVAL = "0 10 * * *"
default_args = {
"owner": "airflow",
"depends_on_past": False,
"start_date": datetime(2020, 11, 18),
"retries": 1,
"retry_delay": timedelta(minutes=5),
"provide_context": True
}
with DAG(
dag_id=DAG_ID,
schedule_interval=SCHEDULE_INTERVAL,
description=DESCRIPTION,
default_args=default_args,
catchup=False,
concurrency=1
) as dag:
end_dag = DummyOperator(task_id='end_dag')
t1 = SparkKubernetesOperator(
task_id='spark_operator_execute',
namespace="analytics-airflow",
application_file="resources/spark/spark-operator-test.yaml",
kubernetes_conn_id="kubernetes_default",
do_xcom_push=True
)
t1 >> end_dag
```
I tried creating a connection with the following extra:
```
{"extra__kubernetes__in_cluster":True}
```
But it didn't work either. How can I correctly configure my airflow
deployment so that I can make a task that can use the
`SparkKubernetesOperator`, creating a `SparkApplication` within the same
cluster?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]