aru-trackunit commented on issue #36838:
URL: https://github.com/apache/airflow/issues/36838#issuecomment-1914632249
@Joffreybvn Just retested with `Databricks provider 6.1.0` on `airflow
2.8.1` and the issue still persists with different stacktrace:
```
[2024-01-29, 13:41:39 CET] {taskinstance.py:1956} INFO - Dependencies all
met for dep_context=non-requeueable deps ti=<TaskInstance: task_1.count
manual__2024-01-29T13:41:35+01:00 [queued]>
[2024-01-29, 13:41:39 CET] {taskinstance.py:1956} INFO - Dependencies all
met for dep_context=requeueable deps ti=<TaskInstance: task_1.count
manual__2024-01-29T13:41:35+01:00 [queued]>
[2024-01-29, 13:41:39 CET] {taskinstance.py:2170} INFO - Starting attempt 1
of 1
[2024-01-29, 13:41:39 CET] {taskinstance.py:2191} INFO - Executing
<Task(DatabricksSqlOperator): count> on 2024-01-29 12:41:35+00:00
[2024-01-29, 13:41:39 CET] {standard_task_runner.py:60} INFO - Started
process 118 to run task
[2024-01-29, 13:41:39 CET] {standard_task_runner.py:87} INFO - Running:
['airflow', 'tasks', 'run', 'task_1', 'count',
'manual__2024-01-29T13:41:35+01:00', '--job-id', '5', '--raw', '--subdir',
'DAGS_FOLDER/dag-wn-equipment.py', '--cfg-path', '/tmp/tmpp4t8f52h']
[2024-01-29, 13:41:39 CET] {standard_task_runner.py:88} INFO - Job 5:
Subtask count
[2024-01-29, 13:41:39 CET] {task_command.py:423} INFO - Running
<TaskInstance: task_1.count manual__2024-01-29T13:41:35+01:00 [running]> on
host 33e1fb1e4ed5
[2024-01-29, 13:41:39 CET] {taskinstance.py:2480} INFO - Exporting env vars:
AIRFLOW_CTX_DAG_OWNER='team_analytics' AIRFLOW_CTX_DAG_ID='task_1'
AIRFLOW_CTX_TASK_ID='count'
AIRFLOW_CTX_EXECUTION_DATE='2024-01-29T12:41:35+00:00'
AIRFLOW_CTX_TRY_NUMBER='1'
AIRFLOW_CTX_DAG_RUN_ID='manual__2024-01-29T13:41:35+01:00'
[2024-01-29, 13:41:39 CET] {sql.py:276} INFO - Executing: SELECT count(*)
FROM catalog.schema.test_table;
[2024-01-29, 13:41:39 CET] {base.py:83} INFO - Using connection ID
'tu-databricks-sp' for task execution.
[2024-01-29, 13:41:39 CET] {databricks_base.py:514} INFO - Using Service
Principal Token.
[2024-01-29, 13:41:39 CET] {databricks_base.py:223} INFO - Existing Service
Principal token is expired, or going to expire soon. Refreshing...
[2024-01-29, 13:41:39 CET] {databricks_base.py:514} INFO - Using Service
Principal Token.
[2024-01-29, 13:41:40 CET] {client.py:200} INFO - Successfully opened
session 01eebea3-bfcf-14ed-8b50-a14cc9a61a35
[2024-01-29, 13:41:40 CET] {sql.py:450} INFO - Running statement: SELECT
count(*) FROM catalog.schema.test_table, parameters: None
[2024-01-29, 13:41:52 CET] {taskinstance.py:2698} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py",
line 433, in _execute_task
result = execute_callable(context=context, **execute_callable_kwargs)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/common/sql/operators/sql.py",
line 282, in execute
output = hook.run(
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/databricks/hooks/databricks_sql.py",
line 254, in run
raw_result = handler(cur)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/common/sql/hooks/sql.py",
line 91, in fetch_all_handler
return cursor.fetchall()
File
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py",
line 670, in fetchall
return self.active_result_set.fetchall()
File
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py",
line 944, in fetchall
return self._convert_arrow_table(self.fetchall_arrow())
File
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py",
line 884, in _convert_arrow_table
res = df.to_numpy(na_value=None)
File
"/home/airflow/.local/lib/python3.10/site-packages/pandas/core/frame.py", line
1981, in to_numpy
result = self._mgr.as_array(dtype=dtype, copy=copy, na_value=na_value)
File
"/home/airflow/.local/lib/python3.10/site-packages/pandas/core/internals/managers.py",
line 1702, in as_array
arr[isna(arr)] = na_value
TypeError: int() argument must be a string, a bytes-like object or a real
number, not 'NoneType'
[2024-01-29, 13:41:52 CET] {taskinstance.py:1138} INFO - Marking task as
FAILED. dag_id=task_1, task_id=count, execution_date=20240129T124135,
start_date=20240129T124139, end_date=20240129T124152
[2024-01-29, 13:41:52 CET] {standard_task_runner.py:107} ERROR - Failed to
execute job 5 for task count (int() argument must be a string, a bytes-like
object or a real number, not 'NoneType'; 118)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]