aru-trackunit commented on issue #36838:
URL: https://github.com/apache/airflow/issues/36838#issuecomment-1914632249

   @Joffreybvn Just retested with `Databricks provider 6.1.0` on `airflow 
2.8.1` and the issue still persists with different stacktrace:
   
   ```
   [2024-01-29, 13:41:39 CET] {taskinstance.py:1956} INFO - Dependencies all 
met for dep_context=non-requeueable deps ti=<TaskInstance: task_1.count 
manual__2024-01-29T13:41:35+01:00 [queued]>
   [2024-01-29, 13:41:39 CET] {taskinstance.py:1956} INFO - Dependencies all 
met for dep_context=requeueable deps ti=<TaskInstance: task_1.count 
manual__2024-01-29T13:41:35+01:00 [queued]>
   [2024-01-29, 13:41:39 CET] {taskinstance.py:2170} INFO - Starting attempt 1 
of 1
   [2024-01-29, 13:41:39 CET] {taskinstance.py:2191} INFO - Executing 
<Task(DatabricksSqlOperator): count> on 2024-01-29 12:41:35+00:00
   [2024-01-29, 13:41:39 CET] {standard_task_runner.py:60} INFO - Started 
process 118 to run task
   [2024-01-29, 13:41:39 CET] {standard_task_runner.py:87} INFO - Running: 
['airflow', 'tasks', 'run', 'task_1', 'count', 
'manual__2024-01-29T13:41:35+01:00', '--job-id', '5', '--raw', '--subdir', 
'DAGS_FOLDER/dag-wn-equipment.py', '--cfg-path', '/tmp/tmpp4t8f52h']
   [2024-01-29, 13:41:39 CET] {standard_task_runner.py:88} INFO - Job 5: 
Subtask count
   [2024-01-29, 13:41:39 CET] {task_command.py:423} INFO - Running 
<TaskInstance: task_1.count manual__2024-01-29T13:41:35+01:00 [running]> on 
host 33e1fb1e4ed5
   [2024-01-29, 13:41:39 CET] {taskinstance.py:2480} INFO - Exporting env vars: 
AIRFLOW_CTX_DAG_OWNER='team_analytics' AIRFLOW_CTX_DAG_ID='task_1' 
AIRFLOW_CTX_TASK_ID='count' 
AIRFLOW_CTX_EXECUTION_DATE='2024-01-29T12:41:35+00:00' 
AIRFLOW_CTX_TRY_NUMBER='1' 
AIRFLOW_CTX_DAG_RUN_ID='manual__2024-01-29T13:41:35+01:00'
   [2024-01-29, 13:41:39 CET] {sql.py:276} INFO - Executing: SELECT count(*) 
FROM catalog.schema.test_table;
   [2024-01-29, 13:41:39 CET] {base.py:83} INFO - Using connection ID 
'tu-databricks-sp' for task execution.
   [2024-01-29, 13:41:39 CET] {databricks_base.py:514} INFO - Using Service 
Principal Token.
   [2024-01-29, 13:41:39 CET] {databricks_base.py:223} INFO - Existing Service 
Principal token is expired, or going to expire soon. Refreshing...
   [2024-01-29, 13:41:39 CET] {databricks_base.py:514} INFO - Using Service 
Principal Token.
   [2024-01-29, 13:41:40 CET] {client.py:200} INFO - Successfully opened 
session 01eebea3-bfcf-14ed-8b50-a14cc9a61a35
   [2024-01-29, 13:41:40 CET] {sql.py:450} INFO - Running statement: SELECT 
count(*) FROM catalog.schema.test_table, parameters: None
   [2024-01-29, 13:41:52 CET] {taskinstance.py:2698} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py",
 line 433, in _execute_task
       result = execute_callable(context=context, **execute_callable_kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/common/sql/operators/sql.py",
 line 282, in execute
       output = hook.run(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/databricks/hooks/databricks_sql.py",
 line 254, in run
       raw_result = handler(cur)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/common/sql/hooks/sql.py",
 line 91, in fetch_all_handler
       return cursor.fetchall()
     File 
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py", 
line 670, in fetchall
       return self.active_result_set.fetchall()
     File 
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py", 
line 944, in fetchall
       return self._convert_arrow_table(self.fetchall_arrow())
     File 
"/home/airflow/.local/lib/python3.10/site-packages/databricks/sql/client.py", 
line 884, in _convert_arrow_table
       res = df.to_numpy(na_value=None)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 
1981, in to_numpy
       result = self._mgr.as_array(dtype=dtype, copy=copy, na_value=na_value)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/pandas/core/internals/managers.py",
 line 1702, in as_array
       arr[isna(arr)] = na_value
   TypeError: int() argument must be a string, a bytes-like object or a real 
number, not 'NoneType'
   [2024-01-29, 13:41:52 CET] {taskinstance.py:1138} INFO - Marking task as 
FAILED. dag_id=task_1, task_id=count, execution_date=20240129T124135, 
start_date=20240129T124139, end_date=20240129T124152
   [2024-01-29, 13:41:52 CET] {standard_task_runner.py:107} ERROR - Failed to 
execute job 5 for task count (int() argument must be a string, a bytes-like 
object or a real number, not 'NoneType'; 118)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to