Fahad-Alam-Jamal opened a new pull request, #59151:
URL: https://github.com/apache/airflow/pull/59151
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
## Fix DatabricksSqlOperator XCom pickle serialization
closes: #59103
### Description
This PR fixes the issue where `DatabricksSqlOperator` fails with
`_pickle.PicklingError: Can't pickle <class
'airflow.providers.databricks.hooks.databricks_sql.Row'>` when XCom push is
enabled (`do_xcom_push=True`).
### Root Cause
The Databricks SQL connector returns `databricks.sql.types.Row` objects,
which are dynamically created classes that cannot be pickled. XCom requires all
return values to be picklable for storage in the Airflow metadata database.
When using the default `fetch_all_handler`, these unpicklable Row objects were
returned directly without conversion.
### Solution
Introduced a new `PicklableRow` wrapper class in `DatabricksSqlHook` that:
- Wraps unpicklable Row objects and makes them picklable via a custom
`__reduce__` method
- Maintains full backward compatibility by delegating to an internal
namedtuple
- Supports all namedtuple interface operations: `_fields`, `_asdict()`,
iteration, and attribute access
- Properly handles field name renaming for invalid Python identifiers (e.g.,
`count(1)` → `_0`)
### Changes
- **Hook**: Modified `DatabricksSqlHook.run()` to always convert Row objects
to PicklableRow, even when no handler is provided
- **Hook**: Updated `_make_common_data_structure()` to use PicklableRow
instead of dynamic namedtuples
- **Tests**: Added `test_xcom_pickle_results_with_row_objects()` to verify
pickle serialization works correctly
- **Backward Compatibility**: All 35 existing unit tests pass, confirming no
breaking changes
### Testing
- ✅ All 35 unit tests pass, including the new pickle test
- ✅ Verified pickle.dumps() and pickle.loads() work correctly on converted
Row objects
- ✅ Confirmed `_fields` attribute returns properly renamed field names
- ✅ Verified `_asdict()` method returns dictionaries with original field
names
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]