dosubot[bot] commented on issue #28779:
URL: https://github.com/apache/superset/issues/28779#issuecomment-2139441941
<!-- Greeting -->
Hey there, @ZuzannaSadowska! I'm here to help you out with any bugs,
questions, or contributions you have for the repository. Let's squash those
bugs together! 🐞
<!-- Answer -->
The issue you're encountering with UUID columns showing up as NULL in
Superset when using Trino and Iceberg is likely due to the Trino SQLAlchemy
dialect not recognizing the UUID data type. This is confirmed by the warning
message you provided: `Did not recognize type 'uuid'`.
### Workaround
Superset includes a workaround for handling UUID data types by converting
them to strings. This can be done using the `convert_uuids` function in
`superset/models/helpers.py`, which converts UUID objects to strings:
```python
def convert_uuids(obj: Any) -> Any:
"""
Convert UUID objects to str so we can use yaml.safe_dump
"""
if isinstance(obj, uuid.UUID):
return str(obj)
if isinstance(obj, list):
return [convert_uuids(el) for el in obj]
if isinstance(obj, dict):
return {k: convert_uuids(v) for k, v in obj.items()}
return obj
```
### Solution
To address this issue, you can modify the Trino SQLAlchemy dialect to
recognize the UUID data type. This involves updating the `datatype.py` file in
the Trino SQLAlchemy dialect to include a mapping for the UUID type. Here is an
example of how you can add this mapping:
1. Locate the `datatype.py` file in the Trino SQLAlchemy dialect.
2. Add a mapping for the UUID type:
```python
from sqlalchemy.types import CHAR
import uuid
class UUID(CHAR):
def __init__(self):
super().__init__(length=36)
def bind_processor(self, dialect):
def process(value):
if value is None:
return value
elif not isinstance(value, uuid.UUID):
return str(uuid.UUID(value))
else:
return str(value)
return process
def result_processor(self, dialect, coltype):
def process(value):
if value is None:
return value
return uuid.UUID(value)
return process
# Add the UUID type to the type map
type_map = {
'uuid': UUID,
# other type mappings...
}
```
3. Ensure that the `type_map` includes the UUID type.
### Configuration
Additionally, you can configure Superset to handle custom data type mappings
in the `superset_config.py` file. This allows you to override the default
behavior for specific databases like Trino. Here is an example configuration:
```python
from superset.db_engine_specs.trino import TrinoEngineSpec
class CustomTrinoEngineSpec(TrinoEngineSpec):
@classmethod
def get_column_spec(cls, native_type: str):
if native_type.lower() == 'uuid':
return ColumnSpec(
sqla_type=UUID(),
generic_type=GenericDataType.STRING,
is_dttm=False,
)
return super().get_column_spec(native_type)
SQLALCHEMY_CUSTOM_ENGINE_SPECS = {
'trino': CustomTrinoEngineSpec,
}
```
This configuration ensures that the UUID data type is correctly recognized
and handled by Superset.
By implementing these changes, you should be able to resolve the issue of
UUID columns showing up as NULL in Superset when using Trino and Iceberg
<sup>[[1]](https://github.com/apache/superset/blob/master/superset/models/helpers.py)</sup><sup>[[2]](https://github.com/apache/superset/blob/master/superset/db_engine_specs/trino.py)</sup>.
<!-- Dosu Comment Footer -->
*To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]