paleolimbot commented on PR #73:
URL: https://github.com/apache/datafusion-site/pull/73#issuecomment-2996949961
Example one!
```python
from uuid import UUID
import datafusion
import pyarrow as pa
@datafusion.udf([pa.string()], pa.uuid(), "stable")
def uuid_from_string(uuid_string):
return pa.array((UUID(s).bytes for s in uuid_string.to_pylist()),
pa.uuid())
@datafusion.udf([pa.uuid()], pa.string(), "stable")
def uuid_to_string(uuid):
return pa.array(str(s) for s in uuid.to_pylist())
@datafusion.udf([pa.uuid()], pa.int64(), "stable")
def uuid_version(uuid):
return pa.array(s.version for s in uuid.to_pylist())
def main():
ctx = datafusion.SessionContext()
batch = pa.record_batch({"idx": pa.array(range(100))})
tab = (
ctx.create_dataframe([[batch]])
.with_column("uuid_string", datafusion.functions.uuid())
.with_column("uuid", uuid_from_string(datafusion.col("uuid_string")))
.with_column("uuid_string2", uuid_to_string(datafusion.col("uuid")))
.with_column("uuid_version", uuid_version(datafusion.col("uuid")))
)
#> AttributeError("'bytes' object has no attribute 'version'"), since
metadata doesn't make it through
print(tab)
if __name__ == "__main__":
main()
```
...this currently fails since the metadata doesn't make it through (I
installed datafusion-python/main)...I can take a look at that if there isn't
already a PR in the works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]