Github user thunterdb commented on the issue:
https://github.com/apache/spark/pull/19439
@hhbyyh regarding the data representation, one could indeed have the each
of the representations being encoded with the proper array information. This
brings some additional complexity for the complex UDFs though, because they
need to select the proper field, and the target implementations in C++ or
tensorflow already can cast the field to the proper type. I suggest we keep
bytes[] for now and see if there is a need to have a more refined
representations.
For the `origin` field, @dakirsa or @imatiach-msft should have more
context.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]