Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15918
I agree that the only change in behavior is that things that used to throw
an error will now not throw an error. If done right (I haven't looked deeply
at the PR itself yet), no case that is currently working should change.
It is maybe slightly odd to mix serialization types, but thats kind of
already happening today if you use the `kryo` serializer. You are taking kryo
encoded data and putting it as a binary value into a tungsten row. The change
here makes it possible to do the same in cases where the incompatible object is
nested within a compatible object. Currently you are forced into all or
nothing (i.e. even if only a single field is incompatible you must treat the
whole object as an opaque binary blob).
The one possible concern compatibility concern I can see is, if in the
future we add support for an previously unsupported type, the schema will
change from `BinaryType` to something else. However, given there are very few
operations you can do on Binary, and this format is not persisted or guaranteed
to be compatible across Spark versions, this actually seems okay.
Thoughts?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]