[GitHub] spark issue #15918: [SPARK-18122][SQL][WIP]Fallback to Kryo for unsupported ...

marmbrus Wed, 30 Nov 2016 15:26:11 -0800

Github user marmbrus commented on the issue:

    https://github.com/apache/spark/pull/15918
  
    I agree that the only change in behavior is that things that used to throw 
an error will now not throw an error.  If done right (I haven't looked deeply 
at the PR itself yet), no case that is currently working should change.
    
    It is maybe slightly odd to mix serialization types, but thats kind of 
already happening today if you use the `kryo` serializer.  You are taking kryo 
encoded data and putting it as a binary value into a tungsten row.  The change 
here makes it possible to do the same in cases where the incompatible object is 
nested within a compatible object.  Currently you are forced into all or 
nothing (i.e. even if only a single field is incompatible you must treat the 
whole object as an opaque binary blob).
    
    The one possible concern compatibility concern I can see is, if in the 
future we add support for an previously unsupported type, the schema will 
change from `BinaryType` to something else.  However, given there are very few 
operations you can do on Binary, and this format is not persisted or guaranteed 
to be compatible across Spark versions, this actually seems okay.
    
    Thoughts?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15918: [SPARK-18122][SQL][WIP]Fallback to Kryo for unsupported ...

Reply via email to