[ 
https://issues.apache.org/jira/browse/SPARK-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645697#comment-14645697
 ] 

James Aley commented on SPARK-8786:
-----------------------------------

Will this change mean that joins, distinct, equality etc on binary columns will 
work?

We serialise UUIDs as binary in our analytics system, as we have to attach them 
to every event, of which there are billions, therefore the space saving vs a 
string is fairly significant. Right now, our SQL queries and analysis jobs use 
a UDF we've written to read them back into UUID strings on the fly, so that we 
can compare them.

Would I be right in thinking that once this issue is solved, we can skip that 
transformation and should therefore see some speed-up?

> Create a wrapper for BinaryType
> -------------------------------
>
>                 Key: SPARK-8786
>                 URL: https://issues.apache.org/jira/browse/SPARK-8786
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Davies Liu
>
> The hashCode and equals() of Array[Byte] does check the bytes, we should 
> create a wrapper (internally) to do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to