[ 
https://issues.apache.org/jira/browse/SPARK-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078825#comment-16078825
 ] 

Dongjoon Hyun commented on SPARK-21344:
---------------------------------------

Hi, [~shubhamc].
Thank you for reporting. I just change the priority since `Blocker` is set by 
committers.

> BinaryType comparison does signed byte array comparison
> -------------------------------------------------------
>
>                 Key: SPARK-21344
>                 URL: https://issues.apache.org/jira/browse/SPARK-21344
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.1
>            Reporter: Shubham Chopra
>
> BinaryType used by Spark SQL defines ordering using signed byte comparisons. 
> This can lead to unexpected behavior. Consider the following code snippet 
> that shows this error:
> {code}
> case class TestRecord(col0: Array[Byte])
> def convertToBytes(i: Long): Array[Byte] = {
>     val bb = java.nio.ByteBuffer.allocate(8)
>     bb.putLong(i)
>     bb.array
>   }
> def test = {
>     val sql = spark.sqlContext
>     import sql.implicits._
>     val timestamp = 1498772083037L
>     val data = (timestamp to timestamp + 1000L).map(i => 
> TestRecord(convertToBytes(i)))
>     val testDF = sc.parallelize(data).toDF
>     val filter1 = testDF.filter(col("col0") >= convertToBytes(timestamp) && 
> col("col0") < convertToBytes(timestamp + 50L))
>     val filter2 = testDF.filter(col("col0") >= convertToBytes(timestamp + 
> 50L) && col("col0") < convertToBytes(timestamp + 100L))
>     val filter3 = testDF.filter(col("col0") >= convertToBytes(timestamp) && 
> col("col0") < convertToBytes(timestamp + 100L))
>     assert(filter1.count == 50)
>     assert(filter2.count == 50)
>     assert(filter3.count == 100)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to