GitHub user rednaxelafx opened a pull request:
https://github.com/apache/spark/pull/21643
[SPARK-24659][SQL] GenericArrayData.equals should respect element type
differences
## What changes were proposed in this pull request?
Fix `GenericArrayData.equals`, so that it respects the actual types of the
elements.
e.g. an instance that represents an `array<int>` and another instance that
represents an `array<long>` should be considered incompatible, and thus should
return false for `equals`.
`GenericArrayData` doesn't keep any schema information by itself, and
rather relies on the Java objects referenced by its `array` field's elements to
keep track of their own object types. So, the most straightforward way to
respect their types is to call `equals` on the elements, instead of using
Scala's `==` operator, which can have semantics that are not always desirable:
```
new java.lang.Integer(123) == new java.lang.Long(123L) // true in Scala
new java.lang.Integer(123).equals(new java.lang.Long(123L)) // false in
Scala
```
## How was this patch tested?
Added unit test in `ComplexDataSuite`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rednaxelafx/apache-spark
fix-genericarraydata-equals
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21643.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21643
----
commit d91b44accbe40b9879cda259912e3ca38759d716
Author: Kris Mok <kris.mok@...>
Date: 2018-06-26T11:02:25Z
SPARK-24659: GenericArrayData.equals should respect element type differences
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]