Github user robbinspg commented on the pull request:
https://github.com/apache/spark/pull/10628#issuecomment-209832195
@nongli I'm just about there with a solution for Big Endian platforms and
will be using https://issues.apache.org/jira/browse/SPARK-14151 for the changes.
I have one question:
It is clear from the tests using Parquet that the byte array passed into
putIntsLittleEndian is in little endian order. It is also the case that the
byte array passed in to the putFloats and putDoubles has the values in little
endian. Reversing the floats/doubles enables all the tests to pass.
In OffHeapColumnVector putDoubles(int rowId, int count, byte[] src, int
srcIndex) if I assume input is LE then the
org.apache.spark.sql.execution.vectorized.ColumnarBatchSuite.Double APIs test
fails. This is because it is passing in a byte array of doubles that are in
platform endian order (created with Platform.putDouble).
My question is: are the byte arrays always in little endian? This seems to
be true for the Parquet sources?? If so then I can modify the testcase
'org.apache.spark.sql.execution.vectorized.ColumnarBatchSuite.Double APIs' to
force the test data into LE.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]