[
https://issues.apache.org/jira/browse/ARROW-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961843#comment-15961843
]
Wes McKinney commented on ARROW-788:
------------------------------------
I'm looking at the serialization code. Couple ideas:
* The amount of padding in the metadata header depends on the starting byte
offset. The {{WriteTensor}} code does not guarantee to write a multiple of 8
bytes, but we could fix this
* Is it possible any of your arrays are not contiguous? e.g. you had before
https://github.com/ray-project/ray/pull/436/files#diff-17aeecc6d41bcd220496c0d5211cf58fL80
-- we aren't checking in WriteTensor whether the data is contiguous, but we
probably should (ARROW-794)
> Possible nondeterminism in Tensor serialization code
> ----------------------------------------------------
>
> Key: ARROW-788
> URL: https://issues.apache.org/jira/browse/ARROW-788
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Python
> Reporter: Philipp Moritz
> Priority: Minor
>
> The Ray nondeterminism tests are failing on
> https://github.com/ray-project/ray/pull/436 (moving to Arrow's Tensor
> serialization code).
> This might mean that there is some nondeterminism (like uninitialized memory)
> in the IPC file written by the Arrow Tensor serializer. I'm investigating it
> now, please let me know if you have an idea what the problem could be.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)