[ https://issues.apache.org/jira/browse/ARROW-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philipp Moritz reassigned ARROW-2308: ------------------------------------- Assignee: Robert Nishihara > Serialized tensor data should be 64-byte aligned. > ------------------------------------------------- > > Key: ARROW-2308 > URL: https://issues.apache.org/jira/browse/ARROW-2308 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Robert Nishihara > Assignee: Robert Nishihara > Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > See [https://github.com/ray-project/ray/issues/1658] for an example of this > issue. Non-aligned data can trigger a copy when fed into TensorFlow and > things like that. > {code} > import pyarrow as pa > import numpy as np > x = np.zeros(10) > y = pa.deserialize(pa.serialize(x).to_buffer()) > x.ctypes.data % 64 # 0 (it starts out aligned) > y.ctypes.data % 64 # 48 (it is no longer aligned) > {code} > It should be possible to fix this by calling something like > {{RETURN_NOT_OK(AlignStreamPosition(dst));}} before writing the array data. > Note that we already do this before writing the tensor header, but the tensor > header is not necessarily a multiple of 64 bytes, so the subsequent data can > be unaligned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)