Robert Nishihara created ARROW-2308:

             Summary: Serialized tensor data should be 64-byte aligned.
                 Key: ARROW-2308
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Robert Nishihara

See [] for an example of this 
issue. Non-aligned data can trigger a copy when fed into TensorFlow and things 
like that.
import pyarrow as pa
import numpy as np

x = np.zeros(10)
y = pa.deserialize(pa.serialize(x).to_buffer()) % 64  # 0 (it starts out aligned) % 64  # 48 (it is no longer aligned)
It should be possible to fix this by calling something like 
{{RETURN_NOT_OK(AlignStreamPosition(dst));}} before writing the array data. 
Note that we already do this before writing the tensor header, but the tensor 
header is not necessarily a multiple of 64 bytes, so the subsequent data can be 

This message was sent by Atlassian JIRA

Reply via email to