[ https://issues.apache.org/jira/browse/ARROW-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443930#comment-16443930 ]
Antoine Pitrou commented on ARROW-2476: --------------------------------------- The offsets array is part of the data and needs to be interchangeable, so an implementation cannot use an array of int64_t if the spec says it's an array of int32_t. {quote}As a sidenote the length of the offsets buffer can't be represented in a 32-bit signed integer (for an array with maximum (2^31 - 1) number of element), it requires a 64 bit signed integer {quote} You seem to be confusing length in elements and size in bytes here. The spec says the number of elements is limited to 2**31, not the size in bytes. (the 32-bit limitation doesn't look like a good idea to me, but I may be missing some context) > [Python/Question] Maximum length of an Array created from ndarray > ----------------------------------------------------------------- > > Key: ARROW-2476 > URL: https://issues.apache.org/jira/browse/ARROW-2476 > Project: Apache Arrow > Issue Type: Improvement > Reporter: Krisztian Szucs > Priority: Minor > > So the format > [describes|https://github.com/apache/arrow/blob/master/format/Layout.md#array-lengths] > that an array max length is 2^31 - 1, however the following python snippet > creates a 2**32 length arrow array: > {code:python} > a = np.ones((2**32,), dtype='int8') > A = pa.Array.from_pandas(a) > type(A) > {code} > {code}pyarrow.lib.Int8Array{code} > Based the layout specification I'd expect a ChunkedArray of three Int8Array's > with lengths: > [2^31 - 1, 2^31 - 1, 2] or should raise an exception? > If it's the expectation is there any documentation for it? -- This message was sent by Atlassian JIRA (v7.6.3#76005)