[ 
https://issues.apache.org/jira/browse/ARROW-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443930#comment-16443930
 ] 

Antoine Pitrou commented on ARROW-2476:
---------------------------------------

The offsets array is part of the data and needs to be interchangeable, so an 
implementation cannot use an array of int64_t if the spec says it's an array of 
int32_t.
{quote}As a sidenote the length of the offsets buffer can't be represented in a 
32-bit signed integer (for an array with maximum (2^31 - 1) number of element), 
it requires a 64 bit signed integer
{quote}
You seem to be confusing length in elements and size in bytes here. The spec 
says the number of elements is limited to 2**31, not the size in bytes.

(the 32-bit limitation doesn't look like a good idea to me, but I may be 
missing some context)

> [Python/Question] Maximum length of an Array created from ndarray
> -----------------------------------------------------------------
>
>                 Key: ARROW-2476
>                 URL: https://issues.apache.org/jira/browse/ARROW-2476
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Krisztian Szucs
>            Priority: Minor
>
> So the format 
> [describes|https://github.com/apache/arrow/blob/master/format/Layout.md#array-lengths]
>  that an array max length is 2^31 - 1, however the following python snippet 
> creates a 2**32 length arrow array:
> {code:python}
> a = np.ones((2**32,), dtype='int8')
> A = pa.Array.from_pandas(a)
> type(A)
> {code}
> {code}pyarrow.lib.Int8Array{code}
> Based the layout specification I'd expect a ChunkedArray of three Int8Array's 
> with lengths:
> [2^31 - 1, 2^31 - 1, 2] or should raise an exception?
> If it's the expectation is there any documentation for it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to