[
https://issues.apache.org/jira/browse/ARROW-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Artem KOZHEVNIKOV updated ARROW-10172:
--------------------------------------
Affects Version/s: 2.0.0
> [Python] pyarrow.concat_arrays segfaults if a resulting StringArray's
> capacity overflows
> ----------------------------------------------------------------------------------------
>
> Key: ARROW-10172
> URL: https://issues.apache.org/jira/browse/ARROW-10172
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 1.0.1, 2.0.0
> Reporter: Artem KOZHEVNIKOV
> Priority: Major
>
> I'm sorry if this was already reported, but there's an overflow issue in
> concatenation of large arrays
> {code:python}
> In [1]: import pyarrow as pa
> In [2]: str_array = pa.array(['a' * 128] * 10**8)
> In [3]: large_array = pa.concat_arrays([str_array] * 50)
> Segmentation fault (core dumped)
> {code}
> I suppose that this should be handled by upcast to large_string.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)