Sten Larsson created ARROW-18405:
------------------------------------
Summary: [Ruby] Raw table converter rebuilds chunked arrays
Key: ARROW-18405
URL: https://issues.apache.org/jira/browse/ARROW-18405
Project: Apache Arrow
Issue Type: Bug
Components: Ruby
Affects Versions: 10.0.0
Reporter: Sten Larsson
Consider the following Ruby script:
{code:ruby}
require 'arrow'
data = Arrow::ChunkedArray.new([Arrow::Int64Array.new([1])])
table = Arrow::Table.new('column' => data)
puts table['column'].data_type
{code}
This prints "int64" with red-arrow 9.0.0 and "uint8" in 10.0.0.
>From my understanding it is due to this commit:
>[https://github.com/apache/arrow/commit/913d9c0a9a1a4398ed5f56d713d586770b4f702c#diff-f7f19bbc3945ea30ba06d851705f2d58f7666507bb101c4e151014ca398bd635R42]
The old version would not call ArrayBuilder.build on a ChunkedArray, but the
new version does. This is a problem for us, because we need the column to stay
int64.
A workaround is to specify a schema and list of arrays instead to bypass the
raw table converter:
{code:ruby}
table = Arrow::Table.new([{name: 'column', type: 'int64'}], [data])
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)