maleadt opened a new issue, #506:
URL: https://github.com/apache/arrow-julia/issues/506
I'm using Arrow v2.7.2 with DataFrames v1.6.1 on Julia 1.10, and am running
into an issue that seems to stem from Arrow.jl deserializing my
`Vector{Vector{T}}` columns as `Vector{SubArray{...}}`:
```julia
julia> using Arrow, DataFrames
julia> df = DataFrame(foo=Vector{Int}[]);
julia> push!(df, [[1,2,3]])
1×1 DataFrame
Row │ foo
│ Array…
─────┼───────────
1 │ [1, 2, 3]
julia> Arrow.write("/tmp/test.arrow", df)
"/tmp/test.arrow"
julia> df2 = copy(DataFrame(Arrow.Table("/tmp/test.arrow")));
julia> typeof(df2.foo)
Vector{SubArray{Int64, 1, Primitive{Int64, Vector{Int64}},
Tuple{UnitRange{Int64}}, true}} (alias for Array{SubArray{Int64, 1,
Arrow.Primitive{Int64, Array{Int64, 1}}, Tuple{UnitRange{Int64}}, true}, 1})
```
This breaks certain `push!`es on the dataframe, which I haven't been able to
reproduce in isolation, but which looks as follows:
```
MethodError: Cannot `convert` an object of type Vector{Int64} to an object
of type SubArray{Int64, 1, Arrow.Primitive{Int64, Vector{Int64}},
Tuple{UnitRange{Int64}}, true}
Stacktrace:
[1] push!(a::Vector{SubArray{Int64, 1, Arrow.Primitive{Int64,
Vector{Int64}}, Tuple{UnitRange{Int64}}, true}}, item::Vector{Int64})
@ Base ./array.jl:1118
[2] _row_inserter!(df::DataFrame, loc::Int64, row::Tuple{String,
Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, String,
Bool, Bool, Bool, Vector{Int64}, Vector{Int64}, Vector{Int64}, String, String,
Float64}, mode::Val{:push}, promote::Bool)
@ DataFrames
~/.julia/packages/DataFrames/58MUJ/src/dataframe/insertion.jl:663
[3] push!(df::DataFrame, row::Tuple{String, Vector{Int64}, Int64, Int64,
Int64, Int64, Int64, Int64, Int64, Int64, String, Bool, Bool, Bool,
Vector{Int64}, Vector{Int64}, Vector{Int64}, String, String, Float64})
@ DataFrames
~/.julia/packages/DataFrames/58MUJ/src/dataframe/insertion.jl:457
```
It's possible I'm doing something wrong; first time Arrow.jl user here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]