DrChainsaw commented on issue #435: URL: https://github.com/apache/arrow-julia/issues/435#issuecomment-1535160812
I managed to produce a file which triggers the problem: [test.zip](https://github.com/apache/arrow-julia/files/11399998/test.zip) ```julia julia> Arrow.Table("c:/temp\\arrowtest\\test/test.arrow") ERROR: TaskFailedException nested task error: ArgumentError: invalid Array dimensions Stacktrace: [1] Array @ .\boot.jl:477 [inlined] [2] uncompress(ptr::Ptr{UInt8}, buffer::Arrow.Flatbuf.Buffer, compression::Arrow.Flatbuf.BodyCompression) @ Arrow \.julia\dev\Arrow\src\table.jl:529 [3] buildbitmap(batch::Arrow.Batch, rb::Arrow.Flatbuf.RecordBatch, nodeidx::Int64, bufferidx::Int64) @ Arrow \.julia\dev\Arrow\src\table.jl:512 [4] build(f::Arrow.Flatbuf.Field, #unused#::Arrow.Flatbuf.Int, batch::Arrow.Batch, rb::Arrow.Flatbuf.RecordBatch, de::Dict{Int64, Arrow.DictEncoding}, nodeidx::Int64, bufferidx::Int64, convert::Bool) @ Arrow \.julia\dev\Arrow\src\table.jl:683 [5] build(field::Arrow.Flatbuf.Field, batch::Arrow.Batch, rb::Arrow.Flatbuf.RecordBatch, de::Dict{Int64, Arrow.DictEncoding}, nodeidx::Int64, bufferidx::Int64, convert::Bool) @ Arrow \.julia\dev\Arrow\src\table.jl:498 [6] iterate(x::Arrow.VectorIterator, ::Tuple{Int64, Int64, Int64}) @ Arrow \.julia\dev\Arrow\src\table.jl:474 [7] iterate @ \.julia\packages\Arrow\rYdxZ\src\table.jl:471 [inlined] [8] copyto!(dest::Vector{Any}, src::Arrow.VectorIterator) @ Base .\abstractarray.jl:946 [9] _collect @ .\array.jl:713 [inlined] [10] collect @ .\array.jl:707 [inlined] [11] macro expansion @ \.julia\packages\Arrow\rYdxZ\src\table.jl:376 [inlined] [12] (::Arrow.var"#108#114"{Bool, Channel{Any}, WorkerUtilities.OrderedSynchronizer, Dict{Int64, Arrow.DictEncoding}, Arrow.Batch, Int64})() @ Arrow .\threadingconstructs.jl:341 Stacktrace: [1] sync_end(c::Channel{Any}) @ Base .\task.jl:445 [2] macro expansion @ .\task.jl:477 [inlined] [3] Arrow.Table(blobs::Vector{Arrow.ArrowBlob}; convert::Bool) @ Arrow \.julia\dev\Arrow\src\table.jl:321 [4] Table @ \.julia\packages\Arrow\rYdxZ\src\table.jl:295 [inlined] [5] #Table#98 @ \.julia\packages\Arrow\rYdxZ\src\table.jl:290 [inlined] [6] Table @ \.julia\packages\Arrow\rYdxZ\src\table.jl:290 [inlined] [7] Arrow.Table(input::String) @ Arrow \.julia\dev\Arrow\src\table.jl:290 [8] top-level scope @ REPL[27]:1 ``` With #436 ```julia julia> Arrow.Table("c:/temp\\arrowtest\\test/test.arrow") |> DataFrame 102×15 DataFrame Row │ isA intkey primitiveIntkey doublekey booleanKey numberkey primitiveNumberkey stringkey objectkey arrayKey NrofSamples Max Min Sum SqrSum │ Int32 Int32 Int32 Float64 Bool Float64 Float64 String String String Int32 Float64 Float64 Float64 Float64 ─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 0 1 2 3.0 true 4.0 5.0 6 StringObject{string='7'} [I@4dd6fd0a 2 100.0 10.0 110.0 10100.0 2 │ 0 10 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 3 │ 0 11 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 4 │ 0 12 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 5 │ 0 13 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 6 │ 0 14 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 7 │ 0 15 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 8 │ 0 16 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 9 │ 0 17 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 10 │ 0 18 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 11 │ 0 19 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 12 │ 0 20 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 13 │ 0 21 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 14 │ 0 22 20 30.0 false 40.0 50.0 60 StringObject{string='70'} [I@bb9e6dc 1 100.0 100.0 100.0 10000.0 ``` Loads with pyarrow ootb: ```julia julia> pywith(pyarrow.ipc.open_file("c:/temp\\arrowtest\\test/test.arrow")) do reader reader.read_pandas() end Python DataFrame: isA intkey primitiveIntkey doublekey booleanKey ... NrofSamples Max Min Sum SqrSum 0 0 1 2 3.0 True ... 2 100.0 10.0 110.0 10100.0 1 0 10 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 2 0 11 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 3 0 12 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 4 0 13 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 .. ... ... ... ... ... ... ... ... ... ... ... 97 0 106 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 98 0 107 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 99 0 108 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 100 0 109 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 101 1 10 20 30.0 False ... 1 100.0 100.0 100.0 10000.0 [102 rows x 15 columns] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
