Moelf opened a new issue, #413:
URL: https://github.com/apache/arrow-julia/issues/413

   Premise:
   1. input file is large, uncompressed Arrow file
   2. we produce a mask and produce a `view()` over the `Mmap`-ed table
   3. use `Arrow.write()` to write filtered table to disk
   
   
   This seems to take increasing memory as the content of the `mask`.
   
   I know this doesn't work correctly because if I set memory limit first:
   ```bash
   > ulimit -Sv 8000000
   ```
   
   ```julia
   julia> using Arrow, DataFrames
   
   julia> const df = @time DataFrame(Arrow.Table("./nanoAOD_nocomp.feather"); 
copycols=false);
     2.661651 seconds (4.84 M allocations: 321.110 MiB, 4.39% gc time, 100.86% 
compilation time)
   
   julia> Arrow.write("/tmp/out.feather", @view df[1:1*10^4, :]);
   
   julia> Arrow.write("/tmp/out.feather", @view df[1:2*10^4, :]);
   ERROR: Internal error: encountered unexpected error in runtime:
   OutOfMemoryError()
   unknown function (ip: 0x7face089fc99)
   unknown function (ip: 0x7face08935b5)
   jl_gc_alloc at 
/home/akako/Documents/github/dotFiles/homedir/.julia/juliaup/julia-1.9.0-rc1+0.x64.linux.gnu/bin/../lib/julia/libjulia-internal.so.1
 (unknown line)
   unknown function (ip: 0x7face0868873)
   unknown function (ip: 0x7face086998b)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to