JoaoAparicio opened a new issue, #496:
URL: https://github.com/apache/arrow-julia/issues/496

   I've noticed that this allocates and I'm surprised.
   
   ``` 
   struct IntWrapper
       data::Int64
   end
   
   const INTWRAPPER_NAME = Symbol("JuliaLang.IntWrapper")
   ArrowTypes.ArrowKind(::Type{IntWrapper}) = ArrowTypes.PrimitiveKind()
   ArrowTypes.ArrowType(::Type{IntWrapper}) = Int64
   ArrowTypes.toarrow(x::IntWrapper) = x.data
   ArrowTypes.arrowname(::Type{IntWrapper}) = INTWRAPPER_NAME
   ArrowTypes.JuliaType(::Val{INTWRAPPER_NAME}, ::Type{Int64}) = IntWrapper
   ArrowTypes.fromarrow(::Type{IntWrapper}, x::Int64) = reinterpret(IntWrapper, 
x)
   
   x = [IntWrapper(1) for _ in 1:8_000_000];
   @time Arrow.write("/tmp/temp.arrow", (x=x,))
   ```
   
   I get (after running it once to compile):
   ```
    0.401526 seconds (8.00 M allocations: 184.254 MiB, 7.14% gc time)
   ```
   Basically one allocation per element of the vector.
   
   Compare this with the cost of saving just ints without the wrapper:
   
   ```
   x = ones(Int,8_000_000);
   @time Arrow.write("/tmp/temp.arrow", (x=x,))
   0.056106 seconds (140 allocations: 11.461 KiB)
   ```
   
   Am I doing something wrong?
   
   I've reproduced this for:
   ```
   julia 1.10.0 + Arrow 2.7.0
   julia 1.9.0 + Arrow 2.6.2
   julia 1.8.5 + Arrow 2.6.2
   julia 1.8.5 + Arrow 2.5.0
   ```
   
   Fresh env:
   ```
   ]activate --temp
   ]add Arrow
   ```
   
   ```
   julia> versioninfo()
   Julia Version 1.10.0
   Commit 3120989f39b (2023-12-25 18:01 UTC)
   Build Info:
     Official https://julialang.org/ release
   Platform Info:
     OS: Linux (x86_64-linux-gnu)
     CPU: 48 × Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz
     WORD_SIZE: 64
     LIBM: libopenlibm
     LLVM: libLLVM-15.0.7 (ORCJIT, skylake-avx512)
     Threads: 1 on 48 virtual cores
   Environment:
     JULIA_EDITOR = code
     JULIA_NUM_THREADS = 4
   ```
   
   Manifest
   ```
   st -m
   Status `/tmp/jl_mwcRrJ/Manifest.toml`
     [69666777] Arrow v2.7.0
     [31f734f8] ArrowTypes v2.3.0
     [c3b6d118] BitIntegers v0.3.1
     [5ba52731] CodecLz4 v0.4.1
     [6b39b394] CodecZstd v0.8.2
     [34da2185] Compat v4.12.0
     [f0e56b4a] ConcurrentUtilities v2.3.0
     [9a962f9c] DataAPI v1.15.0
     [e2d170a0] DataValueInterfaces v1.0.0
     [4e289a0a] EnumX v1.0.4
     [e2ba6199] ExprTools v0.1.10
     [842dd82b] InlineStrings v1.4.0
     [82899510] IteratorInterfaceExtensions v1.0.0
     [692b3bcd] JLLWrappers v1.5.0
     [e6f89c97] LoggingExtras v1.0.3
     [78c3b35d] Mocking v0.7.7
     [bac558e1] OrderedCollections v1.6.3
     [69de0a69] Parsers v2.8.1
     [2dfb63ee] PooledArrays v1.4.3
     [aea7be01] PrecompileTools v1.2.0
     [21216c6a] Preferences v1.4.1
     [6c6a2e73] Scratch v1.2.1
     [91c51154] SentinelArrays v1.4.1
     [dc5dba14] TZJData v1.0.0+2023c
     [3783bdb8] TableTraits v1.0.1
     [bd369af6] Tables v1.11.1
     [f269a46b] TimeZones v1.13.0
     [3bb67fe8] TranscodingStreams v0.10.2
     [5ced341a] Lz4_jll v1.9.4+0
     [3161d3a3] Zstd_jll v1.5.5+0
     [0dad84c5] ArgTools v1.1.1
     [56f22d72] Artifacts
     [2a0f44e3] Base64
     [ade2ca70] Dates
     [f43a241f] Downloads v1.6.0
     [7b1f6079] FileWatching
     [9fa8497b] Future
     [b77e0a4c] InteractiveUtils
     [4af54fe1] LazyArtifacts
     [b27032c2] LibCURL v0.6.4
     [76f85450] LibGit2
     [8f399da3] Libdl
     [37e2e46d] LinearAlgebra
     [56ddb016] Logging
     [d6f4376e] Markdown
     [a63ad114] Mmap
     [ca575930] NetworkOptions v1.2.0
     [44cfe95a] Pkg v1.10.0
     [de0858da] Printf
     [3fa0cd96] REPL
     [9a3f8284] Random
     [ea8e919c] SHA v0.7.0
     [9e88b42a] Serialization
     [6462fe0b] Sockets
     [fa267f1f] TOML v1.0.3
     [a4e569a6] Tar v1.10.0
     [cf7118a7] UUIDs
     [4ec0a83e] Unicode
     [e66e0078] CompilerSupportLibraries_jll v1.0.5+1
     [deac9b47] LibCURL_jll v8.4.0+0
     [e37daf67] LibGit2_jll v1.6.4+0
     [29816b5a] LibSSH2_jll v1.11.0+1
     [c8ffd9c3] MbedTLS_jll v2.28.2+1
     [14a3606d] MozillaCACerts_jll v2023.1.10
     [4536629a] OpenBLAS_jll v0.3.23+2
     [83775a58] Zlib_jll v1.2.13+1
     [8e850b90] libblastrampoline_jll v5.8.0+1
     [8e850ede] nghttp2_jll v1.52.0+1
     [3f19e933] p7zip_jll v17.4.0+2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to