paleolimbot commented on issue #36161:
URL: https://github.com/apache/arrow/issues/36161#issuecomment-1597661645

   The Windows Task Manager and `memory.size()` and `gc()` all get their 
numbers from different places, so I'm not surprised that there are differences 
(although I'm not familiar with the details on Windows). I do know that any 
allocations made by Arrow C++ won't show up in `gc()`; however you can track 
these allocations using `default_memory_pool()$bytes_allocated`. Note that 
there are some hidden references to objects that are not always apparent (for 
example, when converting a Table to a data.frame, some columns may be zero-copy 
shells around Arrow arrays).
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 0
   
   # no bytes allocated because it has re-used R's memory
   array <- as_arrow_array(1:10)
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 0
   
   # Can't re-use R memory for decimal type, so this will trigger an Arrow 
allocation
   array <- as_arrow_array(1:10, type = decimal(10, 3))
   default_memory_pool()$bytes_allocated
   #> [1] 192
   default_memory_pool()$max_memory
   #> [1] 256
   
   rm(array)
   gc()
   #>           used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
   #> Ncells  803037 42.9    1418702 75.8         NA  1418702 75.8
   #> Vcells 1370077 10.5    8388608 64.0      16384  2707166 20.7
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 256
   ```
   
   <sup>Created on 2023-06-19 with [reprex 
v2.0.2](https://reprex.tidyverse.org)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to