[ 
https://issues.apache.org/jira/browse/ARROW-17441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Jones updated ARROW-17441:
-------------------------------
    Description: 
I was trying reproduce another issue involving memory pools not releasing 
memory, but encountered this confusing behavior: if I create a table, then call 
{{{}del table{}}}, and then {{{}pool.release_unused(){}}}, I still see 
significant memory usage. On mimalloc in particular, I see no meaningful drop 
in memory usage on either call.

Am I missing something? My understanding prior has been that memory will be 
held onto by a memory pool, but will be forced free by release_unused; and that 
system memory pool should release memory immediately. But neither of those seem 
true.
{code:python}
import os
import psutil
import time
import gc
process = psutil.Process(os.getpid())
import numpy as np
from uuid import uuid4


import pyarrow as pa

def gen_batches(n_groups=200, rows_per_group=200_000):
    for _ in range(n_groups):
        id_val = uuid4().bytes
        yield pa.table({
            "x": np.random.random(rows_per_group), # This will compress poorly
            "y": np.random.random(rows_per_group),
            "a": pa.array(list(range(rows_per_group)), type=pa.int32()), # This 
compresses with delta encoding
            "id": pa.array([id_val] * rows_per_group), # This compresses with 
RLE
        })

def print_rss():
    print(f"RSS: {process.memory_info().rss:,} bytes")

print(f"memory_pool={pa.default_memory_pool().backend_name}")
print_rss()
print("reading table")
tab = pa.concat_tables(list(gen_batches()))
print_rss()
print("deleting table")
del tab
gc.collect()
print_rss()
print("releasing unused memory")
pa.default_memory_pool().release_unused()
print_rss()
print("waiting 10 seconds")
time.sleep(10)
print_rss()
{code}
{code:none}
> ARROW_DEFAULT_MEMORY_POOL=mimalloc python test_pool.py && \
    ARROW_DEFAULT_MEMORY_POOL=jemalloc python test_pool.py && \
    ARROW_DEFAULT_MEMORY_POOL=system python test_pool.py
memory_pool=mimalloc
RSS: 44,449,792 bytes
reading table
RSS: 1,819,557,888 bytes
deleting table
RSS: 1,819,590,656 bytes
releasing unused memory
RSS: 1,819,852,800 bytes
waiting 10 seconds
RSS: 1,819,852,800 bytes
memory_pool=jemalloc
RSS: 45,629,440 bytes
reading table
RSS: 1,668,677,632 bytes
deleting table
RSS: 698,400,768 bytes
releasing unused memory
RSS: 699,023,360 bytes
waiting 10 seconds
RSS: 699,023,360 bytes
memory_pool=system
RSS: 44,875,776 bytes
reading table
RSS: 1,713,569,792 bytes
deleting table
RSS: 540,311,552 bytes
releasing unused memory
RSS: 540,311,552 bytes
waiting 10 seconds
RSS: 540,311,552 bytes
{code}

  was:
I was trying reproduce another issue involving memory pools not releasing 
memory, but encountered this confusing behavior: if I create a table, then call 
{{{}del table{}}}, and then {{{}pool.release_unused(){}}}, I still see 
significant memory usage. On mimalloc in particular, I see no meaningful drop 
in memory usage on either call.

Am I missing something?
{code:python}
import os
import psutil
import time
import gc
process = psutil.Process(os.getpid())
import numpy as np
from uuid import uuid4


import pyarrow as pa

def gen_batches(n_groups=200, rows_per_group=200_000):
    for _ in range(n_groups):
        id_val = uuid4().bytes
        yield pa.table({
            "x": np.random.random(rows_per_group), # This will compress poorly
            "y": np.random.random(rows_per_group),
            "a": pa.array(list(range(rows_per_group)), type=pa.int32()), # This 
compresses with delta encoding
            "id": pa.array([id_val] * rows_per_group), # This compresses with 
RLE
        })

def print_rss():
    print(f"RSS: {process.memory_info().rss:,} bytes")

print(f"memory_pool={pa.default_memory_pool().backend_name}")
print_rss()
print("reading table")
tab = pa.concat_tables(list(gen_batches()))
print_rss()
print("deleting table")
del tab
gc.collect()
print_rss()
print("releasing unused memory")
pa.default_memory_pool().release_unused()
print_rss()
print("waiting 10 seconds")
time.sleep(10)
print_rss()
{code}
{code:none}
> ARROW_DEFAULT_MEMORY_POOL=mimalloc python test_pool.py && \
    ARROW_DEFAULT_MEMORY_POOL=jemalloc python test_pool.py && \
    ARROW_DEFAULT_MEMORY_POOL=system python test_pool.py
memory_pool=mimalloc
RSS: 44,449,792 bytes
reading table
RSS: 1,819,557,888 bytes
deleting table
RSS: 1,819,590,656 bytes
releasing unused memory
RSS: 1,819,852,800 bytes
waiting 10 seconds
RSS: 1,819,852,800 bytes
memory_pool=jemalloc
RSS: 45,629,440 bytes
reading table
RSS: 1,668,677,632 bytes
deleting table
RSS: 698,400,768 bytes
releasing unused memory
RSS: 699,023,360 bytes
waiting 10 seconds
RSS: 699,023,360 bytes
memory_pool=system
RSS: 44,875,776 bytes
reading table
RSS: 1,713,569,792 bytes
deleting table
RSS: 540,311,552 bytes
releasing unused memory
RSS: 540,311,552 bytes
waiting 10 seconds
RSS: 540,311,552 bytes
{code}


> [Python] Memory kept after del and pool.released_unused()
> ---------------------------------------------------------
>
>                 Key: ARROW-17441
>                 URL: https://issues.apache.org/jira/browse/ARROW-17441
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>    Affects Versions: 9.0.0
>            Reporter: Will Jones
>            Priority: Major
>
> I was trying reproduce another issue involving memory pools not releasing 
> memory, but encountered this confusing behavior: if I create a table, then 
> call {{{}del table{}}}, and then {{{}pool.release_unused(){}}}, I still see 
> significant memory usage. On mimalloc in particular, I see no meaningful drop 
> in memory usage on either call.
> Am I missing something? My understanding prior has been that memory will be 
> held onto by a memory pool, but will be forced free by release_unused; and 
> that system memory pool should release memory immediately. But neither of 
> those seem true.
> {code:python}
> import os
> import psutil
> import time
> import gc
> process = psutil.Process(os.getpid())
> import numpy as np
> from uuid import uuid4
> import pyarrow as pa
> def gen_batches(n_groups=200, rows_per_group=200_000):
>     for _ in range(n_groups):
>         id_val = uuid4().bytes
>         yield pa.table({
>             "x": np.random.random(rows_per_group), # This will compress poorly
>             "y": np.random.random(rows_per_group),
>             "a": pa.array(list(range(rows_per_group)), type=pa.int32()), # 
> This compresses with delta encoding
>             "id": pa.array([id_val] * rows_per_group), # This compresses with 
> RLE
>         })
> def print_rss():
>     print(f"RSS: {process.memory_info().rss:,} bytes")
> print(f"memory_pool={pa.default_memory_pool().backend_name}")
> print_rss()
> print("reading table")
> tab = pa.concat_tables(list(gen_batches()))
> print_rss()
> print("deleting table")
> del tab
> gc.collect()
> print_rss()
> print("releasing unused memory")
> pa.default_memory_pool().release_unused()
> print_rss()
> print("waiting 10 seconds")
> time.sleep(10)
> print_rss()
> {code}
> {code:none}
> > ARROW_DEFAULT_MEMORY_POOL=mimalloc python test_pool.py && \
>     ARROW_DEFAULT_MEMORY_POOL=jemalloc python test_pool.py && \
>     ARROW_DEFAULT_MEMORY_POOL=system python test_pool.py
> memory_pool=mimalloc
> RSS: 44,449,792 bytes
> reading table
> RSS: 1,819,557,888 bytes
> deleting table
> RSS: 1,819,590,656 bytes
> releasing unused memory
> RSS: 1,819,852,800 bytes
> waiting 10 seconds
> RSS: 1,819,852,800 bytes
> memory_pool=jemalloc
> RSS: 45,629,440 bytes
> reading table
> RSS: 1,668,677,632 bytes
> deleting table
> RSS: 698,400,768 bytes
> releasing unused memory
> RSS: 699,023,360 bytes
> waiting 10 seconds
> RSS: 699,023,360 bytes
> memory_pool=system
> RSS: 44,875,776 bytes
> reading table
> RSS: 1,713,569,792 bytes
> deleting table
> RSS: 540,311,552 bytes
> releasing unused memory
> RSS: 540,311,552 bytes
> waiting 10 seconds
> RSS: 540,311,552 bytes
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to