[jira] [Commented] (ARROW-2879) [Python] Arrow plasma can only use a small part of specified shared memory

Wenjun Si (JIRA) Mon, 11 Mar 2019 23:40:39 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790278#comment-16790278
 ]


Wenjun Si commented on ARROW-2879:
----------------------------------

After changing 
https://github.com/apache/arrow/blob/7ddad36e0fd3707f0a893bbfda3e2f9149909708/cpp/src/plasma/malloc.cc#L79:15
 into

{code}
constexpr int GRANULARITY_MULTIPLIER = 1;
{code}

we got the desired memory sizes

{code}
(1073741824, 1040187392)
(2147483648, 2080374784)
(3221225472, 3120562176)
(4294967296, 4160749568)
(5368709120, 5205131264)
(6442450944, 6245318656)
(7516192768, 7285506048)
(8589934592, 8325693440)
(9663676416, 9370075136)
(10737418240, 10410262528)
(11811160064, 11450449920)
{code}

We haven't tested if this causes other problems yet.

> [Python] Arrow plasma can only use a small part of specified shared memory
> --------------------------------------------------------------------------
>
>                 Key: ARROW-2879
>                 URL: https://issues.apache.org/jira/browse/ARROW-2879
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: chineking
>            Priority: Major
>             Fix For: 0.14.0
>
>
> Hi, thanks for the great job of arrow, it helps us a lot.
> However, we encounter a problem when we were using plasma.
> The sample code:
> {code:python}
> import numpy as np
> import pyarrow as pa
> import pyarrow.plasma as plasma
> client = plasma.connect("/tmp/plasma", "", 0)
> puts = []
> nbytes = 0
> while True:
>     a = np.ones((1000, 1000))
>     try:
>         oid = client.put(a)
>         puts.append(client.get(oid))
>         nbytes += a.nbytes
>     except pa.lib.PlasmaStoreFull:
>         print('use nbytes', nbytes)
>         break
> {code}
> We start a plasma store with 1G memory, but the nbytes output above is only 
> 496000000, which cannot even reach half of the memory we specified.
> I cannot figure out why plasma can only use such a small part of shared 
> memory. Could anybody help me? Thanks a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2879) [Python] Arrow plasma can only use a small part of specified shared memory

Reply via email to