Hi Rob,

I only looked briefly at your example so far but it seems to be dominated by 
one big string. Therefore you should definitely try PyPy 7.1. since that 
improves the size of strings dramatically.

Cheers, 

Carl Friedrich

On August 22, 2019 1:34:11 PM GMT+02:00, Robert Whitcher 
<robert.whitc...@rubrik.com> wrote:
>Shared test file with Carl (to avoid posting to everyones inbox).
>PyPy version is currently 6.0 (I don't have ability to affect a change
>here
>unless I can prove something)
>
>On Thu, Aug 22, 2019 at 1:00 AM Carl Friedrich Bolz-Tereick
><cfb...@gmx.de>
>wrote:
>
>> Hi Rob,
>>
>> Which version of PyPy are you running this with? I have a long
>running
>> branch that I really should merge someday that is supposed to help
>with
>> memory consumption of json deserialization. Is there a chance you
>could
>> share a (anonymized) version of your test file?
>>
>> Alternatively, you could try a nightly build from this branch
>yourself:
>>
>> http://buildbot.pypy.org/nightly/json-decoder-maps-py3.6/
>>
>> Carl Friedrich
>>
>> On August 22, 2019 1:02:42 AM GMT+02:00, Robert Whitcher <
>> robert.whitc...@rubrik.com> wrote:
>>>
>>> Hi,
>>> I am running a very simple test case (as we are hitting OOM on our
>larger
>>> PyPy deployments) and I'd love some help understanding what is
>happening
>>> here....
>>> We have a lot of processes that send messages to each other.
>>> These can be large JSON serializations of objects.
>>> But the memory being consumed seems out of order and hard to manage
>>> across processes.
>>>
>>> I have this loop running:
>>>
>>> import time
>>> import json
>>>
>>> def main():
>>>     with open("/tmp/test12334.1234", "r") as f:
>>>         json_msg = f.read()
>>>
>>>     while True:
>>>         j = json.loads(json_msg)
>>>         time.sleep(10)
>>>
>>> if __name__ == "__main__":
>>>     main()
>>>
>>>
>>> I have tried 3 separate general runs across both pypy and cpython.
>>> The first does nothing but the sleep (it doesn't load or json the
>message)
>>> The second just loaded the json_str from a file
>>> The third is the full loop.
>>>
>>> If I run this in cpython I get (80MB, 92MB and 136MB) respectively
>>> This makes sense as the file is 11MB json serialization of a
>dictionary
>>> and json.loads takes up some memory
>>>
>>> However if I run this in pypy I get 120MB, 153MB and between
>360-405MB
>>> when it settles out.
>>> I get the JIT and startup memory being higher, spending a little
>more
>>> loading the string but WOW does json loading the string chew up a
>bunch.
>>>
>>> Multiplying that memory across processes is eating a bunch.
>>>
>>> What easy things am I missing?
>>>
>>> Thanks,
>>> Rob
>>>
>>>
>>>
>
>-- 
>[image: photo]
>Robert Whitcher
>Member of Technical Staff at Rubrik
>M  512-633-1771  <512-633-1771> E  robert.whitc...@rubrik.com
><robert.whitc...@rubrik.com> W  www.rubrik.com
><http://www.rubrik.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to