Hi Brad,
the new json decoder caches values, but not across json.load calls. ie
if you don't process a single gigantic message but many tiny messages,
the caching of (string) values cannot be the problem.
what *could* be the problem is the key caching though, in theory. The
key cache is partially persisted across calls. Are there arbitrarily
many different keys in your messages? The key cache shouldn't grow
without bounds, however.
what do you set your max heap size to?
To debug further, I would first try to see whether it is indeed the json
module or something else. Maybe you could try another json decoder and
see whether the problem persists with that? Eg ujson works on PyPy
(slowly) or you could even use the pure python builtin one that you get
by importing json.decoder.JSONDecoder.
Cheers,
Carl Friedrich
On 26.05.21 16:39, Brad Kish wrote:
Sorry, the containers are not dying. Our processes are exiting due to gc_max
settings.
Brad.
On May 26, 2021, at 10:31 AM, Brad Kish <brad.k...@arcticwolf.com> wrote:
We have a streaming application that processes JSON messages. Our system has
been mostly stable running PyPy2.7 6.0 for a couple of years.
Recently we tried upgrading to 7.3.4 and saw our processes leak memory until
the container was killed. This happened with both PyPy2.7 7.3.4 and PyPy3.7
7.3.4. The only difference was the cpu usage with PyPy3.7 was almost double. :(
My question is could this be from the new JSON parser introduced in 7.2?
According to https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html,
the parser now caches values not just keys. Is there anyway to disable the
value caching? Or other things we can try?
Thanks,
Brad.
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev