Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Fri, Jul 13, 2018 at 11:46 PM Ivan Pozdeev via Python-Dev wrote: > > If the use case for stability is only .pyc compilation, I doubt it's even > relevant 'cuz .pyc's are supposed to be compiled in isolation from other > current objects (otherwise, they wouldn't be reusable or would be invalidated > when dependent modules change, neither of which is the case), so relevant > reference counts should always be the same. > I may be mistaking though. > Good point! You're right. Currently, there is one unstable pyc issue (except frozenset order which can be fixed by PYTHONHASHSEED). https://bugzilla.opensuse.org/show_bug.cgi?id=1049186 This is caused by interned string. Because of interning, reference count can be unstable. Like that, long objects, tuples and some others are cached and reused automatically. But they has refcnt>1 always: reference from object and cache. So we can use FLAG_REF always for interned string, even if refcnt==1. Let's try it and wait another issue are found. Thanks! -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
If the use case for stability is only .pyc compilation, I doubt it's even relevant 'cuz .pyc's are supposed to be compiled in isolation from other current objects (otherwise, they wouldn't be reusable or would be invalidated when dependent modules change, neither of which is the case), so relevant reference counts should always be the same. I may be mistaking though. On 13.07.2018 16:57, Christian Tismer wrote: Well, to my knowledge they did not modify the marshal code. They are in fact heavily dependent from marshal speed since that is used frequently to save and restore state of many actors. But haven't looked further since 2010 ;-) Btw., why are they considering to make the algorithm slower, just because someone wants the algorithm stable? An optional keyword argument would give the stability, and the default behavior would not be changed at all. Cheers - Chris On 12.07.18 12:07, Steve Holden wrote: Eve is indeed based on stackless 2, and are well capable of ignoring changes they don't think they need (or were when I was working with them). At one point I seem to remember they optimised their interpreter to use singleton floating-point values, saving large quantities of memory by having only one floating-point zero. Steve Holden On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters mailto:tritium-l...@sdamon.com>> wrote: > -Original Message- > From: Python-Dev list=sdamon@python.org <mailto:sdamon@python.org>> On Behalf Of Victor Stinner > Sent: Thursday, July 12, 2018 4:01 AM > To: Serhiy Storchaka mailto:storch...@gmail.com>> > Cc: python-dev mailto:python-dev@python.org>> > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler? > > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka mailto:storch...@gmail.com>>: > >> Is there any real application which marshal.dumps() performance is > >> critical? > > > > EVE Online is a well known example. > > EVE Online has been created in 2003. I guess that it still uses Python 2.7. > > I'm not sure that a video game would pick marshal in 2018. > EVE doesn't use stock CPython, IIRC. They use a version of stackless 2, with their own patches. If a company is willing to patch python itself, I don't think their practices should be cited without more context about what they actually modified. > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org <mailto:Python-Dev@python.org> > https://mail.python.org/mailman/listinfo/python-dev <https://mail.python.org/mailman/listinfo/python-dev> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- <https://mail.python.org/mailman/options/python-dev/tritium-> > list%40sdamon.com <http://40sdamon.com> ___ Python-Dev mailing list Python-Dev@python.org <mailto:Python-Dev@python.org> https://mail.python.org/mailman/listinfo/python-dev <https://mail.python.org/mailman/listinfo/python-dev> Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com <https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com> ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/tismer%40stackless.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
Well, to my knowledge they did not modify the marshal code. They are in fact heavily dependent from marshal speed since that is used frequently to save and restore state of many actors. But haven't looked further since 2010 ;-) Btw., why are they considering to make the algorithm slower, just because someone wants the algorithm stable? An optional keyword argument would give the stability, and the default behavior would not be changed at all. Cheers - Chris On 12.07.18 12:07, Steve Holden wrote: > Eve is indeed based on stackless 2, and are well capable of ignoring > changes they don't think they need (or were when I was working with > them). At one point I seem to remember they optimised their interpreter > to use singleton floating-point values, saving large quantities of > memory by having only one floating-point zero. > > Steve Holden > > On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters <mailto:tritium-l...@sdamon.com>> wrote: > > > > > -Original Message- > > From: Python-Dev > list=sdamon@python.org <mailto:sdamon@python.org>> On Behalf Of > Victor Stinner > > Sent: Thursday, July 12, 2018 4:01 AM > > To: Serhiy Storchaka mailto:storch...@gmail.com>> > > Cc: python-dev mailto:python-dev@python.org>> > > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler? > > > > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka <mailto:storch...@gmail.com>>: > > >> Is there any real application which marshal.dumps() performance is > > >> critical? > > > > > > EVE Online is a well known example. > > > > EVE Online has been created in 2003. I guess that it still uses Python > 2.7. > > > > I'm not sure that a video game would pick marshal in 2018. > > > > EVE doesn't use stock CPython, IIRC. They use a version of stackless 2, > with their own patches. If a company is willing to patch python > itself, I > don't think their practices should be cited without more context > about what > they actually modified. > > > Victor > > ___ > > Python-Dev mailing list > > Python-Dev@python.org <mailto:Python-Dev@python.org> > > https://mail.python.org/mailman/listinfo/python-dev > <https://mail.python.org/mailman/listinfo/python-dev> > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/tritium- > <https://mail.python.org/mailman/options/python-dev/tritium-> > > list%40sdamon.com <http://40sdamon.com> > > ___ > Python-Dev mailing list > Python-Dev@python.org <mailto:Python-Dev@python.org> > https://mail.python.org/mailman/listinfo/python-dev > <https://mail.python.org/mailman/listinfo/python-dev> > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com > <https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com> > > > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/tismer%40stackless.com > -- Christian Tismer-Sperling:^) tis...@stackless.com Software Consulting : http://www.stackless.com/ Karl-Liebknecht-Str. 121 : http://pyside.org 14482 Potsdam: GPG key -> 0xE7301150FB7BEE0E phone +49 173 24 18 776 fax +49 (30) 700143-0023 signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Donnerstag, 12. Juli 2018 22:09:41 CEST Antoine Pitrou wrote: > On Thu, 12 Jul 2018 22:03:30 +0200 > > André Malo wrote: > > * INADA Naoki wrote: > > > Is there any real application which marshal.dumps() performance is > > > critical? > > > > I'm using it for spooling big chunks of data on disk, exactly for the > > reason that it's faster than pickle. > > Which kind of data is that? Basically iterators of builtin objects (dicts or tuples of strings or numbers). Typically one unit or "row" per dumps() call (they are written one after the next and marshal load can easily load them in the same manner). They're certainly never the same objects (except maybe for dict keys, which might be interned) Cheers, nd ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Fri, Jul 13, 2018 at 5:03 AM André Malo wrote: > > * INADA Naoki wrote: > > > Is there any real application which marshal.dumps() performance is > > critical? > > I'm using it for spooling big chunks of data on disk, exactly for the reason > that it's faster than pickle. > > Cheers, Does your data contains repetition of same object (not same value)? If yes, this change will affects you. If no, you can use older version which doesn't have overhead of checking object identity. >>> x = [0]*100 >>> y = [0]*100 >>> data = [x,y,x] >>> import marshal >>> len(marshal.dumps(data)) # x is marshaled once 1020 >>> d[0] is d[2] True >>> d[0] is d[1] False >>> import json >>> len(json.dumps(data)) # x is marshaled twice 906 >>> d = marshal.loads(marshal.dumps(data, 2)) # x is marshaled twice >>> len(d) 1520 >>> d[0] is d[2] False -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Thu, 12 Jul 2018 22:03:30 +0200 André Malo wrote: > * INADA Naoki wrote: > > > Is there any real application which marshal.dumps() performance is > > critical? > > I'm using it for spooling big chunks of data on disk, exactly for the reason > that it's faster than pickle. Which kind of data is that? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
* INADA Naoki wrote: > Is there any real application which marshal.dumps() performance is > critical? I'm using it for spooling big chunks of data on disk, exactly for the reason that it's faster than pickle. Cheers, -- "Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III" ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Thu, 12 Jul 2018 09:21:55 +0300 Serhiy Storchaka wrote: > > > Is there any real application which marshal.dumps() performance is > > critical? > EVE Online is a well known example. > > What if write a script which loads .pyc files and stabilize them? This > could solve the problem for applications which need stable .pyc files, > with zero impact on common use. Should python-dev maintain that script? If yes, it sounds better to make marshal itself deterministic. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
Eve is indeed based on stackless 2, and are well capable of ignoring changes they don't think they need (or were when I was working with them). At one point I seem to remember they optimised their interpreter to use singleton floating-point values, saving large quantities of memory by having only one floating-point zero. Steve Holden On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters wrote: > > > > -Original Message- > > From: Python-Dev > list=sdamon@python.org> On Behalf Of Victor Stinner > > Sent: Thursday, July 12, 2018 4:01 AM > > To: Serhiy Storchaka > > Cc: python-dev > > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler? > > > > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka : > > >> Is there any real application which marshal.dumps() performance is > > >> critical? > > > > > > EVE Online is a well known example. > > > > EVE Online has been created in 2003. I guess that it still uses Python > 2.7. > > > > I'm not sure that a video game would pick marshal in 2018. > > > > EVE doesn't use stock CPython, IIRC. They use a version of stackless 2, > with their own patches. If a company is willing to patch python itself, I > don't think their practices should be cited without more context about what > they actually modified. > > > Victor > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > > list%40sdamon.com > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > steve%40holdenweb.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
> -Original Message- > From: Python-Dev list=sdamon@python.org> On Behalf Of Victor Stinner > Sent: Thursday, July 12, 2018 4:01 AM > To: Serhiy Storchaka > Cc: python-dev > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler? > > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka : > >> Is there any real application which marshal.dumps() performance is > >> critical? > > > > EVE Online is a well known example. > > EVE Online has been created in 2003. I guess that it still uses Python 2.7. > > I'm not sure that a video game would pick marshal in 2018. > EVE doesn't use stock CPython, IIRC. They use a version of stackless 2, with their own patches. If a company is willing to patch python itself, I don't think their practices should be cited without more context about what they actually modified. > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
On Thu, Jul 12, 2018 at 3:22 PM Serhiy Storchaka wrote: > > 12.07.18 08:43, INADA Naoki пише: > > I'm working on making pyc stable, via stablizing marshal.dumps() > > https://bugs.python.org/issue34093 > > This is not enough for making pyc stable. The order in frozesets still > is arbitrary. But we can use PYTHONHASHSEED to make pyc stable. Currently, refcnt is the only known issue for reproducible pyc build. > > > Sadly, it makes marshal.dumps() 40% slower. > > Luckily, this overhead is small (only 4%) for dumps(compile(source)) case. > > What about the memory consumption? No overhead, because we already used same hashtable for w_ref. I just make it two-pass, instead of one-pass. > > > So my question is: May I remove unstable but faster code? > > > > Or should I make this optional and we maintain two complex code? > > If so, should this option enabled by default or not? > > My concern is that even if not make it optional, this will complicate > the code. When it's not optional, it makes almost duplicate of w_object for reference counting in object tree. https://github.com/python/cpython/pull/8226/commits/e170116e80dfd27f923c88fc11e42f0d6f687a00 > > > For example, xmlrpc uses marshal. But xmlrpc has significant overhead > > other than marshaling, like dumps(compile(source)) case. So I expect > > marshal.dumps() performance is not critical for it too. > > xmlrpc doesn't use the marshal module. It uses terms marshalling and > unmarshalling, but in different meaning. > Oh, I just grepped and misunderstood. > > Is there any real application which marshal.dumps() performance is critical? > EVE Online is a well known example. > Do they use version>=3? In version 3, FLAG_REF is introduced and it made significant runtime overhead already. If marshaling speed is very important, version<2 should be used. > What if write a script which loads .pyc files and stabilize them? This > could solve the problem for applications which need stable .pyc files, > with zero impact on common use. > Hmm, do you mean which?: * Adding marshal.dump_stable_pyc() and use it like `marshal.dump_stable_pyc(marshal.loads(code))` * Implementing pure Python marshal.dumps in distutils -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
2018-07-12 8:21 GMT+02:00 Serhiy Storchaka : >> Is there any real application which marshal.dumps() performance is >> critical? > > EVE Online is a well known example. EVE Online has been created in 2003. I guess that it still uses Python 2.7. I'm not sure that a video game would pick marshal in 2018. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
12.07.18 08:43, INADA Naoki пише: I'm working on making pyc stable, via stablizing marshal.dumps() https://bugs.python.org/issue34093 This is not enough for making pyc stable. The order in frozesets still is arbitrary. Sadly, it makes marshal.dumps() 40% slower. Luckily, this overhead is small (only 4%) for dumps(compile(source)) case. What about the memory consumption? So my question is: May I remove unstable but faster code? Or should I make this optional and we maintain two complex code? If so, should this option enabled by default or not? My concern is that even if not make it optional, this will complicate the code. For example, xmlrpc uses marshal. But xmlrpc has significant overhead other than marshaling, like dumps(compile(source)) case. So I expect marshal.dumps() performance is not critical for it too. xmlrpc doesn't use the marshal module. It uses terms marshalling and unmarshalling, but in different meaning. Is there any real application which marshal.dumps() performance is critical? EVE Online is a well known example. What if write a script which loads .pyc files and stabilize them? This could solve the problem for applications which need stable .pyc files, with zero impact on common use. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com