Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-13 Thread INADA Naoki
On Fri, Jul 13, 2018 at 11:46 PM Ivan Pozdeev via Python-Dev
 wrote:
>
> If the use case for stability is only .pyc compilation, I doubt it's even 
> relevant 'cuz .pyc's are supposed to be compiled in isolation from other 
> current objects (otherwise, they wouldn't be reusable or would be invalidated 
> when dependent modules change, neither of which is the case), so relevant 
> reference counts should always be the same.
> I may be mistaking though.
>

Good point!  You're right.

Currently, there is one unstable pyc issue (except frozenset order which can be
fixed by PYTHONHASHSEED).
https://bugzilla.opensuse.org/show_bug.cgi?id=1049186

This is caused by interned string.
Because of interning, reference count can be unstable.

Like that, long objects, tuples and some others are cached and reused
automatically.
But they has refcnt>1 always: reference from object and cache.

So we can use FLAG_REF always for interned string, even if refcnt==1.
Let's try it and wait another issue are found.

Thanks!

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-13 Thread Ivan Pozdeev via Python-Dev
If the use case for stability is only .pyc compilation, I doubt it's 
even relevant 'cuz .pyc's are supposed to be compiled in isolation from 
other current objects (otherwise, they wouldn't be reusable or would be 
invalidated when dependent modules change, neither of which is the 
case), so relevant reference counts should always be the same.

I may be mistaking though.

On 13.07.2018 16:57, Christian Tismer wrote:

Well, to my knowledge they did not modify the marshal code.
They are in fact heavily dependent from marshal speed since that
is used frequently to save and restore state of many actors.

But haven't looked further since 2010 ;-)

Btw., why are they considering to make the algorithm slower,
just because someone wants the algorithm stable?

An optional keyword argument would give the stability, and the
default behavior would not be changed at all.

Cheers - Chris


On 12.07.18 12:07, Steve Holden wrote:

Eve is indeed based on stackless 2, and are well capable of ignoring
changes they don't think they need (or were when I was working with
them). At one point I seem to remember they optimised their interpreter
to use singleton floating-point values, saving large quantities of
memory by having only one floating-point zero.

Steve Holden

On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters mailto:tritium-l...@sdamon.com>> wrote:



 > -Original Message-
 > From: Python-Dev  list=sdamon@python.org <mailto:sdamon@python.org>> On Behalf Of
 Victor Stinner
 > Sent: Thursday, July 12, 2018 4:01 AM
 > To: Serhiy Storchaka mailto:storch...@gmail.com>>
 > Cc: python-dev mailto:python-dev@python.org>>
     > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
 >
 > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka mailto:storch...@gmail.com>>:
 > >> Is there any real application which marshal.dumps() performance is
 > >> critical?
 > >
 > > EVE Online is a well known example.
 >
 > EVE Online has been created in 2003. I guess that it still uses Python
 2.7.
 >
 > I'm not sure that a video game would pick marshal in 2018.
 >

 EVE doesn't use stock CPython, IIRC.  They use a version of stackless 2,
 with their own patches.  If a company is willing to patch python
 itself, I
 don't think their practices should be cited without more context
 about what
 they actually modified.

 > Victor
 > ___
 > Python-Dev mailing list
 > Python-Dev@python.org <mailto:Python-Dev@python.org>
 > https://mail.python.org/mailman/listinfo/python-dev
 <https://mail.python.org/mailman/listinfo/python-dev>
 > Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/tritium-
 <https://mail.python.org/mailman/options/python-dev/tritium->
 > list%40sdamon.com <http://40sdamon.com>

 ___
 Python-Dev mailing list
 Python-Dev@python.org <mailto:Python-Dev@python.org>
 https://mail.python.org/mailman/listinfo/python-dev
 <https://mail.python.org/mailman/listinfo/python-dev>
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com
 <https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com>




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/tismer%40stackless.com





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru


--
Regards,
Ivan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-13 Thread Christian Tismer
Well, to my knowledge they did not modify the marshal code.
They are in fact heavily dependent from marshal speed since that
is used frequently to save and restore state of many actors.

But haven't looked further since 2010 ;-)

Btw., why are they considering to make the algorithm slower,
just because someone wants the algorithm stable?

An optional keyword argument would give the stability, and the
default behavior would not be changed at all.

Cheers - Chris


On 12.07.18 12:07, Steve Holden wrote:
> Eve is indeed based on stackless 2, and are well capable of ignoring
> changes they don't think they need (or were when I was working with
> them). At one point I seem to remember they optimised their interpreter
> to use singleton floating-point values, saving large quantities of
> memory by having only one floating-point zero.
> 
> Steve Holden
> 
> On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters  <mailto:tritium-l...@sdamon.com>> wrote:
> 
> 
> 
> > -Original Message-
> > From: Python-Dev  > list=sdamon@python.org <mailto:sdamon@python.org>> On Behalf Of
> Victor Stinner
> > Sent: Thursday, July 12, 2018 4:01 AM
> > To: Serhiy Storchaka mailto:storch...@gmail.com>>
> > Cc: python-dev mailto:python-dev@python.org>>
> > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
> > 
> > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka  <mailto:storch...@gmail.com>>:
> > >> Is there any real application which marshal.dumps() performance is
> > >> critical?
> > >
> > > EVE Online is a well known example.
> > 
> > EVE Online has been created in 2003. I guess that it still uses Python
> 2.7.
> > 
> > I'm not sure that a video game would pick marshal in 2018.
> > 
> 
> EVE doesn't use stock CPython, IIRC.  They use a version of stackless 2,
> with their own patches.  If a company is willing to patch python
> itself, I
> don't think their practices should be cited without more context
> about what
> they actually modified.
> 
> > Victor
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org <mailto:Python-Dev@python.org>
> > https://mail.python.org/mailman/listinfo/python-dev
> <https://mail.python.org/mailman/listinfo/python-dev>
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/tritium-
> <https://mail.python.org/mailman/options/python-dev/tritium->
> > list%40sdamon.com <http://40sdamon.com>
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org <mailto:Python-Dev@python.org>
> https://mail.python.org/mailman/listinfo/python-dev
> <https://mail.python.org/mailman/listinfo/python-dev>
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com
> <https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com>
> 
> 
> 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/tismer%40stackless.com
> 


-- 
Christian Tismer-Sperling:^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : http://pyside.org
14482 Potsdam: GPG key -> 0xE7301150FB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-13 Thread André Malo
On Donnerstag, 12. Juli 2018 22:09:41 CEST Antoine Pitrou wrote:
> On Thu, 12 Jul 2018 22:03:30 +0200
> 
> André Malo  wrote:
> > * INADA Naoki wrote:
> > > Is there any real application which marshal.dumps() performance is
> > > critical?
> > 
> > I'm using it for spooling big chunks of data on disk, exactly for the
> > reason that it's faster than pickle.
> 
> Which kind of data is that?

Basically iterators of builtin objects (dicts or tuples of strings or 
numbers). Typically one unit or "row" per dumps() call (they are written one 
after the next and marshal load can easily load them in the same manner).
They're certainly never the same objects (except maybe for dict keys, which 
might be interned)

Cheers,
nd


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread INADA Naoki
On Fri, Jul 13, 2018 at 5:03 AM André Malo  wrote:
>
> * INADA Naoki wrote:
>
> > Is there any real application which marshal.dumps() performance is
> > critical?
>
> I'm using it for spooling big chunks of data on disk, exactly for the reason
> that it's faster than pickle.
>
> Cheers,

Does your data contains repetition of same object (not same value)?

If yes, this change will affects you.
If no, you can use older version which doesn't have overhead of
checking object identity.

>>> x = [0]*100
>>> y = [0]*100
>>> data = [x,y,x]
>>> import marshal
>>> len(marshal.dumps(data))  # x is marshaled once
1020
>>> d[0] is d[2]
True
>>> d[0] is d[1]
False
>>> import json
>>> len(json.dumps(data))  # x is marshaled twice
906
>>> d = marshal.loads(marshal.dumps(data, 2))  # x is marshaled twice
>>> len(d)
1520
>>> d[0] is d[2]
False

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Antoine Pitrou
On Thu, 12 Jul 2018 22:03:30 +0200
André Malo  wrote:
> * INADA Naoki wrote:
> 
> > Is there any real application which marshal.dumps() performance is
> > critical?  
> 
> I'm using it for spooling big chunks of data on disk, exactly for the reason 
> that it's faster than pickle.

Which kind of data is that?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread André Malo
* INADA Naoki wrote:

> Is there any real application which marshal.dumps() performance is
> critical?

I'm using it for spooling big chunks of data on disk, exactly for the reason 
that it's faster than pickle.

Cheers,
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Antoine Pitrou
On Thu, 12 Jul 2018 09:21:55 +0300
Serhiy Storchaka  wrote:
> 
> > Is there any real application which marshal.dumps() performance is 
> > critical?  
> EVE Online is a well known example.
> 
> What if write a script which loads .pyc files and stabilize them? This 
> could solve the problem for applications which need stable .pyc files, 
> with zero impact on common use.

Should python-dev maintain that script? If yes, it sounds better to
make marshal itself deterministic.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Steve Holden
Eve is indeed based on stackless 2, and are well capable of ignoring
changes they don't think they need (or were when I was working with them).
At one point I seem to remember they optimised their interpreter to use
singleton floating-point values, saving large quantities of memory by
having only one floating-point zero.

Steve Holden

On Thu, Jul 12, 2018 at 9:55 AM, Alex Walters 
wrote:

>
>
> > -Original Message-
> > From: Python-Dev  > list=sdamon@python.org> On Behalf Of Victor Stinner
> > Sent: Thursday, July 12, 2018 4:01 AM
> > To: Serhiy Storchaka 
> > Cc: python-dev 
> > Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
> >
> > 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka :
> > >> Is there any real application which marshal.dumps() performance is
> > >> critical?
> > >
> > > EVE Online is a well known example.
> >
> > EVE Online has been created in 2003. I guess that it still uses Python
> 2.7.
> >
> > I'm not sure that a video game would pick marshal in 2018.
> >
>
> EVE doesn't use stock CPython, IIRC.  They use a version of stackless 2,
> with their own patches.  If a company is willing to patch python itself, I
> don't think their practices should be cited without more context about what
> they actually modified.
>
> > Victor
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium-
> > list%40sdamon.com
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> steve%40holdenweb.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Alex Walters



> -Original Message-
> From: Python-Dev  list=sdamon@python.org> On Behalf Of Victor Stinner
> Sent: Thursday, July 12, 2018 4:01 AM
> To: Serhiy Storchaka 
> Cc: python-dev 
> Subject: Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?
> 
> 2018-07-12 8:21 GMT+02:00 Serhiy Storchaka :
> >> Is there any real application which marshal.dumps() performance is
> >> critical?
> >
> > EVE Online is a well known example.
> 
> EVE Online has been created in 2003. I guess that it still uses Python
2.7.
> 
> I'm not sure that a video game would pick marshal in 2018.
> 

EVE doesn't use stock CPython, IIRC.  They use a version of stackless 2,
with their own patches.  If a company is willing to patch python itself, I
don't think their practices should be cited without more context about what
they actually modified.

> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium-
> list%40sdamon.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread INADA Naoki
On Thu, Jul 12, 2018 at 3:22 PM Serhiy Storchaka  wrote:
>
> 12.07.18 08:43, INADA Naoki пише:
> > I'm working on making pyc stable, via stablizing marshal.dumps()
> > https://bugs.python.org/issue34093
>
> This is not enough for making pyc stable. The order in frozesets still
> is arbitrary.

But we can use PYTHONHASHSEED to make pyc stable.
Currently, refcnt is the only known issue for reproducible pyc build.

>
> > Sadly, it makes marshal.dumps() 40% slower.
> > Luckily, this overhead is small (only 4%) for dumps(compile(source)) case.
>
> What about the memory consumption?

No overhead, because we already used same hashtable for w_ref.
I just make it two-pass, instead of one-pass.

>
> > So my question is:  May I remove unstable but faster code?
> >
> > Or should I make this optional and we maintain two complex code?
> > If so, should this option enabled by default or not?
>
> My concern is that even if not make it optional, this will complicate
> the code.

When it's not optional, it makes almost duplicate of w_object for
reference counting in object tree.
https://github.com/python/cpython/pull/8226/commits/e170116e80dfd27f923c88fc11e42f0d6f687a00

>
> > For example, xmlrpc uses marshal.  But xmlrpc has significant overhead
> > other than marshaling, like dumps(compile(source)) case.  So I expect
> > marshal.dumps() performance is not critical for it too.
>
> xmlrpc doesn't use the marshal module. It uses terms marshalling and
> unmarshalling, but in different meaning.
>

Oh, I just grepped and misunderstood.

> > Is there any real application which marshal.dumps() performance is critical?
> EVE Online is a well known example.
>

Do they use version>=3?
In version 3, FLAG_REF is introduced and it made significant runtime
overhead already.
If marshaling speed is very important, version<2 should be used.

> What if write a script which loads .pyc files and stabilize them? This
> could solve the problem for applications which need stable .pyc files,
> with zero impact on common use.
>

Hmm, do you mean which?:

* Adding marshal.dump_stable_pyc() and use it like
  `marshal.dump_stable_pyc(marshal.loads(code))`
* Implementing pure Python marshal.dumps in distutils

--
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Victor Stinner
2018-07-12 8:21 GMT+02:00 Serhiy Storchaka :
>> Is there any real application which marshal.dumps() performance is
>> critical?
>
> EVE Online is a well known example.

EVE Online has been created in 2003. I guess that it still uses Python 2.7.

I'm not sure that a video game would pick marshal in 2018.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread Serhiy Storchaka

12.07.18 08:43, INADA Naoki пише:

I'm working on making pyc stable, via stablizing marshal.dumps()
https://bugs.python.org/issue34093


This is not enough for making pyc stable. The order in frozesets still 
is arbitrary.



Sadly, it makes marshal.dumps() 40% slower.
Luckily, this overhead is small (only 4%) for dumps(compile(source)) case.


What about the memory consumption?


So my question is:  May I remove unstable but faster code?

Or should I make this optional and we maintain two complex code?
If so, should this option enabled by default or not?


My concern is that even if not make it optional, this will complicate 
the code.



For example, xmlrpc uses marshal.  But xmlrpc has significant overhead
other than marshaling, like dumps(compile(source)) case.  So I expect
marshal.dumps() performance is not critical for it too.


xmlrpc doesn't use the marshal module. It uses terms marshalling and 
unmarshalling, but in different meaning.



Is there any real application which marshal.dumps() performance is critical?

EVE Online is a well known example.

What if write a script which loads .pyc files and stabilize them? This 
could solve the problem for applications which need stable .pyc files, 
with zero impact on common use.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com