[issue42454] Move slice creation to the compiler for constants

2021-05-09 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

I've implemented a new revision that works without making slices hashable. 
Updating PR 23496

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2021-04-11 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

One possibility for this being a breaking change is this scenario: some 
codebase may have logic that takes a list and uses a slice operation on it; in 
a rare circumstance the list is really a dict and a TypeError is caught 
upstream and dealt with; with this change it will no longer be caught/logged 
and the dict will be unexpectedly modified. It might also be hard to debug. 

I don't remember seeing such code but it's conceivable.

The change might still be worth it for performance improvement though - I'm not 
sure.

--
nosy: +andrei.avk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-12-31 Thread Mark Dickinson


Mark Dickinson  added the comment:

@Batuhan Taskaya: would it be worth starting a discussion on either 
python-ideas or python-dev? (Or on discuss.python.org.)

My concern is that we're discussing a core language change. It's not a major 
change, but it's not an insignificant one either, and right now the discussion 
is buried in an issue whose description and "Type" category suggest that it's 
purely about performance - those giving this issue a casual glance may not 
realise that there's a language change involved.

I don't have strong opinions on the change itself either way, but others might 
have.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-12-31 Thread Batuhan Taskaya

Batuhan Taskaya  added the comment:

I've given this another shot, but even though I was able to create a patch that 
preserved slices as (type(slice), start, stop, step) tuples in the 
consts_cache, this final optimization prevented me to finalize it; 
https://github.com/python/cpython/blob/master/Python/compile.c#L5855-L5858

So, unless anyone else came up with a reasonable idea to do this optimization 
at the compile-time without actually making them hashable, I am re-proposing 
the idea of making them hashable.

Pro:
  - Up to %30+ speedup on slicing with constants (extremely common)
Cons:
  - Mappings accepts slices, which would rarely lead some esoteric cases (like 
the examples given)

Comments would be appreciated

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

> I have to say that I feel that this will complicate things too much. Part of 
> the appeal of this change is how straightforward and maintainable is. 
> Fiddling with code objects lowers the watermark.

Agreed. And I'd go for either making slices hashable as is, or rejecting this 
patch without implementing more hacks to the Python / messing with the compiler.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> I thought about using code.__hash__ in the first place, but the compiler's 
> underlying store system (both compiler_unit->u_consts and 
> compiler->c_const_cache) uses dictionary's where the keys are constant 
> objects themselves (or tuples where one of the items is the constant). We 
> might need to change that first (either using something else for the keys, or 
> switch to lists? or something else).

I have to say that I feel that this will complicate things too much. Part of 
the appeal of this change is how straightforward and maintainable is. Fiddling 
with code objects lowers the watermark.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Yep, Mark Shannon's solution of contextual hashing is what I was trying 
(without success) when my last computer died (without backing up work offsite, 
oops) and I gave up on this for a while. And Batuhan Taskaya's note about 
compiler dictionaries for the constants being a problem is where I got stuck.

Switching to lists might work (I never pursued this far enough to profile it to 
see what the performance impact was; presumably for small functions it would be 
near zero, while larger functions might compile more slowly).

The other approach I considered (and was partway through implementing when the 
computer died) was to use a dict subclass specifically for the constants 
dictionaries; inherit almost everything from regular dicts, but with built-in 
knowledge of slices so it could perform hashing on their behalf (I believe you 
could use the KnownHash APIs to keep custom code minimal; you just check for 
slices, fake their hash if you got one and call the KnownHash API, otherwise, 
defer to dict normally). Just an extension of the code.__hash__ trick, adding a 
couple more small hacks into small parts of Python so they treat slices as 
hashable only in that context without allowing non-intuitive behaviors in 
normal dict usage.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

> By leaving slices unhashable, and accounting for their presence in 
> code.__hash__, we get both the performance improvement and full backwards 
> compatibility.


I thought about using code.__hash__ in the first place, but the compiler's 
underlying store system (both compiler_unit->u_consts and 
compiler->c_const_cache) uses dictionary's where the keys are constant objects 
themselves (or tuples where one of the items is the constant). We might need to 
change that first (either using something else for the keys, or switch to 
lists? or something else).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Mark Shannon


Mark Shannon  added the comment:

I agree with Mark about slice hashing.
This looks like a deliberate design decision.
x[:] = [] should empty a sequence, not set the key-value pair (slice(None, 
None, None), []) in a mapping.

However, code-objects can still be hashable even if slices are not.

By leaving slices unhashable, and accounting for their presence in 
code.__hash__, we get both the performance improvement and full backwards 
compatibility.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

One thing to add, the proposed implementation also optimizes extended slices 
(mostly used in scientific stacks) such as [:, 1].

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-28 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

I closed #11107 in favor of this issue.  Some of the discussion and concerns 
(like slice hashing) was similar.

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-25 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

> What are the potential benefits or drawbacks for the user?

The only potential drawback that I can see is that it prevents you from 
distinguishing a sequence from mapping by 'accidentally' passing a slice. 

The major benefit for users is that they will have a decent speed up on their 
slice access. Other than that, I can think of some scenarios where the slice 
objects can be usable. One thing that I just came up with is this example 
(https://gist.github.com/isidentical/a799c4ae5c318bb7ac1a9f101cb709c7). I don't 
claim that we are implementing this to support these kinds of obscure cases, 
but it doesn't hurt anyone (beside maybe become a little bit confusing for a 
second) and doesn't look odd when used with a decent purpose.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-25 Thread Mark Dickinson


Mark Dickinson  added the comment:

> At least for optimization, IMHO it worth taking the shot.

For me, this feels a bit backwards: IMO you should decide what behaviour you 
want first, implement the desired behaviour, and then optimize (if possible) 
while keeping that same desired behaviour. It's rare that we want an 
optimization to drive behaviour changes.

So for me, the key question that needs answering is: independent of any 
performance changes, do we want the behaviour change? Specifically, do we want 
something like "d = {}; d[1:2] = True" to "work" in Python 3.10, given that in 
previous releases it raises TypeError? What are the potential benefits or 
drawbacks for the user?

If you can get consensus that the behaviour change is fine, then by all means 
go ahead with the optimization. But I think the behaviour question needs to be 
answered first.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-25 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

> something[::-1] = something

I was thinking this for a while, and this is the actual reason that I didn't 
propose this until I saw someone's comment under Raymond's tweet about 'slices 
are not hashable' (https://twitter.com/4skvictor/status/1330433911646785539). 
At least for optimization, IMHO it worth taking the shot. 

> I could definitely see code using duck-typing via slices to distinguish 
> sequences from other iterables and mappings

That sounds a bit unorthodox. It is a very common practice to use ABCs for 
doing these checks but I've never seen a piece of code where they test if 
something is a mapping or not by throwing unhashable things in it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

There is an open issue for this already, under #11107 (a reopen of the closed 
#2268, where the reopen was justified due to Python 3 making slice objects more 
common), just so you know.

I made a stab at this a while ago and gave up due to the problems with making 
slices constants while trying to keep them unhashable (and I never got to 
handling the marshal format updates properly). It doesn't seem right to 
incidentally make:

something[::-1] = something

actually work, and be completely nonsensical, when "something" happens to be a 
dict, when previously, you'd get a clear TypeError for trying to do it. I could 
definitely see code using duck-typing via slices to distinguish sequences from 
other iterables and mappings, and making mapping suddenly support slices in a 
nonsensical way is... odd.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Mostly, +1 from me as well.

If anything starts to rely on hashability, it will need a fallback path since 
slice objects are allowed to contain arbitrary objects (which might not 
themselves be hashable):   s = slice([], []).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

As a side effect, we also now fold some esoteric cases which are not very 
common. Like;
"""
test
"""[1:-1]

We reviewed only optimizing this a while back and decided these are not that 
common to add a code path. Just wanted to mention that now we optimize these 
away without any extra additions / maintenance cost.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Christian Heimes


Christian Heimes  added the comment:

Good point and excellent explanation. I'm no longer concerned! :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Batuhan Taskaya


Batuhan Taskaya  added the comment:

> I'm slightly concerned about hashability of slice objects. Currently the 
> slice constructor does not ensure that any slice parameter is a number. You 
> can do horrible stuff like this:

The same thing can be applied to tuples, which are hashable if all the items 
are hashable. Since the underlying slice hash uses a tuple of start, stop, step 
I think we are safe on that.

>>> hash((None, {}, None))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'dict'
>>> hash(slice(None, {}, None))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'dict'

Also I limited the scope of the optimization only for integers, so the folded 
slices can only have arguments as either integers or Nones.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Christian Heimes


Christian Heimes  added the comment:

I'm slightly concerned about hashability of slice objects. Currently the slice 
constructor does not ensure that any slice parameter is a number. You can do 
horrible stuff like this:

>>> slice({})
slice(None, {}, None)

which of course fails later on:

>>> [][slice({})]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: slice indices must be integers or None or have an __index__ method

Would it be possible to move type checks to slice constructor?

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Unless we see something fundamentally broken with the hashability, I am +1 on 
this. Even if it does not show in macro benchmarks over the 3% mark, the 
microbenchmarks are positive and the code changes are very scoped, so there is 
not a maintainability problem IMHO

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
nosy: +mark.dickinson, rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Batuhan Taskaya


Change by Batuhan Taskaya :


--
nosy: +Mark.Shannon, serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Batuhan Taskaya


Change by Batuhan Taskaya :


--
keywords: +patch
pull_requests: +22386
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/23496

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42454] Move slice creation to the compiler for constants

2020-11-24 Thread Batuhan Taskaya


New submission from Batuhan Taskaya :

Currently, all the slices are created by the interpreter with BUILD_SLICE 
preceded by 2/3 LOAD_CONSTS. We can move the slice creation into the compiler 
and deduce the cost of these 3/4 instructions into a single LOAD_CONST 
operation where all values in the slices are constants (like [:1], [::2], [-1]) 
which I assume the most common type.

There are over 93.000 constant slices in 3.200~ PyPI packages, which I'd assume 
is a huge amount (ofc I don't claim to know that whether these are in loops / 
hot functions or static globals).

Making these folding could save up to 1.32x on very common slicing behaviors 
(like [:1]). Here are the benchmarks (thanks to @pablogsal);

*** -s "a=list(range(200))" "a[:1]"
Mean +- std dev: [baseline] 61.8 ns +- 0.3 ns -> [optimized] 47.1 ns +- 0.2 ns: 
1.31x faster (-24%)

*** -s "a=list(range(200))" "a[1:1:1], a[::1], a[1::], a[1::1], a[::-1], 
a[-1::], a[:], a[::]"
Mean +- std dev: [baseline] 2.38 us +- 0.02 us -> [optimized] 2.24 us +- 0.02 
us: 1.06x faster (-6%)

The macro benchmarks appeared to be +1.03x increase which is assumed as noise, 
so this is a targeted optimization.

One problem with slices being constants is that, the code objects are hashable 
so does all the items in the co_consts tuple (also the internal dict that the 
compiler uses to store all co_const items). For allowing slices to be stored in 
the code objects, and not changing any behavior in a backward-incompatible way 
I'm proposing to make the slice objects hashable. It might not be seen as a 
feature, but a side-effect caused by this optimization.

--
messages: 381759
nosy: BTaskaya, pablogsal
priority: normal
severity: normal
status: open
title: Move slice creation to the compiler for constants
type: performance
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com