subject:"\[Python\-ideas\] Re\: Access \(ordered\) dict by index; insert slice"

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-14 Thread Marco Sulla

I just had an idea: we have mp->ma_used and keys->dk_nentries.

holes = mp->ma_keys->dk_nentries - mp->ma_used

is the number of holes in the dict.
So, if holes == 0, you have O(1). But, if the dict has holes, you
can't have O(1), but you can speedup the check of the entry by
counting the NULLs in the for loop. When the number of NULLs reaches
holes, you can jump safely to the dict entry you need.

In this case, O(n) can happen only if the items are removed from the
end, as with popitem().
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G4VO262RMJALR35G74B72PFCTXT524LQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-01 Thread Inada Naoki

On Mon, Aug 31, 2020 at 9:30 PM  wrote:
> I have a use case which relates to this request: iterating over a dict 
> starting from a given key. I would like to achieve this without having to pay 
> the full O(n) cost if I'm going to be iterating over only a few items. My 
> understanding is that this should be achievable without needing to iterate 
> through the entire dict, since the dict's internal key lookup points to a 
> particular index of dk_entries anyway.
>
[snip]
> Doing this efficiently would require either the addition of indexing to dicts 
> as well as some sort of get_key_index operation, or else could be done 
> without knowing indices if an iter_from_key operation were introduced (which 
> used the internal dk_indices to know where to start iterating over 
> dk_entries). I think this thread touches on the same sorts of debates however 
> so I'm mentioning this here.
>

Note that proposed index access is O(n), not O(1). So `get_key_index`
doesn't match your use case.

On the other hand, iter_from_key is possible idea.

Another API idea is `d.next_item(key)` and `d.prev_item(key)`. They
return None if the key is left/right end. They raise KeyError if key
not found. Other wise, they return (key, item) pair.


> I also think that even if adding new features to the built-in dict is 
> undesirable, adding a collections.IndexableDict would be very useful 
> (sortedcollections.IndexableDict exists but sorting is not acceptable for 
> many use cases).

Maybe, OrderedDict can have od.next_item(key) and od.prev_item(key).

Regards,

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2BIYAHUZTPRR5P6Y3MDVFBVY7PEECEDE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-01 Thread Dominik Vilsmeier


On 01.09.20 11:22, Zig Zigson wrote:


I believe I described my case poorly, the process to get from one state (key) 
to the next is an external (slow) process; the value stored is not the next 
state but a value calculated while advancing the state. This dict serves as a 
way to quickly skip steps of the external process when it has repeated itself, 
and thus calculating the next key would be exorbitantly slower than iterating 
through the whole dict.
In any case as a poster pointed out above, my example is not as compelling in 
terms of a speedup as I thought, the dict key iteration is not very long in 
practice compared to other operations I need to do.


What I meant is that you *could* store the next state, e.g. alongside
that computed value, as a tuple. So you would have `state, value =
states[state]`. This increases the memory usage but it saves you from
iterating the whole dict, if that is what you want.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TUOTSOMBHESPGZKCAMLOKA5TUAOQVXT3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-01 Thread Zig Zigson

I believe I described my case poorly, the process to get from one state (key) 
to the next is an external (slow) process; the value stored is not the next 
state but a value calculated while advancing the state. This dict serves as a 
way to quickly skip steps of the external process when it has repeated itself, 
and thus calculating the next key would be exorbitantly slower than iterating 
through the whole dict.
In any case as a poster pointed out above, my example is not as compelling in 
terms of a speedup as I thought, the dict key iteration is not very long in 
practice compared to other operations I need to do.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GZIIFVMRQDQLPNCZAYM4HHZ3HAHGXD2H/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-01 Thread Zig Zigson

Having gotten to an implementation, you are correct, the dict iteration does 
not take the lion's share, or at least there are several other steps in my 
application which dwarf the dict traversal operations in any case.
I don't think I in practice have a compelling case here, so all I'm left with 
is the vagueism that there must exist use cases where this would be the 
bottleneck, which is admittedly a cop-out.
I mostly feel that the new dict internals are so convenient in enabling this 
that it would be a shame not to be able to have this performance improvement 
for the obscure case (which I can't think of) where creating a list copy would 
be undesirable for some reason other than the (as you point out) small memory 
increase.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PRC2EGRSN6JJME4WBNA4PLB2YFXKNO7K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-09-01 Thread Dominik Vilsmeier


On 31.08.20 06:01, junkneno...@gmail.com wrote:


I have a use case which relates to this request: iterating over a dict starting 
from a given key. I would like to achieve this without having to pay the full 
O(n) cost if I'm going to be iterating over only a few items. My understanding 
is that this should be achievable without needing to iterate through the entire 
dict, since the dict's internal key lookup points to a particular index of 
dk_entries anyway.

My sample use case at a high level is when the dict stores values uniquely 
representing the state of a process (say, the hash of a changing object), and 
the values represent some outcome of a step in that process. The process can 
contain loops, so at each step we check if the current state's outcome is 
already stored (thus we want a dict for O(1) lookup), and when a matching state 
is found we'd like to stop and loop over the in-between states performing some 
operation on their values (say, summing their outcome values).
We may continue the process and find state-loops many times (the actual use 
case involves non-deterministic branches and thus possibly many loops), and the 
state-dict might reach a very large size, so iterating over the entire dict 
every time we find a matching key is undesirable, as is storing keys in an 
associated list as this would ~double the memory used.


Unless I'm misunderstanding the task, it sounds like this could be
solved by repeated lookups of cycle elements. It seems to be a special
case anyway that all cycles are inserted in order into the dict. I.e.
instead of iterating from one key to another you would just iterate the
cycle:

    if outcome in states:
    cycle = [outcome]
    while (state := states[cycle[-1]]) != outcome:
        cycle.append(state)
    result = sum(cycle)

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/R2MIVDA3Y5QYYO25HTGKUTZ74MAUNCRL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-31 Thread Steven D'Aprano

On Mon, Aug 31, 2020 at 04:01:06AM -, junkneno...@gmail.com wrote:

> I have a use case which relates to this request: iterating over a dict 
> starting from a given key. I would like to achieve this without having 
> to pay the full O(n) cost if I'm going to be iterating over only a few 
> items.

>From the description of your task, it sounds like the cost of iterating 
over the full dict is trivial compared to the work you would do in the 
loop, even if you end up skipping the bulk of the items.

On my computer, it takes 7ms to iterate over a million-item dict. So 
even if you had a million keys, and had to skip over all but the very 
last, you're saving 7ms.

Or chances are your computer is faster and more powerful than mine, in 
which case that's likely to be more like 0.7ms.

So I'd like to see some profiling information that conclusively 
demonstrates that the bottleneck here is not the work you do within each 
loop, but skipping over the keys to get to the starting point.

[...]
> as is storing keys in an associated list as this would 
> ~double the memory used.

If you are storing only the keys, you're not doubling the memory. You're 
only storing pointers to the same keys as being used in the dict. On a 
64-bit machine, each pointer is 8 bytes, while the key itself is likely 
to be sufficiently larger: even a small int is 28 bytes. If your hash 
values are large, they might be 40 bytes or more.

And of course you're not duplicating the values either. 

So realistically, I would expect perhaps a 15% increase, not 100% 
increase in memory.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C5FUBACWRYPWLOABGCDPS5QLYKG4RRKD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-31 Thread junknenopok

I have a use case which relates to this request: iterating over a dict starting 
from a given key. I would like to achieve this without having to pay the full 
O(n) cost if I'm going to be iterating over only a few items. My understanding 
is that this should be achievable without needing to iterate through the entire 
dict, since the dict's internal key lookup points to a particular index of 
dk_entries anyway.

My sample use case at a high level is when the dict stores values uniquely 
representing the state of a process (say, the hash of a changing object), and 
the values represent some outcome of a step in that process. The process can 
contain loops, so at each step we check if the current state's outcome is 
already stored (thus we want a dict for O(1) lookup), and when a matching state 
is found we'd like to stop and loop over the in-between states performing some 
operation on their values (say, summing their outcome values).
We may continue the process and find state-loops many times (the actual use 
case involves non-deterministic branches and thus possibly many loops), and the 
state-dict might reach a very large size, so iterating over the entire dict 
every time we find a matching key is undesirable, as is storing keys in an 
associated list as this would ~double the memory used.

Doing this efficiently would require either the addition of indexing to dicts 
as well as some sort of get_key_index operation, or else could be done without 
knowing indices if an iter_from_key operation were introduced (which used the 
internal dk_indices to know where to start iterating over dk_entries). I think 
this thread touches on the same sorts of debates however so I'm mentioning this 
here.

I also think that even if adding new features to the built-in dict is 
undesirable, adding a collections.IndexableDict would be very useful 
(sortedcollections.IndexableDict exists but sorting is not acceptable for many 
use cases).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MK5BHIJYNA5GT5ITFR6XKHPT47RWC5RF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-03 Thread Inada Naoki

On Tue, Aug 4, 2020 at 3:35 AM Christopher Barker  wrote:
>
> On Sat, Aug 1, 2020 at 6:10 PM Inada Naoki  wrote:
>>
>> Repacking is mutation, and mutating dict while iterating it breaks the 
>> iterator.
>> But `d.items()[42]` don't looks like mutation.
>
>
> Pardon my ignorance, but IS repacking a mutation? Clearly it's mutating the 
> internal state, but at the logical structure of the dict will not have 
> changed.
>

You are totally right.  Repacking mutates internal state so it breaks
iterator. But it doesn't look like mutation from users' point of view.
It is the main problem.
We can not repack silently.

> Though I suppose if it's being iterated over, then the iterator is keeping an 
> index into the internal array, which would change on repacking?
>

Yes.

> which means that it's worse than not looking like a mutation, but it could 
> make active iterators return results that are actually incorrect.
>

In most cases, iterator can detect it and raise RuntimeError.

> I have to think that this could be accommodated somehow, but with ugly 
> special case code, so yeah, not worth it. Though you could keep track of if 
> there are any active views (ir even active iterators) and not repack in that 
> case. I'm sure most dict iterators are most commonly used right away.
>
> Is repacking ever currently done with dicts? If so, then how is this issue 
> worked around?
>

Repacking happens only when insertion resize; when inserting an item
but there is no space to insert.
dict.clear() also creates clean empty dict.

Currently, del/pop doesn't cause repacking.
https://bugs.python.org/issue32623

Regards,
-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SQKYMBAN4ROLRKPWBT7OVVCSTU6JKJFT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-03 Thread Christopher Barker

On Sat, Aug 1, 2020 at 6:10 PM Inada Naoki  wrote:

> Repacking is mutation, and mutating dict while iterating it breaks the
> iterator.
> But `d.items()[42]` don't looks like mutation.
>

Pardon my ignorance, but IS repacking a mutation? Clearly it's mutating the
internal state, but at the logical structure of the dict will not have
changed.

Though I suppose if it's being iterated over, then the iterator is keeping
an index into the internal array, which would change on repacking?

which means that it's worse than not looking like a mutation, but it could
make active iterators return results that are actually incorrect.

I have to think that this could be accommodated somehow, but with ugly
special case code, so yeah, not worth it. Though you could keep track of if
there are any active views (ir even active iterators) and not repack in
that case. I'm sure most dict iterators are most commonly used right away.

Is repacking ever currently done with dicts? If so, then how is this issue
worked around?

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U3NKOPESDAYHQ5Z6ZUFTR3IL6YCUWXPQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-02 Thread Wes Turner

https://github.com/python/cpython/blob/master/Objects/odictobject.c :

"""
In order to preserve O(1) performance for node removal (finding nodes), we
must do better than just looping through the linked-list.  Here are options
we've considered:
1. use a second dict to map keys to nodes (a la the pure Python version).
2. keep a simple hash table mirroring the order of dict's, mapping each key
   to the corresponding node in the linked-list.
3. use a version of shared keys (split dict) that allows non-unicode keys.
4. have the value stored for each key be a (value, node) pair, and adjust
   __getitem__(), get(), etc. accordingly.
The approach with the least performance impact (time and space) is #2,
mirroring the key order of dict's dk_entries with an array of node pointers.
While lookdict() and friends (dk_lookup) don't give us the index into the
array, we make use of pointer arithmetic to get that index.  **An
alternative
would be to refactor lookdict() to provide the index, explicitly exposing
the implementation detail.**  We could even just use a custom lookup
function
for OrderedDict that facilitates our need.  However, both approaches are
significantly more complicated than just using pointer arithmetic.
The catch with mirroring the hash table ordering is that we have to keep
the ordering in sync through any dict resizes.  However, that order only
matters during node lookup.  We can simply defer any potential resizing
until we need to do a lookup.
"""

On Sat, Aug 1, 2020 at 10:06 PM Wes Turner  wrote:

> Is there any reason that these features couldn't be added to OrderedDict
> (which is a linked list)?
> https://github.com/python/cpython/blob/master/Objects/odictobject.c
>
> On Sat, Aug 1, 2020, 9:13 PM Inada Naoki  wrote:
>
>> On Sun, Aug 2, 2020 at 2:34 AM Christopher Barker 
>> wrote:
>> >
>> > On Sat, Aug 1, 2020 at 2:28 AM Marco Sulla <
>> marco.sulla.pyt...@gmail.com> wrote:
>> >>
>> >> On Sat, 1 Aug 2020 at 03:00, Inada Naoki 
>> wrote:
>> >>>
>> >>> Please teach me if you know any algorithm which has no hole, O(1)
>> >>> deletion, preserving insertion order, and efficient and fast as array.
>> >
>> >
>> > I would think the goal here would be to re-order once in a while to
>> remove the holes. But that would take time, of course, so you wouldn't want
>> to do it on every deletion. But when?
>> >
>> > One option: maybe too specialized, but it could re-pack the array when
>> an indexing operation is made -- since that operation is O(N) anyway. And
>> that would then address the issue of performance for multiple indexing
>> operations -- if you made a bunch of indexing operation in a row without
>> deleting (which would be the case, if this is an alternative to making a
>> copy in a Sequence first), then the first one would repack the internal
>> array (presumably faster than making a copy) and the rest would have O(1)
>> access.
>> >
>>
>> Repacking is mutation, and mutating dict while iterating it breaks the
>> iterator.
>> But `d.items()[42]` don't looks like mutation.
>>
>> > Given that this use case doesn't appear to be very important, I doubt
>> it's worth it, but it seems it would be possible.
>> >
>> > Another thought -- could the re-packing happen whenever the entire dict
>> is iterated through? Though maybe there's no way to know when that's going
>> to happen -- all you get are the individual calls for the next one, yes?
>> >
>>
>> You are right. it couldn't.
>>
>>
>> --
>> Inada Naoki  
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/ED2GRWD4RARR2LGP45PK4M6R3MLTAF75/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XUH3AKC5TVPWJBJSI7UGYWTQDCDME2F5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-02 Thread Marco Sulla

On Sat, 1 Aug 2020 at 20:25, Stestagg  wrote:

> I wrote some (better than the previously shared) benchmarks for this
> change a while ago.
>

I think that you could speed up the algorithm if you check if
dict->ma_keys->dk_lookup
== lookdict_unicode_nodummy. If so, the dict is a combined dict with only
string keys (quite common), and no deletion was done before, so there's no
hole in ma_keys.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZP3QVG4ZPALST5OXEVKXNHGCRC57J34G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-02 Thread Marco Sulla

On Sat, 1 Aug 2020 at 19:34, Christopher Barker  wrote:

> I would think the goal here would be to re-order once in a while to remove
> the holes. But that would take time, of course, so you wouldn't want to do
> it on every deletion. But when?
>

You should have a separate process that does this, so "normal" operations
will be not blocked. The process could also cache objects and do garbage
collection. A sort of domestic worker. Don't know if this is possible.

> > About the hole, I was thinking that in theory the problem can be
> circumvented using a modified version of lookdict.
>>
>> lookdict searches for a key and returns its position in the ma_keys
>> array. I suppose it's possible to do the contrary: search for the index and
>> return the key.
>> What do you think (theoretically speaking)?
>>
>
> but isn't searching for the index going to require iterating through the
> array until you find it? i.e. that O(N) operation we're trying to avoid?
>

O(n) is an+b. If a and b are small, the algorithm is fast for not-so-big n.

To sum up the alternatives:
1.  rearrange the items array at positional lookup. As Inada said, it will
mutate the dict while reading, that is quite unexpected
2. rearrange the items array at deletion. It will slow down deletion.
Probably not an option... a NAO
3. no positional indexing. It seems the more simple solution. A number of
alternatives are offered (by Tim Peters too)
4. something else.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VPMKJEP4A4SN6QTYWVZRR36AIDTOLSPN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Steven D'Aprano

On Sat, Aug 01, 2020 at 07:25:42PM +0100, Stestagg wrote:

> Irrespective of where in the api this logic should exist, the
> implementation won't be algorithmically different, (I think, even with a
> `.ordered` view, as the view would have to cope with changes to the
> underlying dictionary over its lifetime, and external tracking of changes
> to dicts is not, afaik, feasible. Unlike for-loop constructs which are
> inherently scoped, I feel like you wouldn't get away with forbidding
> modifying a dict() if there's a view on keys/values/items still alive, as
> these things are first-class objects that can be stored/passed around)

Forbidding mutation of the dict while a view exists is missing the point 
of having a view in the first place: updating the owning object should 
update the view as well.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B4TVRQVWEKMA7F2LSPJEOMNMR5XFVTTV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Wes Turner

Is there any reason that these features couldn't be added to OrderedDict
(which is a linked list)?
https://github.com/python/cpython/blob/master/Objects/odictobject.c

On Sat, Aug 1, 2020, 9:13 PM Inada Naoki  wrote:

> On Sun, Aug 2, 2020 at 2:34 AM Christopher Barker 
> wrote:
> >
> > On Sat, Aug 1, 2020 at 2:28 AM Marco Sulla 
> wrote:
> >>
> >> On Sat, 1 Aug 2020 at 03:00, Inada Naoki 
> wrote:
> >>>
> >>> Please teach me if you know any algorithm which has no hole, O(1)
> >>> deletion, preserving insertion order, and efficient and fast as array.
> >
> >
> > I would think the goal here would be to re-order once in a while to
> remove the holes. But that would take time, of course, so you wouldn't want
> to do it on every deletion. But when?
> >
> > One option: maybe too specialized, but it could re-pack the array when
> an indexing operation is made -- since that operation is O(N) anyway. And
> that would then address the issue of performance for multiple indexing
> operations -- if you made a bunch of indexing operation in a row without
> deleting (which would be the case, if this is an alternative to making a
> copy in a Sequence first), then the first one would repack the internal
> array (presumably faster than making a copy) and the rest would have O(1)
> access.
> >
>
> Repacking is mutation, and mutating dict while iterating it breaks the
> iterator.
> But `d.items()[42]` don't looks like mutation.
>
> > Given that this use case doesn't appear to be very important, I doubt
> it's worth it, but it seems it would be possible.
> >
> > Another thought -- could the re-packing happen whenever the entire dict
> is iterated through? Though maybe there's no way to know when that's going
> to happen -- all you get are the individual calls for the next one, yes?
> >
>
> You are right. it couldn't.
>
>
> --
> Inada Naoki  
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ED2GRWD4RARR2LGP45PK4M6R3MLTAF75/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/W2KBWMS5A2UJV7OYUNABVMHPC6A6JFDF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Inada Naoki

On Sun, Aug 2, 2020 at 2:34 AM Christopher Barker  wrote:
>
> On Sat, Aug 1, 2020 at 2:28 AM Marco Sulla  
> wrote:
>>
>> On Sat, 1 Aug 2020 at 03:00, Inada Naoki  wrote:
>>>
>>> Please teach me if you know any algorithm which has no hole, O(1)
>>> deletion, preserving insertion order, and efficient and fast as array.
>
>
> I would think the goal here would be to re-order once in a while to remove 
> the holes. But that would take time, of course, so you wouldn't want to do it 
> on every deletion. But when?
>
> One option: maybe too specialized, but it could re-pack the array when an 
> indexing operation is made -- since that operation is O(N) anyway. And that 
> would then address the issue of performance for multiple indexing operations 
> -- if you made a bunch of indexing operation in a row without deleting (which 
> would be the case, if this is an alternative to making a copy in a Sequence 
> first), then the first one would repack the internal array (presumably faster 
> than making a copy) and the rest would have O(1) access.
>

Repacking is mutation, and mutating dict while iterating it breaks the iterator.
But `d.items()[42]` don't looks like mutation.

> Given that this use case doesn't appear to be very important, I doubt it's 
> worth it, but it seems it would be possible.
>
> Another thought -- could the re-packing happen whenever the entire dict is 
> iterated through? Though maybe there's no way to know when that's going to 
> happen -- all you get are the individual calls for the next one, yes?
>

You are right. it couldn't.


-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ED2GRWD4RARR2LGP45PK4M6R3MLTAF75/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Wes Turner

first()
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.first
https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.first

last()
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.last
https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.last

take()
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.take
https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.take

tail()
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.tail
https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.tail

... pluck(ind, seqs, default='__no__default__')
> plucks an element or several elements from each item in a sequence.
https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.pluck

On Sat, Aug 1, 2020, 4:59 PM Guido van Rossum  wrote:

> Yeah, it is totally doable to refactor the collection ABCs to have
> something in between `Collection` and `Sequence` that just supports
> `__getitem__`.
>
> But I would take Marco's research (and Inada's musings) seriously -- we
> don't actually want to support `__getitem__`, because of the unpredictable
> performance characteristics.
>
> I'm no longer in favor of adding .ordered() -- I think it's better to add
> something to itertools, for example first() to get the first item (see Tim
> Peters' post), and something related to get the first N items.
>
> On Sat, Aug 1, 2020 at 12:28 PM Christopher Barker 
> wrote:
>
>> On Fri, Jul 31, 2020 at 7:34 AM Guido van Rossum 
>> wrote:
>>
>>> So maybe we need to add dict.ordered() which returns a view on the items
>>> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
>>> orderedvalues()?
>>>
>>
>> I'm still confused as to when "ordered" became synonymous with "Sequence"
>> -- so wouldn't we want to call these dict.as_sequence() or something like
>> that?
>>
>> And is there a reason that the regular dict views couldn't be both a Set
>> and a Sequence? Looking at the ABCs, I don't see a conflict -- __getitem__,
>> index() and count() would need to be added, and  Set's don't have any of
>> those. (and count could be optimized to always return 0 or 1 for
>> dict.keys() ;-) )
>>
>> But anyway, naming aside, I'm still wondering whether we necessarily want
>> the entire Sequence protocol. For the use cases at hand, isn't indexing and
>> slicing enough?
>>
>> Which brings us to the philosophy of duck typing. I wrote an earlier post
>> about that -- so here's some follow up thoughts. I suggested that I like
>> the "if I only need it to quack, I don't care if it's a duck" approach -- I
>> try to use the quack() method, and I'm happy it if works, and raise an
>> Exception (Or let whatever Exception happens be raised bubble up) if it
>> doesn't.
>>
>> Guido pointed out that having a quack() method isn't enough -- it also
>> needs to actually behave as you expect -- which is the nice thing about
>> ABCs -- if you know something is a Sequence, you don't just know that you
>> can index it, you know that indexing it will do what you expect.
>>
>> Which brings us back to the random.choice() function. It's really simple,
>> and uses exactly the approach I outlined above.
>>
>> def choice(self, seq):
>> """Choose a random element from a non-empty sequence."""
>> try:
>> i = self._randbelow(len(seq))
>> except ValueError:
>> raise IndexError('Cannot choose from an empty sequence') from
>> None
>> return seq[i]
>>
>> It checks the length of the object, picks a random index within that
>> length, and then tries to use that index to get a random item. so anything
>> with a __len__ and a __getitem__ that accepts integers will work.
>>
>> And this has worked "fine" for decades. Should it be checking that seq is
>> actually a sequence? I don't think so -- I like that I can pass in any
>> object that's indexable by an integer.
>>
>> But there's is a potential problem here -- all it does is try to pass an
>> integer to __getitem__. So all Sequences should work. But Mappings also
>> have a __getitem__, but with slightly different semantics -- a Sequence
>> should accept an integer (or object with an __index__) in the range of its
>> size, but a Mapping can accept any valid key. So for the most part, passing
>> a Mapping to random.choice() fails as it should, with a KeyError. But if
>> you happen to have a key that is an integer, it might succeed, but it would
>> not be doing "the right thing" (unless the Mapping happened to be
>> constructed exactly the right way -- but then it should probably just be a
>> Sequence).
>>
>> So: do we need a solution to this? I don't think so, it's simply the
>> nature of a dynamic typing as far as I'm concerned, but if we wanted it to
>> be more robust, we could require (maybe only with a static type
>> declaration) that the object passed in is a Sequence.
>>
>> But I think that would be a shame -- this function doesn't need a

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Guido van Rossum

Yeah, it is totally doable to refactor the collection ABCs to have
something in between `Collection` and `Sequence` that just supports
`__getitem__`.

But I would take Marco's research (and Inada's musings) seriously -- we
don't actually want to support `__getitem__`, because of the unpredictable
performance characteristics.

I'm no longer in favor of adding .ordered() -- I think it's better to add
something to itertools, for example first() to get the first item (see Tim
Peters' post), and something related to get the first N items.

On Sat, Aug 1, 2020 at 12:28 PM Christopher Barker 
wrote:

> On Fri, Jul 31, 2020 at 7:34 AM Guido van Rossum  wrote:
>
>> So maybe we need to add dict.ordered() which returns a view on the items
>> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
>> orderedvalues()?
>>
>
> I'm still confused as to when "ordered" became synonymous with "Sequence"
> -- so wouldn't we want to call these dict.as_sequence() or something like
> that?
>
> And is there a reason that the regular dict views couldn't be both a Set
> and a Sequence? Looking at the ABCs, I don't see a conflict -- __getitem__,
> index() and count() would need to be added, and  Set's don't have any of
> those. (and count could be optimized to always return 0 or 1 for
> dict.keys() ;-) )
>
> But anyway, naming aside, I'm still wondering whether we necessarily want
> the entire Sequence protocol. For the use cases at hand, isn't indexing and
> slicing enough?
>
> Which brings us to the philosophy of duck typing. I wrote an earlier post
> about that -- so here's some follow up thoughts. I suggested that I like
> the "if I only need it to quack, I don't care if it's a duck" approach -- I
> try to use the quack() method, and I'm happy it if works, and raise an
> Exception (Or let whatever Exception happens be raised bubble up) if it
> doesn't.
>
> Guido pointed out that having a quack() method isn't enough -- it also
> needs to actually behave as you expect -- which is the nice thing about
> ABCs -- if you know something is a Sequence, you don't just know that you
> can index it, you know that indexing it will do what you expect.
>
> Which brings us back to the random.choice() function. It's really simple,
> and uses exactly the approach I outlined above.
>
> def choice(self, seq):
> """Choose a random element from a non-empty sequence."""
> try:
> i = self._randbelow(len(seq))
> except ValueError:
> raise IndexError('Cannot choose from an empty sequence') from
> None
> return seq[i]
>
> It checks the length of the object, picks a random index within that
> length, and then tries to use that index to get a random item. so anything
> with a __len__ and a __getitem__ that accepts integers will work.
>
> And this has worked "fine" for decades. Should it be checking that seq is
> actually a sequence? I don't think so -- I like that I can pass in any
> object that's indexable by an integer.
>
> But there's is a potential problem here -- all it does is try to pass an
> integer to __getitem__. So all Sequences should work. But Mappings also
> have a __getitem__, but with slightly different semantics -- a Sequence
> should accept an integer (or object with an __index__) in the range of its
> size, but a Mapping can accept any valid key. So for the most part, passing
> a Mapping to random.choice() fails as it should, with a KeyError. But if
> you happen to have a key that is an integer, it might succeed, but it would
> not be doing "the right thing" (unless the Mapping happened to be
> constructed exactly the right way -- but then it should probably just be a
> Sequence).
>
> So: do we need a solution to this? I don't think so, it's simply the
> nature of a dynamic typing as far as I'm concerned, but if we wanted it to
> be more robust, we could require (maybe only with a static type
> declaration) that the object passed in is a Sequence.
>
> But I think that would be a shame -- this function doesn't need a full
> Sequence, it only needs a Sized and __getitem__.
>
> In fact, the ABCs are designed to accommodate much of this -- for example,
> the Sized ABC only requires one feature: __len__. And Contains only
> __contains__. As far as I know there are no built-ins (or commonly used
> third party) objects that are ONLY Sized, or ONLY Contains. In fact, at
> least in the collection.abc, every ABC that has __contains__ also has
> __len__. And I can't think of anything that could support "in" that didn't
> have a size -- which could be a failure of imagination on my part. But you
> could type check for Contains is all you wanted to do was know that you
> could use it with "in".
>
> So there are ABCs there simply to support a single method. Which means
> that we could solve the "problem" of random.choice with a "Getitemable"
> ABC.
>
> Ahh -- but here's the rub -- while the ABCs only require certain methods
> -- in fact, it's implied that they have

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Christopher Barker

On Fri, Jul 31, 2020 at 7:34 AM Guido van Rossum  wrote:

> So maybe we need to add dict.ordered() which returns a view on the items
> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
> orderedvalues()?
>

I'm still confused as to when "ordered" became synonymous with "Sequence"
-- so wouldn't we want to call these dict.as_sequence() or something like
that?

And is there a reason that the regular dict views couldn't be both a Set
and a Sequence? Looking at the ABCs, I don't see a conflict -- __getitem__,
index() and count() would need to be added, and  Set's don't have any of
those. (and count could be optimized to always return 0 or 1 for
dict.keys() ;-) )

But anyway, naming aside, I'm still wondering whether we necessarily want
the entire Sequence protocol. For the use cases at hand, isn't indexing and
slicing enough?

Which brings us to the philosophy of duck typing. I wrote an earlier post
about that -- so here's some follow up thoughts. I suggested that I like
the "if I only need it to quack, I don't care if it's a duck" approach -- I
try to use the quack() method, and I'm happy it if works, and raise an
Exception (Or let whatever Exception happens be raised bubble up) if it
doesn't.

Guido pointed out that having a quack() method isn't enough -- it also
needs to actually behave as you expect -- which is the nice thing about
ABCs -- if you know something is a Sequence, you don't just know that you
can index it, you know that indexing it will do what you expect.

Which brings us back to the random.choice() function. It's really simple,
and uses exactly the approach I outlined above.

def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
try:
i = self._randbelow(len(seq))
except ValueError:
raise IndexError('Cannot choose from an empty sequence') from
None
return seq[i]

It checks the length of the object, picks a random index within that
length, and then tries to use that index to get a random item. so anything
with a __len__ and a __getitem__ that accepts integers will work.

And this has worked "fine" for decades. Should it be checking that seq is
actually a sequence? I don't think so -- I like that I can pass in any
object that's indexable by an integer.

But there's is a potential problem here -- all it does is try to pass an
integer to __getitem__. So all Sequences should work. But Mappings also
have a __getitem__, but with slightly different semantics -- a Sequence
should accept an integer (or object with an __index__) in the range of its
size, but a Mapping can accept any valid key. So for the most part, passing
a Mapping to random.choice() fails as it should, with a KeyError. But if
you happen to have a key that is an integer, it might succeed, but it would
not be doing "the right thing" (unless the Mapping happened to be
constructed exactly the right way -- but then it should probably just be a
Sequence).

So: do we need a solution to this? I don't think so, it's simply the nature
of a dynamic typing as far as I'm concerned, but if we wanted it to be more
robust, we could require (maybe only with a static type declaration) that
the object passed in is a Sequence.

But I think that would be a shame -- this function doesn't need a full
Sequence, it only needs a Sized and __getitem__.

In fact, the ABCs are designed to accommodate much of this -- for example,
the Sized ABC only requires one feature: __len__. And Contains only
__contains__. As far as I know there are no built-ins (or commonly used
third party) objects that are ONLY Sized, or ONLY Contains. In fact, at
least in the collection.abc, every ABC that has __contains__ also has
__len__. And I can't think of anything that could support "in" that didn't
have a size -- which could be a failure of imagination on my part. But you
could type check for Contains is all you wanted to do was know that you
could use it with "in".

So there are ABCs there simply to support a single method. Which means that
we could solve the "problem" of random.choice with a "Getitemable" ABC.

Ahh -- but here's the rub -- while the ABCs only require certain methods --
in fact, it's implied that they have particular behavior as well. And this
is the problem at hand. Both Sequences and Mappings have a __getitem__, but
they have somewhat different meanings, and that meaning is embedded in the
ABC itself, rather than the method: Sequences will take an integer, and
raise a IndexError if its out of range, and Mappings take any hashable, and
will raise a KeyError if it's not there.

So maybe what is needed is an Indexable ABC that implies the Sequence-like
indexing behavior.

Then if we added indexing to dict views, they would be an Indexable, but
not a Sequence.

-CHB

> On Fri, Jul 31, 2020 at 05:29 Ricky Teachey  wrote:
>
>> On Fri, Jul 31, 2020, 2:48 AM Wes Turner  wrote:
>>
>>> # Dicts and DataFrames
>>> - Src:
>>>

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Stestagg

I wrote some (better than the previously shared) benchmarks for this change
a while ago.  They are run on cPython with a patch applied that implements
dict_views __getitem__() using a method similar to `lookdict` to perform
indexing on keys/values/etc.

Irrespective of where in the api this logic should exist, the
implementation won't be algorithmically different, (I think, even with a
`.ordered` view, as the view would have to cope with changes to the
underlying dictionary over its lifetime, and external tracking of changes
to dicts is not, afaik, feasible. Unlike for-loop constructs which are
inherently scoped, I feel like you wouldn't get away with forbidding
modifying a dict() if there's a view on keys/values/items still alive, as
these things are first-class objects that can be stored/passed around)

Therefore, all index based lookups would have to use a variant of this
logic (unless someone can come up with a magic O(1) solution ;) Or explicit
compaction is used (If anyone has a patch that adds tracking 'compactness'
over the dict_keys, I can run the tests using it, to measure the impact -
However I'm not personally sure yet if the overheads of this more invasive
change are justified just for enabling indexing).

The cPython patch can be found here:
https://github.com/stestagg/dict_index/blob/master/changes.patch, and the
benchmark results are linked below.

The tl/dr from my perspective is that these results make the change
challenging to continue proposing without a better implementation than the
obvious one. (I was weakly +1 before these results).  Personally, I'm happy
that the numbers give good evidence for this change being more complex than
it at-first seems.

Some notes about the benchmarks, I've adapted an existing, not-related,
test runner for this, so there may be some compromises.  I've tried to be
reasonable about capturing OK timing data, but the intent here isn't to
spot single-% changes in performance, rather it's looking at significant
changes in runtime performance over vastly varying sizes of dicts.  The
repo including the test runner, patches and makefile are here:
https://github.com/stestagg/dict_index and I'm accepting issues/PRs there
if anyone feels that there's an omission or error that's worth correcting.

The numbers are raw, and *do not* have any interpretation layered on them,
there are many snippets of code that are not best-practice or ideal ways of
achieving things, this is as much because I wanted to see what the impact
of these non-optimal patterns would be on common operations, please take
the time to understand the exact test (check the full source if you need)
before making any meaningful decisions based on this data.

Graphs are, by default, plotted on log-log axes, so beware when just
looking at the shapes of the lines that the real-world difference in
run-time is much larger than the absolute line differences suggest.  The
solution that uses direct indexing is always coloured green in the graphs.

As the code involves a tight loop over a simple structure which is very CPU
dependent (and because I can), I've run the benchmarks on a Raspberry pi4
(ARMv7l), and on an AMD pc:

ARM Results:
https://stestagg.github.io/dict_index/pi4.html

PC Results:
https://stestagg.github.io/dict_index/pc.html

Thanks

Steve

On Sat, Aug 1, 2020 at 10:25 AM Marco Sulla 
wrote:

> On Sat, 1 Aug 2020 at 03:00, Inada Naoki  wrote:
>
>> Please teach me if you know any algorithm which has no hole, O(1)
>> deletion, preserving insertion order, and efficient and fast as array.
>>
>
> :)
>
> About the hole, I was thinking that in theory the problem can be
> circumvented using a modified version of lookdict.
> lookdict searches for a key and returns its position in the ma_keys array.
> I suppose it's possible to do the contrary: search for the index and return
> the key.
> What do you think (theoretically speaking)?
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/THHMGFINOJAOHQQTRUBHYKWRQZLEJ7OZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Tim Peters

[Steven D'Aprano ]
>> 
>> The other simple solution is `next(iter(mydict.items()))`.

[Guido]
> That one always makes me uncomfortable, because the StopIteration it
> raises when the dict is empty might be misinterpreted. Basically I never
> want to call next() unless there's a try...except StopIteration: around it,
> and that makes this a lot less simple.

Last time this came up, this appeared to reach near-consensus:

"""
exactly what more-itertools has supplied for years already :-)

If the iterable is empty/exhausted, by default ValueError is raised,
but that can be overridden by also passing an optional argument to
return instead (like dict.pop() in this respect).

So, e.g.,

first([42]) returns 42
first([]) raises ValueError
first([], 42) and first([], default=42) return 42

I don't think it belongs in the builtins.  It doesn't perfectly fit
anywhere, but given its prior history in the more-itertools and
itertoolz packages, Python's itertools package seems the least
annoying ;-) home for it.
|"""
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OKUKAX6YE54KCKRV5OIP3XX4J2U6U3WC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Christopher Barker

On Sat, Aug 1, 2020 at 2:28 AM Marco Sulla 
wrote:

> On Sat, 1 Aug 2020 at 03:00, Inada Naoki  wrote:
>
>> Please teach me if you know any algorithm which has no hole, O(1)
>> deletion, preserving insertion order, and efficient and fast as array.
>>
>
I would think the goal here would be to re-order once in a while to remove
the holes. But that would take time, of course, so you wouldn't want to do
it on every deletion. But when?

One option: maybe too specialized, but it could re-pack the array when an
indexing operation is made -- since that operation is O(N) anyway. And that
would then address the issue of performance for multiple indexing
operations -- if you made a bunch of indexing operation in a row without
deleting (which would be the case, if this is an alternative to making a
copy in a Sequence first), then the first one would repack the internal
array (presumably faster than making a copy) and the rest would have O(1)
access.

Given that this use case doesn't appear to be very important, I doubt it's
worth it, but it seems it would be possible.

Another thought -- could the re-packing happen whenever the entire dict is
iterated through? Though maybe there's no way to know when that's going to
happen -- all you get are the individual calls for the next one, yes?

> About the hole, I was thinking that in theory the problem can be
circumvented using a modified version of lookdict.
>
> lookdict searches for a key and returns its position in the ma_keys array.
> I suppose it's possible to do the contrary: search for the index and return
> the key.
> What do you think (theoretically speaking)?
>

but isn't searching for the index going to require iterating through the
array until you find it? i.e. that O(N) operation we're trying to avoid?

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KGRYRDCLFIH3PAEOZ7HFFIN4SLV5KHIF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Marco Sulla

On Sat, 1 Aug 2020 at 03:00, Inada Naoki  wrote:

> Please teach me if you know any algorithm which has no hole, O(1)
> deletion, preserving insertion order, and efficient and fast as array.
>

:)

About the hole, I was thinking that in theory the problem can be
circumvented using a modified version of lookdict.
lookdict searches for a key and returns its position in the ma_keys array.
I suppose it's possible to do the contrary: search for the index and return
the key.
What do you think (theoretically speaking)?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/57BDPRYVMMALKERYPRJMQO4AH33FWOV4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Steven D'Aprano

On Sat, Aug 01, 2020 at 03:57:34PM +1000, Chris Angelico wrote:

> Ahh, okay. So it's safe in the sense that it can't accidentally leak a
> confusing exception. Unfortunately a lot of people are going to assume
> that it means "will always give a useful return value". Definitely
> worth being very very clear about the semantics.

What you name the function is up to you :-)


> And it's really not the ideal semantics here anyway. What you actually
> want is ValueError if the dict is empty.

Your wish is my command:

mynext = exception_guard(catch=StopIteration, throw=ValueError)(next)


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B5FIE43QBMMGGFYXDJYMWYB64D4SR6HW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Christopher Barker

On Sat, Aug 1, 2020 at 10:19 AM Wes Turner  wrote:

> > AFAIU, direct subscripting / addressing was not a use case in the design
> phase of the current dict?
>

Nope, -- if it were, it would have presumably been implemented :-)

But order-preserving wasn't really a design goal either, as I understand
it, but rather a side effect of the implementation. As I recall the
conversation, In 3.7, when it was made "official", even then it was less
about how useful it was than that people WILL use it, and will count on it,
even if they are told by the docs that they shouldn't. So we might as well
commit to it. And it is indeed handy now and again.

So the current conversion is the result that once we have order preserving
dicts, maybe we can do a few other things with them, than a use case
driving the decision in the first place.

On Fri, Jul 31, 2020 at 6:35 PM Inada Naoki  wrote:
> There are two major points to optimize.

>
> * Iterating over `next(islice(dict.items(), n, n+1))` will produce n
> temporary tuples.
> * (CPython implementation detail) dict can detect if there is no hole.
> index access is O(1) if there is no hole.
>

Any thoughts on how much of a difference these might make? particularly the
first one. the seconds of course won't help when there are holes, which
would make performance harder to predict.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LHAZAHFYHPZTDHP3EJNM45LJWWCWQA54/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Inada Naoki

On Sat, Aug 1, 2020 at 12:40 PM Steven D'Aprano  wrote:
>
> On Fri, Jul 31, 2020 at 08:08:58PM -0700, Guido van Rossum wrote:
>
> > > The other simple solution is `next(iter(mydict.items()))`.
> > >
> >
> > That one always makes me uncomfortable, because the StopIteration it raises
> > when the dict is empty might be misinterpreted. Basically I never want to
> > call next() unless there's a try...except StopIteration: around it, and
> > that makes this a lot less simple.
>
> Acknowledged. But there are ways to solve that which perhaps aren't as
> well known as they should be.
>
> * Use a default: `next(iter(mydict.items()), MISSING)`
>
> * Use a helper to convert StopIteration to something else.
>

There is a most simple solution:

* `[first] = mydict.items()`, or `first, = mydict.items()`

Anyway, should we add some tools to itertools, instead of "itertools recipe"?

* `first(iterable, default=None)` -- same to `[first] = iterable`, but
return default value instead of ValueError when iterable is empty.
* `nth(iterable, n, default=None)`
* `consume(iterator, n=None)`

Regards,
-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/75EFRNZQS7FZWVS5BL2RZ73QKA3D4NZR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-08-01 Thread Chris Angelico

On Sat, Aug 1, 2020 at 3:50 PM Steven D'Aprano  wrote:
>
> On Sat, Aug 01, 2020 at 02:08:08PM +1000, Chris Angelico wrote:
> > On Sat, Aug 1, 2020 at 1:43 PM Steven D'Aprano  wrote:
>
> > > Some years ago, someone (I think it was Nick Coghlan?) proposed a
> > > standard solution for this issue, a context manager + decorator function
> > > that guarded against a specific exception. Nothing much came of it, but
> > > I did experiment with the idea, and got something which you could use
> > > like this:
> > >
> > > with exception_guard(StopIteration):
> > > first = next(iter(mydict.items()))
> >
> > My understanding of this is that 'first' is unassigned if StopIteration 
> > happens.
>
> Sorry for not being more explicit about what was going on. I was stuck
> in my own head and didn't consider that others might not recall the
> discussion from all those many years ago, mea culpa.
>
> The exception guard doesn't merely catch and discard exceptions. It
> re-raises with a new exception, RuntimeError by default.
>
>
> > > or like this:
> > >
> > > safenext = exception_guard(StopIteration)(next)
> > > first = safenext(iter(mydict.items()))
> >
> > My understanding of this is that I am confused. What does safenext return?
>
> Nothing; it raises RuntimeError.
>

Ahh, okay. So it's safe in the sense that it can't accidentally leak a
confusing exception. Unfortunately a lot of people are going to assume
that it means "will always give a useful return value". Definitely
worth being very very clear about the semantics.

And it's really not the ideal semantics here anyway. What you actually
want is ValueError if the dict is empty. There's no easy way to spell
that generically, so it'd want a helper function, which means it would
probably do well as a method.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G3ZVLH3F3WT3XGEZIPHS3NOYOX633S7R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Steven D'Aprano

On Sat, Aug 01, 2020 at 02:08:08PM +1000, Chris Angelico wrote:
> On Sat, Aug 1, 2020 at 1:43 PM Steven D'Aprano  wrote:

> > Some years ago, someone (I think it was Nick Coghlan?) proposed a
> > standard solution for this issue, a context manager + decorator function
> > that guarded against a specific exception. Nothing much came of it, but
> > I did experiment with the idea, and got something which you could use
> > like this:
> >
> > with exception_guard(StopIteration):
> > first = next(iter(mydict.items()))
> 
> My understanding of this is that 'first' is unassigned if StopIteration 
> happens.

Sorry for not being more explicit about what was going on. I was stuck 
in my own head and didn't consider that others might not recall the 
discussion from all those many years ago, mea culpa.

The exception guard doesn't merely catch and discard exceptions. It 
re-raises with a new exception, RuntimeError by default.


> > or like this:
> >
> > safenext = exception_guard(StopIteration)(next)
> > first = safenext(iter(mydict.items()))
> 
> My understanding of this is that I am confused. What does safenext return?

Nothing; it raises RuntimeError.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4OQ74NIUZV5J6SAF3B7E7WQ2EA33IZMQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Chris Angelico

On Sat, Aug 1, 2020 at 1:43 PM Steven D'Aprano  wrote:
> Some years ago, someone (I think it was Nick Coghlan?) proposed a
> standard solution for this issue, a context manager + decorator function
> that guarded against a specific exception. Nothing much came of it, but
> I did experiment with the idea, and got something which you could use
> like this:
>
> with exception_guard(StopIteration):
> first = next(iter(mydict.items()))

My understanding of this is that 'first' is unassigned if StopIteration happens.

> or like this:
>
> safenext = exception_guard(StopIteration)(next)
> first = safenext(iter(mydict.items()))
>

My understanding of this is that I am confused. What does safenext return?

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U2ELA7SS6MBDCJ6ZQSN5O324RPNSZTDK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

On Fri, Jul 31, 2020 at 10:59 PM Steven D'Aprano 
wrote:

> [...]

The first request in this thread was from Hans:
>
>
> https://mail.python.org/archives/list/python-ideas@python.org/message/S7UMTWK65X6BJDYZ3SSU7I7HOIASDMMJ/
>
> He is using a dict to hold an array of columns indexed by name
> `{column_name: column}` and wanted to re-order and insert columns at
> arbitrary positions.
>

Pandas solves for columnar data.
SQL is one source and destination for columnar data which Pandas supports.
Pandas handles NULL/None/NaN more consistently than dict.get("key", None).
https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-sql
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

assert df.columns.tolist() == ['a', 'b', 'c']

# this creates a copy
df2 = df[['b', 'c', 'a']]

# this doesn't create a copy
df.reindex(columns=['b', 'c', 'a'])

# this inserts a column after column b
df.insert(df.columns.get_loc('b'), 'newcolumn', df['c'])
df.insert(df.columns.tolist().index('b'), 'newcolumn2', df['c'])

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.insert.html
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex.html

If you need to reorder rows (or columns transposed (df.T)), you could
select with .loc[[list, of, indexvalues]] or .iloc[[list, of, ints]]
http://www.datasciencemadesimple.com/reindex-python-change-order-column-rows-pandas-dataframe/

...

To accomplish the same with the Python standard library, you'd need to
create a data structure that is an amalgamation of list and dict: a
MutableMapping with a Sequence interface for at least .keys() (or
.keyseq()).

Is there already a "pure python" Dataframe?

Do whatever you want,
but the Pandas DataFrame API is also available in Dask, Modin, and CuDF;
for distributed and larger-than-RAM use.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XYRWT5YIA63ANAECLSE5LDCFVXDFUNMS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Steven D'Aprano

On Fri, Jul 31, 2020 at 08:08:58PM -0700, Guido van Rossum wrote:

> > The other simple solution is `next(iter(mydict.items()))`.
> >
> 
> That one always makes me uncomfortable, because the StopIteration it raises
> when the dict is empty might be misinterpreted. Basically I never want to
> call next() unless there's a try...except StopIteration: around it, and
> that makes this a lot less simple.

Acknowledged. But there are ways to solve that which perhaps aren't as 
well known as they should be.

* Use a default: `next(iter(mydict.items()), MISSING)`

* Use a helper to convert StopIteration to something else.

Some years ago, someone (I think it was Nick Coghlan?) proposed a 
standard solution for this issue, a context manager + decorator function 
that guarded against a specific exception. Nothing much came of it, but 
I did experiment with the idea, and got something which you could use 
like this:

with exception_guard(StopIteration):
first = next(iter(mydict.items()))

or like this:

safenext = exception_guard(StopIteration)(next)
first = safenext(iter(mydict.items()))

I think that would be a good tool for the functools library, but I 
acknowledge that even if standard, it would be a little too obscure for 
most people thinking "I need the first key from this dict".

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7HUGB5JJIJS6JCQBU4GM4X7BPYYNPRC3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Steven D'Aprano

On Fri, Jul 31, 2020 at 04:30:22PM -0700, Guido van Rossum wrote:

> > So:
> > dicts preserve order
> > the dict_views preserve this same order
> >
> > Some think that we can't call them "ordered", because somehow that implies
> > other things, like the ability to insert, etc, but while I don't get that
> > argument, fine, let's call them "order preserving".
> >
> 
> Meh, who cares. (The confusion is usually between "ordered" and "sorted"
> but we're past that here, so let's just keep saying "ordered".)

I care, and I think other people should care too, because calling dicts 
"ordered" with no qualifying "by insertion order" provably confuses 
users. See for example the original poster who started this thread, who 
imagined that because dicts are now "ordered" he should be able to 
reorder the keys and insert keys in arbitrary positions.

(That doesn't mean that every single reference to a dict's order needed 
to explicitly refer back to insertion order. I'm not unreasonably 
pedantic about this :-)

[Christopher Barker]
> > 1) Because adding ANYTHING new is taking on a commitment to preserve it in
> > the future, and it's more code to write, maintain, and document. So
> > regardless of any other argument, it shouldn't be added without a
> > reasonable use case(s) -- what's reasonable is subjective, of course.

Agreed. It is much harder to remove unneeded functions than to add it, 
so we should be conservative about adding new functions. We should 
beware bloat and API creep.

> > 2) Because it could be an "attractive nuisance" while dicts are order
> > preserving, their internal structure is such that you cannot find the nth
> > item without iterating through n items, making access O(n), rather than
> > O(1) for access by key or access of the usual Sequences -- Folks expect
> > numerical indexing to be O(1), so it could be tricky. However, the only
> > other way to get an n'th item is to copy everything into a Sequence, which
> > is a lot slower, or to use islice or next() to do the order N access by
> > hand. So this is an attractive nuisance in the use case of wanting to take
> > multiple items by index, in which case, making a sequence first would be
> > better than directly accessing the dict_view object.

I'm not entirely sure that's a strong argument against this.

I never really bought the argument that having O(N) indexing was bad 
because repeated indexing would then be O(N**2). That's true, but for 
small enough N everything is fast, and even O(N**3) or worse can be 
"fast enough" for small N.

The builtins have plenty of other functions which are O(N), e.g. 
list.count, str.find etc. While one might build O(N**2) operations on 
top of those, typically people don't.

So I think that, provided the underlying indexing access is "fast 
enough", which I daresay it would be, I don't think we should care too 
much about the risk of people writing technically O(N**2) code like:

for i in range(len(mydict)):
key = mydict.getkey(i)  # get key by index

This is no worse than what we can already write using existing list or 
string methods. Beginners can and do write much worse code, and 
professionals will learn better.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UTXX3FUQWCM2QVZYUPESG6FXTNESWO54/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Guido van Rossum

On Fri, Jul 31, 2020 at 7:59 PM Steven D'Aprano  wrote:

> On Fri, Jul 31, 2020 at 04:08:43PM -0700, Guido van Rossum wrote:
> [...]
> > I'm guessing that indexing by 0, if it were possible, would be a
> convenient
> > idiom to implement the "first item" operation that has been requested
> > numerous times (at least for dicts).
>
> Indeed, that seems to be the only use-case which seems to be even
> remotely common. `dict.popitem` would do it, of course, but it also
> mutates the dict.
>
> The other simple solution is `next(iter(mydict.items()))`.
>

That one always makes me uncomfortable, because the StopIteration it raises
when the dict is empty might be misinterpreted. Basically I never want to
call next() unless there's a try...except StopIteration: around it, and
that makes this a lot less simple.


> The bottom line here seems to me, as far as I can tell, is that being
> able to fetch a key and/or value by index is of only marginal use.
>

Agreed.


> > Slicing would be useful to get the
> > first N items of a huge dict without materializing the full list of items
> > as a list object, which brought Chris B to request this in the first
> place.
>
> The first request in this thread was from Hans:
>
>
> https://mail.python.org/archives/list/python-ideas@python.org/message/S7UMTWK65X6BJDYZ3SSU7I7HOIASDMMJ/
>
> He is using a dict to hold an array of columns indexed by name
> `{column_name: column}` and wanted to re-order and insert columns at
> arbitrary positions.
>

A bare dict is just not the data structure for that problem.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZAAYWB74TFXQTYXW2IWZNQNS235HOIJM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Steven D'Aprano

On Fri, Jul 31, 2020 at 04:08:43PM -0700, Guido van Rossum wrote:

> Maybe it is common in numpy and pandas to keep adding operations to the
> same object until it breaks, but the key and items views already implement
> the Set ABC, and I'd rather refrain from having them *also* implement the
> Sequence ABC.

+1

I'm not a huge fan of pandas' APIs, I don't think that "pandas does it 
this way" is something we ought to emulate.

> I'm guessing that indexing by 0, if it were possible, would be a convenient
> idiom to implement the "first item" operation that has been requested
> numerous times (at least for dicts).

Indeed, that seems to be the only use-case which seems to be even 
remotely common. `dict.popitem` would do it, of course, but it also 
mutates the dict.

The other simple solution is `next(iter(mydict.items()))`.

The bottom line here seems to me, as far as I can tell, is that being 
able to fetch a key and/or value by index is of only marginal use.

> Slicing would be useful to get the
> first N items of a huge dict without materializing the full list of items
> as a list object, which brought Chris B to request this in the first place.

The first request in this thread was from Hans:

https://mail.python.org/archives/list/python-ideas@python.org/message/S7UMTWK65X6BJDYZ3SSU7I7HOIASDMMJ/

He is using a dict to hold an array of columns indexed by name 
`{column_name: column}` and wanted to re-order and insert columns at 
arbitrary positions.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T3SSBMEZCIOHJJ7REJXE6XZLTNQQWVFJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Inada Naoki

On Sat, Aug 1, 2020 at 10:19 AM Wes Turner  wrote:
>
> We should be reading the source: 
> https://github.com/python/cpython/blob/master/Objects/dictobject.c
>
> AFAIU, direct subscripting / addressing was not a use case in the design 
> phase of the current dict?
>
> Could a __getitem__(slice_or_int_index) be implemented which just skips over 
> the NULLs?
> Or would that be no faster than or exactly what islice does when next()'ing 
> through?
>

There are two major points to optimize.

* Iterating over `next(islice(dict.items(), n, n+1))` will produce n
temporary tuples.
* (CPython implementation detail) dict can detect if there is no hole.
index access is O(1) if there is no hole.

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BL24ZADLYFMRXQXHJ3HNQQMCEDTORLZH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Inada Naoki

On Sat, Aug 1, 2020 at 8:14 AM Christopher Barker  wrote:
>
>
> If that's all we've come up with in this lengthy thread, it's probably not 
> worth it -- after all, both of those can be efficiently accomplished with a 
> little help from itertools.islice and/or next().
>

100% agree.

> But I think we all agree that those tools are less newbie-friendly -- but 
> they should be learned at some point, so maybe that's OK (and there is always 
> the wrap it with a list approach, which is pretty newbie friendly)

I don't agree with it.  It is somewhat newbie-frindly, but somewhat
newbie unfriendly.

Newbie will use random.sample(d.items()) or get-by-index without
knowing it is O(n) even when they can create a list once and use it
repeatedly.
When people learn islice, they will learn it is not O(1) operation too.

If dict views support direct indexing, it is very difficult to notice
it is O(n) operation.  They just think "Oh Python is fucking slow!".
So it is newbie-unfriendly at some point.

Regards,
-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4FR2RPVA3HXVSSZMERSEYCPRHOVOX5A2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

We should be reading the source:
https://github.com/python/cpython/blob/master/Objects/dictobject.c

AFAIU, direct subscripting / addressing was not a use case in the design
phase of the current dict?

Could a __getitem__(slice_or_int_index) be implemented which just skips
over the NULLs?
Or would that be no faster than or exactly what islice does when next()'ing
through?

On Fri, Jul 31, 2020 at 9:14 PM Inada Naoki  wrote:

> On Sat, Aug 1, 2020 at 10:04 AM Wes Turner  wrote:
> >
> > Actually, I think the reverse traversal case is the worst case because:
> it's not possible to use negative subscripts with islice (because that
> would require making a full copy).
> >
> > This doesn't work:
> > >>> islice(dict.keys(), -1, -5)
> >
> > Reverse traversal did work in Python 2 but was foregone when making
> .keys() a view in Python 3 in order to avoid lulling users into making
> usually unnecessary copies.
> >
>
> dict is reversible now. You can do `islice(dict, 0, 5)`.
>
> --
> Inada Naoki  
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BKWWSZHVPRUIGJQQPDO2Y4KPJRBVCT2R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Inada Naoki

On Sat, Aug 1, 2020 at 10:04 AM Wes Turner  wrote:
>
> Actually, I think the reverse traversal case is the worst case because: it's 
> not possible to use negative subscripts with islice (because that would 
> require making a full copy).
>
> This doesn't work:
> >>> islice(dict.keys(), -1, -5)
>
> Reverse traversal did work in Python 2 but was foregone when making .keys() a 
> view in Python 3 in order to avoid lulling users into making usually 
> unnecessary copies.
>

dict is reversible now. You can do `islice(dict, 0, 5)`.

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7F2UCQPDZ3SWSQHVPBBQQKNHVIQWKLV3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Marco Sulla

Ok... I wrong. The array of items have a NULL key and value if one item is
deleted. Sorry for the confusion.

On Sat, 1 Aug 2020 at 01:09, Guido van Rossum  wrote:

> the key and items views already implement the Set ABC, and I'd rather
> refrain from having them *also* implement the Sequence ABC.
>

I'm not pro or against the proposal, but maybe count it's not useful at all
for keys and items. Furthermore, dict implements __reversed__, that is not
a Mapping method, but it's a Sequence method.

I think that it's useful to implement methods of other APIs without
changing their name or implement the full other API... if it's useful ^^
But maybe I'm missing something in the general picture.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LXPWDPYTFT3SP2XFHG2CXNSTHACSLFVA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Inada Naoki

On Sat, Aug 1, 2020 at 9:55 AM Marco Sulla  wrote:
>
> On Sat, 1 Aug 2020 at 02:30, Stestagg  wrote:
>>
>> The dict keys is compact only *until* you delete an item, at which point, a 
>> hole is left in the array
>
> No, the array of items has no hole. The hole is inserted in the hashtable.

Yes, the array of items has hole. Otherwise, `del d[k]` become `O(n)`,
or `del d[k]` won't preserve insertion order.
Please teach me if you know any algorithm which has no hole, O(1)
deletion, preserving insertion order, and efficient and fast as array.

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TH27RES3NQYIVZE7YD2QEQRAK7UCDVXL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

Actually, I think the reverse traversal case is the worst case because:
it's not possible to use negative subscripts with islice (because that
would require making a full copy).

This doesn't work:
>>> islice(dict.keys(), -1, -5)

Reverse traversal did work in Python 2 but was foregone when making .keys()
a view in Python 3 in order to avoid lulling users into making usually
unnecessary copies.

If it's not possible to support
(dict_view||dict_sequence_view).__getitem__(slice_or_index) without causing
a performance regression in dict, which directly affects core performance,
I'm not sure that this use case is worth it either. There's always the
linked list odict.

There are probably tests somewhere that show what happens when we delete a
dict key and value? I'm still somewhat mystified by that, tooL

On Fri, Jul 31, 2020 at 8:44 PM Guido van Rossum  wrote:

> This is not great news. A solution could be to make dict.ordered() force
> compaction -- if there are no deleted keys this would be a no-op. We'd have
> to be able to tell in constant time whether this is the case. But yeah,
> this would be a dangerous thing in the hands of folks below wizard level.
> Another solution could be to make dict.ordered() fail if there are deleted
> keys. But that's not a great experience either.
>
> All in all I retract this idea.
>
> On Fri, Jul 31, 2020 at 5:31 PM Stestagg  wrote:
>
>> On Sat, 1 Aug 2020 at 00:32, Guido van Rossum  wrote:
>>
>>> If true this would be a huge argument against adding indexing and
>>> slicing (other than the special case starting with 0). However, I don't
>>> think it's true. The dict implementation (again, starting in 3.6) actually
>>> stores the list of keys in a compact array, in the insertion order. The
>>> hash table stores indices into this array. Moreover, in order to implement
>>> the ordered property, you pretty much have to do this (since the hash table
>>> *definitely* isn't going to be ordered :-). So indexing and slicing into
>>> this array would seem to be possible, unless I am missing something. I'm
>>> CC"ing Inada Naoki because it's his code -- maybe he can enlighten us.
>>>
>>
>> I’m not Inada Naoki :). So happy to be corrected, but I looked into this
>> quite closely.
>>
>> The dict keys is compact only *until* you delete an item, at which point,
>> a hole is left in the array, which defeats direct lookups (either by
>> indexing or via a view).  Order preservation is still kept as code that
>> iterates over dictionaries knows to just skip the holes.
>>
>> I’m unsure when/if compaction is called however. So there may be a
>> technique for forcing the keys to be compacted again.  This would cause
>> very unpredictable performance in the case of large dictionaries (a main
>> use-case here)
>>
>> Steve
>>
>>
>>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/CWQMLXNOEPCXUXPQSTUOI5VUF4V7R36B/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/JY2ZJTOE4VHC36VWTMZY4QM7U2NUCYT5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FTFGGSQNU5SXHHIWULOH3Y2ROT7TZW46/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

On Fri, Jul 31, 2020 at 8:44 PM Guido van Rossum  wrote:

> This is not great news. A solution could be to make dict.ordered() force
> compaction -- if there are no deleted keys this would be a no-op. We'd have
> to be able to tell in constant time whether this is the case. But yeah,
> this would be a dangerous thing in the hands of folks below wizard level.
> Another solution could be to make dict.ordered() fail if there are deleted
> keys. But that's not a great experience either.
>
> All in all I retract this idea.
>
> On Fri, Jul 31, 2020 at 5:31 PM Stestagg  wrote:
>
>> On Sat, 1 Aug 2020 at 00:32, Guido van Rossum  wrote:
>>
>>> If true this would be a huge argument against adding indexing and
>>> slicing (other than the special case starting with 0). However, I don't
>>> think it's true. The dict implementation (again, starting in 3.6) actually
>>> stores the list of keys in a compact array, in the insertion order. The
>>> hash table stores indices into this array. Moreover, in order to implement
>>> the ordered property, you pretty much have to do this (since the hash table
>>> *definitely* isn't going to be ordered :-). So indexing and slicing into
>>> this array would seem to be possible, unless I am missing something. I'm
>>> CC"ing Inada Naoki because it's his code -- maybe he can enlighten us.
>>>
>>
>> I’m not Inada Naoki :). So happy to be corrected, but I looked into this
>> quite closely.
>>
>> The dict keys is compact only *until* you delete an item, at which point,
>> a hole is left in the array, which defeats direct lookups (either by
>> indexing or via a view).  Order preservation is still kept as code that
>> iterates over dictionaries knows to just skip the holes.
>>
>> I’m unsure when/if compaction is called however. So there may be a
>> technique for forcing the keys to be compacted again.  This would cause
>> very unpredictable performance in the case of large dictionaries (a main
>> use-case here)
>>
>> Steve
>>
>>
>>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/CWQMLXNOEPCXUXPQSTUOI5VUF4V7R36B/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/JY2ZJTOE4VHC36VWTMZY4QM7U2NUCYT5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JI34C44QFZTIGOMQVX4JJLHSGPTYBTGV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Marco Sulla

On Sat, 1 Aug 2020 at 02:30, Stestagg  wrote:

> The dict keys is compact only *until* you delete an item, at which point,
> a hole is left in the array
>

No, the array of items has no hole. The hole is inserted in the hashtable.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YNL6HGGXI7HVEX7QW2YM7WCEVGXNLKSL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Guido van Rossum

This is not great news. A solution could be to make dict.ordered() force
compaction -- if there are no deleted keys this would be a no-op. We'd have
to be able to tell in constant time whether this is the case. But yeah,
this would be a dangerous thing in the hands of folks below wizard level.
Another solution could be to make dict.ordered() fail if there are deleted
keys. But that's not a great experience either.

All in all I retract this idea.

On Fri, Jul 31, 2020 at 5:31 PM Stestagg  wrote:

> On Sat, 1 Aug 2020 at 00:32, Guido van Rossum  wrote:
>
>> If true this would be a huge argument against adding indexing and slicing
>> (other than the special case starting with 0). However, I don't think it's
>> true. The dict implementation (again, starting in 3.6) actually stores the
>> list of keys in a compact array, in the insertion order. The hash table
>> stores indices into this array. Moreover, in order to implement the ordered
>> property, you pretty much have to do this (since the hash table
>> *definitely* isn't going to be ordered :-). So indexing and slicing into
>> this array would seem to be possible, unless I am missing something. I'm
>> CC"ing Inada Naoki because it's his code -- maybe he can enlighten us.
>>
>
> I’m not Inada Naoki :). So happy to be corrected, but I looked into this
> quite closely.
>
> The dict keys is compact only *until* you delete an item, at which point,
> a hole is left in the array, which defeats direct lookups (either by
> indexing or via a view).  Order preservation is still kept as code that
> iterates over dictionaries knows to just skip the holes.
>
> I’m unsure when/if compaction is called however. So there may be a
> technique for forcing the keys to be compacted again.  This would cause
> very unpredictable performance in the case of large dictionaries (a main
> use-case here)
>
> Steve
>
>
>> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/CWQMLXNOEPCXUXPQSTUOI5VUF4V7R36B/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JY2ZJTOE4VHC36VWTMZY4QM7U2NUCYT5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Stestagg

On Sat, 1 Aug 2020 at 00:32, Guido van Rossum  wrote:

> If true this would be a huge argument against adding indexing and slicing
> (other than the special case starting with 0). However, I don't think it's
> true. The dict implementation (again, starting in 3.6) actually stores the
> list of keys in a compact array, in the insertion order. The hash table
> stores indices into this array. Moreover, in order to implement the ordered
> property, you pretty much have to do this (since the hash table
> *definitely* isn't going to be ordered :-). So indexing and slicing into
> this array would seem to be possible, unless I am missing something. I'm
> CC"ing Inada Naoki because it's his code -- maybe he can enlighten us.
>

I’m not Inada Naoki :). So happy to be corrected, but I looked into this
quite closely.

The dict keys is compact only *until* you delete an item, at which point, a
hole is left in the array, which defeats direct lookups (either by indexing
or via a view).  Order preservation is still kept as code that iterates
over dictionaries knows to just skip the holes.

I’m unsure when/if compaction is called however. So there may be a
technique for forcing the keys to be compacted again.  This would cause
very unpredictable performance in the case of large dictionaries (a main
use-case here)

Steve

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CWQMLXNOEPCXUXPQSTUOI5VUF4V7R36B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Guido van Rossum

On Fri, Jul 31, 2020 at 4:12 PM Christopher Barker 
wrote:

> On Fri, Jul 31, 2020 at 2:56 PM Marco Sulla 
> wrote:
>
>> Yes. Since now dicts are ordered by insertion, also keys, values and
>> items are ordered the same way.
>>
>
> Preservation of oider was codified in version 3.7 (? at some point,
> anyway). And while it may not explicitly say that the views will present
> the same order, I haven't heard anyone complain about that in this
> discussion.
>

It happened in the implementation in 3.6, and in 3.7 we added it as a
guarantee to the language reference. But this means you can count on it
since 3.6.


> So:
> dicts preserve order
> the dict_views preserve this same order
>
> Some think that we can't call them "ordered", because somehow that implies
> other things, like the ability to insert, etc, but while I don't get that
> argument, fine, let's call them "order preserving".
>

Meh, who cares. (The confusion is usually between "ordered" and "sorted"
but we're past that here, so let's just keep saying "ordered".)

So in terms of what is defined by the language, and what is technically
> possible, dict views could be indexed and sliced with clear semantics. Most
> other Sequence operations are not possible, so no, dicts are not Sequences,
> and the dict views aren't either. So we won't register them with the
> Sequence ABC, no problem.
>
> So why not add this? Form what I can tell from this thread, there are
> these reasons:
>
> 1) Because adding ANYTHING new is taking on a commitment to preserve it in
> the future, and it's more code to write, maintain, and document. So
> regardless of any other argument, it shouldn't be added without a
> reasonable use case(s) -- what's reasonable is subjective, of course.
>
> 2) Because it could be an "attractive nuisance" while dicts are order
> preserving, their internal structure is such that you cannot find the nth
> item without iterating through n items, making access O(n), rather than
> O(1) for access by key or access of the usual Sequences -- Folks expect
> numerical indexing to be O(1), so it could be tricky. However, the only
> other way to get an n'th item is to copy everything into a Sequence, which
> is a lot slower, or to use islice or next() to do the order N access by
> hand. So this is an attractive nuisance in the use case of wanting to take
> multiple items by index, in which case, making a sequence first would be
> better than directly accessing the dict_view object.
>

If true this would be a huge argument against adding indexing and slicing
(other than the special case starting with 0). However, I don't think it's
true. The dict implementation (again, starting in 3.6) actually stores the
list of keys in a compact array, in the insertion order. The hash table
stores indices into this array. Moreover, in order to implement the ordered
property, you pretty much have to do this (since the hash table
*definitely* isn't going to be ordered :-). So indexing and slicing into
this array would seem to be possible, unless I am missing something. I'm
CC"ing Inada Naoki because it's his code -- maybe he can enlighten us.


> 3) dicts are not Sequences, they are Mappings, so they shouldn't have
> Sequence features. dict_views are Sets, not Sequences, so they shouldn't
> have Sequence features.
>

Right, I said this (with more words) in my last email, a few minutes ago.

Point 3) I may have misrepresented it, it was explained in a lot more
> detail, but I'm pretty sure that's what it comes down to. But I'm going to
> talk about my understanding of the point, which is pretty much how I wrote
> it. If I really did misrepresent it, then feel free to kibitz some more
>
> It seems like I have a different philosophy about typing and duck typing
> than some in this converstaion. I appreciate the ABCs, and that they
> clearly nail down what types need to be in order to "BE" one of the ABCs --
> and see how this is useful. But I don't see it as restrictive. An object is
> not a Sequence unless it fully conforms to the Sequence ABC, sure. But that
> doesn't mean an object can't have some, but not all, of the behavior of a
> Sequence without having all the rest. In this example, implementing integer
> indexing and slicing on dict_views would not make them a Sequence, but that
> doesn't mean you can't do it. It's also the case that any type can BE a
> particular ABC, and still have other functionality. So, in this case, the
> dict_views are Sets, but we can add other things to them of course.
>

And indeed, here our philosophies diverge (see my last email).


> So on to "duck typing" -- the term is comes from the aphorism: "If it
> looks like a duck, swims like a duck, and quacks like a duck, then it
> probably *is* a duck" (https://en.wikipedia.org/wiki/Duck_test).
>
> So that pretty much maps to ABCs: the ABC for a duck specifies what a duck
> is -- it's something that looks, swims, and quacks like a duck.
>
> But I've always taken a more flexible view on duck

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Christopher Barker

On Fri, Jul 31, 2020 at 2:56 PM Marco Sulla 
wrote:

> Yes. Since now dicts are ordered by insertion, also keys, values and items
> are ordered the same way.
>

Preservation of oider was codified in version 3.7 (? at some point,
anyway). And while it may not explicitly say that the views will present
the same order, I haven't heard anyone complain about that in this
discussion.

So:
dicts preserve order
the dict_views preserve this same order

Some think that we can't call them "ordered", because somehow that implies
other things, like the ability to insert, etc, but while I don't get that
argument, fine, let's call them "order preserving".

So in terms of what is defined by the language, and what is technically
possible, dict views could be indexed and sliced with clear semantics. Most
other Sequence operations are not possible, so no, dicts are not Sequences,
and the dict views aren't either. So we won't register them with the
Sequence ABC, no problem.

So why not add this? Form what I can tell from this thread, there are these
reasons:

1) Because adding ANYTHING new is taking on a commitment to preserve it in
the future, and it's more code to write, maintain, and document. So
regardless of any other argument, it shouldn't be added without a
reasonable use case(s) -- what's reasonable is subjective, of course.

2) Because it could be an "attractive nuisance" while dicts are order
preserving, their internal structure is such that you cannot find the nth
item without iterating through n items, making access O(n), rather than
O(1) for access by key or access of the usual Sequences -- Folks expect
numerical indexing to be O(1), so it could be tricky. However, the only
other way to get an n'th item is to copy everything into a Sequence, which
is a lot slower, or to use islice or next() to do the order N access by
hand. So this is an attractive nuisance in the use case of wanting to take
multiple items by index, in which case, making a sequence first would be
better than directly accessing the dict_view object.

3) dicts are not Sequences, they are Mappings, so they shouldn't have
Sequence features. dict_views are Sets, not Sequences, so they shouldn't
have Sequence features.

Point 3) I may have misrepresented it, it was explained in a lot more
detail, but I'm pretty sure that's what it comes down to. But I'm going to
talk about my understanding of the point, which is pretty much how I wrote
it. If I really did misrepresent it, then feel free to kibitz some more

It seems like I have a different philosophy about typing and duck typing
than some in this converstaion. I appreciate the ABCs, and that they
clearly nail down what types need to be in order to "BE" one of the ABCs --
and see how this is useful. But I don't see it as restrictive. An object is
not a Sequence unless it fully conforms to the Sequence ABC, sure. But that
doesn't mean an object can't have some, but not all, of the behavior of a
Sequence without having all the rest. In this example, implementing integer
indexing and slicing on dict_views would not make them a Sequence, but that
doesn't mean you can't do it. It's also the case that any type can BE a
particular ABC, and still have other functionality. So, in this case, the
dict_views are Sets, but we can add other things to them of course.

So on to "duck typing" -- the term is comes from the aphorism: "If it looks
like a duck, swims like a duck, and quacks like a duck, then it probably
*is* a duck" (https://en.wikipedia.org/wiki/Duck_test).

So that pretty much maps to ABCs: the ABC for a duck specifies what a duck
is -- it's something that looks, swims, and quacks like a duck.

But I've always taken a more flexible view on duck typing -- if I want to
know if it's a duck, yes, I need the ABC. But I don't usually care if it's
a duck -- I care if it does the one or two things I need it to do. So if I
need a quacker, then anything that quacks like a duck is fine with me, even
if it can't swim.

Bringing this back to this concrete example:

One of the use cases brought up was being able to choose a random item from
a dict. You can't pass a dict (or dict_views) to random.choice. But not
because a dict isn't a Sequence, but because they aren't subscriptable:

In [14]: random.choice(d.keys())

---
TypeError Traceback (most recent call last)
 in 
> 1 random.choice(d.keys())

~/miniconda3/envs/py3/lib/python3.8/random.py in choice(self, seq)
289 except ValueError:
290 raise IndexError('Cannot choose from an empty
sequence') from None
--> 291 return seq[i]
292
293 def shuffle(self, x, random=None):

TypeError: 'dict_keys' object is not subscriptable

Everyone on this thread knows this, but to be explicit, anything with a
length and that is subscriptable with integers between 0 and that length
can be used with random.choice. I think this is

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Guido van Rossum

On Fri, Jul 31, 2020 at 2:56 PM Marco Sulla 
wrote:

> On Fri, 31 Jul 2020 at 22:14, Chris Angelico  wrote:
>
>>  So, constructing a tuple or list from the keys or items WILL give you a
>> sequence.
>>
>
> Yes. Since now dicts are ordered by insertion, also keys, values and items
> are ordered the same way.
>
> It seems to me more simple to add some sequence methods to dict views,
> like subscript, slicing and index(), instead of creating other 3 methods
> and 3 data types.
>

Maybe it is common in numpy and pandas to keep adding operations to the
same object until it breaks, but the key and items views already implement
the Set ABC, and I'd rather refrain from having them *also* implement the
Sequence ABC. For one thing, when they want to pass one of those views as
an argument to another function *as a set or as a sequence*, this would
force the user to show the reader what they meant, and this will make the
code more readable. It would also clarify whether some other mapping
supports the same operation (the presence of the new method would indicate
it).

>
> What I have not understood well is when you need to index a dict by
> position or slice it.
>

I'm guessing that indexing by 0, if it were possible, would be a convenient
idiom to implement the "first item" operation that has been requested
numerous times (at least for dicts). Slicing would be useful to get the
first N items of a huge dict without materializing the full list of items
as a list object, which brought Chris B to request this in the first place.
The dict would have to be huge for this materialization to matter, but
Chris B did encounter the situation.

What I don't understand yet is how *frequently* the latter operation would
be useful -- if it's infrequently, Chris B's own solution using islice() on
the items() view looked pretty decent to me, and not that hard to come up
with for someone who made it that far. For the former I expect that sooner
or later someone will write a PEP and it will be accepted (assuming the PEP
doesn't overreach).

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LOPDTPFRDSUGKYZEKJILZSWABQSUHC7X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Marco Sulla

On Fri, 31 Jul 2020 at 22:14, Chris Angelico  wrote:

>  So, constructing a tuple or list from the keys or items WILL give you a
> sequence.
>

Yes. Since now dicts are ordered by insertion, also keys, values and items
are ordered the same way.

It seems to me more simple to add some sequence methods to dict views, like
subscript, slicing and index(), instead of creating other 3 methods and 3
data types.

What I have not understood well is when you need to index a dict by
position or slice it.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YMXPFPXWOJXZXIYXC6ZX3KQQ2YFTERBD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Chris Angelico

On Sat, Aug 1, 2020 at 6:07 AM Ricky Teachey  wrote:
>
> On Fri, Jul 31, 2020 at 3:52 PM Marco Sulla  
> wrote:
>>
>> Is it not more simple to add some sequence methods to the dict views (if 
>> there's a real need)?
>> If you do tuple(dict.keys()), you get the sequence of keys in the same 
>> insertion order of the dict. It seems to me that the duck already exists and 
>> quacks.
>
>
> The problem with that is these view objects are in fact NOT ducks 
> (sequences). They are gooses (sets). Gooses don't quack. They honk.
>

They are guaranteed to iterate in the same order as the dict, though,
unless I'm mistaken. There's been a weaker guarantee for a long time
(that iteration order for keys/values/items is consistent and will not
change without a dict mutation), and I can't find anywhere that it's
now guaranteed to be the same as dict order, but I would be very
surprised if list(d) was different from list(d.keys()). So,
constructing a tuple or list from the keys or items WILL give you a
sequence.

Whether that's good enough is another question. For instance, it
requires taking a full copy of the dict, even if all you want is a
single element from it.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QN6BDOBM6KXB3LAVSE34WNTN4T2APBDA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Ricky Teachey

On Fri, Jul 31, 2020 at 3:52 PM Marco Sulla 
wrote:

> Is it not more simple to add some sequence methods to the dict views (if
> there's a real need)?
> If you do tuple(dict.keys()), you get the sequence of keys in the same
> insertion order of the dict. It seems to me that the duck already exists
> and quacks.
>

The problem with that is these view objects are in fact NOT ducks
(sequences). They are gooses (sets). Gooses don't quack. They honk.

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home
or actually going home." - Happy Chandler
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2EPKD6IH4NKWKBHWRG4YPMGQOC5PP7CV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Marco Sulla

Is it not more simple to add some sequence methods to the dict views (if
there's a real need)?
If you do tuple(dict.keys()), you get the sequence of keys in the same
insertion order of the dict. It seems to me that the duck already exists
and quacks.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MFS55TWPHYTKNNJMYZACAEKOEQ2XALUE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

> keyseq(), valueseq(), itemseq()

Yeah, that's a good one

On Fri, Jul 31, 2020, 12:18 PM Ricky Teachey  wrote:

> On Fri, Jul 31, 2020 at 11:55 AM Wes Turner  wrote:
>
>> +1 on (lazy) Sequence views optimized for sequential reads and random
>> access (which hide the dict implementation which varies across which
>> platforms?)
>>
>> I'm somewhat indifferent on the name:
>>
>> # Sequence views names:
>> ## method() -> Sequence
>> - orderedkeys(), orderedvalues(), ordereditems()
>> - okeys(), ovalues(), oitems()
>> - keys(), keysordered(), values(), valuesordered(), items(),
>> itemsordered()
>>   - What does this do with tab-completion?
>>
>>
> Perhaps add these to the list of possibilities also?
>
> keyseq(), valueseq(), itemseq()
>
> --
> Ricky.
>
> "I've never met a Kentucky man who wasn't either thinking about going home
> or actually going home." - Happy Chandler
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/H4L6MY6ZRBIARUYLYHAE5EO3QTCZMHRY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Ricky Teachey

On Fri, Jul 31, 2020 at 11:55 AM Wes Turner  wrote:

> +1 on (lazy) Sequence views optimized for sequential reads and random
> access (which hide the dict implementation which varies across which
> platforms?)
>
> I'm somewhat indifferent on the name:
>
> # Sequence views names:
> ## method() -> Sequence
> - orderedkeys(), orderedvalues(), ordereditems()
> - okeys(), ovalues(), oitems()
> - keys(), keysordered(), values(), valuesordered(), items(), itemsordered()
>   - What does this do with tab-completion?
>
>
Perhaps add these to the list of possibilities also?

keyseq(), valueseq(), itemseq()

--
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home
or actually going home." - Happy Chandler
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/H34VNADIE7Y7R5HDSKX25FWQTXQBTW7B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

+1 on (lazy) Sequence views optimized for sequential reads and random
access (which hide the dict implementation which varies across which
platforms?)

I'm somewhat indifferent on the name:

# Sequence views names:
## method() -> Sequence
- orderedkeys(), orderedvalues(), ordereditems()
- okeys(), ovalues(), oitems()
- keys(), keysordered(), values(), valuesordered(), items(), itemsordered()
  - What does this do with tab-completion?

# Get key, value, item by integer-location method names:
## method(start: int, stop: int, step: int) -> Any
- keys_iloc(), values_iloc(), items_iloc()

## __getitem__(slice_or_int: Union[slice, int]) -> Any
- keys[0], values.iloc[1:4:2], items.iloc[1:4, 2:3]
- keys_iloc[], values_iloc[], items_iloc[]
- keysiloc[], valuesiloc[], itemsiloc[]
- keys.iloc[], values.iloc[], items.iloc[]
- keypos(),

## method(index: int) -> Any:
- getkey(), getvalue(), getitem(



# Get integer-location of a key, of a value:
## method(value) -> int, method(value) ->
- index(value)
  - this is compatible with list.index and str.index but incompatible with
Series/Dataframe.index (which may or may not be irrelevant)
- orderedkeys.index(), orderedvalues.index()
- okeys.index(), ovalues.index()
- keys.ordered.index(), values.ordered.index()

- getkeypos(), getvaluepos()
- getkeyiloc(), getvalueiloc()

...

Thoughts:
- what did i miss?
- is there an opportunity to optimize performance of e.g getvaluepos() /
values.ordered.index()?
- performance considerations between islice(dict_view) and
Sequence.__getitem__?


On Fri, Jul 31, 2020 at 11:17 AM Ricky Teachey  wrote:

> On Fri, Jul 31, 2020 at 10:31 AM Guido van Rossum 
> wrote:
>
>> So maybe we need to add dict.ordered() which returns a view on the items
>> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
>> orderedvalues()?
>>
>>
> I for one would be +1 on this idea.
>
> On the bike shed hew: oitems(), okeys(), and ovalues() would be much
> shorter, and shouldn't be too difficult to read once they get established
> (though i am sure many will differ on this point).
>
> ---
> Ricky.
>
> "I've never met a Kentucky man who wasn't either thinking about going home
> or actually going home." - Happy Chandler
>
>
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Q3UZ6DQDSOSQYIHSQYWUUM7KTNTA3KKA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Ricky Teachey

On Fri, Jul 31, 2020 at 10:31 AM Guido van Rossum  wrote:

> So maybe we need to add dict.ordered() which returns a view on the items
> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
> orderedvalues()?
>
>
I for one would be +1 on this idea.

On the bike shed hew: oitems(), okeys(), and ovalues() would be much
shorter, and shouldn't be too difficult to read once they get established
(though i am sure many will differ on this point).

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home
or actually going home." - Happy Chandler
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MFTAXYRLJS354GLUYRKLGT5L7X3KUMUY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Guido van Rossum

So maybe we need to add dict.ordered() which returns a view on the items
that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
orderedvalues()?

On Fri, Jul 31, 2020 at 05:29 Ricky Teachey  wrote:

> On Fri, Jul 31, 2020, 2:48 AM Wes Turner  wrote:
>
>> # Dicts and DataFrames
>> - Src:
>> https://github.com/westurner/pythondictsanddataframes/blob/master/dicts_and_dataframes.ipynb
>> - Binder:
>> https://mybinder.org/v2/gh/westurner/pythondictsanddataframes/master?filepath=dicts_and_dataframes.ipynb
>>   - (interactive Jupyter Notebook hosted by https://mybinder.org/ )
>>
>
> The punchline of Wes Turner's notebook (very well put together, thank
> you!) seems to partly be that if you find yourself wanting to work with the
> position of items in a dict, you might want to consider using a
> pandas.Series (with it's .iloc method).
>
> A difficulty that immediately came to mind with this advice is type
> hinting support. I was just googling yesterday for "how to type hint using
> pandas" and the only thing I found is to use pd.Series and pd.DataFrame
> directly.
>
> But those don't support type hinting comparable to:
>
> Dict[str, float]
>
> Or:
>
> class Vector(TypedDict):
> i: float
> j: float
>
> This is a big downside of the advice "just use pandas". Although I love
> using pandas and use it all the time.
>
>> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/C7HJFKB67U74SULO6OUTLWST2MHZERCH/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
--Guido (mobile)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VIPBHJTMFGREFQHINDNODAAJGNE2IDJB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Ricky Teachey

On Fri, Jul 31, 2020, 2:48 AM Wes Turner  wrote:

> # Dicts and DataFrames
> - Src:
> https://github.com/westurner/pythondictsanddataframes/blob/master/dicts_and_dataframes.ipynb
> - Binder:
> https://mybinder.org/v2/gh/westurner/pythondictsanddataframes/master?filepath=dicts_and_dataframes.ipynb
>   - (interactive Jupyter Notebook hosted by https://mybinder.org/ )
>

The punchline of Wes Turner's notebook (very well put together, thank you!)
seems to partly be that if you find yourself wanting to work with the
position of items in a dict, you might want to consider using a
pandas.Series (with it's .iloc method).

A difficulty that immediately came to mind with this advice is type hinting
support. I was just googling yesterday for "how to type hint using pandas"
and the only thing I found is to use pd.Series and pd.DataFrame directly.

But those don't support type hinting comparable to:

Dict[str, float]

Or:

class Vector(TypedDict):
i: float
j: float

This is a big downside of the advice "just use pandas". Although I love
using pandas and use it all the time.

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C7HJFKB67U74SULO6OUTLWST2MHZERCH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-31 Thread Wes Turner

# Dicts and DataFrames
- Src:
https://github.com/westurner/pythondictsanddataframes/blob/master/dicts_and_dataframes.ipynb
- Binder:
https://mybinder.org/v2/gh/westurner/pythondictsanddataframes/master?filepath=dicts_and_dataframes.ipynb
  - (interactive Jupyter Notebook hosted by https://mybinder.org/ )

## Question / Problem / Challenge
Question: Now that dicts are insertion-ordered (since Python 3.6), can we
lookup items by integer-location?
- As of CPython 3.10: No; the public Python `dict` API neither:
  - (1) offers any method to access keys, values, or items by
integer-location; nor
  - (2) exposes anything from the underlying C code like `dict._array`
which could be used for such a method. `dict._array` would be considered an
implementation detail that could be different in Python versions and
implementations

How could lookup of `dict` keys, values, and items *by integer-location* be
implemented?
- This is the subject of this document.

## Background
- The CPython dict is an insertion-ordered Hashmap:
  https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
  - https://github.com/python/cpython/blob/master/Objects/dictobject.c
  - https://github.com/python/cpython/blob/master/Objects/odictobject.c
- The Pandas Series and DataFrames are insertion-ordered columnar data
structures
  - https://pandas.pydata.org/pandas-docs/stable/reference/series.html
-
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html#pandas.Series
- https://github.com/pandas-dev/pandas/blob/master/pandas/core/series.py
  - https://pandas.pydata.org/pandas-docs/stable/reference/frame.html
-
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

- https://github.com/pandas-dev/pandas/blob/master/pandas/core/frame.py

```python
import itertools
import pandas as pd
import pytest
import random
try:
display  # IPython
except NameError:
def display(*args): print(args)
pd.set_option('display.notebook_repr_html', False)
```

### CPython dict basics

```python
odict = {'a':'A', 'b':'B', 'c':'C', 'd': 'D'}
odict = dict(a='A', b='B', c='C', d='D')
odict = dict((x, x.upper()) for x in 'abcd')  # list comprehension
odict = {x:x.upper() for x in 'abcd'} # dict comprehension
odict = dict.fromkeys('abcd')
[odict.__setitem__(x, x.upper()) for x in 'abcd']
display(odict)
assert list(odict.keys()) == list('abcd') == ['a', 'b', 'c', 'd']
assert random.seed(1) or list(odict.keys()) == random.seed(2**10) or
list(odict.keys())
assert list(odict.values()) == list('ABCD')
assert list(odict.items()) == [('a', 'A'), ('b', 'B'), ('c', 'C'), ('d',
'D')]
```

{'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}

`itertools.islice(dict.items())` is suboptimal for a number of cases
because we don't need to iterate through the items at the beginning: we
could directly address the underlying array.

```python

# This would not call next() unnecessarily:
with pytest.raises(AttributeError): # 'dict' object has no attribute
'_array'
odict._array[3]   # _array[3]
```

## Possible Solutions

### No changes to CPython

 Make an unnecessary copy of the whole dict in order to only take(3)

```python
assert list(odict.items())[3] == ('d', 'D')
```

 Call `itertools.islice(dict.items())`
- Does this call `next()`, `next()`, `next()`, `next()` unnecessarily?
- `itertools.islice(dict.items())` is suboptimal for a number of cases
because we don't need to iterate through the items at the beginning.
Directly addressing the underlying array would be much faster but unlikely
to happen because the underlying array is an implementation detail.

```python
list(itertools.islice(odict.items(), 0))
```


[]


```python
list(itertools.islice(odict.items(), 3))
```


[('a', 'A'), ('b', 'B'), ('c', 'C')]


```python
list(itertools.islice(odict.items(), 3, 3+1))
```


[('d', 'D')]


### Change CPython

 Expose something like `dict._array`
- Again, the underlying array is a \[CPython,] implementation detail
- So, there must be methods that provide access to the [(key, item),] data
that hide the implementation details.

 Overload `dict.__getitem__` (`dict[]`)

`dict[]` (`dict.__getitem__`) is already used for lookup by key value, so
`dict[3]` could either be lookup by value or lookup by integer-location:
which would be dangerously abiguous at runtime and frustratingly vague to
review. (See also the note below regarding the fate of the Pandas
`.ix.__getitem__` method).

 Make all iterators subscriptable: define `iterator.__getitem__`

`dict.items()[3]` fails with a `TypeError: 'dict_items' object is not
subscriptable`:`dict.items()` returns a view (instead of an unnecessarily
copied list like in Python 2) which is not subscriptable.

Could we define `iterator.__getitem__` such that:

```python
obj = list('abcd')
iter(obj)[3] => islice(obj, 3, 3+1)
iter(obj)[3:4] => islice(obj, 3, 4)
iter(obj)[0:4:2] => islice(obj, 1, 4, 2)
```

- This would require a change to the

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Marco Sulla

On Thu, 30 Jul 2020 at 23:42, Steven D'Aprano  wrote:

> Of course anything is possible in theory, but I think that would require
> shifting the existing keys and values in the array, and updating the
> entries in the hash table to point to their key in the new position, and
> probably other complications I haven't thought of.
>

Also list must shift the internal array. You don't have to update the
hashtable. For example, when you pop an item, the key is not removed from
the hashtable completely, but it's marked as dummy.

I think the main problem here is set, since set is not ordered. In theory,
dict views could be an "ordered" set, so you can do:

d.keys()[5]
d.values()[1:]
d.items()[3] = "omg"

In any case, if people want to propose turning dicts into fully-fledged
> reorderable, slicable sequences as well as mappings, they will probably
> need a PEP :-)
>

I'm just brainstorming. Not sure about the real use-case.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EF3FRDGV5DPTEIFY2DVRTAMEUUSOEFV7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Guido van Rossum

On Thu, Jul 30, 2020 at 22:23 Greg Ewing 
wrote:

> On 31/07/20 4:04 am, Christopher Barker wrote:
> > (Side note: why ARE the views provided by methods, rather than
> > properties?)
>
> Probably because they construct objects, and are therefore
> somewhat more expensive than is usually expected for an
> attribute access.


Also for compatibility (even if imperfect) with Python 2. It’s water over
the dam now.

—Guido

-- 
--Guido (mobile)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RPAYXPAB24V7EDINJLHIW4FVHR4DL7FM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Greg Ewing


On 31/07/20 4:04 am, Christopher Barker wrote:
(Side note: why ARE the views provided by methods, rather than 
properties?


Probably because they construct objects, and are therefore
somewhat more expensive than is usually expected for an
attribute access.

--
Greg
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2SJXX4PDFTEC7YSZVGGWNCXSOGLUXK2S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Steven D'Aprano

On Thu, Jul 30, 2020 at 06:17:24PM -0400, Wes Turner wrote:

> What new syntax do you propose?

I don't propose new syntax. If this functionality is needed, regular 
method syntax will do the job without misleading people to th8ink that 
dicts are sequences that ought to support the full sequence API or the 
full set of list methods. One or the other of these methods should be 
sufficient for the uses I've seen:

dict.getkey(index)  # return key
dict.getitem(index) # return (key, value)

I'm not even convinced that there is a compelling use-case for this 
functionality at all, but one or both of the above seem more than 
sufficient. Once you have the key, you can do everything else.

If there is a need to get the insertion index of a key, that could also 
be a method, but let's hold off adding that unless there is a good 
reason.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PKKO4XFUC5BIQZKXMWM2VJGO4GDELYAW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Wes Turner

What new syntax do you propose?

On Thu, Jul 30, 2020 at 5:55 PM Steven D'Aprano  wrote:

> On Thu, Jul 30, 2020 at 05:03:58PM -0400, Wes Turner wrote:
>
> > Would these be the *non-mutating* methods desired of insertion-ordered
> > dicts?
> >
> > .iloc[sli:ce]
> > .iloc[int]
> > .iloc[[list,]]
> > .iloc[callable]
> > .iloc[bitmask]
> >
> > .index(key)
>
> No.
>
> "iloc" sounds like a new Apple product to keep strangers out of your
> home, or maybe something to do with iterators.
>
> Both slicing and "[list,]" arguments can be easily performed using a
> comprehension.
>
> I have no idea what calling it with a callable or bitmask is intended to
> do, or why they would be useful, but bitmasks are usually ints, so how
> would it distinguish between the item at index 3 and the item with
> bitmask 3?
>
> Is there a use-case for retrieving the insertion position of a key, or
> are we just adding it because we can?
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/XWQZD45EE6MY2NKV3UYN7GZXYMYZPO2E/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LPG74OMOAWWKO7BBTWUJPIVSNIBNKRRV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Steven D'Aprano

On Thu, Jul 30, 2020 at 05:03:58PM -0400, Wes Turner wrote:

> Would these be the *non-mutating* methods desired of insertion-ordered
> dicts?
> 
> .iloc[sli:ce]
> .iloc[int]
> .iloc[[list,]]
> .iloc[callable]
> .iloc[bitmask]
> 
> .index(key)

No.

"iloc" sounds like a new Apple product to keep strangers out of your 
home, or maybe something to do with iterators.

Both slicing and "[list,]" arguments can be easily performed using a 
comprehension.

I have no idea what calling it with a callable or bitmask is intended to 
do, or why they would be useful, but bitmasks are usually ints, so how 
would it distinguish between the item at index 3 and the item with 
bitmask 3?

Is there a use-case for retrieving the insertion position of a key, or 
are we just adding it because we can?

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XWQZD45EE6MY2NKV3UYN7GZXYMYZPO2E/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Steven D'Aprano

On Thu, Jul 30, 2020 at 09:35:31PM +0200, Marco Sulla wrote:
> On Thu, 30 Jul 2020 at 19:24, Steven D'Aprano  wrote:
> 
> > You can't insert a key in a specific position. If I have this dict:
> >
> > mydict = {'a': 1, 'c': 3, 'd': 4, 'e': 5}
> >
> > I can't insert 'b':2 between keys 'a' and 'c', except by creating a new
> > dict.
> >
> 
> Not sure about this. In C code, dicts are a hashtable and an array of
> items. In theory, nothing prevents you from inserting a new key in a
> specific position of the key array instead of at the end.

Of course anything is possible in theory, but I think that would require 
shifting the existing keys and values in the array, and updating the 
entries in the hash table to point to their key in the new position, and 
probably other complications I haven't thought of.

Of course it is possible to have a data structure with O(1) insertions 
and updates by key, and O(1) insertions and updates by index, and fully 
reorderable keys. But it probably won't be small and lean and fast. Just 
because something takes constant time doesn't mean it will be a *fast* 
constant time.

In any case, if people want to propose turning dicts into fully-fledged 
reorderable, slicable sequences as well as mappings, they will probably 
need a PEP :-)

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NLT46M65Y5326AETV6NFEXJONCQZCC65/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Wes Turner

Are you suggesting that dict views would become subscriptable?

In [3]: %doctest_mode
Exception reporting mode: Plain
Doctest mode is: ON
>>> x = dict(a=2, b=3, c=4)
>>> x.items()[3]
Traceback (most recent call last):
  File "", line 1, in 
x.items()[3]
TypeError: 'dict_items' object is not subscriptable

On Thu, Jul 30, 2020, 12:05 PM Christopher Barker 
wrote:

> On Wed, Jul 29, 2020 at 7:23 AM Wes Turner  wrote:
>
>> .iloc[] is the Pandas function for accessing by integer-location:
>>
>> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html
>>
>
> I'm not sure I quite get the analogy here, but ...
>
> >>> odict.iloc[3]
>>
>>
>> Would this be the only way to access only item 4 from an odict of length
>> greater than 4 with slice notation for dict views generated from selection
>> by integer-location?
>>
>
> are suggesting that we add another attribute to dicts to provide "index"
> access? I'm not sure the advantage of that, as we already have the dict
> views, so I guess it's nice not to have the type the parentheses:
>
> odict.items()[3] IS a bit klunkier than odict.iloc[3]
>
> (Side note: why ARE the views provided by methods, rather than properties?
> But I digress ...)
>
>
>> >>> odict[3:4]
>>
>
> that would be possible.
>
> What does this do? Confusing to a beginner:
>>
>> >>> odict[3,]
>>
>
> no more confusing than it is for Sequences, is it?
>
> In [4]: l[3,]
>
> ---
> TypeError Traceback (most recent call last)
>  in 
> > 1 l[3,]
>
> TypeError: list indices must be integers or slices, not tuple
>
> But yes, if we allowed dicts to be indexable with slices, they still
> couldn't be indexed by single indexes (unless that value happened to be a
> key) so there would be no way to get a single item by index, as a length-1
> slice would presumably return a dict with one item.
>
> So yeah, if indexing, slicing were to happen, it would have to be in the
> views.
>
> But this does bring up that there are a lot of places where slice notation
> could be used outside of [] -- and that would be handy for use cases like
> pandas, even if not for dicts.
>
> -CHB
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TH54BGX66NQ2VECS3XO33ADY3GRTX5NU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Wes Turner

On Thu, Jul 30, 2020, 3:37 PM Marco Sulla 
wrote:

> On Thu, 30 Jul 2020 at 19:24, Steven D'Aprano  wrote:
>
>> You can't insert a key in a specific position. If I have this dict:
>>
>> mydict = {'a': 1, 'c': 3, 'd': 4, 'e': 5}
>>
>> I can't insert 'b':2 between keys 'a' and 'c', except by creating a new
>> dict.
>>
>
> Not sure about this. In C code, dicts are a hashtable and an array of
> items. In theory, nothing prevents you from inserting a new key in a
> specific position of the key array instead of at the end.
>

Nothing but the cost of shifting successive elements by 1 and sometimes
copying the entire array to a new, larger array.

Would these be the *non-mutating* methods desired of insertion-ordered
dicts?

.iloc[sli:ce]
.iloc[int]
.iloc[[list,]]
.iloc[callable]
.iloc[bitmask]

.index(key)



___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/PX6OBI4FFJXLLTWWSA5O7SKCD3L6KMWP/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TIQKQ3WP2NMC6MIESMOBCW2SAHXUP74T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Marco Sulla

On Thu, 30 Jul 2020 at 19:24, Steven D'Aprano  wrote:

> You can't insert a key in a specific position. If I have this dict:
>
> mydict = {'a': 1, 'c': 3, 'd': 4, 'e': 5}
>
> I can't insert 'b':2 between keys 'a' and 'c', except by creating a new
> dict.
>

Not sure about this. In C code, dicts are a hashtable and an array of
items. In theory, nothing prevents you from inserting a new key in a
specific position of the key array instead of at the end.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PX6OBI4FFJXLLTWWSA5O7SKCD3L6KMWP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Steven D'Aprano

On Thu, Jul 30, 2020 at 08:52:34AM -0700, Christopher Barker wrote:

> The point being that dicts are now ordered, so maybe it makes sense to add
> some operations that make it natural and easy to work with them in an
> ordered way -- you know, like slicing :-)

Dicts are not sequences and cannot support ordering operations like
insert, sort, prepend, reverse or slice assignment. They merely preserve
insertion order.

You can't insert a key in a specific position. If I have this dict:

mydict = {'a': 1, 'c': 3, 'd': 4, 'e': 5}

I can't insert 'b':2 between keys 'a' and 'c', except by creating a new
dict. Nor can I sort the keys, or reverse them, except by putting them
into a list and sorting or reversing the list, and only then putting
them into a dict.

We could read the key and/or value at index N, that is straight forward
if we can decide on a colour of the bike shed. But I don't believe that
there is an straight forward way of replacing the key at index N, or of
inserting keys, or swapping them, or directly changing their order.

Dicts are, sort of, like a stack: you can "insert" (append) new keys at
the end, but not in the middle.

So let's please stop over-generalising by saying dicts are "ordered"
without the qualifier that they only preserve insertion order, just like
collections.OrderedDict.

The minimal API we could support would be a getter that returns the key
at index N. Once you have the key, you can get or set the value. So all
we really need is that one single getter. Anything else is gravy.

dict.getkey(n) # returns the key at index n by insertion order

I wouldn't object to:

dict.getitem(n) # returns (key, value) at index n

since it's hardly more trouble to return both the key and the value at
the same time. But more than that seems difficult to justify, and
supporting slices seems like overkill. You can always get them with a
comprehension:

[mydict.getkey(n) for n in range(3, 49, 7)]

--
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/ICROBIPAJEXZUV3J7U2QTTCGUL7HIODU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread David Mertz

On Thu, Jul 30, 2020 at 11:55 AM Christopher Barker 
wrote:

>  > smaller_dict = dict(islice(large_dict.items(), 0, 255))
> well, beauty is in the eye of the beholder, of course. But really, you
> think it's pretty to import itertools, then make a function call, for
> what's very much a slicing operation?
>

In your post that introduced that,that is *exactly* what I thought of when
you first described the issue, before I read down to your solution.  It
took a fraction of a second to think of for me.  It *is* slightly verbose.
What do you think about this in current Python:

>>> large = {i:i**2 for i in range(1000)}
>>> import sys
>>> from itertools import islice
>>> class IterSlicer:
... def __getitem__(self, what):
... it, sl = what
... return islice(it, sl.start or 0, sl.stop or sys.maxsize,
sl.step or 1)
...
>>> IS = IterSlicer()
>>> dict(IS[large.items(), 3:10:2])
{3: 9, 5: 25, 7: 49, 9: 81}
>>> from itertools import count
>>> set(IS[count(), 10:100:9])
{64, 37, 73, 10, 46, 82, 19, 55, 91, 28}

Potentially IterSlicer, or IS, could live in itertools (or more likely
more-itertools) just to provide a short way to use slice notation with an
arbitrary iterable... not limited to dict.items().


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OEO7XAM4FJM2742M4AWJVEEHBGIKZPJV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Christopher Barker

On Wed, Jul 29, 2020 at 7:23 AM Wes Turner  wrote:

> .iloc[] is the Pandas function for accessing by integer-location:
>
> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html
>

I'm not sure I quite get the analogy here, but ...

>>> odict.iloc[3]
>
>
> Would this be the only way to access only item 4 from an odict of length
> greater than 4 with slice notation for dict views generated from selection
> by integer-location?
>

are suggesting that we add another attribute to dicts to provide "index"
access? I'm not sure the advantage of that, as we already have the dict
views, so I guess it's nice not to have the type the parentheses:

odict.items()[3] IS a bit klunkier than odict.iloc[3]

(Side note: why ARE the views provided by methods, rather than properties?
But I digress ...)

> >>> odict[3:4]
>

that would be possible.

What does this do? Confusing to a beginner:
>
> >>> odict[3,]
>

no more confusing than it is for Sequences, is it?

In [4]: l[3,]

---
TypeError Traceback (most recent call last)
 in 
> 1 l[3,]

TypeError: list indices must be integers or slices, not tuple

But yes, if we allowed dicts to be indexable with slices, they still
couldn't be indexed by single indexes (unless that value happened to be a
key) so there would be no way to get a single item by index, as a length-1
slice would presumably return a dict with one item.

So yeah, if indexing, slicing were to happen, it would have to be in the
views.

But this does bring up that there are a lot of places where slice notation
could be used outside of [] -- and that would be handy for use cases like
pandas, even if not for dicts.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZSVYRCBPH2OQWKWHD4363C4EVDOXS2JD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-30 Thread Christopher Barker

On Tue, Jul 28, 2020 at 10:26 PM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Christopher Barker writes:
>  > from itertools import islice
>  >
>  > smaller_dict = dict(islice(large_dict.items(), 0, 255))
>  >
>  > which works, and isn't doing any unnecessary copying, but it's pretty
>  > darn ugly, as far as I'm concerned.
>
> In your application, I think that's just pretty, myself.

well, beauty is in the eye of the beholder, of course. But really, you
think it's pretty to import itertools, then make a function call, for
what's very much a slicing operation?

Python has been shifting away from a sequence-focused to an iteration
focused approach for a while -- and while I do see the benefits, I also see
the drawbacks. And it seems to be coupled with what might be called a shift
to a functional style

This reminds me a bit of a thread on this list a couple years(?) back,
about a built in "group by" operation. It petered out, but one point of
disagreement was whether folks wanted an API that was more "functional",
e.g. map-like, or comprehension based. Comprehensions are functional, and
at least generator comprehensions (expressions) are all about iterables,
but there is a different style as to whether you are more using expressions
or passing functions around. I personally far prefer the comprehension
style, but was surprised by how many folks prefered the "functional" style.

This is not exactly the same, but it "feels" the similar to me -- I really
prefer things like expressions and slice notation to a pile of nested
function calls.

Think about it -- if Python had started out being built around iterables,
and itertools had been there from the beginning, would Sequences simply be
iterables that happened to have a length? And would people be expressing
that:

itertools.islice(a_list, 0, 100, 2)

was a pretty way to get a portion of a list -- why would we want "slicing"
notation at all?

That's being dramatic, but to bring this back around, the original title of
this thread is:"Access (ordered) dict by ..."

The point being that dicts are now ordered, so maybe it makes sense to add
some operations that make it natural and easy to work with them in an
ordered way -- you know, like slicing :-)

I don't expect this to be accepted, folks didn't seem to like the idea
much, and it's not a very big deal, as we've seen this is only the second
use case anyone has come up with (the other being selecting a random item
out of a dict).

But it did strike me when I had the use-case.

And I'm well aware of islice, and yet it still took me a LOT longer to
write that code than it would have to do a slice :-)

that's missing is slice notation.  But that's probably not hard to
> implement in terms of the islice function, just completely redundant.
>

I don't follow here -- the slice notation is very important here -- I'm not
sure what you mean by "implement in terms of the islice function", but that
does bring up another point:

islice does not provide a way to do the equivalent of negative indexes --
because it's designed to work with iterables that don't have a length. But
if you ARE using it on a Sequence -- that's a missing feature.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KAFHZT3A2UJVYPKJ5Y6Y7HAYFJZVMEO4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-29 Thread Wes Turner

.iloc[] is the Pandas function for accessing by integer-location:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html

"""
Purely integer-location based indexing for selection by position.

.iloc[] is primarily integer position based (from 0 to length-1 of the
axis), but may also be used with a boolean array.

Allowed inputs are:

- An integer, e.g. 5.
- A list or array of integers, e.g. [4, 3, 0].
- A slice object with ints, e.g. 1:7.
- A boolean array.
- A callable function with one argument (the calling Series or DataFrame)
and that returns valid output for indexing (one of the above). This is
useful in method chains, when you don’t have a reference to the calling
object, but would like to base your selection on some value.

.iloc will raise IndexError if a requested indexer is out-of-bounds, except
slice indexers which allow out-of-bounds indexing (this conforms with
python/numpy slice semantics).
"""

https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#selection-by-position

I don't remember why df.iloc[3] is preferred over directly getitem'ing
df[3] ?

- Because `3` could be a key or an integer-location; to avoid ambiguity
- Because changing between df[3] and df.loc[3] and df[3:5] and df.iloc[3:5]
is unnecessary cognitive burden: it's easier to determine that the code is
intentionally accessing by integer-location from '.iloc' than from ':'
(indicating slice notation)

>>> odict.iloc[3]

Would this be the only way to access only item 4 from an odict of length
greater than 4 with slice notation for dict views generated from selection
by integer-location?

>>> odict[3:4]

What does this do? Confusing to a beginner:

>>> odict[3,]

On Wed, Jul 29, 2020, 1:28 AM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Christopher Barker writes:
>
>  > from itertools import islice
>  >
>  > smaller_dict = dict(islice(large_dict.items(), 0, 255))
>  >
>  > which works, and isn't doing an unnecessary copying but it's pretty
>  > darn ugly, as far as I'm concerned.
>
> In your application, I think that's just pretty, myself.  The only thing
> that's missing is slice notation.  But that's probably not hard to
> implement in terms of the islice function, just completely redundant.
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/LPHMWNCI3VPPN7FEBEUJR4K2QDVNCPX3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/X4VOHLF4PBDYW7WI7YQIGO4ZZ4KT5Z6N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-28 Thread Stephen J. Turnbull

Christopher Barker writes:

 > from itertools import islice
 > 
 > smaller_dict = dict(islice(large_dict.items(), 0, 255))
 > 
 > which works, and isn't doing an unnecessary copying but it's pretty
 > darn ugly, as far as I'm concerned.

In your application, I think that's just pretty, myself.  The only thing
that's missing is slice notation.  But that's probably not hard to
implement in terms of the islice function, just completely redundant.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LPHMWNCI3VPPN7FEBEUJR4K2QDVNCPX3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-28 Thread Christopher Barker

re-awakening this thread, because I just happened on an actual real-world
use case:

I need the first 255 items in a dict, as a dict. Order is important, and I
can count on an order-preserving dict.

I ended up with:

from itertools import islice

smaller_dict = dict(islice(large_dict.items(), 0, 255))

which works, and isn't doing an unnecessary copying but it's pretty darn
ugly, as far as I'm concerned.

I could also do it the old fashioned way:

smaller_dict = {}
for k, v in dict.items():
if i > 255:
break
 smaller_dict[k] = v

which is worse.

I'd much rather:

dict(large_dict.items[:255])

I'd even more prefer:

dict[:255]

which would, in theory be possible, as slices are not hashable, so no one
is doing this in current code as an actual dict key, but yeah, that's too
much magic to ask for.

To be fair, this isn't that much data, and it's in memory already, so:

dict(tuple(large_dict.items())[:255])

would work just fine. but as Python has moved toward a lot more iterables,
and lazy evaluation, it seems like we shouldn't have to make a full copy,
just to pull a fraction of something out.

Note that this is related to an earlier thread I started about having a way
to lazy-slice a Sequence -- essentially islice, but with slice syntax.

This is also related a bit to another thread about keywords in indexing --
there were examples there where it would be nice to have slice syntax in
more places, like function calls. then this could be:

So: anyone have a cleaner way to accomplish this without any changes to
Python?

This is still an oddball use case, so may not be a reason to change Python
-- but it IS a real world, operational code, use case :-)

-CHB

On Sun, Jul 12, 2020 at 9:55 PM Christopher Barker 
wrote:

> On Sat, Jul 11, 2020 at 1:33 PM David Mertz  wrote:
>
>> On Sat, Jul 11, 2020 at 3:45 PM Christopher Barker 
>> wrote:
>>
>>> random.choice(the_dict.keys())
>>>
>>> is a little easier than:
>>>
>>> random.choice(list(the_dict.keys())
>>>
>>
>> Ummm... don't you mean:
>>
>> random.choice(list(the_dict))
>>
>
> well, sure, though I have to say that I think that that's an unfortunate
> confusing thing about python dicts. IN fact, I doubt there are many uses at
> all for dict.keys() -- most uses can jsut use the dict.
>
> I certainly see newbies write:
>
> if this in dict.keys():
>
> pretty often.
>
> maybe adding indexing to the dict views will give the dict_keys object
> more reason to exist :-)
>
> -CHB
>
>
>
>
>
>
>
>
>> If it's keys you care about I've saved you one character over your
>> proposed style, while also reading better to me.  It's only for .items()
>> where it doesn't work.  And honestly, just looking up the value from the
>> random key is not hard.
>>
>> In any case, if "reservoir sampling" is the goal here, we should just add
>> a function `random.reservoir_sample()` to accommodate using iterators
>> rather than sequences (https://en.wikipedia.org/wiki/Reservoir_sampling)
>>
>>
>> --
>> The dead increasingly dominate and strangle both the living and the
>> not-yet born.  Vampiric capital and undead corporate persons abuse
>> the lives and control the thoughts of homo faber. Ideas, once born,
>> become abortifacients against new conceptions.
>>
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BXSOYCMJ2YN7NUKXFDVFBXOMAHZR45L5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-21 Thread Wes Turner

On Mon, Jul 13, 2020, 3:04 PM Christopher Barker 
wrote:

> Is there ANY chance of setting the default reply-to to the list? I know
> everyone else thinks that's a bid idea, but I doubt this was actually
> intended just for me.
>
> If it was, I apologize for bringing it back on the list.
>
> On Sun, Jul 12, 2020 at 10:12 PM Inada Naoki 
> wrote:
>
>> On Mon, Jul 13, 2020 at 1:51 PM Christopher Barker 
>> wrote:
>> As I and Eric already said several times, we can do it in C already --
>> itertools.islice.
>>
>> ...
>
> itertools.islice is implemented in C too. No need to call next() from
>> Python stack.
>>
>
> Ahh -- thanks! I had no idea islice() was in C -- yes, then, there
> probably is little to gain from a C implementation.
>
> I'd still be interested in the performance comparison, but not enough to
> try to write a direct-access-to-the-internals-of-dict C version myself :-)
>
> I still like the idea of indexing the order-preserving dict, but we do
> seem to not have any compelling use cases.
>

Some ideas for optimizing lookup of ordered indexed structures:

https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html :

> loc, .iloc, and also [] indexing can accept a callable as indexer. See
more at Selection By Callable.

Pandas > Enhancing performance > Numba > Vectorize
https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html#vectorize

(
https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html
)

https://docs.rapids.ai/api/cudf/stable/guide-to-udfs.html#Numba-Kernels-on-CuPy-Arrays

https://arrow.apache.org/docs/python/pandas.html#handling-pandas-indexes

> Methods like pyarrow.Table.from_pandas() have a preserve_index option
which defines how to preserve (store) or not to preserve (to not store) the
data in the index member of the corresponding pandas object. This data is
tracked using schema-level metadata in the internal arrow::Schema object.

... Type signatures are particularly helpful for indexes.


> Particularly if someone does get a reservoir sampling implementation in
> the stdlib. (see another thread for that)
>
> -CHB
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ZL2VYGVO7MB63TAJKDL4YKSB64X5YP55/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RHTALEYSBRERRVW7WHDU2EEJZOAVW3U7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-21 Thread Guido van Rossum

I think this cannot just be considered a bug fix, and it seems somewhat
fundamental (catching arbitrary exceptions is controversial), so I
recommend finding a core dev to sponsor a PEP. (Or finding one who thinks
it is obviously a bug and will approve a PR.)

On Tue, Jul 21, 2020 at 04:43 Dominik Vilsmeier 
wrote:

> On 07.07.20 19:41, Stephen J. Turnbull wrote:
>
> > Dominik Vilsmeier writes:
> >
> >   > So if you're dealing with items views and want to compare them to a
> set
> >   > representing dict items, then you need an extra `try/except` in
> order to
> >   > handle non-hashable values in the items view.
> >
> > Sounds like you have a change to propose here, then.  Put the
> > try/except in the __eq__ for the items view class when comparing
> > against a set.  I would expect it to be accepted, as comparing items
> > views is pretty expensive so the slight additional overhead would
> > likely be acceptable, and if you get the exception, you know the
> > equality comparison against a set is false since a set cannot contain
> > that element, so this possibility can't affect worst-case performance
> > by much, if at all.
>
> So it seems there is a consensus on the fact that this is undesirable
> behavior and it can be fixed relatively easy.
>
> What's the next step then? Should this be discussed further on the
> mailing list? Should I open an issue at bpo?
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/G7N7WV5C6JKLIFAA5NGUXW7VEH4CMRKT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
--Guido (mobile)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RLJCH2ENXOLCTU4JTK7TCF4NBRC3RUUH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-18 Thread Rob Cliffe via Python-ideas




On 13/07/2020 18:42, Barry Scott wrote:



On 13 Jul 2020, at 05:55, Christopher Barker > wrote:


well, sure, though I have to say that I think that that's an 
unfortunate confusing thing about python dicts. IN fact, I doubt 
there are many uses at all for dict.keys() -- most uses can jsut use 
the dict.


I use key() all the time to sort the keys before printing.

for name in sorted(dict.key()):
print(name, dict[name)


Barry



But you could use items() instead:
    for name, val in sorted(dict.items()):
        print(name, val)

Rob Cliffe
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XT3LHIUWKJ3HIPWR2ZCOUFU7HTRWORZF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-14 Thread Barry Scott




> On 13 Jul 2020, at 18:56, Paul Moore  wrote:
> 
> On Mon, 13 Jul 2020 at 18:42, Barry Scott  wrote:
> 
>> On 13 Jul 2020, at 05:55, Christopher Barker  wrote:
>> 
>> well, sure, though I have to say that I think that that's an unfortunate 
>> confusing thing about python dicts. IN fact, I doubt there are many uses at 
>> all for dict.keys() -- most uses can jsut use the dict.
>> 
>> I use key() all the time to sort the keys before printing.
>> 
>> for name in sorted(dict.key()):
>>print(name, dict[name)
> 
> Why not just use sorted(dict)?

Ooo shiny I do like that version of sorted(d).

I got curious and found that iter(d) is the same as iter(d.keys())

Barry
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J4S62GNXB4J7W5AVAQFDVCC2NRP4AMRG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-13 Thread Christopher Barker

Is there ANY chance of setting the default reply-to to the list? I know
everyone else thinks that's a bid idea, but I doubt this was actually
intended just for me.

If it was, I apologize for bringing it back on the list.

On Sun, Jul 12, 2020 at 10:12 PM Inada Naoki  wrote:

> On Mon, Jul 13, 2020 at 1:51 PM Christopher Barker 
> wrote:
> As I and Eric already said several times, we can do it in C already --
> itertools.islice.
>
> ...

itertools.islice is implemented in C too. No need to call next() from
> Python stack.
>

Ahh -- thanks! I had no idea islice() was in C -- yes, then, there probably
is little to gain from a C implementation.

I'd still be interested in the performance comparison, but not enough to
try to write a direct-access-to-the-internals-of-dict C version myself :-)

I still like the idea of indexing the order-preserving dict, but we do seem
to not have any compelling use cases.

Particularly if someone does get a reservoir sampling implementation in the
stdlib. (see another thread for that)

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZL2VYGVO7MB63TAJKDL4YKSB64X5YP55/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-13 Thread Christopher Barker

On Mon, Jul 13, 2020 at 11:15 AM Rob Cliffe 
wrote:

> On 13 Jul 2020, at 05:55, Christopher Barker  wrote:
>
> In fact, I doubt there are many uses at all for dict.keys() -- most uses
> can jsut use the dict.
>
> But you could use items() instead:
> for name, val in sorted(dict.items()):
> print(name, val)
>

Sure. But my point was that the dict_keys object is very often unnecessary,
not that all the views aren't useful. And it really is mostly only there
for symmetry with dict_items(), which IS needed.

I'm on the fence though -- I think it adds clarity, even if it's
unnecessary.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LJRVCD54G7IUMTX5HUADOJS5URRYWBQQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-13 Thread Christopher Barker

>>  I doubt there are many uses at all for dict.keys() -- most uses can
jsut use the dict.

> >
> > I use key() all the time to sort the keys before printing.
> >
> > for name in sorted(dict.key()):
> > print(name, dict[name)
>
> Why not just use sorted(dict)?


Thanks for so nicely making my point :-)

Though I do like the keys() version: it seems more obvious to me.

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Y53IRUZWOF7NK3CI5T5UQQY67IXMHK6U/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-13 Thread Paul Moore

On Mon, 13 Jul 2020 at 18:42, Barry Scott  wrote:

> On 13 Jul 2020, at 05:55, Christopher Barker  wrote:
>
> well, sure, though I have to say that I think that that's an unfortunate 
> confusing thing about python dicts. IN fact, I doubt there are many uses at 
> all for dict.keys() -- most uses can jsut use the dict.
>
> I use key() all the time to sort the keys before printing.
>
> for name in sorted(dict.key()):
> print(name, dict[name)

Why not just use sorted(dict)?
Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VCPJA3YZGFR7FTNHXDZYE57XTE7XB5ZZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-13 Thread Barry Scott



> On 13 Jul 2020, at 05:55, Christopher Barker  wrote:
> 
> well, sure, though I have to say that I think that that's an unfortunate 
> confusing thing about python dicts. IN fact, I doubt there are many uses at 
> all for dict.keys() -- most uses can jsut use the dict.

I use key() all the time to sort the keys before printing.

for name in sorted(dict.key()):
print(name, dict[name)


Barry

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DIAEDDJ5A3Y7BFSH452KPGRCAIKHLYNL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-12 Thread Christopher Barker

On Sat, Jul 11, 2020 at 1:33 PM David Mertz  wrote:

> On Sat, Jul 11, 2020 at 3:45 PM Christopher Barker 
> wrote:
>
>> random.choice(the_dict.keys())
>>
>> is a little easier than:
>>
>> random.choice(list(the_dict.keys())
>>
>
> Ummm... don't you mean:
>
> random.choice(list(the_dict))
>

well, sure, though I have to say that I think that that's an unfortunate
confusing thing about python dicts. IN fact, I doubt there are many uses at
all for dict.keys() -- most uses can jsut use the dict.

I certainly see newbies write:

if this in dict.keys():

pretty often.

maybe adding indexing to the dict views will give the dict_keys object more
reason to exist :-)

-CHB








> If it's keys you care about I've saved you one character over your
> proposed style, while also reading better to me.  It's only for .items()
> where it doesn't work.  And honestly, just looking up the value from the
> random key is not hard.
>
> In any case, if "reservoir sampling" is the goal here, we should just add
> a function `random.reservoir_sample()` to accommodate using iterators
> rather than sequences (https://en.wikipedia.org/wiki/Reservoir_sampling)
>
>
> --
> The dead increasingly dominate and strangle both the living and the
> not-yet born.  Vampiric capital and undead corporate persons abuse
> the lives and control the thoughts of homo faber. Ideas, once born,
> become abortifacients against new conceptions.
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3RPW7EPXHICBPM4RXENZNZAXRGJJDKVV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread Inada Naoki

On Sun, Jul 12, 2020 at 4:43 AM Christopher Barker  wrote:
>
> > The existing dictionary memory layout doesn't support direct indexing 
> > (without stepping), so this functionality is not being added as a 
> > requirement.
>
> But it does make it much more efficient if the stepping is done inside the 
> dict object by code that knows its internal structure. Both because it can be 
> in C, and can be done without any additional references or copying. yes, it's 
> all O(n) but a very different constant.
>

It's just a difference in proportional constants.
If the performance difference is really important, dict_views must
have `d.items().tolist()` (replacement for `list(d.items())`) before
random indexing. It is much more used.

Currently, list(iterable) doesn't have any specialized algorithm for
dict views. (As far as I know, it doesn't have specialized algorithm
even for dict).

>
> > If random.choice should support non-sequence ordered container,
> just propose it to random.choice.
>
> That would indeed solve the usability issue, and so may be a good idea,
>
> The problem here is that there is no way for random.choice to efficiently 
> work with generic Mappings. This whole discussion started because now that 
> dicts preserve order, there is both a logical reason, and a practical 
> implementation for indexing. But if that is not exposed, then 
> random.choice(), nor any other function, can take advantage of it.
>

Ditto.  Iterating internal structure directly is not so important.
And there is only little performance difference between current dict
and previous dict implementation for iteration.
I suppose new dict implementation is faster only 20~40%, when dict is
clean (no item is removed yet).

So I don't think preserving order is good reason to support indexing
while `random.choice` is the only use case.

> Which would lead to adding a random_choice protocol -- but THAT sure seems 
> like overkill.
> (OK, you could have the builtin random.choice check for an actual dict, and 
> then use custom code to make a random selection, but that would really be a 
> micro-optimization!)

I already wrote sample code using `itertools.islice()`. It works for
all containers with len() and iterator.
No need for adding protocol.

> If a feature is useful, and doesn't conflict with another feature, then we 
> can add it.

I believe this is a bad idea. It leads software to be huge,
unmaintainable, and unusable.
A Relatively high bar must be set for adding a feature to builtin type
than adding a third party package on PyPI.

Regards,
-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EHUQLA2W4YQFXTCRF6W5BOY4UH7C6RKU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread David Mertz

On Sat, Jul 11, 2020 at 4:33 PM David Mertz  wrote:

> In any case, if "reservoir sampling" is the goal here, we should just add
> a function `random.reservoir_sample()` to accommodate using iterators
> rather than sequences (https://en.wikipedia.org/wiki/Reservoir_sampling)
>

Doh! I mean *iterables* not *iterators* of course.

This would actually be a reasonable and useful function to have, and
sampling from dict.items() would fall out transparently from that.

-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZNSL5SMUVRPNROFLWVLV56HCULP7CBGT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread David Mertz

On Sat, Jul 11, 2020 at 3:45 PM Christopher Barker 
wrote:

> random.choice(the_dict.keys())
>
> is a little easier than:
>
> random.choice(list(the_dict.keys())
>

Ummm... don't you mean:

random.choice(list(the_dict))

If it's keys you care about I've saved you one character over your proposed
style, while also reading better to me.  It's only for .items() where it
doesn't work.  And honestly, just looking up the value from the random key
is not hard.

In any case, if "reservoir sampling" is the goal here, we should just add a
function `random.reservoir_sample()` to accommodate using iterators rather
than sequences (https://en.wikipedia.org/wiki/Reservoir_sampling)


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GIIGKOEDIJ3NLMUYJQ3HXFK5ZMI5IQX6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread David Mertz

On Sat, Jul 11, 2020 at 3:45 PM Christopher Barker 
wrote:

> On Fri, Jul 10, 2020 at 12:45 PM David Mertz  wrote:
> > The strongest argument I've seen is: `list(d.items())` adds six
> characters.
>
> 1) Matching our mental model / usability: if I want the nth item (or a
> random item) from a dict, I want to ask for that -- I don't want to make a
> list, just to index it and throw it away. the list(d.items()) idiom is the
> right one if I actually need a list -- it's a bit awkward to have to make a
> list, just to throw it away. 2) Performance: making an entire list just to
> get one item out is a potentially expensive operation. Again, for the
> limited use cases, probably not a big deal, I'm having a really hard time
> imagining a application where that would be a bottleneck, but it is *a*
> reason, if not a compelling one.
>

For the single case of wanting JUST ONE random key/val pair from a
dictionary EVER, I agree random.choice(d.items()) looks a little better.  I
even agree that it's something that probably seems obvious to try the first
time.

However, once you want to get multiple items, the amortized costs starts to
matter.  I think that latter situation is likely to be VASTLY more common
in real code... but I'm just saying that without evidence.  Still, this is
such a narrow situation, I just don't see it as justifying a language
change (the "get an item, only once" case).

Code that I can easily imagine writing (although I'm not sure I actually
have):

items = dict.items()
while True:
k, v = random.choice(items)
if satisfies_need(k, v):
do_stuff(k, v)
break

The hypothetical view-sequence would be an attractive nuisance that would
hurt performance in all such similar code.

-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/57GW6PGVJ3WAF6A4NKMERIOHI2SVMSGI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread Christopher Barker

I had a nice note almost written yesterday, but now there've been a bunch
more discussion, so I'm going to try to hit a few points that have been
recently made.

TL;DR: I personally think it would be a nice feature to add indexing to the
dict views. But to be fair, the only real use case I've seen is
random.choice(), so it's really not very compelling. And it seems the
consensus is coming down on the side of no.

However, I find I disagree with many of the "no" arguments, other than:
"it's too much churn for little gain", so I'm going to make my points,
thinking that maybe that trade-off will be rethought. So if yiure firmly on
the side of no, I guess there's no point in reading this, but I do hope it
will be considered.

On Fri, Jul 10, 2020 at 12:45 PM David Mertz  wrote:
> The strongest argument I've seen is: `list(d.items())` adds six
characters.

That's a misrepresentation: the reason to prefer not to use the list call
is twofold:

1) Matching our mental model / usability: if I want the nth item (or a
random item) from a dict, I want to ask for that -- I don't want to make a
list, just to index it and throw it away. the list(d.items()) idiom is the
right one if I actually need a list -- it's a bit awkward to have to make a
list, just to throw it away.

2) Performance: making an entire list just to get one item out is a
potentially expensive operation. Again, for the limited use cases, probably
not a big deal, I'm having a really hard time imagining a application where
that would be a bottleneck, but it is *a* reason, if not a compelling one.

> Moreover, even apart from the work of maintaining the feature itself, the
attractive nuisance of getting O(N) behavior rather than O(1) seems like a
strong anti-feature.

Yes, this is the only anti-feature I've seen described in this thread. But
it's only an anti-feature for the use case of making multiple indexing
operations from the same dict view, without changes to the dict. It's a
feature if you need to make only one (or very few) indexing operations from
the same non-mutated dict. After all, that's exactly why we have the dict
views in the first place: you don't want to have to make an unnecessary
copy if don't need to. That clearly applies to iteration and membership:
why not to the "getting one item out" case?

But of course, it is indeed an attractive nuisance in some cases, which is
different than the other view use cases: they are the same or more
efficient than the old "make a list" approach, whereas this would be more
efficient in some cases, and less in others -- so users would need to
evaluate the trade offs, and many wouldn't even know they should think
about that. Overall though, I think that folks would still need to make a
list if they wanted to do any other MutableSequence operations (or be
clearly working with a copy), so I don't think there's all that much danger
is this feature being accidentally used.

On Fri, Jul 10, 2020, 1:07 PM Stestagg  wrote:
>
>> I don't mind the shooting down, as long as the arguments make sense :D.
>>
>
I agree here for sure: I've no problem with folks having a different
opinion about the value of the trade offs, but I think the trade offs have
been misrepresented -- hence this post ... (no, I don't think anyone's
misrepresenting anything on purpose -- this is about the technical issues)

> It seems like we're both in agreement that the cost of implementing &
>> maintaining the change is non-zero.
>>
>
Another note there: one of the big costs is implementation and
documentation. But this is Open Source: we can all decide that a feature is
a good idea, but it'll never get done unless someone(s) actually decides it
is worth it, to them, to write the code and docs. If no one does, then it's
not going to happen. So that part of the cost is self limiting. Granted,
once written, it needs to be maintained, but that is a lesser cost, at
least in this case, where it's not a whole new object or anything.

> I don't believe that this feature would steepen the language learning
>> curve however, but actually help to shallow it slightly (Explained more
>> below)
>>
>
I agree here. Granted, it's again, only the one use case, but when my
newbie students have to figure out how to get a random key from a dict,
there is no question that:

random.choice(the_dict.keys())

is a little easier than:

random.choice(list(the_dict.keys())

and a lot easier than (untested):

idx = random.randint(0, len(the_dict))
it = iter(the_dict.keys())
for _ in range(choice):
choice = next(it)

getting an arbitrary one is a bit easier:

choice = next(iter(the_dict.keys()))

In practice, I use this as a teaching opportunity -- but the fact that it
IS a teaching opportunity kind makes my point.

Granted, if this feature were there, there'd be the need to teach folks
about why they want to avoid the attractive nuisance discussed above -- so
I'll set a net-zero.
 > >>> import numpy as np

> > >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
>

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread Random832

On Thu, Jul 9, 2020, at 13:26, Stestagg wrote:
> Obviously, python is a general-purpose, turing complete language, so 
> each of these options can be written in other ways. But it would be 
> nice if the simple, readable versions also worked :D
> 
> The idea that there are future, unspecified changes to dicts() that may 
> or may not be hampered by allowing indexing sounds like FUD to me, 
> unless there are concrete references?

Does the current implementation support indexing? It is certainly possible in 
principle to preserve ordering without indexing, for example if a linked list 
is used to support the ordering, or if items are stored in an array where 
deleted items leave holes permanently until the dict is resized.

I don't know how dict works, but I am not sure how you would support indexing 
while also allowing deletion to be O(1). A specific random key function would 
be narrower in scope than this, and for anyone who wants full sequence support, 
perhaps that could be added to OrderedDict.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/66KEODX2EPDIN5K4Y72DRRUW432V3O3S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-11 Thread Stephen J. Turnbull

Chris Angelico writes:

 > I would pick repeatedly from the same dictionary but it might be
 > mutated in between. So the list would have to be reconstructed
 > fresh every time.

OK, that moves me a couple million Planck lengths away from -1 nm. :-)

I guess in that case if I cared about performance of both dict and
sequence access, I would use a derived type with the actual data store
being a list, and a dict storing the index (this was suggested to the
OP, IIRC).  If I really, *really*, REALLY cared about performance I'd
write an accelerated module (eg, in C for CPython)[1], and take advantage
of special characteristics of my application.  WDYT?

Footnotes: 
[1]  Okay, I've done that for XEmacs, though not yet for Python.  So
that's probably not something I could recommend to typical Pythonistas.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/45DIZOVUO2RJJ7RD653HSOSGDAFYYYZM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Steven D'Aprano

On Fri, Jul 10, 2020 at 06:06:03PM +0100, Stestagg wrote:

> I don't believe that this feature would steepen the language learning curve
> however, but actually help to shallow it slightly (Explained more below)

It steepens it because it's yet another feature to learn. Is it dicts or 
dict views that supports indexing? What's the difference between dict[1] 
and dict.keys()[1]? These things need to be taught.

Remember that there is a cohort of Python programmers who have never, 
ever been exposed to any version of Python where dict.keys returned a 
list. They do not have your expectation "but it used to work!". And that 
cohort is only going to increase.

> What making dict_* types a Sequence will do is make this code (as written)
> behave:
> 1. like it used to do
> 2. like most people seem to expect it to.

The "used to" part is easy to fix: call list. That's what dict.items etc 
used to implicitly do. Now it's explicit.

As for the second part, I agree -- I was surprised that it didn't work, 
but I fear that's because numpy is, in many places, awfully un-Pythonic. 
numpy's design seems to me to be more closely inspired by a hodge-podge 
of mismatched Fortran and C libraries than Python, which is hardly 
surprising.

> Currently numpy does something that I consider unexpected (I'm sure, given
> your previous responses, you'd disagree with this,

Heh, no, I consider numpy to be doing the wrong thing here. I think this 
exposes numpy's pre-iterator origins:

>>> np.array(iter([1, 2, 3, 4]))
array(, dtype=object)

but I'm sure that numpy fans will argue that's a good thing, if you want 
to pass an iterator to array, explicitly converting it to a sequence is 
the right thing to do.

[...]
> > I suspect that even if dict items were indexable, Raymond Hettinger
> > would not be happy with random.sample on dict views.
> >
> 
> I don't know why? I can understand deprecating sets here as they're
> unordered, so the results when seed() has been called are not consistent.
> I don't see why Raymond would object to allowing sampling an ordered
> container, one from which the results will be reproducible.

I wish to back away from predicting what Raymond would say. When I wrote 
that in the wee hours of the morning (local time) it seemed perfectly 
obvious to me why he would dislike it, but after a good sleep I now find 
myself completely unable to recall why that was the case.

So let me backpedal furiously, withdraw that remark, and let Raymond 
speak for himself :-)

[...]
> This is the proposal, I want to make these things Sequences.

But they can't be Sequences, since they are already Sets. They would 
have to be a hybrid of the two, and that, I feel, comes with more 
baggage than just being one or the other.

It's not that I react to the very idea with horror, but it's not a clean 
design to mix Sequence and Set (OrderedSet?) so without a good reason 
for it, I'd rather hold off. Not so much "Never!!!" as "Not yet."

Besides:

- to make views a Sequence, they would have to also support count and 
index methods, so this proposal doesn't go far enough;

- and in Python 2, they weren't merely Sequences, they were full-blown 
actual lists, which supported slicing, in-place sorting, and various 
other methods.

So this proposal doesn't come close to either returning to the Python 2 
days or offering a Sequence API. It's just a hybrid that falls short of 
being either a sequence or a list.

> > ...not that we can jump to the 350th key without
> > stepping through the previous 349 keys.
> 
> The existing dictionary memory layout doesn't support direct indexing
> (without stepping), so this functionality is not being added as a
> requirement.

Oh, I'm sorry, I misunderstood.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FXRP465KOBIB7DABEPHVNX4S2EOXVXWU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Inada Naoki

I'm -1 too.

Making ordered containers to sequence-like only for random.choice is a
wrong idea.

If random.choice should support non-sequence ordered container,
just propose it to random.choice.
And if the proposal was rejected, you can do it by yourself with
helper functions.

>>> import random
>>> from itertools import islice
>>> def nth(iterable, n):
... return next(islice(iterable, n, None))
>>> def choice_from_container(c):
... return nth(c, random.randrange(len(c)))
>>> d = dict.fromkeys(range(1))
>>> choice_from_container(d)
61
>>> choice_from_container(d)
9858
>>> choice_from_container(d.keys())
2436
>>> choice_from_container(d.items())
(7685, None)

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XEXTKEOLJBNWHYPUQMGFTNN4SZQHWLMH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread David Mertz

Definitely -1 after reading all the discussion.

The strongest argument I've seen is: `list(d.items())` adds six characters.

So far, there's absolutely nothing described that cannot easily be
accomplished with list(). Moreover, even apart from the work of maintaining
the feature itself, the attractive nuisance of getting O(N) behavior rather
than O(1) seems like a strong anti-feature.

On Fri, Jul 10, 2020, 1:07 PM Stestagg  wrote:

>
> I too have sometimes proposed what I think of as "minor quality-of-life"
>> enhancements, and had them shot down. It stings a bit, and can be
>> frustrating, but remember it's not personal.
>>
>
> I don't mind the shooting down, as long as the arguments make sense :D.
>
> It seems like we're both in agreement that the cost of implementing &
> maintaining the change is non-zero.
> I'm asserting that the end benefit of this change is also non-zero, and in
> my opinion higher than the cost. But I also acknowledge that the benefit
> may not be enough to overcome the inertia behind getting a change made.
>
> The reason I'm persevering is to try and weed out the immaterial or
> incorrect reasons for not making this change, so hopefully we're left with
> a good understanding of the pros-cons.
>
>
>>
>> The difficulty is that our QOL enhancement is someone else's bloat.
>> Every new feature is something that has to be not just written once, but
>> maintained, documented, tested and learned. Every new feature steepens
>> the learning curve for the language; every new feature increases the
>> size of the language, increases the time it takes to build, increases
>> the time it takes for the tests to run.
>>
>
> Yeah, I can see that more code = more code overhead, and that's got to be
> justified.
> I don't believe that this feature would steepen the language learning
> curve however, but actually help to shallow it slightly (Explained more
> below)
>
>
>>
>> This one might only be one new method on three classes, but it all adds
>> up, and we can't add *everything*.
>
>
>> (I recently started writing what was intended to be a fairly small
>> class, and before I knew it I was up to six helper classes, nearly 200
>> methods, and approaching 1500 LOC, for what was conceptually intended to
>> be a *lightweight* object. I've put this aside to think about it for a
>> while, to decide whether to start again from scratch with a smaller API,
>> or just remove the word "lightweight" from the description :-)
>>
>
> Absolute method count is seldom a standalone indicator of a dead end.
> Often classes with many methods, (especially if they're accompanied by
> lots of helpers)
> are a side-effect of some abstraction failure.  Usually when I'm
> consulting on fixing projects
> with these characteristics, it's a case of the developers not correctly
> choosing their abstractions, or letting them leak.
> It sounds like you let that one get away from you, chalk it up to a
> learning experience.
>
> The "It walks like a zoo, squaks/lows/grunts/chitters like a zoo" problem
> is very real.
> This is more of a "It used to be a duck. Now it walks like a duck, but
> doesn't sound like a duck because it's a coot" problem
>
>>
>> So each new feature has to carry its own weight. Even if the weight in
>> effort to write, effort to learn, code, tests and documentation is
>> small, the benefit gained must be greater or it will likely be rejected.
>>
>> "Nice to have" is unlikely to be enough, unless you happen to be one of
>> the most senior core devs scratching your own itch, and sometimes not
>> even then.
>>
>>
>> > >>> import numpy as np
>> > >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
>> > [[1, 99],
>> >  [2, 23],
>> >  ...
>> > ]
>>
>> That worked in Python 2 by making a copy of the dict items into a list.
>> It will equally work in Python 3 by making a copy of the items into a
>> list.
>>
>> And I expect that even if dict.items() was indexable, numpy would
>> still have to copy the items. I don't know how numpy works in detail,
>> but I doubt that it will be able to use a view of a hash table internals
>> as a fast array without copying.
>>
>> Bottom line here is that adding indexing to dict views won't save you
>> either time or memory or avoid making a copy in this example. All it
>> will save you is writing an explicit call to `list`. And we know what
>> the Zen says about being explicit.
>>
>
> What making dict_* types a Sequence will do is make this code (as written)
> behave:
> 1. like it used to do
> 2. like most people seem to expect it to.
>
> Currently numpy does something that I consider unexpected (I'm sure, given
> your previous responses, you'd disagree with this, but from canvassing
> Python devs, I feel like many people share my opinions here)
> with that code.
>
>
>>
>>
>> > >>> import sqlite3
>> > >>> conn = sqlite3.connect(":memory:")
>> > >>> params = {'a': 1, 'b': 2}
>> > >>> placeholders = ', '.join(f':{p}' for p in params)
>> > >>> statement = f"select {placeholders}"
>> >

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Stestagg

> I too have sometimes proposed what I think of as "minor quality-of-life"
> enhancements, and had them shot down. It stings a bit, and can be
> frustrating, but remember it's not personal.
>

I don't mind the shooting down, as long as the arguments make sense :D.

It seems like we're both in agreement that the cost of implementing &
maintaining the change is non-zero.
I'm asserting that the end benefit of this change is also non-zero, and in
my opinion higher than the cost. But I also acknowledge that the benefit
may not be enough to overcome the inertia behind getting a change made.

The reason I'm persevering is to try and weed out the immaterial or
incorrect reasons for not making this change, so hopefully we're left with
a good understanding of the pros-cons.


>
> The difficulty is that our QOL enhancement is someone else's bloat.
> Every new feature is something that has to be not just written once, but
> maintained, documented, tested and learned. Every new feature steepens
> the learning curve for the language; every new feature increases the
> size of the language, increases the time it takes to build, increases
> the time it takes for the tests to run.
>

Yeah, I can see that more code = more code overhead, and that's got to be
justified.
I don't believe that this feature would steepen the language learning curve
however, but actually help to shallow it slightly (Explained more below)


>
> This one might only be one new method on three classes, but it all adds
> up, and we can't add *everything*.


> (I recently started writing what was intended to be a fairly small
> class, and before I knew it I was up to six helper classes, nearly 200
> methods, and approaching 1500 LOC, for what was conceptually intended to
> be a *lightweight* object. I've put this aside to think about it for a
> while, to decide whether to start again from scratch with a smaller API,
> or just remove the word "lightweight" from the description :-)
>

Absolute method count is seldom a standalone indicator of a dead end.
Often classes with many methods, (especially if they're accompanied by lots
of helpers)
are a side-effect of some abstraction failure.  Usually when I'm consulting
on fixing projects
with these characteristics, it's a case of the developers not correctly
choosing their abstractions, or letting them leak.
It sounds like you let that one get away from you, chalk it up to a
learning experience.

The "It walks like a zoo, squaks/lows/grunts/chitters like a zoo" problem
is very real.
This is more of a "It used to be a duck. Now it walks like a duck, but
doesn't sound like a duck because it's a coot" problem

>
> So each new feature has to carry its own weight. Even if the weight in
> effort to write, effort to learn, code, tests and documentation is
> small, the benefit gained must be greater or it will likely be rejected.
>
> "Nice to have" is unlikely to be enough, unless you happen to be one of
> the most senior core devs scratching your own itch, and sometimes not
> even then.
>
>
> > >>> import numpy as np
> > >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
> > [[1, 99],
> >  [2, 23],
> >  ...
> > ]
>
> That worked in Python 2 by making a copy of the dict items into a list.
> It will equally work in Python 3 by making a copy of the items into a
> list.
>
> And I expect that even if dict.items() was indexable, numpy would
> still have to copy the items. I don't know how numpy works in detail,
> but I doubt that it will be able to use a view of a hash table internals
> as a fast array without copying.
>
> Bottom line here is that adding indexing to dict views won't save you
> either time or memory or avoid making a copy in this example. All it
> will save you is writing an explicit call to `list`. And we know what
> the Zen says about being explicit.
>

What making dict_* types a Sequence will do is make this code (as written)
behave:
1. like it used to do
2. like most people seem to expect it to.

Currently numpy does something that I consider unexpected (I'm sure, given
your previous responses, you'd disagree with this, but from canvassing
Python devs, I feel like many people share my opinions here)
with that code.


>
>
> > >>> import sqlite3
> > >>> conn = sqlite3.connect(":memory:")
> > >>> params = {'a': 1, 'b': 2}
> > >>> placeholders = ', '.join(f':{p}' for p in params)
> > >>> statement = f"select {placeholders}"
> > >>> print(f"Running: {statement}")
> > Running: select :a, :b
> > >>> cur=conn.execute(statement, params.values())
> > >>> cur.fetchall()
> > [(1, 2)]
>
> Why are you passing a view to a values when you could pass the dict
> itself? Is there some reason you don't do this?
>
> # statement = "select :a, :b"
> py> cur=conn.execute(statement, params)
> py> cur.fetchall()
> [(1, 2)]
>
> I'm not an expert on sqlite, so I might be missing something here, but I
> would have expected that this is the prefered solution. It matches the
> example in the docs, which uses a

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Chris Angelico

On Sat, Jul 11, 2020 at 12:45 AM Steven D'Aprano  wrote:
>
> On Fri, Jul 10, 2020 at 02:50:17PM +1000, Chris Angelico wrote:
>
> > And immediately above that part, I said that I had made use of this,
> > and had used it in Python by listifying the dict first. Okay, so I
> > didn't actually dig up the code where I'd done this, but that's a use
> > case. I have *actually done this*. In both languages.
>
> I'm not calling you a liar, I'm just pointing out that saying "I have
> done this" is not a use-case. You must have had a reason for *why* you
> did it, beyond just "because I can!" (or in this case, "because I can't,
> so I used a list instead").
>
> It's the *why* that's important.
>

Okay, fair enough. I don't remember exactly what it was, but basically
I had a collection of things, identified by some sort of name, and
wanted to pick one of the currently available ones. If I *just* had
the things, I would have a list, and if I *just* had the names, I
would also have a list, and either way, I can pick one at random. But
having the dict means I have to go to the extra hassle to listify it.

Unfortunately I don't have the code to hand, and it's a bit hard to
search your hard drive for "a place where I wanted to call
random.choice on a dict but didn't". :)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CEB7UYI6AGL7ACX756JRH65NWY4GSEVH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Ronald Oussoren via Python-ideas




> On 10 Jul 2020, at 15:38, Eric V. Smith  wrote:
> 
> 
> On 7/10/2020 9:20 AM, Paul Moore wrote:
>> On Fri, 10 Jul 2020 at 13:47, Stestagg  wrote:
>> 
>>> The intent of my statement was: The current implementation of dict does not 
>>> allow any reasonable implementation of dict_keys/dict_values/dict_items 
>>> with O(1) index behaviour.  I.e. the performance scalability of any future 
>>> changes to the dict_* types that add direct indexing is *not* likely to be 
>>> adversely affected by changes to the implementation of dict(), unless 
>>> somehow iteration in O(n) time becomes impossible.
>> So you're saying that you'd like to be able to index
>> dict.keys()/dict.values()/dict.items(), but are fine with O(n)
>> performance.
>> 
>> OK,
>> 
>> list(d.items())[n]
> 
> Or next(itertools.islice(d.items(), n, None))
> 
> There are multiple O(n) ways of doing this. If we're not going to require 
> O(1), then I don't see the point of adding another ways of achieving the same 
> thing. And I don't think we should require O(1): I wouldn't want to constrain 
> any implementation for a niche use case.

I agree.

Ronald

—

Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NSPUWPJYS7E6W7YRBPGKH5KFFQK5K5YW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-10 Thread Steven D'Aprano

On Fri, Jul 10, 2020 at 02:50:17PM +1000, Chris Angelico wrote:

> And immediately above that part, I said that I had made use of this,
> and had used it in Python by listifying the dict first. Okay, so I
> didn't actually dig up the code where I'd done this, but that's a use
> case. I have *actually done this*. In both languages.

I'm not calling you a liar, I'm just pointing out that saying "I have 
done this" is not a use-case. You must have had a reason for *why* you 
did it, beyond just "because I can!" (or in this case, "because I can't, 
so I used a list instead").

It's the *why* that's important.

When we ask for use-cases, the implication is that contrived examples 
don't really count:

- to win a bet
- just out of curiosity, to see if it can be done (I've done this)
- to write obfuscated code
- to solve an exercise:

"Exercise 7: choose a random key:value pair from 
a dict, and print the result."

etc. So *on its own* the ability to choose a random key:value pair from 
a dict is not very compelling. If it were combined with a *why* then it 
could become a stronger example, presuming of course that this use-case 
would not be equally well served by listifying the key:value pairs 
first.

As I said, perhaps I just lack imagination.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7YT4V5H64H7CZMXSM3FEMQDQFZK3J22O/
Code of Conduct: http://python.org/psf/codeofconduct/

1 2 >

1 - 100 of 179 matches

Mail list logo