[issue40411] frozen collection.Counter

2020-05-14 Thread paul rubin


paul rubin  added the comment:

Note: PEP 603 may essentially take care of this, if it is accepted.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

paul rubin wrote:
> Yeah I think the basic answer to this ticket is "Python doesn't really have 
> multisets and a proposal to add them should go somewhere else".  Fair 
> enough-- consider the request withdrawn from here.

Not necessarily, python-ideas is just more so the designated location for 
cultivating Python change proposals in their early stages. If you're interested 
enough in this being added to the standard library, you can present your case 
there using some of the guidelines I suggested earlier. Also, if others are 
interested in the feature and have specific use cases for it, that can 
significantly increase the odds of it being added. The python-ideas mailing 
list tends to get much more traffic than an individual open issue on bpo.

In the context of significant new feature proposals, bpo is typically used once 
it's been approved of (with a formal PEP for especially large changes), to 
further discuss the implementation details and for tracking related PRs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


paul rubin  added the comment:

Totally tangential: Veky, your ordinal example would work ok in Haskell and 
you'd have omega work the same way as epsilon0.  Take a look at Herman Ruge 
Jervell's book "Proof Theory" for a really nice tree-based ordinal notation 
that goes much higher than epsilon0.  It would be cool if your student 
implemented it.  Raymond, I agree that ordinals are a very esoteric use of 
multisets ;).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


paul rubin  added the comment:

Yeah I think the basic answer to this ticket is "Python doesn't really have 
multisets and a proposal to add them should go somewhere else".  Fair enough-- 
consider the request withdrawn from here.

Regarding minimalism vs completeness, regarding some feature X (say X is 
multisets), it's reasonable per minimalism to decide not to have X.  But if on 
weighing the use cases you decide to have X after all, I think it's best to 
implement X properly and completely with all the edge cases handled, rather 
than implement a half-baked subset.  That was something Java was historically 
very good at and Python wasn't.  I guess that is one for the theorists though.  
Thanks everyone.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

If you're interested in pursuing this, please follow Kyle's suggestion and move 
this to Python ideas.  The two frozencounter use cases presented are pretty 
exotic, something encountered rarely in a lifetime.  That is well below the 
threshold for growing the standard library.  We cater to the commonplace. It is 
for PyPI to cater to everything else :-)

--
resolution:  -> postponed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Kyle Stanley


Kyle Stanley  added the comment:

paul rubin wrote:
> In that case, it works for what it is made for, so it really then is a 
> separate question of whether implementing multisets properly is worthwhile.

Indeed, that seems to be the primary question to address. Personally, I don't 
have any strong opposition to including a multiset implementation in 
collections, I'm just not yet convinced they have enough widespread practical 
utility (that can't already be provided with Counter or a slightly modified 
version of it) to justify including them in collections.
 
> I dislike the concept of shovelling off basic functionality to 3rd party 
> libraries.  The rejection of that philosophy (Python's old motto "Batteries 
> included") is one of the things that attracted me to Python in the first 
> place.  Though that value is mostly historical now, I'm sad about the loss.  
> It's impossible to write a large Ruby or Javascript (npm) application with 
> 100s of third party modules, every one of which is an attack vector ("supply 
> chain attack" is the current buzzword), and  whose implementations vary 
> widely in quality and usability.

I can certainly sympathize with that sentiment of wanting to avoid excessive 
dependencies, and I think in general that we do still try to include more in 
the Python standard library compared to many other ecosystems. AFAIK, there's 
zero desire for the Python ecosystem to slowly turn into the "dependency hell" 
that NPM has become, with almost every library relying on a chain of minor 3rd 
party utilities.

However, rather than including everything that *might* be useful, the goal has 
been to include features that have substantial practical utility for a decent 
volume of users (and library maintainers), as long as they fit with the 
existing theme of a module (or are widely needed enough to justify a separate 
module) for a couple of primary reasons:

1) Making the stdlib modules as digestible as reasonably possible
2) Maximizing the ratio of practical benefit to implementation and long-term 
maintenance cost, which is especially important since development is 
voluntarily driven

Also, I would consider "batteries included" to still be a general goal, but 
it's practically impossible for the stdlib to fully cover what everyone would 
consider to be "basic functionality" while still maintaining its present 
quality. It's even more complicated by the fact that "basic functionality" will 
differ significantly depending upon who you ask, based upon what they 
personally use Python for and their unique experiences.

So, we tend to lean towards minimalism, focusing on implementing and improving 
features that have the most or strongest concrete use cases for a large volume 
of users. There are of course exceptions to this, such as features that are 
highly useful for a relatively small volume of users, but are difficult to 
implement a proper or optimized version. That has to be determined on a 
case-by-case basis though.

--
nosy:  -veky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Vedran Čačić

Vedran Čačić  added the comment:

Just another usecase: one of my students is writing an implementation of 
ordinal arithmetic in Python as a graduate thesis. FrozenCounters whose keys 
are FrozenCounters and whose values are natural numbers are _exactly_ 
isomorphic (via Cantor's normal form) to countable ordinals below epsilon_0 
(and if a Counter can have itself as a key, we can go above epsilon_0 too).

Examples: zero = f{}
  one = f{zero: 1}
  seven = f{zero: 7}
  omega = f{one: 1}
  w^7*3+5 = f{seven: 3, zero: 5}
  w^w^w = f{f{omega: 1}: 1}
  epsilon0 = f{epsilon0: 1}
  and so on

I realize this is not something everybody does, but just to show that the need 
exists. We started with https://github.com/tamuhey/python-frozen-counter, 
afterwards we rolled our own implementation. But if it were in the stdlib, it 
would have saved us a lot of work.

--
nosy: +veky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


paul rubin  added the comment:

Oops I meant "*without* 100s of third party modules" in the case of ruby gems 
or npm.  There are just a few pip modules that I really use all the time, most 
notably bs4.  I continue to use urllib/urllib2 instead of requests because I'm 
used to them and because they eliminate an unnecessary dependency.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


paul rubin  added the comment:

Kyle, thanks, I saw your comment after posting my own, and I looked at 
Raymond's mailing list post that you linked.  I do think that "completing the 
grid" is a good thing in the cases where it's obvious how to do it (if there's 
one and one obvious way, then doing it that way should work), and having 
missing combinations in the grid is also a form of interface complexity (expect 
subset to work, find that it doesn't, and implement some hack).  Also it seems 
to me that sets, dicts, bags are basic data objects in many languages these 
days, without a lot of design space to move around in.  So we don't have to 
worry much about constraining future choices.  We could just look at Smalltalk 
or Ruby or whatever and just do what they did.  

However, that is philosophical.  I did learn from Raymond's message that 
Counter doesn't really attempt to be a multiset implementation.  In that case, 
it works for what it is made for, so it really then is a separate question of 
whether implementing multisets properly is worthwhile.  (That's as opposed to 
saying we already have a multiset implementation but some functionality is 
missing from it).

I dislike the concept of shovelling off basic functionality to 3rd party 
libraries.  The rejection of that philosophy (Python's old motto "Batteries 
included") is one of the things that attracted me to Python in the first place. 
 Though that value is mostly historical now, I'm sad about the loss.  It's 
impossible to write a large Ruby or Javascript (npm) application with 100s of 
third party modules, every one of which is an attack vector ("supply chain 
attack" is the current buzzword), and  whose implementations vary widely in 
quality and usability.  I looked at the docs for Ruby's multiset gem 
(https://maraigue.hhiro.net/multiset/) and they are partly in Japanese.  Python 
has until the past few years managed to keep away from that kind of thing.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


paul rubin  added the comment:

Yes, the idea was for them to be hashable, to be used as dict keys.  E.g. if 
you use frozensets to model graphs, you'd use frozen multisets for hypergraphs. 
 My immediate use case involved word puzzles, e.g. treating words as bags of 
scrabble tiles with letters on them, and finding out what other words you could 
form from the letters.  It is not a pressing need.  

I'm trying to remember what the other missing operation I ran into was, a few 
days ago.  It may have been subset.  I found a workaround but it seemed a bit 
ugly.  It would be nice if Counters could also be thought of as multisets which 
can do the same things sets can do with unneeded head scratching.  It seems 
like counters and multisets/bags really are different things conceptually.  
Their operations and implementations overlap, but they are different.  E.g. it 
can make sense for a counter to have a negative count of something, but not for 
a bag.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Do you want to freeze a counter to prevent future count changes or bto make it 
to be hashable?  Essentially the only reason we have frozenset is to use sets 
as dict keys to model graphs, but that wouldn't make much sense for counters.

What multiset methods do you need? We already have addition, subtraction, 
union, and intersection.  A symmetric_difference doesn't seem to make sense or 
have a use case.   AFAICT, all that is missing are subset/superset tests which 
were omitted because their semantics conflict with the existing dict equality 
semantics and because the use cases are rare.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Kyle Stanley


Kyle Stanley  added the comment:

Also, adding Raymond to the nosy list in case he has any specific comments 
about a frozen collections.Counter.

--
nosy: +rhettinger
versions: +Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread Kyle Stanley


Kyle Stanley  added the comment:

Rather than starting this out as a bpo issue, I would highly recommend bringing 
forth a proposal for this in python-id...@python.org and explaining in as much 
detail as possible:

1) How a frozen variation of collection.Counter would benefit your specific use 
case
2) The current available workaround you mentioned, issues with it, and the 
comparative benefit your proposal would provide
3) Why it should be included in the standard library, compared to a 3rd party 
package on PyPI

The 3rd one can be a bit tricky, but it typically involves explaining how 
widespread the potential use case(s) would be and how tricky it would be for 
users to implement something like this on their own (from both a functionality 
and performance perspective).

>From a brief glance of the archives, it doesn't look like this specific idea 
>was proposed, other than part of a more generalized one to add frozen 
>equivalents for the entire collections module, which was rejected on the basis 
>of the use cases being too theoretical and overly broad.

Also, while not specific to your particular proposal, I would recommend reading 
over the following post from Raymond Hettinger in the python-ideas archive 
(from the previously mentioned thread): 
https://mail.python.org/archives/list/python-id...@python.org/message/QVBVJU4RNJ5MDKBJ5CNGINYG24WZDZX7/.
 It explains why the collections module tries to be minimalist, and 
specifically the requirement of new features providing a tangible, real-world 
benefit.

It's of course up to you how you decide to format the proposal, but by 
answering the above points and demonstrating clear value in specific use cases, 
it has a much better chance of eventually making into the standard library (or 
at the least, a constructive discussion about it).

--
nosy: +aeros

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40411] frozen collection.Counter

2020-04-27 Thread paul rubin


New submission from paul rubin :

It would be nice to have frozen Counters analogous to frozensets, so they are 
usable as dictionary keys.  One can of course create frozenset(counter.items()) 
but that means the set items are tuples rather than the original set elements, 
so it's no longer quick to check membership.  I can work around this in my 
immediate application but it seems like a shortcoming.  There are some other 
set operations that aren't supported by counters either, that would be nice if 
they are conceptually multisets.

--
components: Library (Lib)
messages: 367472
nosy: phr
priority: normal
severity: normal
status: open
title: frozen collection.Counter
type: enhancement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com