from:"Steve Jorgensen"

[Python-ideas] Abstract dataclasses and dataclass fields

2023-12-21 Thread Steve Jorgensen

I am finding that it would be useful to be able to define a dataclass that is 
an abstract base class and define some of its field as abstract.

As I am typing this, I realize that I could presumably write some code to 
implement what I'm asking for. Maybe it is a good enough idea to make part of 
the standard API in any case though? I'm thinking that a field would be made 
abstract by passing `abstract=True` as an argument to `dataclasses.field()`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TFDJDTM7ZOYKBOPAYSDCM3T7SYD2RIJL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Bind/normalize params for @functools.cache

2023-03-19 Thread Steve Jorgensen

Steve Jorgensen wrote:
> I was surprised to find, when I pass arguments to a function decorated with 
> `@functools.cache` in different, equivalent ways, the cache does not 
> recognize them as the same.
> counter = itertools.count(1)
> @functools.cache
> def example(a, b, c=0):
> return (next(counter), a, b, c)
> example(1, 2)  # => (1, 1, 2, 0)
> example(1, b=2)  # => (2, 1, 2, 0)
> example(1, 2, 0)  # => (3, 1, 2, 0)
> When I wrote my own implementation as a coding exercise, I noticed the same 
> weakness while testing it and solved that by having the decorator function 
> get the signature of the decorated function, then use the bind method of the 
> signature to bind the parameter values, then call the apply_defaults method 
> on the bound arguments, and then finally, use the args and kwargs properties 
> of the bound arguments to make the cache key.
> It seems like functools.cache should do the same thing. If it is undesirable 
> for that to be the default behavior, then it could be optional (e.g. 
> @functools.cache(normalize=True) ).
> I have not tested to see if functools.lru_cache has the same issue. I presume 
> that it does, so my suggestion would apply to that as well.

After saying that, I realized that, if the behavior should be optional, then 
maybe it would make sense to provide another wrapper to normalize the 
parameters instead (see possible implementation below)? On the other hand, 
since the primary use of such a thing would be for caching, maybe it does make 
more sense to include the behavior in 'functools.cache' et al., as I originally 
suggested, or maybe have both.

def bind_call_params(func):
"""
Transform a function to always receive its arguments in the same form
(which are positional and which are keyword) even if its
implementation is less strict than what is described by its
signature.

This is for use in cases where the form of in which the parameters
are passed may be significant to a decorator (e.g. '@functools.cache').
"""
sig = signature(func)

@wraps(func)
def wrapper(*args, **kwargs):
bound = sig.bind(*args, **kwargs)
bound.apply_defaults()
return func(*bound.args, **bound.kwargs)

return wrapper
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B7UC2472UBGCMO2S3NZWRTDZLJ7OOPRJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Bind/normalize params for @functools.cache

2023-03-19 Thread Steve Jorgensen

I was surprised to find, when I pass arguments to a function decorated with 
`@functools.cache` in different, equivalent ways, the cache does not recognize 
them as the same.

counter = itertools.count(1)

@functools.cache
def example(a, b, c=0):
return (next(counter), a, b, c)

example(1, 2)  # => (1, 1, 2, 0)
example(1, b=2)  # => (2, 1, 2, 0)
example(1, 2, 0)  # => (3, 1, 2, 0)

When I wrote my own implementation as a coding exercise, I noticed the same 
weakness while testing it and solved that by having the decorator function get 
the signature of the decorated function, then use the bind method of the 
signature to bind the parameter values, then call the apply_defaults method on 
the bound arguments, and then finally, use the args and kwargs properties of 
the bound arguments to make the cache key.

It seems like functools.cache should do the same thing. If it is undesirable 
for that to be the default behavior, then it could be optional (e.g. 
@functools.cache(normalize=True) ).

I have not tested to see if functools.lru_cache has the same issue. I presume 
that it does, so my suggestion would apply to that as well.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DS72SMJRJNKO3UVDS7ZVKAAPES45PLOQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steve Jorgensen

Paul Moore wrote:
> What's wrong with defining a custom exception? It's literally one line:
> `class InvalidStateError(Exception): pass`. Two lines if you want to put
> the `pass` on its own line.
> The built in exceptions are ones that are raised by the core interpreter.
> Even the stdlib doesn't get builtin exceptions, look at sqlite3.Error, for
> example. Defining a custom exception in the module alongside the function
> that raises it is both normal practice, and far more discoverable.
> Paul
> On Thu, 1 Sept 2022 at 22:42, Steve Jorgensen stevec...@gmail.com wrote:
> > I frequently find that I want to raise an exception when the target of a
> > call is not in an appropriate state to perform the requested operation.
> > Rather than choosing between `Exception` or defining a custom exception, it
> > would be nice if there were a built-in `InvalidStateError` exception that
> > my code could raise.
> > In cases where I want to define a custom exception anyway, I think it
> > would be nice if it could have a generic `InvalidStateError` exception
> > class for it to inherit from.
> > Of course, I would be open to other ideas for what the name of this
> > exception should be. Other possibilities off the top of my head are
> > `BadStateError` or `StateError`.
> > ___
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/NMHNKS...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
OK, but by that logic, why do we have standard exceptions like `ValueError` 
when we could define custom exceptions for the cases where that should be 
raised?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5JGBBE7JEKWYPEQO6NC4B7UFKJN2UK6K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steve Jorgensen

Matthias Görgens wrote:
> > If the target of the call isn't in an appropriate state, isn't that a
> > bug in the constructor that it allows you to construct objects that are
> > in an invalid state?
> > You should fix the object so that it is never in an invalid state rather
> > than blaming the caller.
> > You can't really do that with files that have been closed.
> Unless you disallow manual closing of files altogether.
> That being said, I'd suggest that people raise custom exception, so your
> callers can catch exactly what they want to handle.
> An generic exception like ValueError or the proposed InvalidStateError
> could be thrown by almost anything you call in your block, instead of just
> what you actually intend to catch.

I didn't say that I was talking about a file. In fact, today, I'm talking about 
an object that manages a subprocess. If a caller tries to call a method of the 
manager to interact with the subprocess when the subprocess has not yet been 
started or after it has been terminated, then I want to raise an appropriate 
exception. I am raising a custom exception, and it annoys me that it has to 
simply inherit from Exception when I think that an invalid state condition is a 
common enough kind of issue that it should have a standard exception class in 
the hierarchy.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YC62WQIXUTM3ULVA64SBXBS5YZ3M2XGT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steve Jorgensen

Matthias Görgens wrote:
> > If the target of the call isn't in an appropriate state, isn't that a
> > bug in the constructor that it allows you to construct objects that are
> > in an invalid state?
> > You should fix the object so that it is never in an invalid state rather
> > than blaming the caller.
> > You can't really do that with files that have been closed.
> Unless you disallow manual closing of files altogether.
> That being said, I'd suggest that people raise custom exception, so your
> callers can catch exactly what they want to handle.
> An generic exception like ValueError or the proposed InvalidStateError
> could be thrown by almost anything you call in your block, instead of just
> what you actually intend to catch.

It depends on context whether it makes sense to define a custom exception, and 
I agree that I frequently should define a custom exception. In that case 
though, it would still be nice to have an appropriate generic exception for 
that to inherit from, just as I would inherit from `ValueError` for a special 
case of a value error.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GNBFLWNXWBV54C73MOZJDEXJPDIOVBGM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steve Jorgensen

Jean Abou Samra wrote:
> Le 01/09/2022 à 23:40, Steve Jorgensen a écrit :
> > I frequently find that I want to raise an exception when the target of a 
> > call is not in an appropriate state to perform the requested operation. 
> > Rather than choosing between `Exception` or defining a custom exception, it 
> > would be nice if there were a built-in `InvalidStateError` exception that 
> > my code could raise.
> > In cases where I want to define a custom exception anyway, I think it would 
> > be nice if it could have a generic `InvalidStateError` exception class for 
> > it to inherit from.
> > Of course, I would be open to other ideas for what the name of this 
> > exception should be. Other possibilities off the top of my head are 
> > `BadStateError` or `StateError`.
> > https://docs.python.org/3/library/exceptions.html#ValueError states that
> ValueError is “Raised when an operation or function receives an argument
> that has the right type but an inappropriate value, and the situation is
> not described by a more precise exception such as |IndexError| 
> https://docs.python.org/3/library/exceptions.html#IndexError.” How would
> a "state error" differ from this more precisely? What value would this new
> exception type add? Both ValueError and this proposed StateError are very
> generic.

`ValueError` is about for when the value of an argument passed to the function 
is unacceptable. The exception that I propose would be for when there is 
nothing wrong with any argument value, but the object is not in the correct 
state for that method to be called.

I should have provided an example. One example is when trying to call methods 
to interact with a remote system either before a connection has been made or 
after the connection has been terminated.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/W2G5XNWKS6KHXSCH45QPLFRUMZIVNS4L/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steve Jorgensen

I frequently find that I want to raise an exception when the target of a call 
is not in an appropriate state to perform the requested operation. Rather than 
choosing between `Exception` or defining a custom exception, it would be nice 
if there were a built-in `InvalidStateError` exception that my code could raise.

In cases where I want to define a custom exception anyway, I think it would be 
nice if it could have a generic `InvalidStateError` exception class for it to 
inherit from.

Of course, I would be open to other ideas for what the name of this exception 
should be. Other possibilities off the top of my head are `BadStateError` or 
`StateError`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NMHNKSEZG7UZ6AIFTVGQXVECCNYYVODT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-20 Thread Steve Jorgensen

Ethan Furman wrote:
> On 7/9/22 12:19, Steve Jorgensen wrote:
> > [...] It works great to combine them by defining the dataclass as a mixin 
> > for the Enum class. Why would
> > it not be good to include that as an example in the official docs, assuming 
> > (as I believe) that it is a
> > particularly useful combination?
> > Do you have some real-world examples that show this?
> --
> ~Ethan~

I have only used it in 1 real-world case s far. It's a good use case but not a 
good example case. I'll keep using this pattern though, and I'll probably end 
up with a good example soonish.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FGK4R4ES3STAS2PZLYX5UOV5HZRIFSF2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Dataclasses for complex models, A proposal for datatrees,

2022-07-15 Thread Steve Jorgensen

I think there have not been any replies to this so far because it's too much 
effort to figure out what you're actually suggesting. Can you try to make the 
request again, starting with a clear summary and then breaking out some of the 
details?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7XPPPC63XVXFIXP2WIT6ARRX7CTYPRSX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-12 Thread Steve Jorgensen

Chris Angelico wrote:
> On Mon, 11 Jul 2022 at 03:54, Steve Jorgensen stevec...@gmail.com wrote:
> > David Mertz, Ph.D. wrote:
> > I've seen this thread, and also wondered why anyone could EVER want a
> > dataclass that is an enum.  Nothing I've seen in the thread gives me any
> > hint about that, really.
> > On Sun, Jul 10, 2022 at 7:44 AM Barry Scott ba...@barrys-emacs.org wrote:
> > On 9 Jul 2022, at 22:53, Steve Jorgensen stevec...@gmail.com wrote:
> > I don't think that dataclasses have the limited set of intended uses
> > that you are interpreting them as having. To me, the fact that they can be
> > frozen makes them a good fit with Enum.
> > Please quote the email that you are replying to.
> > It is usually considered a code smell to have a class that is two or more
> > things.
> > This seems to be what you are trying to do.
> > How can one class be a set of fields and also the enum for one of its own
> > fields?
> > I do not understand why this is resonable.
> > Barry
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/V6U7UM...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/HZFZE3...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > --
> > Keeping medicines from the bloodstreams of the sick; food
> > from the bellies of the hungry; books from the hands of the
> > uneducated; technology from the underdeveloped; and putting
> > advocates of freedom in prisons.  Intellectual property is
> > to the 21st century what the slave trade was to the 16th.
> > Sorry, I don't know how I communicated that I was trying to have one class 
> > be a set of fields and also the enum for one of its own fields.
> > I'm really just wanting to have each member of the enum be an instance of a 
> > frozen dataclass. If an of the dataclass fields were of an enum type, then 
> > it would presumably not be for the same enum. In my example, none of the 
> > fields of the dataclass contains an enum. One contains a string, and the 
> > other contains an int.
> > Just throwing an idea out there, but would it work better to have an
> enum-namedtuple instead?
> ChrisA

The only benefit I can think of for namedtuple vs a dataclass is compactness in 
memory, but the number of members of an enum is typically very small. I think 
the extra flexibility of a dataclass makes more desirable for this purpose.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/F4YM66UAQ3GXXBIMPNX6MLEQA22K7UVL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-10 Thread Steve Jorgensen

David Mertz, Ph.D. wrote:
> I've seen this thread, and also wondered why anyone could EVER want a
> dataclass that is an enum.  Nothing I've seen in the thread gives me any
> hint about that, really.
> On Sun, Jul 10, 2022 at 7:44 AM Barry Scott ba...@barrys-emacs.org wrote:
> > On 9 Jul 2022, at 22:53, Steve Jorgensen stevec...@gmail.com wrote:
> > I don't think that dataclasses have the limited set of intended uses
> > that you are interpreting them as having. To me, the fact that they can be
> > frozen makes them a good fit with Enum.
> > Please quote the email that you are replying to.
> > It is usually considered a code smell to have a class that is two or more
> > things.
> > This seems to be what you are trying to do.
> > How can one class be a set of fields and also the enum for one of its own
> > fields?
> > I do not understand why this is resonable.
> > Barry
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/V6U7UM...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/HZFZE3...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > -- 
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.

Sorry, I don't know how I communicated that I was trying to have one class be a 
set of fields and also the enum for one of its own fields.

I'm really just wanting to have each member of the enum be an instance of a 
frozen dataclass. If an of the dataclass fields were of an enum type, then it 
would presumably not be for the same enum. In my example, none of the fields of 
the dataclass contains an enum. One contains a string, and the other contains 
an int.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KWL2FXQ2FKRMGBAB5PMR3GIRAQBC6CLR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-09 Thread Steve Jorgensen

I don't think that dataclasses have the limited set of intended uses that you 
are interpreting them as having. To me, the fact that they can be frozen makes 
them a good fit with Enum.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/V6U7UMQRTLDZ2W6SWREL472L6ZH7MHB5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-09 Thread Steve Jorgensen

Ethan Furman wrote:
> On 7/7/22 09:01, Steve Jorgensen wrote:
> > Actually, maybe these are fundamentally incompatible?
> > Their intended use seems fundamentally incompatible:
> - dataclass was designed for making many mutable records (hundreds, 
> thousands, or more)
> - enum was designed to make a handful of named constants (I haven't yet seen 
> one with even a hundred elements)
> The repr from a combined dataclass/enum looks like a dataclass, giving no 
> clue that the object is an enum, and omitting 
> any information about which enum member it is and which enum it is from.
> Given these conflicts of interest, I don't see any dataclass examples making 
> it into the enum documentation.
> --
> ~Ethan~

Per my subsequent self-reply, they are only incompatible when trying to do them 
at the same time in the same class definition. It works great to combine them 
by defining the dataclass as a mixin for the Enum class. Why would it not be 
good to include that as an example in the official docs, assuming (as I 
believe) that it is a particularly useful combination?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VFGXT4QOWYF3UJVWYOR54GNTKEG2XT7D/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-07 Thread Steve Jorgensen

After some playing around, I figured out a pattern that works without any 
changes to the implementations of `dataclass` or `Enum`, and I like this 
because it keeps the 2 kinds of concern separate. Maybe I'll try submitting an 
MR to add an example like this to the documentation for `Enum`.

In [1]: from dataclasses import dataclass

In [2]: from enum import Enum

In [3]: @dataclass(frozen=True)
   ...: class CreatureDataMixin:
   ...: size: str
   ...: legs: int
   ...: 

In [4]: class Creature(CreatureDataMixin, Enum):
   ...: BEETLE = ('small', 6)
   ...: DOG = ('medium', 4)
   ...: 

In [5]: Creature.DOG
Out[5]: Creature(size='medium', legs=4)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G2VALQ4RIVFKIOKVW4XZAHZMLSZWL2XS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make dataclass aware that it might be used with Enum

2022-07-07 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Perhaps, this has already been addressed in a newer release (?) but in Python 
> 3.9, making `@dataclass` work with `Enum` is a bit awkward.
> Currently, it order to make it work, I have to:
> 1. Pass `init=False` to `@dataclass` and hand-write the `__init__` method
> 2. Pass `repr=False` to `@dataclass` and use `Enum`'s representation or write 
> a custom __repr__
> Example:
> In [72]: @dataclass(frozen=True, init=False, repr=False)
> ...: class Creature(Enum):
> ...: legs: int
> ...: size: str
> ...: Beetle = (6, 'small')
> ...: Dog = (4, 'medium')
> ...: def __init__(self, legs, size):
> ...: self.legs = legs
> ...: self.size = size
> ...:
> In [73]: Creature.Dog
> Out[73]: 

Actually, maybe these are fundamentally incompatible? `@dataclass` is a 
decorator, so it acts on the class after it was already defined, but `Enum` 
acts before that when `@dataclass` cannot have not generated the `__init__` 
yet. Right?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T775WMOLR6TNOXDAU37ZA2FKQB3SMJT6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Make dataclass aware that it might be used with Enum

2022-07-06 Thread Steve Jorgensen

Perhaps, this has already been addressed in a newer release (?) but in Python 
3.9, making `@dataclass` work with `Enum` is a bit awkward.

Currently, it order to make it work, I have to:
1. Pass `init=False` to `@dataclass` and hand-write the `__init__` method
2. Pass `repr=False` to `@dataclass` and use `Enum`'s representation or write a 
custom __repr__

Example:
In [72]: @dataclass(frozen=True, init=False, repr=False)
...: class Creature(Enum):
...: legs: int
...: size: str
...: Beetle = (6, 'small')
...: Dog = (4, 'medium')
...: def __init__(self, legs, size):
...: self.legs = legs
...: self.size = size
...: 

In [73]: Creature.Dog
Out[73]: 
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EXPSE4KQYM5SWPFCWH4QPOTS6UCP5FNL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-07-01 Thread Steve Jorgensen

Dexter Hill wrote:
> Steve Jorgensen wrote:
> > Would we want something more general that could deal with cases where the 
> > input does not have a 1-to-1 mapping to the field that differ only, 
> > perhaps, in type hint? What if we want 1 argument to initializes 2 
> > properties or vice verse, etc.?
> > That's definitely an improvement that could be made, although I think it 
> > would require a large amount of changes. I don't know if you had syntax in 
> > mind for it, or an easy way to represent it, but at least from what I 
> > understand you would probably a whole new function like `field`, but that 
> > handles just that functionality, otherwise it would add a lot of arguments 
> > to `field`.
> Steve Jorgensen wrote:
> > In any case, having a new `InitFn` is worth digging into, I don't think it 
> > needs to have 2 arguments for type since the type annotation already covers 
> > 1 of those cases. I think it makes the most sense for the type annotation 
> > to apply to the property and the type of the argument to be provided either 
> > through an optional argument to `InitFn` or maybe that can be derived from 
> > the signature of the function that `InitFn` refers to.
> > So the use case would be either this:
> ```py
> @dataclass
> class Foo:
> x: InitFn[str] = field(converter=chr)
> ```
> where the field `x` has the type string, and the type for the `x` parameter 
> in `__init__` would be derrived from `chr`, or optionally:
> ```py
> @dataclass
> class Foo:
> x: InitFn[str, int] = field(converter=chr)
> ```
> where you can provide a second type argument that specifies the type 
> parameter for `__init__`?

How about this variation?

Use with `init_using` instead of `converter` as the name of the argument to 
field, allow either a callable or a method name to be supplied, and expect the 
custom init function to behave like `__post_init__` in that it assigns to 
properties rather than returning a converted value. That will allow it to 
initialize more than 1 property. Next, we can say that if the same callable 
object or the same method name is passed to `init_using`, then it is called 
only once. Finally, we say that the class' init argument(s) and their type 
hints are taken from the `init_using` target.

```
@dataclass
class DocumentFile:
filename: str = field(init_using='_init_name_and_ctype')
content_type: str = field(init_using='_init_name_and_ctype')
description: str | None = field(default=None)

# In this case, the function takes a `file_name` argument which is the same
# as one of the property names that it initializes, but it could take an 
argument
# with a completely different name, and the class init would have that as 
its
# an argument instead.
def _init_name_and_ctype(self, filename: str | Path = '/tmp/example.txt') 
-> None:
self.filename = str(filename)
self.content_type = mimetypes.guess_type(filename)

# Roughly translates to

class DocumentFile:
filename: str
content_type: str
description: str | None

def __init__(self, filename: str | Path = '/tmp/example.txt', description: 
str | None = None):
self.description = description
self._init_name_and_ctype(filename)

def _init_name_and_ctype(self, file_name: str | Path = '/tmp/example.txt') 
-> None:
self.file_name = str(file_name)
self.content_type = mimetypes.guess_type(file_name)
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CGOCLL2YRITOXJWQB55PHYUTYKF4BLSB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-29 Thread Steve Jorgensen

Paul Bryan wrote:
> Could the type hint for the __init__ parameter be inferred from the
> (proposed) init_fn's own parameter type hint itself?
> On Tue, 2022-06-28 at 16:39 +0000, Steve Jorgensen wrote:

I think I was already suggesting that possibility "an optional argument to 
`InitFn` or maybe that can be derived from the signature of the function that 
`InitFn` refers to." Are we saying the same thing?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/W6SYLYIQLORAJJCVXYPZFLV25XZG43DH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-28 Thread Steve Jorgensen

Dexter Hill wrote:
> Ah right I see what you mean. In my example I avoided the use of `__init__` 
> and specifically `__post_init__` as (and it's probably a fairly uncommon use 
> case), in my actual project, `__post_init__` is defined on a base class, and 
> inherited by all other classes, and I wanted to avoid overriding 
> `__post_init__` (and super-ing). The idea was to have the conversion 
> generated by the dataclass, within the `__init__` no function were required 
> to be defined (similarly to how converters work in attrs).
> With your suggestion, what do you think about having something similar to 
> `InitVar` so it's more in line with how `__post_init__` currently works? For 
> example, like one of my other suggestions, having a type called `InitFn` 
> which takes two types: the type for `__init__` and the type of the actual 
> field.

Now I see why you wanted to avoid using __post_init__. I had been thinking to 
try to use __post_init_ instead of adding more ways to initialize, but your 
reasoning makes a lot of sense.

Would we want something more general that could deal with cases where the input 
does not have a 1-to-1 mapping to the field that differ only, perhaps, in type 
hint? What if we want 1 argument to initializes 2 properties or vice verse, 
etc.?

In any case, having a new `InitFn` is worth digging into, I don't think it 
needs to have 2 arguments for type since the type annotation already covers 1 
of those cases. I think it makes the most sense for the type annotation to 
apply to the property and the type of the argument to be provided either 
through an optional argument to `InitFn` or maybe that can be derived from the 
signature of the function that `InitFn` refers to.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BCN2BUZSM6KH5VSTKHYWI3CB5UVDDNUH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-25 Thread Steve Jorgensen

Dexter Hill wrote:
> Do you mind providing a little example of what you mean? I'm not sure I 100% 
> understand what your use of `__post_init__` is. In my mind, it would be 
> something like:
> ```py
> @dataclass
> class Foo:
> x: str = field(init=int, converter=chr)
> # which converts to
> class Foo:
> def __init__(self, x: int):
> self.x = chr(x)
> ```
> without any use of `__post_init__`. If it were to be something like:
> ```py
> class Foo:
> def __init__(self, x: int):
> self.__post_init__(x)
> def __post_init__(x: int):
> self.x = chr(x)
> ```
> which, I think is what you are suggesting (please correct me if I'm wrong), 
> then I feel that may be confusing if you were to override `__post_init__`, 
> which is often much easier than overriding `__init__`.
> For exmple, in a situation like:
> ```py
> @dataclass
> class Foo:
> x: str = field(init=int, converter=chr)
> y: InitVar[str]
> ```
> if the user were to override `__post_init__`, would they know that they need 
> to include `x` as the first argument? It's not typed with `InitVar` so it 
> might not be clear that it's passed to `__post_init__`.
That's close to what I mean. I'm actually suggesting to not have 'converter 
though, and instead use an explicit `__post_init__` for that, so
```py
@dataclass
class Foo:
x: str = field(init=int)

def __post_init__(self, x: int):
self.x = chr(x)

# converts to
class Foo:
def __init__(self, x: int):
self.__post_init__(x)

def __post_init__(self, x: int):
self.x = chr(x)
```
Writing that out is helpful because now I see that the argument type can 
possibly be taken from the `__post_init__` signature, meaning there is no need 
to use the type as the value for the `init` argument to `field`. In that case, 
instead of `init=int`, it could maybe be something like `post_init=True`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LI7ZSAZ6VGQV4OEP7ZOXIWIKA4VLMWXJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-25 Thread Steve Jorgensen

Dexter Hill wrote:
> I don't mind that solution although my concern is whether it would be 
> confusing to have `init` have two different purposes depending on the 
> argument. And, if `__post_init__` was overrided, which I would say it 
> commonly is, that would mean the user would have to manually do the 
> conversion, as well as remembering to add an extra argument for the 
> conversion function (assuming I'm understanding what you're saying).
> If no type was provided to `init` but a conversion function was, it would be 
> a case of getting the type from the function signature, right?

The reason I am saying to use the 'init' argument is that it seems to me to be 
a variation on what that argument already does. It controls whether the 
argument is passed to the generated `__init__` method. Passing a type as the 
value for 'init' would now behave like sort of a cross between `init=False` and 
`InitVar`. The field would still be created (unlike `InitVar`) but would not be 
automatically assigned the value passed as its corresponding argument, leaving 
that responsibility to `__post_init__`. Like with `InitVar`, the argument would 
be passed to `__post_init__` since it was not processed by `__init__`.

The type annotation would continue to specify the type of the field, and the 
type passed to the 'init' argument would specify the type of its constructor 
argument.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4DUTNRIRLJKOY3CDRGIU6TZ4NV2RWP5Q/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-24 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Simão Afonso wrote:
> > On 2022-06-23 17:35:59, Steve Jorgensen wrote:
> > What if, instead, the `init` parameter could accept either a boolean
> > (as it does now) or a type? When given a type, that would mean that to
> > created the property and accept the argument but pass the argument ti
> > `__post_init__` rather than using it to initialize the property
> > directly. The type passed to `init` would become the type hint for the
> > argument.
> > What if you wanted to create a boolean type from a function?
> > Then you would pass `type=bool`
Oops. That was another typo. You would pass `init=bool`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YAAMMAP4YWLJ5YWZG6DLFLVTBX73MFGR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-24 Thread Steve Jorgensen

Simão Afonso wrote:
> On 2022-06-23 17:35:59, Steve Jorgensen wrote:
> > What if, instead, the `init` parameter could accept either a boolean
> > (as it does now) or a type? When given a type, that would mean that to
> > created the property and accept the argument but pass the argument ti
> > `__post_init__` rather than using it to initialize the property
> > directly. The type passed to `init` would become the type hint for the
> > argument.
> > What if you wanted to create a boolean type from a function?
Then you would pass `type=bool`
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C5RAIKFHB3KCXJGGGWYWZAGNQ7OJ3AUS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Generalized deferred computation in Python

2022-06-24 Thread Steve Jorgensen

Steve Jorgensen wrote:
> I think I have an idea how to do something like what you're asking with less 
> magic, and I think an example implementation of this could actually be done 
> in pure Python code (though a more performant implementation would need 
> support at the C level).
> What if a deferred object has 1 magic method ( __isdeferred__ ) that is 
> invoked directly rather than causing a thunk, and invocation of any other 
> method does cause a thunk. For the example implementation, a thunk would 
> simply mean that the value is computed and stored within the instance, and 
> method calls on the wrapper are now delegated to that. In the proper 
> implementation, the object would change its identity to become its computed 
> result.

I haven't had any replies to this, but I think it warrants some attention, so 
I'll try to clarify  what I'm suggesting.

Basically, have a deferred object be a wrapper around any kind of callable, and 
give the wrapper a single method __is_deferred__ that does not trigger 
unwrapping. Any other method call or anything else that depends on knowing the 
actual object results in the callable being executed and the wrapper object 
being replaced by that result. From then on, it is no longer deferred.

I like this idea because it is very easy to reason about and fairly flexible. 
Whether the deferred object is a closure or not depends entirely on its 
callable. When it gets unwrapped is easy to understand (basically anything 
other than assignment, passing as an argument, or asking whether it is 
deferred).

What this does NOT help much with is using for argument defaults. Personally, I 
think that's OK. I think that there are good arguments (separately) for dynamic 
argument defaults and deferred objects and that trying to come up with 1 
concept that covers both of those is not necessarily a good idea. It's not a 
good idea if we can't come up with a way to do it that IS easy to reason about, 
anyway.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OWDM7AUSYECALBQ2JVNQL3H2GH2NFSYV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Steve Jorgensen

Sorry for typos.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2RMKRA2GEQ3HADXG4TXYTCLRUX2CR5QG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Steve Jorgensen

Dexter Hill wrote:
> The idea is to have a `default_factory` like argument (either in the `field` 
> function, or a new function entirely) that takes a function as an argument, 
> and that function, with the value provided by `__init__`, is called and the 
> return value is used as the value for the respective field. For example:
> ```py
> @dataclass
> class Foo:
> x: str = field(init_fn=chr)
> f = Foo(65)
> f.x # "A"
> ```
> The `chr` function is called, given the value `65` and `x` is set to its 
> return value of `"A"`. I understand that there is both `__init__` and 
> `__post_init__` which can be used for this purpose, but sometimes it isn't 
> ideal to override them. If you overrided `__init__`, and were using 
> `__post_init__`, you would need to manually call it, and in my case, 
> `__post_init__` is implemented on a base class, which all other classes 
> inherit, and so overloading it would require re-implementing the logic from 
> it (and that's ignoring the fact that you also need to type the field with 
> `InitVar` to even have it passed to `__post_init__` in the first place).
> I've created a proof of concept, shown below:
> ```py
> def initfn(fn, default=None):
> class Inner:
> def __set_name__(_, owner_cls, owner_name):
> old_setattr = getattr(owner_cls, "__setattr__")
> def __setattr__(self, attr_name, value):
> if attr_name == owner_name:
> # Bypass `__setattr__`
> self.__dict__[attr_name] = fac(value)
> else:
> old_setattr(self, attr_name, value)
> setattr(owner_cls, "__setattr__", __setattr__)
> def fac(value):
> if isinstance(value, Inner):
> return default
> return fn(value)
> return field(default=Inner())
> ```
> It makes use of the fact that providing `default` as an argument to `field` 
> means it checks the value for a `__set_name__` function, and calls it with 
> the class and field name as arguments. Overriding `__setattr__` is just used 
> to catch when a value is being assigned to a field, and if that field's name 
> matches the name given to `__set_name__`, it calls the function on the value, 
> at sets the field to that instead.
> It can be used like so:
> ```py
> @dataclass
> class Foo:
> x: str = initfn(fn=chr, default="Z")
> f = Foo(65)
> f2 = Foo()
> f.x # "A"
> f2.x # "Z"
> ```
> It adds a little overhead, especially with having to override `__setattr__` 
> however, I believe it would have very little overhead if directly implemented 
> in the dataclass library.
> Even in the case of being able to override one of the init functions, I still 
> think it would be nice to have as a quality of life feature as I feel calling 
> a function is too simple to want to override the functions, if that makes 
> sense.
> Thanks.
> Dexter

What if, instead, the `init` parameter could accept either a boolean (as it 
does now) or a type? When given a type, that would mean that to created the 
property and accept the argument but pass the argument ti `__post_init__` 
rather than using it to initialize the property directly. The type passed to 
`init` would become the type hint for the argument.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YERVGXA5QJUHOQW357GVN7JERB2AJT6P/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Ellipsis (...) to be roughly synonymous with * in destructuring but without capture.

2022-06-22 Thread Steve Jorgensen

> No need to have an object there - you could just define it as a syntactic 
> construct instead. Assignment targets aren't themselves objects (although the 
> same syntax can often be used on the RHS, when it would resolve to one).

Right. Thanks. That _should_ have been obvious. :)

> Having a way to say "allow additional elements without iterating over them" 
> would be useful, but creating a new way to spell the non-assignment wouldn't 
> be of sufficiently great value to justify the syntax IMO.

I mostly agree. I included that option for completeness. It would still have 
the benefit of avoiding the memory usage of creating a list and keeping 
references to the items until the list itself can be collected.

Come to think of it, can (or could) Python already optimize that using current 
syntax, noticing that the variable assigned to is never used after it is 
"assigned" to? If that optimization were implemented (I presume it is not 
implemented now) then there is actually no point to this proposal at all except 
to allow "..." in final positions in the expression to the left of "=" and to 
have that mean to not iterate.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AHUOIVOS4GXHAI3AT7O5M2MI4BJJER24/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Ellipsis (...) to be roughly synonymous with * in destructuring but without capture.

2022-06-22 Thread Steve Jorgensen

Steve Jorgensen wrote:
> This is based on previous discussions of possible ways of matching all 
> remaining items during destructuring but without iterating of remaining final 
> items. This is not exactly a direct replacement for that idea though, and 
> skipping iteration of final items might or might not be part of the goal.
> In this proposal, the ellipsis (...) can be used in the expression on the 
> left side of the equals sign in destructuring anywhere that `*` can 
> appear and has approximately the same meaning. The difference is that when 
> the ellipsis is used, the matched items are not stored in variables. This can 
> be useful when the matched data might be very large.
> ..., last_one = 
> a, ..., z = 
> first_one, ... = 
> Additionally, when the ellipsis comes last and the data is being retrieved by 
> iterating, stop retrieving items since that might be expensive and we know 
> that we will not use them.
> Alternative A:
> Still iterate over items when the ellipsis comes last (for side effects) but 
> introduce a new `final_elipsis` object that is used to stop iteration. The 
> negation of `ellipsis` (e.g. `-...`) could return `final_ellipsis` in that 
> case.
> Alternative B:
> Still iterate over items when the ellipsis comes last (for side effects) and 
> don't provide any new means of skipping iteration over final items. The 
> programmer can use islice to achieve that.

Correction: "are not stored in variables" should say "are not stored in a 
variable"
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CCQYEZH465W4ARBMBIUWK6YN4J5HNA5B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Ellipsis (...) to be roughly synonymous with * in destructuring but without capture.

2022-06-22 Thread Steve Jorgensen

This is based on previous discussions of possible ways of matching all
remaining items during destructuring but without iterating of remaining final
items. This is not exactly a direct replacement for that idea though, and
skipping iteration of final items might or might not be part of the goal.

In this proposal, the ellipsis (...) can be used in the expression on the left
side of the equals sign in destructuring anywhere that `*` can appear
and has approximately the same meaning. The difference is that when the
ellipsis is used, the matched items are not stored in variables. This can be
useful when the matched data might be very large.

..., last_one =
a, ..., z =
first_one, ... =

Additionally, when the ellipsis comes last and the data is being retrieved by
iterating, stop retrieving items since that might be expensive and we know that
we will not use them.

Alternative A:

Still iterate over items when the ellipsis comes last (for side effects) but
introduce a new `final_elipsis` object that is used to stop iteration. The
negation of `ellipsis` (e.g. `-...`) could return `final_ellipsis` in that case.

Alternative B:

Still iterate over items when the ellipsis comes last (for side effects) and
don't provide any new means of skipping iteration over final items. The
programmer can use islice to achieve that.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/QPMFXOOHKQJ6YFM35SJXZMANBQTRZ3FY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Generalized deferred computation in Python

2022-06-22 Thread Steve Jorgensen

I think I have an idea how to do something like what you're asking with less 
magic, and I think an example implementation of this could actually be done in 
pure Python code (though a more performant implementation would need support at 
the C level).

What if a deferred object has 1 magic method ( __isdeferred__ ) that is invoked 
directly rather than causing a thunk, and invocation of any other method does 
cause a thunk. For the example implementation, a thunk would simply mean that 
the value is computed and stored within the instance, and method calls on the 
wrapper are now delegated to that. In the proper implementation, the object 
would change its identity to become its computed result.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RNZXM55GFZ5DHOHP6QZZ744HUVNDB2BV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Jonathan Slenders wrote:
> > Hi everyone,
> > Today was the 3rd time I came across a situation where it was needed to
> > retrieve all the positions of the line endings (or beginnings) in a very
> > long python string as efficiently as possible. First time, it was needed in
> > prompt_toolkit, where I spent a crazy amount of time looking for the most
> > performant solution. Second time was in a commercial project where
> > performance was very critical too. Third time is for the Rich/Textual
> > project from Will McGugan. (See:
> > https://twitter.com/willmcgugan/status/1537782771137011715 )
> > The problem is that the `str` type doesn't expose any API to efficiently
> > find all \n positions. Every Python implementation is either calling
> > `.index()` in a loop and collecting the results or running a regex over the
> > string and collecting all positions.
> > For long strings, depending on the implementation, this results in a lot of
> > overhead due to either:
> > 
> > calling Python functions (or any other Python instruction) for every \n
> > 
> > character in the input. The amount of executed Python instructions is O(n)
> > here.
> > 
> > Copying string data into new strings.
> > 
> > The fastest solution I've been using for some time, does this (simplified):
> > `accumulate(chain([0], map(len, text.splitlines(True`. The performance
> > is great here, because the amount of Python instructions is O(1).
> > Everything is chained in C-code thanks to itertools. Because of that, it
> > can outperform the regex solution with a factor of ~2.5. (Regex isn't slow,
> > but iterating over the results is.)
> > The bad things about this solution is however:
> > 
> > Very cumbersome syntax.
> > We call `splitlines()` which internally allocates a huge amount of
> > 
> > strings, only to use their lengths. That is still much more overhead then a
> > simple for-loop in C would be.
> > Performance matters here, because for these kind of problems, the list of
> > integers that gets produced is typically used as an index to quickly find
> > character offsets in the original string, depending on which line is
> > displayed/processed. The bisect library helps too to quickly convert any
> > index position of that string into a line number. The point is, that for
> > big inputs, the amount of Python instructions executed is not O(n), but
> > O(1). Of course, some of the C code remains O(n).
> > So, my ask here.
> > Would it make sense to add a `line_offsets()` method to `str`?
> > Or even `character_offsets(character)` if we want to do that for any
> > character?
> > Or `indexes(...)/indices(...)` if we would allow substrings of arbitrary
> > lengths?
> > Thanks,
> > Jonathan
> > I presume there is some reason that `re.findall` did not work or was not 
> > optimal?

I just saw your reply elsewhere in the conversation that says

> That requires a more complex regex pattern. I was actually using:
> re.compile(r"\n|\r(?!\n)")
> And then the regex becomes significantly slower than the splitlines() 
> solution, which is still much slower than it has to be.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JGY2YNOCKZ2KS7BMQMNCEY3YHIRJC3UL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Steve Jorgensen

Jonathan Slenders wrote:
> Hi everyone,
> Today was the 3rd time I came across a situation where it was needed to
> retrieve all the positions of the line endings (or beginnings) in a very
> long python string as efficiently as possible. First time, it was needed in
> prompt_toolkit, where I spent a crazy amount of time looking for the most
> performant solution. Second time was in a commercial project where
> performance was very critical too. Third time is for the Rich/Textual
> project from Will McGugan. (See:
> https://twitter.com/willmcgugan/status/1537782771137011715 )
> The problem is that the `str` type doesn't expose any API to efficiently
> find all \n positions. Every Python implementation is either calling
> `.index()` in a loop and collecting the results or running a regex over the
> string and collecting all positions.
> For long strings, depending on the implementation, this results in a lot of
> overhead due to either:
> - calling Python functions (or any other Python instruction) for every \n
> character in the input. The amount of executed Python instructions is O(n)
> here.
> - Copying string data into new strings.
> The fastest solution I've been using for some time, does this (simplified):
> `accumulate(chain([0], map(len, text.splitlines(True`. The performance
> is great here, because the amount of Python instructions is O(1).
> Everything is chained in C-code thanks to itertools. Because of that, it
> can outperform the regex solution with a factor of ~2.5. (Regex isn't slow,
> but iterating over the results is.)
> The bad things about this solution is however:
> - Very cumbersome syntax.
> - We call `splitlines()` which internally allocates a huge amount of
> strings, only to use their lengths. That is still much more overhead then a
> simple for-loop in C would be.
> Performance matters here, because for these kind of problems, the list of
> integers that gets produced is typically used as an index to quickly find
> character offsets in the original string, depending on which line is
> displayed/processed. The bisect library helps too to quickly convert any
> index position of that string into a line number. The point is, that for
> big inputs, the amount of Python instructions executed is not O(n), but
> O(1). Of course, some of the C code remains O(n).
> So, my ask here.
> Would it make sense to add a `line_offsets()` method to `str`?
> Or even `character_offsets(character)` if we want to do that for any
> character?
> Or `indexes(...)/indices(...)` if we would allow substrings of arbitrary
> lengths?
> Thanks,
> Jonathan

I presume there is some reason that `re.findall` did not work or was not 
optimal?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PO3V3XXHZL7CF4YCD635AF57OYG2RORC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Steve Jorgensen

Steven D'Aprano wrote:
> Okay, I'm convinced.
> If we need this feature (and I'm not convinced about that part), then it 
> makes sense to keep the star and write it as `spam, eggs, *... = items`.

I thought about that, but to me, there are several reasons to not do that and 
to have the ellipsis mean multiple rather than prepending * for that:
1. In common usage outside of programming, the ellipsis means a continuation 
and not just a single additional thing.
2. Having `*...` mean any number of things implies that `...` means a single 
thing, and I don't think there is a reason to match 1 thing but not assign it 
to a variable. It is also already fine to repeat `_` in the left side 
expression.
3. I am guessing (though I could be wrong) that support for `*...` would be a 
bigger change and more complicated in the Python source code.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2YGDMCGY5NBMIO57F6M7K3HP6HRYKTWZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-17 Thread Steve Jorgensen

Also in reply to Paul & Stephen, …

Yes. I really like the idea of using the ellipsis in the expression on the 
left. It avoids any breaking changes, avoids adding new semantics to '*', and 
also reads quite well.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GAMCAQDMLKDDNMRITIJHWZEHKCRMZ5DE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-17 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Restarting this with an improved title "Bare" vs "Raw", and I will try not to 
> digress so much in the new thread.
> My suggestion is to allow a bare asterisk at the end of a desctructuring 
> expression to indicate that additional elements are to be ignored if present 
> and not iterated over if the rhs is being evaluated by iterating.
> (first, second, *) = items
> This provides a way of using destructuring from something that will be 
> processed by iterating and for which the number of items might be very large 
> and/or accessing of successive items is expensive.
> As Paul Moore pointed out in the original thread, itertools.islice can be 
> used to limit the number of items iterated over. That's a nice solution, but 
> it required knowing or thinking of the solution, an additional import, and 
> repetition of the count of items to be destrucured at the outermost nesting 
> level on the lhs.
> What are people's impressions of this idea. Is it valuable enough to pursue 
> writing a PEP?
> If so, then what should I do in writing the PEP to make sure that it's 
> somewhat close to something that can potentially be accepted? Perhaps, there 
> is a guide for doing that?
First, thanks very much for the thoughtful and helpful replies so far.

Since my last message here, I have noticed a couple of issues with the 
suggestion.

1. In a function declaration, the bare "*" specifically expects to match 
nothing, and in this case, I am suggesting that it have no expectation. That's 
a bit of a cognitive dissonance.

2. The new structural pattern matching that was introduced in Python 3.10 
introduces a very similar concept by using an underscore as a wildcard that 
matches and doesn't bind to anything.

That leads me to want to change the proposal to say that we give the same 
meaning to "_" in ordinary destructuring that it has in structural pattern 
matching, and then, I believe that a final "*_" in the expression on the left 
would end up with exactly the same meaning that I originally proposed for the 
bare "*".

Although that would be a breaking change, it is already conventional to use "_" 
as a variable name only when we specifically don't care what it contains 
following its assignment, so for any code to be affected by the change would be 
highly unusual.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2DJXQ22GN3ABWGT2VUTGIXEUMMA6XOLO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-16 Thread Steve Jorgensen

Restarting this with an improved title "Bare" vs "Raw", and I will try not to 
digress so much in the new thread.

My suggestion is to allow a bare asterisk at the end of a desctructuring 
expression to indicate that additional elements are to be ignored if present 
and not iterated over if the rhs is being evaluated by iterating.

(first, second, *) = items

This provides a way of using destructuring from something that will be 
processed by iterating and for which the number of items might be very large 
and/or accessing of successive items is expensive.

As Paul Moore pointed out in the original thread, itertools.islice can be used 
to limit the number of items iterated over. That's a nice solution, but it 
required knowing or thinking of the solution, an additional import, and 
repetition of the count of items to be destrucured at the outermost nesting 
level on the lhs.

What are people's impressions of this idea. Is it valuable enough to pursue 
writing a PEP?

If so, then what should I do in writing the PEP to make sure that it's somewhat 
close to something that can potentially be accepted? Perhaps, there is a guide 
for doing that?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4DN7T3NZEAUPJBA2SNJ4YWM564QPVE5N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-16 Thread Steve Jorgensen

Is there anything that I can do, as a random Python user to help move this to 
the next stage? I'm happy to go along with whatever the preponderance of 
responses here seem to think in terms of which syntax choice is best. Although 
I have a slight preference, all of the options seem decent to me.

I am definitely in favor of having the PEP accepted and implemented.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5572SP7T2GR5PYIVTYN5VESHV5XJ2JA5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-13 Thread Steve Jorgensen

To clarify my statement about readability of the '@' prefix option…

I think that its meaning is less clear if one doesn't already know what the 
syntax means. I think the code would be easier to skim, however, using that 
option after one does know its meaning.

My favorite options are '@' or '?=' (tied), followed by ':=' followed by '=>'.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TDPKOPGWQ4ORRJDHWJMX5GMW2TQ5FI5B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-13 Thread Steve Jorgensen

Ah and since previous parameters can be referenced, and `self` or `cls` is the 
first argument to any method, that is always available to default value 
expressions. Correct?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZB3FSVZH2JVRI6LAMK7WCUSITC4RYBUO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-13 Thread Steve Jorgensen

One thing was not clear to me from the current PEP 671 text.

When that is used in a method, what is the closure for the expressions? 
Would/should assignments in the class definition be available or only global 
variables in the module and local variables in the function (if applicable) in 
which the class definition happens?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XQD56GF3W2L223HSSBOVMIWTKF2AERH6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-13 Thread Steve Jorgensen

I couldn't figure out the best place in the reply tree to post this, so
replying to the OP, answering the questions, taking into account other
discussion that has happened.

> 1) If this feature existed in Python 3.11 exactly as described, would
you use it?

Definitely

> 2) Independently: Is the syntactic distinction between "=" and "=>" a
cognitive burden?

No, but I feel there is some cognitive burden with the distinction between that
and other arrow notations that we have now and will likely have later.

4) If "no" to question 1, is there some other spelling or other small
change that WOULD mean you would use it? (Some examples in the PEP.)

Technically this is not applicable since I would use it anyway, but…

I would slightly prefer any one of the alternative syntaxes. At first, I was
not liking the '@' prefix idea because the '@' is separated from the default
expression that it is conceptually associated with. That option does have a
strong redeeming aspect though, which is that I think it might be the easiest
to read.

5) Do you know how to compile CPython from source, and would you be
willing to try this out? Please? :)

Sure. I don't think I need to try it to know that I would appreciate it though,
unless I were to find that it is buggy or something.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/Y6VHQZI5FDR25WUBFDF2NRRRPVSTT7RL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-12 Thread Steve Jorgensen

I had actually not thought about the question of what should happen when 
performing multiple index operations on the same iterator, and maybe that's a 
reason that the idea of adding index lookup using brackets is not as good as it 
first seems.

The whole point of adding that would be to reduce the number of situations in 
which it matters whether you have a sequence, or and iterator. As soon as we 
consider what should happen for multiple index lookups on a single iterator, 
that concept breaks down.

The next thing that makes me think of that's even farther afield from the 
initial topic of this thread would be to have some new function in the standard 
library that is similar to 'islice' but returns an array instead of a new 
iterator and performs optimally when given a list or tuple as an argument. 
Maybe it could be named something like 'gslice', short for "greedy slice".

Hypothetical simplistic implementation:

def gslice(source, start_or_stop=None, stop=None, step=None):
if isinstance(source, collections.abc.Sequence):
return source[slice(start_or_stop, stop, step)]
elif isinstance(source, collections.abc.Iterable):
return list(islice(start_or_stop, stop, step))
else:
raise TypeError("'source' must be a sequence or iterable")
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RRSJ65RYDRJ2X4K235M4M4AYJSTQAINB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-08 Thread Steve Jorgensen

My current thinking in response to that is that using islice is a decent 
solution except that it's not obvious. You have to jump outside of the thinking 
about the destructuring capability and consider what else could be used to 
help. Probably, first thing that _would_ come to mind from outside would be 
slicing with square brackets, but that would restrict the solution to only  
work with sequences and not other iterables and iterators as islice does.

That brings up a tangential idea. Why not allow square-bracket indexing of 
generators instead of having to import and utilize islice for that?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZGMFS4Y56MDPQLEIKW6PQVW2WDHRSGZV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Add .except() and .only(), and .values_at(). instance methods to dict

2022-06-05 Thread Steve Jorgensen

I think these are an extremely common needs that are worth having standard 
methods for. If adding instance methods seems like a bad idea, then maybe add 
functions to the standard library that perform the same operations.

m = {'a': 123, 'b': 456, 'c': 789}
m.except(('a', 'c'))  # {'b': 456}
m.only(('b', 'c'))  # {'b': 456, 'c': 789}
m.values_at(('a', 'b'))  # [123, 456]

…or…

from mappings import except, only, values_at

m = {'a': 123, 'b': 456, 'c': 789}
except(m, ('a', 'c'))  # {'b': 456}
only(m, ('b', 'c'))  # {'b': 456, 'c': 789}
values_at(m, ('a', 'b'))  # [123, 456]
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SMHI3ABM4XLASYYDGSTY45BKHTM7QMK2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steve Jorgensen

OK. That's not terrible. It is a redundancy though, having to re-state the 
count of variables that are to be de-structured into on the left.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/F3YHX7F3HGKFYAX7JH3LJNJRSDN2XOYE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steve Jorgensen

I was using the reading of lines from a file as a contrived example. There are 
many other possible cases such as de-structuring from iterator such as 
`itertools.repeat()` with no `count` argument which will generate values 
endlessly.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YH722VXD32AX4MDIDOXVP64YVPNXTNQ6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steve Jorgensen

A contrived use case:

with open('document.txt', 'r') as io:
(line1, line2, *) = io

It is possible to kind of achieve the same result using `*_` except that would 
actually read all the lines from the file, even if we only want the first 2.

…so I am suggesting that we use the bare `*` here to mean that we don't care 
whether there are additional items in the sequence, _and_ we want to stop 
iterating.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NURDVNQUMKDH7242FCQBBYIU7WSATTB6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Means of avoiding accidental import of/from an Implicit Namespace Package

2022-06-02 Thread Steve Jorgensen

More than once, I've had bugs that were hard to track down because I was 
accidentally using an implicit namespace without realizing it.

The last time this happened, it was a typo, and my init file was named 
`_init__.py` instead of `__init__.py`. The init file imported from sub-modules, 
including 1 with a class that was supposed be be registered via an 
`__init_subclass__` callback that was not happening.

I'm sure that implicit namespace packages are here to stay, and I imagine I 
will actually want to use them on purpose at some point, but it would be nice 
if we could come up with a straightforward way to avoid the accidental usages.

One idea that comes to mind is to add a new built-in context manager within 
which the importing of a purely implicit namespace raises an exception.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/V7QU3IDGKITJ3J4FL7G6YAFKIXM44IC2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Enhance flexibility of dataclass repr

2022-05-30 Thread Steve Jorgensen

I should add that…

I did find it is already possible to define a dataclass field for a property 
that is implemented as `@property`-decorated function, but it's a bit of a 
hack. It only works if the property has a setter that succeeds, even it the 
attribute is supposed to be read-only or if it is not appropriate for it's 
setter to be called during initialization.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LHEBFGTOZRO4LD7N5JLQLPV46CE5NCNP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Enhance flexibility of dataclass repr

2022-05-30 Thread Steve Jorgensen

The desire for this came up for me in relation to a set of dataclasses used to
define a tree structure where each item has a reference to its parent.
Including the complete expansion of the parent (with its children and its
parent with its children, etc.) is WAY too much information, but at the same
time, I do want to at least identify the parent. Currently, the only way to get
what I'm looking for is to write a custom `__repr__` from scratch.

It would be great if there was at least 1 way to take advantage of the
automatic repr and still customize its handling for specific fields.

The first thing I thought of in that regard for my example was to add a
`parent_name` property using `@property` and specify `repr=False` for `parent.
The auto-generated repr is not aware of properties defined that way though.
Maybe that could be solved by adding an argument named something like
`descriptor=` to `field()` where a `True` value means that getting
and setting happens through a separately defined descriptor (e.g.
via`@property`) and should not be implemented automatically, even though it
should be otherwise treated as a dataclass property.

The second thought I had is to be able to customize `repr` for any field. One
way to do that might be to allow `field()`'s `repr` argument to accept a method
name string and/or a callable that accepts an instance of the class in addition
to accepting `True` or `False`.

I actually like the idea of having both of those capabilities.

Opinions?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/BE5I4MXLPBW3RUKSV5M35CEJRJHISKNW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-11 Thread Steve Jorgensen

Andrew Barnert wrote:
> On May 11, 2020, at 00:40, Steve Jorgensen ste...@stevej.name wrote:
> > Proposal:
> > Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> > use as a single component of a path. In the default case, the value must 
> > also not be a
> > reference to the current or parent directory ("." or "..") and must not 
> > contain control
> > characters.

> If not: the result can contain the path separator, illegal characters that 
> aren’t
> control characters, nonprinting characters that aren’t control characters, 
> and characters
> whose bytes (in the filesystem’s encoding) are ASCII control characters?
> And it can be a reserved name, or even something like C:; as long as it’s not 
> the Unix
> . or ..?

Are there non-printing characters outside of those in the Unicode general 
category of "C" that make sense to omit? There are combining characters and 
such that do not have glyphs but are visible in the sense that they modify the 
glyphs displayed for the characters that they combine with.

Regarding names like "C:", you are absolutely right to point that out. When the 
platform is Windows, certainly, ":" should not be allowed, and perhaps 
colon should not be allowed at all. I'll need to research that a bit. This 
matters because if the path part is used without explicit "./" prefixed to it, 
then it will refer to a root path, so same problem as allowing a name starting 
with "/" in *NIX. That should be unconditionally disallowed in the case of WIN 
or GENERAL systems.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SDMTI5KQKWYZV3MOTFRS27M7RED56THZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-11 Thread Steve Jorgensen

Andrew Barnert wrote:
> On May 11, 2020, at 00:40, Steve Jorgensen ste...@stevej.name wrote:
> > Proposal:
> > Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> > use as a single component of a path. In the default case, the value must 
> > also not be a
> > reference to the current or parent directory ("." or "..") and must not 
> > contain control
> > characters.
> > “Also” in addition to what? Are there other requirements enforced besides 
> > these
> two that aren’t specified anywhere?

Sorry that was not clear. In addition to ensuring that it it a single part, 
meaning that it contains no path separators.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LVJ62W42HQNNQOJXIKS7KLRIOY5IE7JT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-11 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Based on responses to my previous proposal, I am convinced that it was 
> over-ambitious
> and not appropriate for inclusion in the Python standard library, so starting 
> over with a
> more narrowly scoped suggestion.
> Proposal:
> Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> use as a single component of a path. In the default case, the value must also 
> not be a
> reference to the current or parent directory ("." or "..") and must not 
> contain control
> characters.
> When an invalid character is encountered, then ValueError will be raised
> in the default case, or the character may be replaced or escaped.
> When an invalid name is encountered, then ValueError will be raised in the
> default case, or the first character may be replaced, escaped, or prefixed.
> Control characters (those in the Unicode general category of "C") are treated 
> as invalid
> by default.
> After applying any transformations, if the result would still be invalid, 
> then an
> exception is raised.
> Proposed function signature: sanitizepart(name, replace=None, escape=None,
> prefix=None, flags=0)
> When replace is supplied, it is used as a replacement for any invalid
> characters or for the first character of an invalid name. When prefix is not
> also supplied, this is also used as the replacement for the first character 
> of the name if
> it is invalid, not simply due to containing invalid characters.
> When escape is supplied (typically "%") it is used as the escape character
> in the same way that "%" is used in URL encoding. When a non-ASCII character 
> is escaped,
> it is represented as a sequence of encoded bytes/octets. When prefix is not
> also supplied, this is also used to escape the first character of the name if 
> it is
> invalid, not simply due to containing invalid characters.
> replace and escape are mutually exclusive.
> When prefix is supplied (typically "_"), it is prepended the name if it is
> invalid, not simply due to containing invalid characters.
> Flags:
> 
> path.PERMIT_RELATIVE (1): Permit relative path values ("." "..")
> path.PERMIT_CTRL (2): Permit characters in the Unicode general category of 
> "C".

Somewhere between the 1st and 2nd proposal, I lost track of the 
system-specificity issue. Even with this more focused proposal, there is the 
issue of different path separators on Windows vs *nix, so the function needs 
another argument for that. Presumably, it would have a default of `None` 
meaning to use the current platform and would have constants for `NIX`, `WIN`, 
and `GENERAL` where `WIN` and `GENERAL` behave the same, recognizing either "/" 
or "\" as a file separator character.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SRQJ2BZHYYVIPW7CGABLNCWLZMOMCZO3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-11 Thread Steve Jorgensen

Steve Jorgensen wrote:

> When escape is supplied (typically "%") it is used as the escape character
> in the same way that "%" is used in URL encoding. When a non-ASCII character 
> is escaped,
> it is represented as a sequence of encoded bytes/octets.

I neglected to say that the octet sequence would be for the UTF-8 
representation of the non-ASCII character. This is consistent with ECMAScript's 
`encodeURI` (see https://www.ecma-international.org/ecma-262/5.1/#sec-15.1.3).

Also, to clarify why this is needed, it is for when there are non-ASCII control 
characters such as \u2066 (Left-to-Right Isolate) in the given name value and 
control characters are not being allowed. Other non-ASCII Unicode characters 
are permitted, so this is not applicable to those.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O6AIDG4BDFQUYYZJYVX24LSNHYHO5JFL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Sanitize filename (path part) 2nd try

2020-05-11 Thread Steve Jorgensen

Based on responses to my previous proposal, I am convinced that it was
over-ambitious and not appropriate for inclusion in the Python standard
library, so starting over with a more narrowly scoped suggestion.

Proposal:

Add a new function (possibly `os.path.sanitizepart`) to sanitize a value for
use as a single component of a path. In the default case, the value must also
not be a reference to the current or parent directory ("." or "..") and must
not contain control characters.

When an invalid character is encountered, then `ValueError` will be raised in
the default case, or the character may be replaced or escaped.
When an invalid name is encountered, then `ValueError` will be raised in the
default case, or the first character may be replaced, escaped, or prefixed.
Control characters (those in the Unicode general category of "C") are treated
as invalid by default.
After applying any transformations, if the result would still be invalid, then
an exception is raised.

Proposed function signature: `sanitizepart(name, replace=None, escape=None,
prefix=None, flags=0)`

When `replace` is supplied, it is used as a replacement for any invalid
characters or for the first character of an invalid name. When `prefix` is not
also supplied, this is also used as the replacement for the first character of
the name if it is invalid, not simply due to containing invalid characters.

When `escape` is supplied (typically "%") it is used as the escape character in
the same way that "%" is used in URL encoding. When a non-ASCII character is
escaped, it is represented as a sequence of encoded bytes/octets. When `prefix`
is not also supplied, this is also used to escape the first character of the
name if it is invalid, not simply due to containing invalid characters.

`replace` and `escape` are mutually exclusive.

When `prefix` is supplied (typically "_"), it is prepended the name if it is
invalid, not simply due to containing invalid characters.

Flags:
- path.PERMIT_RELATIVE (1): Permit relative path values ("." "..")
- path.PERMIT_CTRL (2): Permit characters in the Unicode general category of
"C".
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/LRIKMG3G4I4YQNK6BTU7MICHT7X67MEF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-11 Thread Steve Jorgensen

Stephen J. Turnbull wrote:
> Steve Jorgensen writes:
> > I'm thinking of this specifically in terms of
> > sanitizing input,
> > assuming that later usage of the value might or might not properly
> > protect against potential vulnerabilities. This is also limited to
> > the case where the value is supposed to be a single path referring
> > to an entry within a single directory context.
> > This sounds extremely specialized to me.  For example, presumably
> you're not referring to dotted module specifications in Python, but
> those usually do map to filesystem paths in implementations, and I can
> imagine vulnerabilities (the one on top of my head requires a fair
> amount of Python ignorance and environmental serendipity, which sort
> of proves my point about situation-specificity) using Python module
> paths as mapped to filesystem paths.
> ISTM that it might be useful to provide a toolbox for scanning paths
> with various validation operations, but that it's really up to
> applications to decide which operations to use and what parameters
> (eg, evil code point set, bytes vs code points vs code units vs
> characters), and so on.  PyPI seems ideal for that, until it matures
> more than a discussion on the mailing lists can provide.
> Steve (T)

…so maybe it makes sense to have only the more specific sanitization in the 
standard library, then. In the POSIX case, I think that means just blocking "/" 
characters and "." or ".." values.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EVSXG4ZPE5OXNV3NCHPIU5YKAJRMM3NF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Dan Sommers wrote:

> I know what sanitize means (in English and in the technical sense I
> believe you intend here), but can you provide some context and actual
> use cases?
> Sanitize on input so that your application code doesn't "accidentally"
> spit out the contents of /etc/shadow?  Sanitize on output so that your
> code doesn't produce syntactically broken links in an HTML document or
> weird results in an xterm?  Sanitize in both directions for safe round
> tripping to a database server?

I'm thinking of this specifically in terms of sanitizing input, assuming that 
later usage of the value might or might not properly protect against potential 
vulnerabilities. This is also limited to the case where the value is supposed 
to be a single path referring to an entry within a single directory context.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UYMWQOXF26M2O52JZJJAJ76MI2NYKTNC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Dan Sommers wrote:
> On Sun, 10 May 2020 00:34:43 -
> "Steve Jorgensen" ste...@stevej.name wrote:
> > I believe the Python standard library should include
> > a means of
> > sanitizing a filesystem entry, and this should not be something
> > requiring a 3rd party package.
> > I'm not disagreeing.
> > What I am envisioning is a function (presumably in
> > os.path with a signature roughly like
> > {{{
> > sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
> > }}}
> > When permissive is False, characters that are generally
> > unsafe are
> > rejected. When permissive is True, only path separator
> > characters
> > are rejected. Generally unsafe characters besides path separators
> > would include things like a leading ".", any non-printing character,
> > any wildcard, piping and redirection characters, etc.
> > Okay, now I'm disagreeing.  ;-)
> I know what sanitize means (in English and in the technical sense I
> believe you intend here), but can you provide some context and actual
> use cases?
> Sanitize on input so that your application code doesn't "accidentally"
> spit out the contents of /etc/shadow?  Sanitize on output so that your
> code doesn't produce syntactically broken links in an HTML document or
> weird results in an xterm?  Sanitize in both directions for safe round
> tripping to a database server?  All of those use cases potentially
> require separate handling, especially in terms of quoting and escaping.
> For another example, suppose I'm writing a command line utility on a
> POSIX system to compute a hash of the contents of a file.  There's
> nothing wrong with ".profile" as a file name.  Why are you rejecting
> leading "."  characters?  What about leading "-"s, or embedded "|"s?
> Yes, certain shells and shell commands can make them "difficult" to deal
> with in one way or another, but they're not "generally unsafe."
> A very, very, very long time ago, we wrote some software for a customer
> who liked to "editing" our data files to make minor corrections instead
> of using our software.  Our solution was to use "illegal" filenames that
> the shell rejected, but that an application could access directly
> anyway.  I guess the point is that "sanitize" can mean different things
> to different parts of a system.
> Dan

I totally get what you're saying. For the sake of simplicity, I thought that 
the 2 permissiveness options should be one that only prevents path traversal 
and one that is extremely conservative, omitting characters that are often safe 
and appropriate but may be unsafe in some cases.

In regard to dot files, those can be safe in some cases, but unsafe in others — 
writing to configuration files that will be read by shell helpers or editors, 
for instance.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QQ2FO6ARZD4WM45OPYGBXEGXYQO72PRY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Steve Jorgensen wrote:
> > I believe the Python standard library should include
> > a means of sanitizing a filesystem
> > entry, and this should not be something requiring a 3rd party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > What I am envisioning is a function (presumably in os.path with a
> > signature roughly like
> > {{{
> > sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
> > }}}
> > When permissive is False, characters that are generally
> > unsafe are rejected. When permissive is True, only path
> > separator characters are rejected. Generally unsafe characters besides path 
> > separators
> > would include things like a leading ".", any non-printing character, any 
> > wildcard, piping
> > and redirection characters, etc.
> > The mode argument indicates what to do with unacceptable characters.
> > Escape them (ESCAPE), omit them (OMIT) or raise an exception
> > (RAISE). This could also double as an escape character argument when a 
> > string
> > is given. The default escape character should probably be "%" (same as URL 
> > encoding).
> > The system argument accepts a combination of bit flags indicating what
> > operating system's rules to apply, or None meaning to use rules for the
> > current platform. Systems would probably include SYS_POSIX,
> > SYS_WIN, and SYS_MISC where miscellaneous means to enforce rules
> > for all commonly used systems. One example of a distinction is that on a 
> > POSIX system,
> > backslash characters are not path separators, but on Windows, both forward 
> > and backward
> > slashes are path separators.
> > {{{
> > from os import path
> > from os.path import sanitizepart
> > print(repr(
> > os.path.sanitizepart('/ABC\QRS%', system=path.SYS_WIN))
> > # => '%2fABC%5cQRS%%'
> > os.path.sanitizepart('/ABC\QRS%', True, mode=path.STRIP,
> > system=path.SYS_POSIX))
> > # => 'ABC\QRS%'
> > os.path.sanitizepart('../AB*\x01\n', system=path.SYS_POSIX))
> > # => '%2e.%2fABC%26CD%2a%01%10'
> > os.path.sanitizepart('../AB*\x01\n', True, system=path.SYS_POSIX))
> > # => '..%2eAB*\x01\n'
> > }}}
> > Existing work:
> https://pypi.org/project/pathvalidate/#sanitize-a-filename

More existing work:
* https://pypi.org/project/sanitize-filename/
* http://detox.sourceforge.net/
* 
https://sourceforge.net/p/glindra/news/2005/08/glindra-rename--lower--portable/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ITEHIWIFNGM5WOMOC5UAHKQVMLVIBR6Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Steve Jorgensen wrote:
> > Andrew Barnert wrote:
> > On May 9, 2020, at 17:35, Steve Jorgensen
> > ste...@stevej.name wrote:
> > I believe the Python standard library should
> > include
> > a means of sanitizing a filesystem entry, and this should not be something 
> > requiring a
> > 3rd
> > party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > This does seem like a good idea. People who do this themselves get it wrong 
> > all
> > the time, occasionally with disastrous consequences, so if Python can solve 
> > that, that
> > would be great.
> > But, at least historically, this has been more complicated than what you’re 
> > suggesting
> > here. For example, don’t you have to catch things like directories named 
> > “Con” or files
> > whose 8.3 representation has “CON” as the 8 part? I don’t think you can 
> > hang an entire
> > Windows system by abusing those anymore, but you can still produce 
> > filenames that some
> > APIs, and some tools (possibly including Explorer, cmd, powershell, Cygwin, 
> > mingw/native
> > shells, Python itself…) can’t access (or can only access if the user 
> > manually specified a
> > .\ absolute path, or whatever).
> > Yes. I am aware of some of the unsafe names in DOS and older Windows. As I
> > mentioned in my other reply, there is a distinction between the ones that 
> > are merely
> > invalid and those that are actually unsafe. In researching existing Linux 
> > tools just now,
> > I was reminded that a leading dash is frequently unsafe because many tools 
> > will treat an
> > argument starting with dash as an option argument.
> > Is there an established algorithm/rule that lots of
> > people in the industry trust that
> > Python can just reference, instead of having to research or invent it? 
> > Because otherwise,
> > we run the risk of making things worse instead of better.
> > An excellent point! I just started digging into that and found references to
> > detox and Glindra. Neither of those seems to be well maintained though. The 
> > documentation
> > pages for Glindra no longer exist and detox is not in standard package 
> > repositories for
> > CentOS later than 6 (and only in EPEL for that. Still digging.
> > Extremely apropos to the question of what charters might be problematic
> and/or unsafe: https://dwheeler.com/essays/fixing-unix-linux-filenames.html

That article links to another by the same author that is specific to 
vulnerabilities caused by file names.
https://dwheeler.com/secure-programs/Secure-Programs-HOWTO/file-names.html
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FDZOXS2BNZHJ4XAG7WU7BO3AA7KF6WWK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Andrew Barnert wrote:
> > On May 9, 2020, at 17:35, Steve Jorgensen
> > ste...@stevej.name wrote:
> > I believe the Python standard library should
> > include
> > a means of sanitizing a filesystem entry, and this should not be something 
> > requiring a
> > 3rd
> > party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > This does seem like a good idea. People who do this themselves get it wrong 
> > all
> > the time, occasionally with disastrous consequences, so if Python can solve 
> > that, that
> > would be great.
> > But, at least historically, this has been more complicated than what you’re 
> > suggesting
> > here. For example, don’t you have to catch things like directories named 
> > “Con” or files
> > whose 8.3 representation has “CON” as the 8 part? I don’t think you can 
> > hang an entire
> > Windows system by abusing those anymore, but you can still produce 
> > filenames that some
> > APIs, and some tools (possibly including Explorer, cmd, powershell, Cygwin, 
> > mingw/native
> > shells, Python itself…) can’t access (or can only access if the user 
> > manually specified a
> > .\ absolute path, or whatever).
> > Yes. I am aware of some of the unsafe names in DOS and older Windows. As I
> mentioned in my other reply, there is a distinction between the ones that are 
> merely
> invalid and those that are actually unsafe. In researching existing Linux 
> tools just now,
> I was reminded that a leading dash is frequently unsafe because many tools 
> will treat an
> argument starting with dash as an option argument.
> > Is there an established algorithm/rule that lots of
> > people in the industry trust that
> > Python can just reference, instead of having to research or invent it? 
> > Because otherwise,
> > we run the risk of making things worse instead of better.
> > An excellent point! I just started digging into that and found references to
> detox and Glindra. Neither of those seems to be well maintained though. The 
> documentation
> pages for Glindra no longer exist and detox is not in standard package 
> repositories for
> CentOS later than 6 (and only in EPEL for that. Still digging.

Extremely apropos to the question of what charters might be problematic and/or 
unsafe: https://dwheeler.com/essays/fixing-unix-linux-filenames.html
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EDJQA7SDUWEHJ53GYXIGX2HPTU3JEM6X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Andrew Barnert wrote:
> On May 9, 2020, at 17:35, Steve Jorgensen ste...@stevej.name wrote:
> > I believe the Python standard library should include
> > a means of sanitizing a filesystem entry, and this should not be something 
> > requiring a 3rd
> > party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > This does seem like a good idea. People who do this themselves get it wrong 
> > all
> the time, occasionally with disastrous consequences, so if Python can solve 
> that, that
> would be great.
> But, at least historically, this has been more complicated than what you’re 
> suggesting
> here. For example, don’t you have to catch things like directories named 
> “Con” or files
> whose 8.3 representation has “CON” as the 8 part? I don’t think you can hang 
> an entire
> Windows system by abusing those anymore, but you can still produce filenames 
> that some
> APIs, and some tools (possibly including Explorer, cmd, powershell, Cygwin, 
> mingw/native
> shells, Python itself…) can’t access (or can only access if the user manually 
> specified a
> \.\ absolute path, or whatever).

Yes. I am aware of some of the unsafe names in DOS and older Windows. As I 
mentioned in my other reply, there is a distinction between the ones that are 
merely invalid and those that are actually unsafe. In researching existing 
Linux tools just now, I was reminded that a leading dash is frequently unsafe 
because many tools will treat an argument starting with dash as an option 
argument.

> Is there an established algorithm/rule that lots of people in the industry 
> trust that
> Python can just reference, instead of having to research or invent it? 
> Because otherwise,
> we run the risk of making things worse instead of better.

An excellent point! I just started digging into that and found references to 
detox and Glindra. Neither of those seems to be well maintained though. The 
documentation pages for Glindra no longer exist and detox is not in standard 
package repositories for CentOS later than 6 (and only in EPEL for that. Still 
digging.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2LLQWDJJFDM7QJHLMUE73VNJ2T2FA2VM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-10 Thread Steve Jorgensen

Responding to points individually to avoid confusing multi-topic threads. :)

Andrew Barnert wrote:
 < snip >
> > When permissive is False,
> > characters that are generally unsafe are rejected. When permissive is
> > True, only path separator characters are rejected. Generally unsafe
> > characters besides path separators would include things like a leading ".", 
> > any
> > non-printing character, any wildcard, piping and redirection characters, 
> > etc.
> > I think neither of these is what I’d usually want.
> I never want to sanitize just pathsep characters without sanitizing all 
> illegal
> characters.
> I do often want to sanitize all illegal characters (just \0 and the path sep 
> on POSIX,
> a larger set that I don’t know by heart on Windows).

Sanitization and validation are not the same thing though. \0 is invalid and 
will result in an error when passed to a function that attempts to use it to 
reference a file, so allowing that character to pass through sanitization 
doesn't constitute an exploitable vulnerability.

Having said that, it's usually friendlier to fail sooner rather than later, so 
it maybe it actually does make sense for sanitization to fail for illegal 
characters as well as for valid, unsafe characters.

Hmm. I just realized that "..." and (to a lesser extent) "." are valid path 
parts but are nevertheless usually not safe to allow.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/V6EH7JSEKJTT57HHQU3CCQOYE3E7I2G3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Sanitize filename (path part)

2020-05-09 Thread Steve Jorgensen

Steve Jorgensen wrote:
> I believe the Python standard library should include a means of sanitizing a 
> filesystem
> entry, and this should not be something requiring a 3rd party package.
> One of reasons I think this should be in the standard lib is because that 
> provides a
> common, simple means for code reviewers and static analysis services such as 
> Veracode to
> recognize that a value is sanitized in an accepted manner.
> What I am envisioning is a function (presumably in os.path with a
> signature roughly like
> {{{
> sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
> }}}
> When permissive is False, characters that are generally
> unsafe are rejected. When permissive is True, only path
> separator characters are rejected. Generally unsafe characters besides path 
> separators
> would include things like a leading ".", any non-printing character, any 
> wildcard, piping
> and redirection characters, etc.
> The mode argument indicates what to do with unacceptable characters.
> Escape them (ESCAPE), omit them (OMIT) or raise an exception
> (RAISE). This could also double as an escape character argument when a string
> is given. The default escape character should probably be "%" (same as URL 
> encoding).
> The system argument accepts a combination of bit flags indicating what
> operating system's rules to apply, or None meaning to use rules for the
> current platform. Systems would probably include SYS_POSIX,
> SYS_WIN, and SYS_MISC where miscellaneous means to enforce rules
> for all commonly used systems. One example of a distinction is that on a 
> POSIX system,
> backslash characters are not path separators, but on Windows, both forward 
> and backward
> slashes are path separators.
> {{{
> from os import path
> from os.path import sanitizepart
> print(repr(
> os.path.sanitizepart('/ABC\QRS%', system=path.SYS_WIN))
> # => '%2fABC%5cQRS%%'
> os.path.sanitizepart('/ABC\\QRS%', True, mode=path.STRIP,
> system=path.SYS_POSIX))
> 
> # => 'ABC\QRS%'
> os.path.sanitizepart('../AB*\x01\n', system=path.SYS_POSIX))
> 
> # => '%2e.%2fABC%26CD%2a%01%10'
> os.path.sanitizepart('../AB*\x01\n', True, system=path.SYS_POSIX))
> 
> # => '..%2eAB*\x01\n'
> }}}

Existing work:
https://pypi.org/project/pathvalidate/#sanitize-a-filename
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FI2V2EZGLSYB3AAV5V5RNEOFJQWQE45S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Sanitize filename (path part)

2020-05-09 Thread Steve Jorgensen

I believe the Python standard library should include a means of sanitizing a 
filesystem entry, and this should not be something requiring a 3rd party 
package.

One of reasons I think this should be in the standard lib is because that 
provides a common, simple means for code reviewers and static analysis services 
such as Veracode to recognize that a value is sanitized in an accepted manner.

What I am envisioning is a function (presumably in `os.path` with a signature 
roughly like
{{{
sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
}}}

When `permissive` is `False`, characters that are generally unsafe are 
rejected. When `permissive` is `True`, only path separator characters are 
rejected. Generally unsafe characters besides path separators would include 
things like a leading ".", any non-printing character, any wildcard, piping and 
redirection characters, etc.

The `mode` argument indicates what to do with unacceptable characters. Escape 
them (`ESCAPE`), omit them (`OMIT`) or raise an exception (`RAISE`). This could 
also double as an escape character argument when a string is given. The default 
escape character should probably be "%" (same as URL encoding).

The `system` argument accepts a combination of bit flags indicating what 
operating system's rules to apply, or `None` meaning to use rules for the 
current platform. Systems would probably include `SYS_POSIX`, `SYS_WIN`, and 
`SYS_MISC` where miscellaneous means to enforce rules for all commonly used 
systems. One example of a distinction is that on a POSIX system, backslash 
characters are not path separators, but on Windows, both forward and backward 
slashes are path separators.

{{{
from os import path
from os.path import sanitizepart

print(repr(
os.path.sanitizepart('/ABC\\QRS%', system=path.SYS_WIN))
# => '%2fABC%5cQRS%%'

os.path.sanitizepart('/ABC\\QRS%', True, mode=path.STRIP, 
system=path.SYS_POSIX))
# => 'ABC\\QRS%'

os.path.sanitizepart('../AB*\x01\n', system=path.SYS_POSIX))
# => '%2e.%2fABC%26CD%2a%01%10'

os.path.sanitizepart('../AB*\x01\n', True, system=path.SYS_POSIX))
# => '..%2eAB*\x01\n'
}}}
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SQH4LPERFLKBLXPDUOVJMV24JBCBUCYO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Instance method to test equivalence between set and iterable

2020-03-24 Thread Steve Jorgensen

Steven D'Aprano wrote:
> On Mon, Mar 23, 2020 at 12:03:50AM -0000, Steve Jorgensen wrote:
> > Every set is a superset of itself and a subset of
> > itself. A set may 
> > not be a "formal" subset or a "formal" superset of itself. issubset 
> > and issuperset refer to standard subsets and supersets, not formal 
> > subsets and supersets.
> > Sorry, I don't understand your terminology "formal" and "standard". I 
> think you might mean "proper" rather than formal? But I don't know what 
> you mean by "standard".

Right. I meant "proper". Not "formal". By "standard", I simply mean without the 
"proper" qualifier.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GGAKYH5HFZHIPTOIXJA64MY2W7BAIZMQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Instance method to test equivalence between set and iterable

2020-03-22 Thread Steve Jorgensen

Paul Moore wrote:
> On Sun, 22 Mar 2020 at 20:01, Steve Jorgensen ste...@stevej.name wrote:
> >
> > Currently, the issubset and
> > issuperset methods of set objects accept arbitrary iterables as arguments. 
> > An
> > iterable that is both a subset and superset is, in a sense, "equal" to the 
> > set. It would
> > be inappropriate for == to return True for such a comparison,
> > however, since that would break the Hashable contract.
> > Should sets have an additional method, something like like(other),
> > issimilar(other), or isequivalent(other), that returns
> > True for any iterable that contains the all of the items in the set and no
> > items that are not in the set? It would therefore be true in the same cases 
> > where
> >  = set(other) or .issubset(other) and
> > .issuperset(other) is true.
> > What is the practical use case for this? It seems like it would be a
> pretty rare need, at best.
> Paul

Basically, it is for a sense of completeness. It feels weird that there is a 
way to check whether an iterable is a subset of a set or a superset of a set 
but no way to directly ask whether it is equivalent to the set.

Even though the need for it might not be common, I think that the collection of 
methods makes more sense if a method like this is present.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MRCHHRVCXEUAB3HBV4WRMZ56O3HUJQYL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Instance method to test equivalence between set and iterable

2020-03-22 Thread Steve Jorgensen

Steven D'Aprano wrote:
> On Sun, Mar 22, 2020 at 07:59:59PM -0000, Steve Jorgensen wrote:
> > Currently, the issubset and
> > issuperset methods of set objects 
> > accept arbitrary iterables as arguments. An iterable that is both a 
> > subset and superset is, in a sense, "equal" to the set. It would be 
> > inappropriate for == to return True for such a comparison, 
> > however, since that would break the Hashable contract.
> > I think the "arbitrary iterables" part is a distraction. We are 
> fundamentally talking about a comparison on sets, even if Python relaxes 
> the requirements and also allows one operand to be a arbitrary iterable.
> I don't believe that a set A can be both a superset and subset of 
> another set B at the same time. On a Venn Diagram, that would require A 
> to be both completely surrounded by B and B to be completely surrounded 
> by A at the same time, which is impossible.
> I think you might be talking about sets which partially overlap:
> A = {1, 2, 3, 4}
> B = {2, 3, 4, 5}

Every set is a superset of itself and a subset of itself. A set may not be a 
"formal" subset or a "formal" superset of itself. `issubset` and `issuperset` 
refer to standard subsets and supersets, not formal subsets and supersets.

In Python, you can trivially check that…
```
In [1]: {1, 2, 3}.issubset({1, 2, 3})
Out[1]: True

In [2]: {1, 2, 3}.issuperset({1, 2, 3})
Out[2]: True

In [3]: {1, 2, 3}.issubset((1, 2, 3))
Out[3]: True

In [4]: {1, 2, 3}.issuperset((1, 2, 3))
Out[4]: True
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QRI7LQAR7TZXSWOVYY5KLS52HK2GU7IK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Instance method to test equivalence between set and iterable

2020-03-22 Thread Steve Jorgensen

Bar Harel wrote:
> Hey Steve,
> How about set.symmetric_difference()?
> Does it not do what you want?
> Best regards,
> Bar Harel
> On Sun, Mar 22, 2020, 10:03 PM Steve Jorgensen ste...@stevej.name wrote:
> > Currently, the issubset and
> > issuperset methods of set objects accept
> > arbitrary iterables as arguments. An iterable that is both a subset and
> > superset is, in a sense, "equal" to the set. It would be inappropriate for
> > == to return True for such a comparison, however, since that
> > would
> > break the Hashable contract.
> > Should sets have an additional method, something like like(other),
> > issimilar(other), or isequivalent(other), that returns
> > True for any
> > iterable that contains the all of the items in the set and no items that
> > are not in the set? It would therefore be true in the same cases where
> >  = set(other) or .issubset(other) and
> > .issuperset(other)
> > is true.
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/ULQQ7T...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
Indirectly, it does, but that returns a set, not a `bool`. It would also, 
therefore, do more work than necessary to determine the result in many cases.

A python implementation for what I'm talking about would be something like the 
following.

```
def like(self, other):
found = set()
for item in other:
if item not in self:
return False
found.add(item)
return len(found) == len(self)
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XURB3B3RVM23ECR7BZZFFW7ISLLR63NQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Instance method to test equivalence between set and iterable

2020-03-22 Thread Steve Jorgensen

Currently, the `issubset` and `issuperset` methods of set objects accept 
arbitrary iterables as arguments. An iterable that is both a subset and 
superset is, in a sense, "equal" to the set. It would be inappropriate for `==` 
to return `True` for such a comparison, however, since that would break the 
`Hashable` contract.

Should sets have an additional method, something like `like(other)`, 
`issimilar(other)`, or `isequivalent(other)`, that returns `True` for any 
iterable that contains the all of the items in the set and no items that are 
not in the set? It would therefore be true in the same cases where ` = 
set(other)` or `.issubset(other) and .issuperset(other)` is true.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ULQQ7TZBPQN3RAGKIP52XHFD6LR4HIB4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Formalized pretty & encoding-aware object representation (was dunder methods for...)

2020-03-20 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Based on the conversations stemming from my previous post, it is clear that 
> the topic
> was too implementation-specific. It is not clear whether dunder methods are 
> an appropriate
> component of the solution (they might or might not be).
> Also, it presumably makes sense to start by looking at prior art rather than 
> inventing
> from scratch.
> Quotes from previous thread regarding prior art to look at:
> Jonathan Fine wrote:
> 
> > Here's some comments on the state of the art. In
> > addition to
> > https://docs.python.org/3/library/pprint.html
> > there's also
> > https://docs.python.org/3/library/reprlib.html
> > and
> > https://docs.python.org/3/library/json.html
> > I expect that these three modules have some overlap in purpose and design
> > (but probably not in code).
> > And if you're brave, there's also
> > https://docs.python.org/3/library/pickle.html
> > and
> > https://github.com/psf/black
> > Time to declare a special interest. I'm a long-time user and great fan of
> > TeX / LaTeX. And some nice way of pretty-printing Python objects using TeX
> > notation could be useful.
> > And also related is Geoffrey French's Larch environment for editing Python,
> > which has a pretty-printing component.
> > http://www.britefury.com/larch_site/
> > with best wishes
> > Jonathan
> > Alex Hall wrote:
> > Might be helpful to look at https://github.com/tommikaikkonen/prettyprinter
> >  and https://github.com/wolever/pprintpp
> >
> Angus Hollands wrote:
> > Has anyone mentioned the IPython pretty printer yet? I'm late to the
> > conversation unfortunately, so apologies if someone else already raised it.
> > https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#IPython.lib.pretty.pretty
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3WCI2E3BL4VBJ6W33PWNZLR25YUW3662/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Syntax for loop invariants

2020-03-17 Thread Steve Jorgensen

haael wrote:
> Python has more and more optional tools for formal correctness cheking. 
> Why not loop invariants?
> Loop invariant is such a statement that if it was true before the loop 
> iteration, it will also be true after the iteration. It can be 
> implemented as an assertion of an implication.
>  now_value = False
>  while running_condition(...):
>  prev_value = now_value
>  now_value = invariant_condition(...)
>  assert now_value if prev_value else True
> 
> Here for ellipsis we can substitute any values and variables.
> I propose the new syntax:
>  while running_condition(...):
>  invariant invariant_condition(...)
> 
> The keyword 'invariant' is allowed only inside a loop. The interpreter 
> will create a separate boolean variable holding the truth value of each 
> invariant. On the loop entry, the value is reset to false. When the 
> 'invariant' statement is encountered, the interpreter will evaluate the 
> expression, test the implication 'prev_value -> now_value' and update 
> the value. If the implication is not met, an exception will be thrown 
> 'InvariantError' which is a subclass of 'AssertionError'.
> Like assertions, invariants will be checked only in debug mode.
> I am developing a library for formal proofs and such a feature would be 
> handy.
> haael

If something like this would be appropriate to have, then maybe it would be 
more appropriate to have a more generic-purpose DbC-like capability that could 
be used to check various kinds of pre/post conditions around various kinds of 
code construct.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TEBTN2W4TNZUCELN3ZBORZEYWSYI6XHK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Formalized pretty & encoding-aware object representation (was dunder methods for...)

2020-03-16 Thread Steve Jorgensen

Oops. Somehow this subject was posted twice. Please ignore this thread & follow 
the other thread with the same subject line.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NBPCLSFPDZIK2SGDUDK7CHHMHXROD7X5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Formalized pretty & encoding-aware object representation (was dunder methods for...)

2020-03-16 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Based on the conversations stemming from my previous post, it is clear that 
> the topic
> was too implementation-specific. It is not clear whether dunder methods are 
> an appropriate
> component of the solution (they might or might not be).
> Also, it presumably makes sense to start by looking at prior art rather than 
> inventing
> from scratch.

There has been some argument regarding whether objects should say how to 
present themselves "prettily". I think a case can be made either way, but in 
either case, it makes sense that it should be easy to override the 
representation for an object type without subclassing or monkey-patching it. 
Also, it might make sense not to clutter up the dunder-method space for all 
kinds of objects with this kind of thing.

Without using dunder methods, it could still be possible for any body of code 
to provide default special-representational rules for its objects by 
registering hooks. Also, as a hybrid-approach, it could be that the defaults 
for representation are determined first by looking at a default registry and 
then falling back to dunder methods if present.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EMXMEPFSXTUMFGY2LN5UHWCJYSVBKEEK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Formalized pretty & encoding-aware object representation (was dunder methods for...)

2020-03-16 Thread Steve Jorgensen

Based on the conversations stemming from my previous post, it is clear that the 
topic was too implementation-specific. It is not clear whether dunder methods 
are an appropriate component of the solution (they might or might not be).

This suggestion is to try to solve 2 inter-related but different issues, 
possibly through the same mechanism, 2 unrelated mechanisms, or partially 
overlapping mechanisms.

Although the current `__str__` and `__repr__` concepts seem perfectly 
appropriate to me, I think there also is justification for a means of having 
standard pretty-informal (str-like) and pretty-formal (repr-like) 
representations for various types of object.  In the informal case, it should 
be possible to pass information about a file object that it will be written to 
(especially encoding & possibly isatty()) to the representation code, and in 
the formal case, either the representation code should interact with the 
pretty-printer or it should be able to return data in a from that tells the 
pretty printer how to nest portions of the representation.

It presumably makes sense to start by looking at prior art rather than 
inventing from scratch.

Quotes from previous thread regarding prior art to look at:

Jonathan Fine wrote:

> Here's some comments on the state of the art. In addition to
> https://docs.python.org/3/library/pprint.html
> there's also
> https://docs.python.org/3/library/reprlib.html
> and
> https://docs.python.org/3/library/json.html
> I expect that these three modules have some overlap in purpose and design
> (but probably not in code).
> And if you're brave, there's also
> https://docs.python.org/3/library/pickle.html
> and
> https://github.com/psf/black
> Time to declare a special interest. I'm a long-time user and great fan of
> TeX / LaTeX. And some nice way of pretty-printing Python objects using TeX
> notation could be useful.
> And also related is Geoffrey French's Larch environment for editing Python,
> which has a pretty-printing component.
> http://www.britefury.com/larch_site/
> with best wishes
> Jonathan

Alex Hall wrote:
> Might be helpful to look at https://github.com/tommikaikkonen/prettyprinter
>  and https://github.com/wolever/pprintpp
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GLXVPG6UAOTEKDVCV362CTGB4EGYYWPP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Formalized pretty & encoding-aware object representation (was dunder methods for...)

2020-03-16 Thread Steve Jorgensen

Based on the conversations stemming from my previous post, it is clear that the 
topic was too implementation-specific. It is not clear whether dunder methods 
are an appropriate component of the solution (they might or might not be).

Also, it presumably makes sense to start by looking at prior art rather than 
inventing from scratch.

Quotes from previous thread regarding prior art to look at:

Jonathan Fine wrote:

> Here's some comments on the state of the art. In addition to
> https://docs.python.org/3/library/pprint.html
> there's also
> https://docs.python.org/3/library/reprlib.html
> and
> https://docs.python.org/3/library/json.html
> I expect that these three modules have some overlap in purpose and design
> (but probably not in code).
> And if you're brave, there's also
> https://docs.python.org/3/library/pickle.html
> and
> https://github.com/psf/black
> Time to declare a special interest. I'm a long-time user and great fan of
> TeX / LaTeX. And some nice way of pretty-printing Python objects using TeX
> notation could be useful.
> And also related is Geoffrey French's Larch environment for editing Python,
> which has a pretty-printing component.
> http://www.britefury.com/larch_site/
> with best wishes
> Jonathan

Alex Hall wrote:
> Might be helpful to look at https://github.com/tommikaikkonen/prettyprinter
>  and https://github.com/wolever/pprintpp
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MQVTUDPIX7LWTPMPSBAQLPCDZSPMBUEU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-15 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Steve Jorgensen wrote:
> > Steve Jorgensen wrote:
> > 
> > The problem I came up with trying to spike out
> > my
> > proposal last night is that there
> > doesn't seem to be anyway to implement it without creating infinite 
> > recursion in the
> > issublcass call. If I make Orderable a real or virtual subclass
> > of ProtoOrderable and Orderable's __subclasshook__
> > or metaclass __subclasscheck__ (I tried both ways) tries to check whether
> > C is a subclass of ProtoOrderable, then an infinite recursion
> > occurs.
> > It wasn't immediately obvious to me why that is the case, but when I 
> > thought about it
> > deeply, I can see why that must happen.
> > An alternative that I thought about previously but seems very smelly to me 
> > for several
> > reasons is to have both Orderable and NonOrderable ABCs. In that
> > case, what should be done to prevent a class from being both orderable and 
> > non-orderable
> > or figure out which should take precedence in that case?
> > As a meta-solution (wild-assed idea) what if metaclass registration could 
> > accept
> > keyword arguments, similar to passing keyword arguments to a class 
> > definition? That way,
> > a
> > single ABC (ProtoOrderable or whatever better name) could be a real or
> > virtual subclass that is explicitly orderable or non-orderable depending on
> > orderable=.
> > I have been unable to implement the class hierarchy that I proposed, and I 
> > think
> > I've determined that it's just not a practical fit with how the virtual bas 
> > class
> > mechanism works, so…
> > Maybe just a single TotalOrdered or TotalOrderable ABC with a
> > register_explicit_only method. The __subclasshook__ method would
> > skip the rich comparison methods check and return NotImplemented for any
> > class registered using register_explicit_only (or any of its true
> > subclasses).
> > The only weird edge case in the above is that is someone registers another 
> > ABC using
> > TotalOrdered.register_explicit_only and uses that as a virtual base class of
> > something else, the register_explicit_only registration will not apply to 
> > the
> > virtual subclass. I'm thinking that's completely acceptable as a known 
> > limitation if
> > documented?
> > Code spike of that idea:
> from abc import ABCMeta
> from weakref import WeakSet
> 
> 
> class TotallyOrderable(metaclass=ABCMeta):
> _explicit_only_registry = WeakSet()
> 
> @classmethod
> def register_explicit_only(cls, C):
> if cls is not TotallyOrderable:
> raise NotImplementedError(
> f"{cls.__name__} does not implement 'register_explicit_only'")
> 
> cls._explicit_only_registry.add(C)
> 
> @classmethod
> def __subclasshook__(cls, C):
> if cls is not TotallyOrderable:
> return NotImplemented
> 
> for B in C.__mro__:
> if B in cls._explicit_only_registry:
> return NotImplemented
> 
> return cls._check_overrides_rich_comparison_methods(C)
> 
> @classmethod
> def _check_overrides_rich_comparison_methods(cls, C):
> mro = C.__mro__
> for method in ('__lt__', '__le__', '__gt__', '__ge__'):
> for B in mro:
> if B is not object and method in B.__dict__:
> if B.__dict__[method] is None:
> return NotImplemented
> break
> else:
> return NotImplemented
> return True

Naming question: Should an abstract base class for this concept be named 
`TotalOrderable`, `TotallyOrderable`, `TotalOrdered`, or `TotallyOrdered`?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IPVNBE6VQZJZPF5ZB7XLPCAIX47SBMIL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dunder methods for encoding & prettiness aware formal & informal representations

2020-03-15 Thread Steve Jorgensen

Alex Hall wrote:
> Might be helpful to look at https://github.com/tommikaikkonen/prettyprinter
>  and https://github.com/wolever/pprintpp

Right! Thx. :)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HHE6NKJ5C7HBYJO2ASHXMKYLVC6ZBVLE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dunder methods for encoding & prettiness aware formal & informal representations

2020-03-15 Thread Steve Jorgensen

Jonathan Fine wrote:
> Hi Steve (for clarity Jorgensen)
> Thank you for your good idea, and your enthusiasm. And I thank Guido, for
> suggesting a good contribution this list can make.
> Here's some comments on the state of the art. In addition to
> https://docs.python.org/3/library/pprint.html
> there's also
> https://docs.python.org/3/library/reprlib.html
> and
> https://docs.python.org/3/library/json.html
> I expect that these three modules have some overlap in purpose and design
> (but probably not in code).
> And if you're brave, there's also
> https://docs.python.org/3/library/pickle.html
> and
> https://github.com/psf/black
> Time to declare a special interest. I'm a long-time user and great fan of
> TeX / LaTeX. And some nice way of pretty-printing Python objects using TeX
> notation could be useful.
> And also related is Geoffrey French's Larch environment for editing Python,
> which has a pretty-printing component.
> http://www.britefury.com/larch_site/
> with best wishes
> Jonathan

I feel kind of silly for jumping right to the idea of prototyping rather than 
looking for prior art. :)

It clearly makes more sense to choose an existing popular library as a 
candidate starting point for promotion into the stdlib rather than starting 
from scratch.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OHMJBHPBLOUQ42E6J4YGOGNUBLWUFAH4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: dunder methods for encoding & prettiness aware formal & informal representations

2020-03-15 Thread Steve Jorgensen

Guido van Rossum wrote:
> I think the idea you're looking for is an alternative for the pprint module
> that allows classes to have formatting hooks that get passed in some
> additional information (or perhaps a PrettyPrinter object) that can affect
> the formatting.
> This would seem to be an ideal thing to try to design and put on PyPI,
> except it would be more effective if there was a standard, rather than
> several competing such modules, with different APIs for the formatting
> hooks.
> So I encourage having a discussion (might as well be here) about the design
> of the new PrettyPrinter API.

I like it. :)

I'll do some prototyping to see if I come up with any promising patterns.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RPW742V6X3UAXGNZ5GZVOFFZIKSO5MCL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] dunder methods for encoding & prettiness aware formal & informal representations

2020-03-15 Thread Steve Jorgensen

This is really an idea for an idea. I'm not sure what the ideal dunder method 
names or APIs should be.


Encoding awareness:

The informal (`str`) representations of `inf` and `-inf` are "inf" and "-inf", 
and that seems appropriate as a known-safe value, but if we're writing the 
representation to a stream, and the stream has a Unicode encoding, then those 
might prefer to represent themselves as "∞" and "-∞". If there were a dunder 
method for informal representation to which the destination stream was passed, 
then the object could decide how to represent itself based on the properties of 
the stream.


Prettiness awareness:

It would be nice if an object could have control of how it is represented when 
pretty-printed. If there is any way for that to be done now, it is not at all 
evident from the pprint module documentation. It would be nice if there were 
some method that, if implemented for the object, would be used to allow the 
object to tell the pretty printer to treat it is a composite with starting 
text, component objects, and ending text.


Additional thoughts & open questions:

Perhaps there should only be stream awareness for informal representation and 
prettiness awareness for formal representation (separate concepts and APIs) or 
perhaps both ideas are applicable to both kinds of representation.

Is it better for a stream-aware representation method to return the value to be 
written to the stream or to directly append its representation to that stream?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OQPPJ7SNM5CZUI5RYT5R4Z6YZWMNNTZS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] More appropriate behavior for the NotImplemented object

2020-03-11 Thread Steve Jorgensen

I realize this is probably something that would be hard to change for 
compatibility reasons. Maybe someone can think of a way around that though?

It seems to me that `not NotImplemented` should result in `NotImplemented` and 
attempting to convert it to `bool` should raise a `TypeError` exception.

Take the following example:

```
def __lt__(self, other):
return not self.__ge__(other):

def __le__(self, other):
return not self.__gt__(other):

def __ge__(self, other):

```

Currently, this will not work because `NotImplemented` is truthy and `not 
NotImplemented` is `False`, so it is necessary to complicate the 
implementations of `__lt__` and `__le__` to specifically check whether the 
value returned from the complementary method returned `NotImplemented` or not.

If the value of `not NotImplemented` was `NotImplemented` then the coding 
pattern above would simply work.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U7GOYMMMBQQPSD45JDNCSOO7VULDZTD6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-05 Thread Steve Jorgensen

Andrew Barnert wrote:
> On Mar 5, 2020, at 11:05, Steve Jorgensen ste...@stevej.name wrote:
> > Steve Jorgensen wrote:
> > Steve Jorgensen wrote:
> > 
> > The problem I came up with trying to spike out
> > my
> > proposal last night is that there
> > doesn't seem to be anyway to implement it without creating infinite 
> > recursion in the
> > issublcass call.
> > Is this something we should be looking to add to the ABC mechanism in
> general?
> Would a way to “unregister” classes that would be implicitly accepted be 
> simpler than a
> way to “register_explicit_only” classes so they skip the implicit test?
> > If I make Orderable a real or virtual subclass
> > of ProtoOrderable and Orderable's __subclasshook__
> > or metaclass __subclasscheck__ (I tried both ways) tries to check whether
> > C is a subclass of ProtoOrderable, then an infinite recursion
> > occurs.
> > It wasn't immediately obvious to me why that is the case, but when I 
> > thought about it
> > deeply, I can see why that must happen.
> > An alternative that I thought about previously but seems very smelly to me 
> > for several
> > reasons is to have both Orderable and NonOrderable ABCs. In that
> > case, what should be done to prevent a class from being both orderable and 
> > non-orderable
> > or figure out which should take precedence in that case?
> > As a meta-solution (wild-assed idea) what if metaclass registration could 
> > accept
> > keyword arguments, similar to passing keyword arguments to a class 
> > definition? That way,
> > a
> > single ABC (ProtoOrderable or whatever better name) could be a real or
> > virtual subclass that is explicitly orderable or non-orderable depending on
> > orderable=.
> > I have been unable to implement the class hierarchy that I proposed, and I 
> > think
> > I've determined that it's just not a practical fit with how the virtual bas 
> > class
> > mechanism works, so…
> > Maybe just a single TotalOrdered or TotalOrderable ABC with a
> > register_explicit_only method. The __subclasshook__ method would
> > skip the rich comparison methods check and return NotImplemented for any
> > class registered using register_explicit_only (or any of its true
> > subclasses).
> > The only weird edge case in the above is that is someone registers another 
> > ABC using
> > TotalOrdered.register_explicit_only and uses that as a virtual base class of
> > something else, the register_explicit_only registration will not apply to 
> > the
> > virtual subclass. I'm thinking that's completely acceptable as a known 
> > limitation if
> > documented?
> > Code spike of that idea:
> > from abc import ABCMeta
> > from weakref import WeakSet
> > 
> > 
> > class TotallyOrderable(metaclass=ABCMeta):
> >_explicit_only_registry = WeakSet()
> > 
> >@classmethod
> >def register_explicit_only(cls, C):
> >if cls is not TotallyOrderable:
> >raise NotImplementedError(
> >f"{cls.__name__} does not implement 
> > 'register_explicit_only'")
> > 
> >cls._explicit_only_registry.add(C)
> > 
> >@classmethod
> >def __subclasshook__(cls, C):
> >if cls is not TotallyOrderable:
> >return NotImplemented
> > 
> >for B in C.__mro__:
> >if B in cls._explicit_only_registry:
> >return NotImplemented
> > 
> >return cls._check_overrides_rich_comparison_methods(C)
> > 
> >@classmethod
> >def _check_overrides_rich_comparison_methods(cls, C):
> >mro = C.__mro__
> >for method in ('__lt__', '__le__', '__gt__', '__ge__'):
> >for B in mro:
> >if B is not object and method in B.__dict__:
> >if B.__dict__[method] is None:
> >return NotImplemented
> >break
> >else:
> >return NotImplemented
> >return True
> > 
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at 
> > https://mail.python.org/archives/list/python-ideas@python.org/message/2OZBPQ...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >

Maybe so because I found a limitation with my code spike. Calling 
`register_explicit_only` doesn't bust

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-05 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Steve Jorgensen wrote:
> 
> > The problem I came up with trying to spike out my
> > proposal last night is that there
> > doesn't seem to be anyway to implement it without creating infinite 
> > recursion in the
> > issublcass call. If I make Orderable a real or virtual subclass
> > of ProtoOrderable and Orderable's __subclasshook__
> > or metaclass __subclasscheck__ (I tried both ways) tries to check whether
> > C is a subclass of ProtoOrderable, then an infinite recursion
> > occurs.
> > It wasn't immediately obvious to me why that is the case, but when I 
> > thought about it
> > deeply, I can see why that must happen.
> > An alternative that I thought about previously but seems very smelly to me 
> > for several
> > reasons is to have both Orderable and NonOrderable ABCs. In that
> > case, what should be done to prevent a class from being both orderable and 
> > non-orderable
> > or figure out which should take precedence in that case?
> > As a meta-solution (wild-assed idea) what if metaclass registration could 
> > accept
> > keyword arguments, similar to passing keyword arguments to a class 
> > definition? That way,
> > a
> > single ABC (ProtoOrderable or whatever better name) could be a real or
> > virtual subclass that is explicitly orderable or non-orderable depending on
> > orderable=.
> > I have been unable to implement the class hierarchy that I proposed, and I 
> > think
> I've determined that it's just not a practical fit with how the virtual bas 
> class
> mechanism works, so…
> Maybe just a single TotalOrdered or TotalOrderable ABC with a
> register_explicit_only method. The __subclasshook__ method would
> skip the rich comparison methods check and return NotImplemented for any
> class registered using register_explicit_only (or any of its true
> subclasses).
> The only weird edge case in the above is that is someone registers another 
> ABC using
> TotalOrdered.register_explicit_only and uses that as a virtual base class of
> something else, the register_explicit_only registration will not apply to the
> virtual subclass. I'm thinking that's completely acceptable as a known 
> limitation if
> documented?

Code spike of that idea:
```
from abc import ABCMeta
from weakref import WeakSet


class TotallyOrderable(metaclass=ABCMeta):
_explicit_only_registry = WeakSet()

@classmethod
def register_explicit_only(cls, C):
if cls is not TotallyOrderable:
raise NotImplementedError(
f"{cls.__name__} does not implement 'register_explicit_only'")

cls._explicit_only_registry.add(C)

@classmethod
def __subclasshook__(cls, C):
if cls is not TotallyOrderable:
return NotImplemented

for B in C.__mro__:
if B in cls._explicit_only_registry:
return NotImplemented

return cls._check_overrides_rich_comparison_methods(C)

@classmethod
def _check_overrides_rich_comparison_methods(cls, C):
mro = C.__mro__
for method in ('__lt__', '__le__', '__gt__', '__ge__'):
for B in mro:
if B is not object and method in B.__dict__:
if B.__dict__[method] is None:
return NotImplemented
break
else:
return NotImplemented
return True
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2OZBPQPYIFFG2E6BS2EYLDJF2QP5FRTG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-05 Thread Steve Jorgensen

Steve Jorgensen wrote:

> The problem I came up with trying to spike out my proposal last night is that 
> there
> doesn't seem to be anyway to implement it without creating infinite recursion 
> in the
> issublcass call. If I make Orderable a real or virtual subclass
> of ProtoOrderable and Orderable's __subclasshook__
> or metaclass __subclasscheck__ (I tried both ways) tries to check whether
> C is a subclass of ProtoOrderable, then an infinite recursion
> occurs.
> It wasn't immediately obvious to me why that is the case, but when I thought 
> about it
> deeply, I can see why that must happen.
> An alternative that I thought about previously but seems very smelly to me 
> for several
> reasons is to have both Orderable and NonOrderable ABCs. In that
> case, what should be done to prevent a class from being both orderable and 
> non-orderable
> or figure out which should take precedence in that case?
> As a meta-solution (wild-assed idea) what if metaclass registration could 
> accept
> keyword arguments, similar to passing keyword arguments to a class 
> definition? That way, a
> single ABC (ProtoOrderable or whatever better name) could be a real or
> virtual subclass that is explicitly orderable or non-orderable depending on
> orderable=.

I have been unable to implement the class hierarchy that I proposed, and I 
think I've determined that it's just not a practical fit with how the virtual 
bas class mechanism works, so…

Maybe just a single `TotalOrdered` or `TotalOrderable` ABC with a 
`register_explicit_only` method. The `__subclasshook__` method would skip the 
rich comparison methods check and return `NotImplemented` for any class 
registered using `register_explicit_only` (or any of its true subclasses).

The only weird edge case in the above is that is someone registers another ABC 
using `TotalOrdered.register_explicit_only` and uses that as a virtual base 
class of something else, the `register_explicit_only` registration will not 
apply to the virtual subclass. I'm thinking that's completely acceptable as a 
known limitation if documented?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AF4O3PFQ7VNHCUVBWB3NENYNGPU74SVX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-04 Thread Steve Jorgensen

Andrew Barnert wrote:
> On Mar 4, 2020, at 00:07, Steve Jorgensen ste...@stevej.name wrote:
> > Taking one step back out of the realm of mathematical
> > definition, however, the original idea was simply to distinguish what I now 
> > understand to
> > be "totally ordered" types from other types, be they "partially ordered" or 
> > unordered —
> > not even having a full complement of rich comparison operators or having 
> > all but using
> > them in weirder ways than sets do.
> > Is there any commonly used or even imaginable useful type that uses them in
> weirder ways than set and float (which are both partially ordered) or 
> np.array (where they
> aren’t even Boolean-values)? In particular, transitivity keeps coming up, but 
> all of those
> examples are transitive (it’s never true that a being
> true than a to
> distinguish them, but if there aren’t, it doesn’t seem unreasonable for 
> PartiallyOrdered
> to “wrongly” pick up hypothetical pathological types that no one will ever 
> write in
> exchange for automatically being right about every actual type anyone uses. 
> After all,
> Iterable is a virtual superclass of any type with __iter__, even if it 
> returns the number
> 42 instead of an Iterator, and so on; technically every implicit ABC in 
> Python is “wrong”
> like this, but in practice it doesn’t come up and implicit ABCs are very 
> useful.

I see what you're saying. I guess what I was getting at is that for purposes of 
determining whether something is totally orderable or not, it doesn't matter 
what kind of not-totally-orderable the thing is — partially orderable (like 
sets), non-orderable (without full complement of operators), or some other 
weird thing that has the full compliment of operators.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WDB6UPXAMCJUMWNZBEJ2466JCBGU5PIH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-04 Thread Steve Jorgensen

Stéfane Fermigier wrote:
> On Wed, Mar 4, 2020 at 8:24 AM Steve Jorgensen ste...@stevej.name wrote:
> > Chris Angelico wrote:
> > On Wed, Mar 4, 2020 at 6:04 PM Steve Jorgensen
> > ste...@stevej.name wrote:
> >  
> > https://en.wikipedia.org/wiki/Partially_ordered_set
> > "Partially ordered" means you can compare pairs of elements and find
> > which one comes first. "Totally ordered" means you can compare ANY
> > pair of elements, and you'll always know which comes first.
> > ChrisA
> > Ah. Good to know. I don't think "Partially ordered" actually applies,
> > then, because that still seems to imply that transitivity would apply to
> > comparisons between any given pair of objects. Simply having
> > implementations of all the rich comparison operators does not make that
> > true, however, and in particular, that's not true for sets.
> > Not quite: https://en.wikipedia.org/wiki/Partially_ordered_set#Examples
> (see
> example 2).
> Or:
> https://math.stackexchange.com/questions/1305004/what-is-meant-by-ordering-o...
> S.

Ah! That Wikipedia article is very helpful. I see that it is not necessary for 
all items in a partially ordered set to be comparable.

Taking one step back out of the realm of mathematical definition, however, the 
original idea was simply to distinguish what I now understand to be "totally 
ordered" types from other types, be they "partially ordered" or unordered — not 
even having a full complement of rich comparison operators or having all but 
using them in weirder ways than sets do.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S6VZ4DWZBL3NLBFZKJYPN5EE5OMRAF3V/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Chris Angelico wrote:
> > On Wed, Mar 4, 2020 at 6:04 PM Steve Jorgensen
> > ste...@stevej.name wrote:
> >  
> > https://en.wikipedia.org/wiki/Partially_ordered_set
> > "Partially ordered" means you can compare pairs of elements and find
> > which one comes first. "Totally ordered" means you can compare ANY
> > pair of elements, and you'll always know which comes first.
> > ChrisA
> > Ah. Good to know. I don't think "Partially ordered" actually applies, then,
> because that still seems to imply that transitivity would apply to 
> comparisons between any
> given pair of objects. Simply having implementations of all the rich 
> comparison operators
> does not make that true, however, and in particular, that's not true for sets.
> If we consider just the sets {1, 2} and {1, 3}, …
> In [1]: {1, 2} < {1, 3}
> Out[1]: False
> 
> In [2]: {1, 2} >= {1, 3}
> Out[2]: False
> 
> Neither is a subset of the other, so both of those tests return
> False.

Ah. Maybe I'm arguing against a different point than what you were making then. 
Just because sets are not partially ordered does not mean that "partially 
ordered" is not a useful distinction in addition to "totally ordered".

In that case, maybe the hierarchy would be something like…
* ProtoOrdered (or ProtoOrderable): Orderability is explicit and never 
inferred. Unordered unless also a subclass of PartiallyOrdered or 
TotallyOrdered.
* * PartiallyOrdered
* * * TotallyOrdered

An class that does not directly or virtually subclass any of those but 
implements all the rich comparison operators would be treated as an inferred 
virtual subclass of `TotallyOrdered`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7FV27SQYFR6M66JHHYMFW7EDKHXNJ3MJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

Chris Angelico wrote:
> On Wed, Mar 4, 2020 at 6:04 PM Steve Jorgensen ste...@stevej.name wrote:

> https://en.wikipedia.org/wiki/Partially_ordered_set
> "Partially ordered" means you can compare pairs of elements and find
> which one comes first. "Totally ordered" means you can compare ANY
> pair of elements, and you'll always know which comes first.
> ChrisA

Ah. Good to know. I don't think "Partially ordered" actually applies, then, 
because that still seems to imply that transitivity would apply to comparisons 
between any given pair of objects. Simply having implementations of all the 
rich comparison operators does not make that true, however, and in particular, 
that's not true for sets.

If we consider just the sets `{1, 2}` and `{1, 3}`, …
```
In [1]: {1, 2} < {1, 3}
Out[1]: False

In [2]: {1, 2} >= {1, 3}
Out[2]: False
```
Neither is a subset of the other, so both of those tests return `False`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3IOKPBV6DIQCJ5FNLVSMP3M7HHJ2STO2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

Greg Ewing wrote:
> On 4/03/20 7:42 am, Steve Jorgensen wrote:
> > That's a much better term. Orderable and
> > ProtoOrderable.
> > I would suggest "TotallyOrdered" and "PartiallyOrdered".

Possibly, but the reasoning is not obvious to me. Can you explain? I get that 
`TotallyOrdered` is  consistent with 
https://docs.python.org/2/library/functools.html#functools.total_ordering, but 
I don't get the `PartialyOrdered` term.

In case I was not sufficiently clear about my proposal (just making sure) the 
`Proto`… in my concept simply means that the determination of whether the class 
is orderable is explicit and not determined by whether the rich comparison 
methods are present. A class that has `ProtoOrderable` but not `Orderable` as 
an actual or virtual subclass is not orderable, but a class that is not a 
sublcass of either is assumed to be orderable if it implements all the rich 
comparison methods.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LU3UFEXBQZJS2TUZQFCPVFH7Q37I62E7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

Guido van Rossum wrote:
> On Tue, Mar 3, 2020 at 10:43 AM Steve Jorgensen ste...@stevej.name wrote:
> > Guido van Rossum wrote:
> > I think it’s usually called Orderable. It’s a
> > useful concept in static
> > type
> > checking too (e.g. mypy), where we’d use it as an upper bound for type
> > variables, if we had it. I guess to exclude sets you’d have to introduce
> > TotalOrderable.
> > Right. That's a much better term. Orderable and
> > ProtoOrderable.
> > Or even PartialOrderable and Orderable. This would follow Rust's
> PartialOrd
> and Ord (https://doc.rust-lang.org/std/cmp/trait.PartialOrd.html
> and
> https://doc.rust-lang.org/std/cmp/trait.Ord.html).
> But beware, IIRC there are pathological cases involving floats, (long) ints
> and rounding where transitivity may be violated in Python (though I believe
> only Tim Peters can produce an example :-). I'm honestly not sure that
> that's enough to sink the idea. (If it were, NaN would be a bigger problem.)

Yeah. Violations of transitivity are already breaking their contracts, so 
having a new way of expressing the contract has no affect on that.

The problem I came up with trying to spike out my proposal last night is that 
there doesn't seem to be anyway to implement it without creating infinite 
recursion in the `issublcass` call. If I make `Orderable` a real or virtual 
subclass of `ProtoOrderable` and `Orderable`'s `__subclasshook__` or metaclass 
`__subclasscheck__` (I tried both ways) tries to check whether `C` is a 
subclass of `ProtoOrderable`, then an infinite recursion occurs.

It wasn't immediately obvious to me why that is the case, but when I thought 
about it deeply, I can see why that must happen.

An alternative that I thought about previously but seems very smelly to me for 
several reasons is to have both `Orderable` and `NonOrderable` ABCs. In that 
case, what should be done to prevent a class from being both orderable and 
non-orderable or figure out which should take precedence in that case?

As a meta-solution (wild-assed idea) what if metaclass registration could 
accept keyword arguments, similar to passing keyword arguments to a class 
definition? That way, a single ABC (`ProtoOrderable` or whatever better name) 
could be a real or virtual subclass that is explicitly orderable or 
non-orderable depending on `orderable=`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BFFPWPZGNPJCT3KFFU6DJHI5RBG2NBYC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

Guido van Rossum wrote:
> I think it’s usually called Orderable. It’s a useful concept in static type
> checking too (e.g. mypy), where we’d use it as an upper bound for type
> variables, if we had it. I guess to exclude sets you’d have to introduce
> TotalOrderable.
> On Tue, Mar 3, 2020 at 04:03 Steve Jorgensen ste...@stevej.name wrote:
> > I have encountered cases in which I would like to
> > validate that an
> > argument can be properly compared with other instances of its type. This is
> > true of numbers, strings, dates, … but not for NoneClass, type,
> > ….
> > One way that I have tried to handle this is to check whether the object
> > can be compared to itself using >, <, >=,
> > and <= and that it is
> > neither > or < itself and is both >= and
> > <= itself. The most
> > glaring example of why this is insufficient is the set type. A
> > set
> > object meets all of those criteria, but given any 2 instances, it is not
> > true that if set a > b is False then a <= b
> > is True. The operators
> > are not acting as comparisons of relative magnitude in this case but as
> > tests for superset/subset relations — which is fine and good but doesn't
> > help with this situation.
> > What I think would be helpful is to have a Magnitude abstract base class
> > that is a subclass of ProtoMagnitude (or whatever better names anyone can
> > imagine).
> > The subclass hook for Magnitude would return True for any
> > class with
> > instance methods for all of the rich comparison methods, but it would skip
> > that check and return False for any real or virtual subclass of
> > ProtoMagnitude (overridable by registering as a Magnitude
> > subclass).
> > The, set type would then be registered as a virtual base class of
> > ProtoMagnitude but not Magnitude so that issubclass(set,
> > Magnitude)
> > would return False.
> > For performance optimization, the module that defines these ABCs would
> > register the obviously appropriate built-in and standard-lib types with
> > Magnitude: Number, str, list,
> > tuple, date, …
> > Why not have this be a separate distribution package? This concept is only
> > reliable if all of the code that makes use of it shares a common
> > implementation. It does no good to register a class as ProtoMagnitude,
> > for instance, if an instance of that will passed to code in another library
> > that is unaware of the ProtoMagnitude and Magnitude ABCs in the
> > package, or maybe has its own independent system for attempting to
> > accomplish the same goal.
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/7WC4SF...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > -- 
> > --Guido (mobile)
> >

Right. That's a much better term. `Orderable` and `ProtoOrderable`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DIZPDIRF3254ZZZMCWSPEUOBLKC2MQZZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Magnitude and ProtoMagnitude ABCs — primarily for argument validation

2020-03-03 Thread Steve Jorgensen

I have encountered cases in which I would like to validate that an argument can 
be properly compared with other instances of its type. This is true of numbers, 
strings, dates, … but not for `NoneClass`, `type`, ….

One way that I have tried to handle this is to check whether the object can be 
compared to itself using `>`, `<`, `>=`, and `<=` and that it is neither `>` or 
`<` itself and is both `>=` and `<=` itself. The most glaring example of why 
this is insufficient is the `set` type. A `set` object meets all of those 
criteria, but given any 2 instances, it is not true that if set `a > b` is 
`False` then `a <= b` is `True`. The operators are not acting as comparisons of 
relative magnitude in this case but as tests for superset/subset relations — 
which is fine and good but doesn't help with this situation.

What I think would be helpful is to have a `Magnitude` abstract base class that 
is a subclass of `ProtoMagnitude` (or whatever better names anyone can imagine).

The subclass hook for `Magnitude` would return `True` for any class with 
instance methods for all of the rich comparison methods, but it would skip that 
check and return `False` for any real or virtual subclass of `ProtoMagnitude` 
(overridable by registering as a `Magnitude` subclass). The, `set` type would 
then be registered as a virtual base class of `ProtoMagnitude` but not 
`Magnitude` so that `issubclass(set, Magnitude)` would return `False`.

For performance optimization, the module that defines these ABCs would register 
the obviously appropriate built-in and standard-lib types with `Magnitude`: 
`Number`, `str`, `list`, `tuple`, `date`, …

Why not have this be a separate distribution package? This concept is only 
reliable if all of the code that makes use of it shares a common 
implementation. It does no good to register a class as `ProtoMagnitude`, for 
instance, if an instance of that will passed to code in another library that is 
unaware of the `ProtoMagnitude` and `Magnitude` ABCs in the package, or maybe 
has its own independent system for attempting to accomplish the same goal.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7WC4SF2GYVLP56K6Q74OKFPJGHGWAPIP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Means of comparing slices for intersection or containment containment and computing intersections or unions

2020-02-29 Thread Steve Jorgensen

Andrew Barnert wrote:
> On Feb 29, 2020, at 10:03, Steve Jorgensen ste...@stevej.name wrote:
> > In that case, I still do think that this kind of
> > functionality is of enough general use to have something for it in the 
> > Python standard
> > library, though it should probably be through the introduction of a new 
> > type (possibly
> > named something like "bounds") since neither range nor slice is really a 
> > good fit. I'm
> > thinking it should/would be much more limited in scope than intervaltree 
> > (which does look
> > really nice).
> > There are a ton of different libraries on PyPI for interval/discrete 
> > range/range
> values and sets and algebra and/or arithmetic on them, not to mention related 
> things like
> saturating values within bounds. They all provide different functionality 
> with different
> interfaces. Why do we need to pick one (or redesign and reimplement one 
> without even
> looking for it) in particular?

To me, it just feels like a missing core feature. What I'm talking about is 
something far simpler and less ambitious than what I would expect to see in an 
external addon but something that might likely be useful to any/all such things.

I have decided that it makes more sense for me to publish something like what 
I'm looking for in a library of tools though, and then use that as the basis 
for a new post after I have that ready.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5JQXHETABMXTKPGDLDCAWQS3W3C7LBK4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Means of comparing slices for intersection or containment containment and computing intersections or unions

2020-02-29 Thread Steve Jorgensen

Steve Jorgensen wrote:
> Christopher Barker wrote:
> > On Sat, Feb 29, 2020 at 4:37 AM Alex Hall
> > alex.moj...@gmail.com wrote:
> > It seems like most of this would be very easy
> > to
> > implement yourself with
> > the exact semantics that you prefer and find most intuitive, while other
> > people might have different expectations.
> > I have to agree here. You are proposing that a slice object be treated as a
> > general purpose interval, but that is not, in fact what they are. This is
> > made clear by: " Presumably, these operations would raise exceptions when
> > used with slices that have
> >  step values other than None."
> > and also: "whereas a slice represents a possibly continuous range of any
> > kind of value to which magnitude is applicable." well, sort of. Given the
> > implementation and duck typing, I suppose that's true. But in fact, slices
> > were designed for, and are (at least mostly) used to, well, slice
> > sequences, which are always integer indexes, and hav semantics specific to
> > that use case:
> > OK. That does make sense to me.

In that case, I still do think that this kind of functionality is of enough 
general use to have something for it in the Python standard library, though it 
should probably be through the introduction of a new type (possibly named 
something like "bounds") since neither range nor slice is really a good fit. 
I'm thinking it should/would be much more limited in scope than intervaltree 
(which does look really nice).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CKEPZYKUIZWGNA27NFXHMAMNPSXZZJIT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Means of comparing slices for intersection or containment containment and computing intersections or unions

2020-02-29 Thread Steve Jorgensen

Christopher Barker wrote:
> On Sat, Feb 29, 2020 at 4:37 AM Alex Hall alex.moj...@gmail.com wrote:
> > It seems like most of this would be very easy to
> > implement yourself with
> > the exact semantics that you prefer and find most intuitive, while other
> > people might have different expectations.
> > I have to agree here. You are proposing that a slice object be treated as a
> general purpose interval, but that is not, in fact what they are. This is
> made clear by: " Presumably, these operations would raise exceptions when
> used with slices that have
>  step values other than None."
> and also: "whereas a slice represents a possibly continuous range of any
> kind of value to which magnitude is applicable." well, sort of. Given the
> implementation and duck typing, I suppose that's true. But in fact, slices
> were designed for, and are (at least mostly) used to, well, slice
> sequences, which are always integer indexes, and hav semantics specific to
> that use case:


OK. That does make sense to me.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RDGKOLMVBJECFT2YO6UBD2KVVQ25WJTL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Means of comparing slices for intersection or containment containment and computing intersections or unions

2020-02-29 Thread Steve Jorgensen

Steve Jorgensen wrote:
> I am purposefully proposing this for slices as opposed to ranges because it 
> is about
> the bounds of the slices, not the items in synthetic sequences. Also, slices 
> can refer to
> any type of value, not just integers.
> Presumably, these operations would raise exceptions when used with slices 
> that have
> step values other than None. Alternatively, those could
> hypothetically be valid in restricted cases such as when all properties are 
> either int or
> Fraction types. Probably better to have them be simply unsupported though.
> a in b   # True if  is fully contained within 
> a.intersects(b)  # True if any value could be within both  and 
> a & b# Intersection of  and  or None if no intersection
> a | b# Union of  and  or Exception if neither contiguous nor
> overlapping.
> 
> Also, it might be nice to be able to test whether a non-slice value falls 
> within the
> slice's bounds. This would be using x in s as a shorthand for s.start
> <= x < s.end. Again, this is different than asking whether a value is "in" a
> range because a rage is a sequence of discrete integers whereas a slice 
> represents a
> possibly continuous range of any kind of value to which magnitude is 
> applicable.
> slice(1, 2) in (0, 3)
> # => True because 1 >= 0 and 2 <= 3
> 
> slice(0.5, 1.5) in slice(0, 2)
> # => True because 0.5 >= 1.5 and 0.5 < 2
> 
> 1 in (0, 3)
> # => True because 0 <= 1 < 3
> 
> 'Joe' in slice('Alice', 'Riley')
> # => True because 'Alice' <= 'Joe' < 'Riley'
> 
> slice(1.1, 5.9).intersects(slice(2, 10.5))
> # => True because either...
> #  1.1 <=  2<  5.9 or
> #  1.1 <  10.5  <  5.5 or
> #  2   <=  1.1  < 10.5 or
> #  2   <   5.9  < 10.5
> 
> slice(5.5, 15.5) & slice(10.25, 20.25)
> # => slice(10.25, 15.5)
> 
> slice(5.5, 15.5) | slice(10.25, 20.25)
> # => slice(5.5, 20.25)
> 
> slice('abc', 'fff') & slice('eee', 'xyz')
> # => slice('fff', 'eee')

I notice I made a couple of subtle mistakes in the above. All the more reason 
to implement these concepts correctly as standard. :)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O6TLJRFOW2Q4A6PWY4O6IKY3P5ZPA5PQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Means of comparing slices for intersection or containment containment and computing intersections or unions

2020-02-29 Thread Steve Jorgensen

I am purposefully proposing this for slices as opposed to ranges because it is 
about the bounds of the slices, not the items in synthetic sequences. Also, 
slices can refer to any type of value, not just integers.

Presumably, these operations would raise exceptions when used with slices that 
have `step` values other than `None`. Alternatively, those could hypothetically 
be valid in restricted cases such as when all properties are either int or 
Fraction types. Probably better to have them be simply unsupported though.

```
a in b   # True if  is fully contained within 
a.intersects(b)  # True if any value could be within both  and 
a & b# Intersection of  and  or None if no intersection
a | b# Union of  and  or Exception if neither contiguous nor 
overlapping.
```
Also, it might be nice to be able to test whether a non-slice value falls 
within the slice's bounds. This would be using `x in s` as a shorthand for 
`s.start <= x < s.end`. Again, this is different than asking whether a value is 
"in" a range because a rage is a sequence of discrete integers whereas a slice 
represents a possibly continuous range of any kind of value to which magnitude 
is applicable.

```
slice(1, 2) in (0, 3)
# => True because 1 >= 0 and 2 <= 3

slice(0.5, 1.5) in slice(0, 2)
# => True because 0.5 >= 1.5 and 0.5 < 2

1 in (0, 3)
# => True because 0 <= 1 < 3

'Joe' in slice('Alice', 'Riley')
# => True because 'Alice' <= 'Joe' < 'Riley'

slice(1.1, 5.9).intersects(slice(2, 10.5))
# => True because either...
#  1.1 <=  2<  5.9 or
#  1.1 <  10.5  <  5.5 or
#  2   <=  1.1  < 10.5 or
#  2   <   5.9  < 10.5

slice(5.5, 15.5) & slice(10.25, 20.25)
# => slice(10.25, 15.5)

slice(5.5, 15.5) | slice(10.25, 20.25)
# => slice(5.5, 20.25)

slice('abc', 'fff') & slice('eee', 'xyz')
# => slice('fff', 'eee')
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EGAMXMDCKBA4I5BSQL4KCI2DE3NM7L7F/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Incremental step on road to improving situation around iterable strings

2020-02-25 Thread Steve Jorgensen

Steven D'Aprano wrote:
> On Sun, Feb 23, 2020 at 11:25:12PM +0200, Alex Hall wrote:
> > "Strings are not iterable - you cannot loop over them
> > or treat them as a
> > collection.
> > Are you implying that we should deprecate the in operator for
> strings 
> too?

I would not get rid of the `in` behavior, but the `in` behavior of a string is 
actually not like that of the `in` operator for a typical collection.  Seen as 
simply a collection of single-character strings, "b" would be in "abcd", but 
"bc" would not. The `in` operator for strings is checking whether the left 
operand is a substring as opposed to an item. `(2, 3)` is not `in` `(1, 2, 3, 
4)`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4H2IEA6MNOBH2JKENGLOYIE33O7BT4ST/
Code of Conduct: http://python.org/psf/codeofconduct/

1 2 >

1 - 100 of 176 matches

Mail list logo