[Python-Dev] Re: PEP 616 "String methods to remove prefixes and suffixes" accepted

2020-04-20 Thread Raymond Hettinger
Please consider adding underscores to the names:  remove_prefix() and 
remove_suffix().

The latter method causes a mental hiccup when first read as removes-uffix, 
forcing mental backtracking to get to remove-suffix. We had a similar problem 
with addinfourl initially being read as add-in-four-l before mentally 
backtracking to add-info-url.

The PEP says this alternative was considered, but I disagree with the rationale 
given in the PEP.  The reason that "startswith" and "endswith" don't have 
underscores is that they aren't needed to disambiguate the text.  Our rules are 
to add underscores and to spell-out words when it improves readability, which 
in this case it does.   Like casing conventions, our rules and preferences for 
naming evolved after the early modules were created -- the older the module, 
the more likely that it doesn't follow modern conventions.

We only have one chance to get this right (bugs can be fixed, but API choices 
persist for very long time).  Take it from someone with experience with this 
particular problem.  I created imap() but later regretted the naming pattern 
when if came to ifilter() and islice() which sometimes cause mental hiccups 
initially being read as if-ilter and is-lice.


Raymond
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZMXSQ5T6L6CR5GUIBFEYLJJF7FE4B4US/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 "String methods to remove prefixes and suffixes" accepted

2020-04-20 Thread Guido van Rossum
Congrats Dennis! I hope your PR lands soon.

On Mon, Apr 20, 2020 at 12:40 PM Eric V. Smith  wrote:

> Congratulations, Dennis!
>
> Not 10 minutes ago I was writing code that could have used this
> functionality. And I got it wrong on my first attempt! I'm looking
> forward to it in 3.9.
>
> Eric
>
> On 4/20/2020 2:26 PM, Victor Stinner wrote:
> > Hi,
> >
> > The Python Steering Council accepts the PEP 616 "String methods to
> > remove prefixes and suffixes":
> > https://www.python.org/dev/peps/pep-0616/
> >
> > Congrats Dennis Sweeney!
> >
> > We just have one last request: we expect the documentation to explain
> > well the difference between removeprefix()/removesuffix() and
> > lstrip()/strip()/rstrip(), since it is the rationale of the PEP ;-)
> >
> > You can find the WIP implementation at:
> >
> > * https://github.com/python/cpython/pull/18939
> > * https://bugs.python.org/issue39939
> >
> > Victor
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VV2CFGYTJXADLK5NJXECU55HS5PYNUK3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IQMXG57KDF4G62KKWKXAXKNYSAU7IE5G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 "String methods to remove prefixes and suffixes" accepted

2020-04-20 Thread Eric V. Smith

Congratulations, Dennis!

Not 10 minutes ago I was writing code that could have used this 
functionality. And I got it wrong on my first attempt! I'm looking 
forward to it in 3.9.


Eric

On 4/20/2020 2:26 PM, Victor Stinner wrote:

Hi,

The Python Steering Council accepts the PEP 616 "String methods to
remove prefixes and suffixes":
https://www.python.org/dev/peps/pep-0616/

Congrats Dennis Sweeney!

We just have one last request: we expect the documentation to explain
well the difference between removeprefix()/removesuffix() and
lstrip()/strip()/rstrip(), since it is the rationale of the PEP ;-)

You can find the WIP implementation at:

* https://github.com/python/cpython/pull/18939
* https://bugs.python.org/issue39939

Victor

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VV2CFGYTJXADLK5NJXECU55HS5PYNUK3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-04-02 Thread Nick Coghlan
On Thu., 2 Apr. 2020, 8:30 am Victor Stinner,  wrote:

> I suggest you to wait one more week to let other people comment the
> PEP. After this delay, if you consider that the PEP is ready for
> pronouncement, you can submit it to the Steering Council, right.
>

Note that the submission to the Steering Council doesn't have to be a
request for immediate pronouncement - it's a notification that the PEP is
mature enough for the Council to decide whether to appoint a Council member
as BDFL-Delegate or to appoint someone else.

The decision on whether to wait for more questions is then up to the
Council and/or the appointed BDFL-Delegate.

PEP 616 definitely looks mature enough for that step to me (and potentially
even immediately accepted - it did get dissected pretty thoroughly, after
all!)

Cheers,
Nick.



> http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4MJAT2DJAWRSZCG465QQOQRSG3NUIH7D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-04-02 Thread Nick Coghlan
On Sat., 28 Mar. 2020, 8:39 am Guido van Rossum,  wrote:

> On Fri, Mar 27, 2020 at 3:29 PM Dennis Sweeney <
> sweeney.dennis...@gmail.com> wrote:
>
>> > If I saw that in a code review I'd flag it for non-obviousness. One
>> should
>> > use 'string != new_string' unless there is severe pressure to squeeze
>> > every nanosecond out of this particular code (and it better be inside an
>> > inner loop).
>>
>> I thought that someone had suggested that such things go in the PEP,
>
>
> I'm sure someone did.
>

I think that may have been me in a tangent thread where folks were worried
about O(N) checks on long strings.

I know at least I temporarily forgot to account for string equality checks
starting with a few O(1) checks to speed up common cases (IIRC: identity,
length, first code point, last code point), which means explicitly calling
len() is just as likely to slow things down as it is to speed them up.

Cheers,
Nick.


>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UYA5ICAM6TWREXS7SSA4WKWRA2DAQOMF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-04-01 Thread Victor Stinner
I suggest you to wait one more week to let other people comment the
PEP. After this delay, if you consider that the PEP is ready for
pronouncement, you can submit it to the Steering Council, right.

Victor

Le mer. 1 avr. 2020 à 21:56, Dennis Sweeney
 a écrit :
>
> Hello all,
>
> It seems that most of the discussion has settled down, but I didn't quite 
> understand from reading PEP 1 what the next step should be -- is this an 
> appropriate time to open an issue on the Steering Council GitHub repository 
> requesting pronouncement on PEP 616?
>
> Best,
> Dennis
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZXKU3EM6HEG6R7C65L7UN65IGTBB7VHH/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C4SR7J2X6KNMC5N7SZLMWV76VPY2G22U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-04-01 Thread Dennis Sweeney
Hello all,

It seems that most of the discussion has settled down, but I didn't quite 
understand from reading PEP 1 what the next step should be -- is this an 
appropriate time to open an issue on the Steering Council GitHub repository 
requesting pronouncement on PEP 616?

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZXKU3EM6HEG6R7C65L7UN65IGTBB7VHH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-29 Thread Victor Stinner
My intent is to help people like me to follow the discussion on the
PEP. There are more than 100 messages, it's hard to follow PEP
updates.

Victor

Le dim. 29 mars 2020 à 14:55, Rob Cliffe via Python-Dev
 a écrit :
>
>
>
> On 28/03/2020 17:02, Victor Stinner wrote:
> > What do you think of adding a Version History section which lists most
> > important changes since your proposed the first version of the PEP? I
> > recall:
> >
> > * Version 3: don't accept tuple
> > * Version 2: Rename cutprefix/cutsuffix to removeprefix/removesuffix,
> > accept tuple
> > * Version 1: initial version
> >
> > For example, for my PEP 587, I wrote detailed changes, but I don't
> > think that you should go into the details ;-)
> > https://www.python.org/dev/peps/pep-0587/#version-history
> >
> > Victor
> >
> >
> IMHO that's overkill.  A list of rejected ideas, and why they were
> rejected, seems sufficient.
> Rob Cliffe
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/L6CHS3PFUY3CWUERHMYS3OWU327P4RIE/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NTRCLXO3IQCBQW64XC7P4FCJN72IJZVQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-29 Thread Rob Cliffe via Python-Dev



On 28/03/2020 17:02, Victor Stinner wrote:

What do you think of adding a Version History section which lists most
important changes since your proposed the first version of the PEP? I
recall:

* Version 3: don't accept tuple
* Version 2: Rename cutprefix/cutsuffix to removeprefix/removesuffix,
accept tuple
* Version 1: initial version

For example, for my PEP 587, I wrote detailed changes, but I don't
think that you should go into the details ;-)
https://www.python.org/dev/peps/pep-0587/#version-history

Victor


IMHO that's overkill.  A list of rejected ideas, and why they were 
rejected, seems sufficient.

Rob Cliffe
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L6CHS3PFUY3CWUERHMYS3OWU327P4RIE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-28 Thread Dennis Sweeney
Sure -- I can add in a short list of those major changes.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TKCHV76P3CYYSZDSB3TH3I4UTFCUNKU5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-28 Thread Eric V. Smith

On 3/26/2020 9:10 PM, Ethan Furman wrote:
First off, thank you for being so patient -- trying to champion a PEP 
can be exhausting.


On 03/26/2020 05:22 PM, Dennis Sweeney wrote:

So now if I understand the dilemma up to this point we have:

Benefits of writing ``return self`` in the PEP:
    a - Makes it clear that the optimization of not copying is allowed
    b - Makes it clear that ``self.__class__.__getitem__`` isn't used

Benefits of writing ``return self[:]`` in the PEP:
    c - Makes it clear that returning self is an implementation detail
    d - For subclasses not overriding ``__getitem__`` (the majority 
of cases), makes
   it clear that this method will return a base str like the 
other str methods.


Did I miss anything?


The only thing you missed is that, for me at least, points A, C, and D 
are not at all
clear from the example code.  If I wanted to be explicit about the 
return type being

`str` I would write:

 return str(self)   # subclasses are coerced to str


That does seem like the better solution, including the comment.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/43GEOBNJYN22QLYMG2PGN3KDOSQQHR6E/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-28 Thread Victor Stinner
What do you think of adding a Version History section which lists most
important changes since your proposed the first version of the PEP? I
recall:

* Version 3: don't accept tuple
* Version 2: Rename cutprefix/cutsuffix to removeprefix/removesuffix,
accept tuple
* Version 1: initial version

For example, for my PEP 587, I wrote detailed changes, but I don't
think that you should go into the details ;-)
https://www.python.org/dev/peps/pep-0587/#version-history

Victor

Le sam. 28 mars 2020 à 06:11, Dennis Sweeney
 a écrit :
>
> PEP 616 -- String methods to remove prefixes and suffixes
> is available here: https://www.python.org/dev/peps/pep-0616/
>
> Changes:
> - Only accept single affixes, not tuples
> - Make the specification more concise
> - Make fewer stylistic prescriptions for usage
> - Fix typos
>
> A reference implementation GitHub PR is up to date here:
> https://github.com/python/cpython/pull/18939
>
> Are there any more comments for it before submission?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/UJE3WCQXSZI76IW54D2SKKL6OFQ2VFMA/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4UKSBWZE2RIP4VPADVHKQRLS7E7OBMTS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
PEP 616 -- String methods to remove prefixes and suffixes
is available here: https://www.python.org/dev/peps/pep-0616/

Changes:
- Only accept single affixes, not tuples
- Make the specification more concise
- Make fewer stylistic prescriptions for usage
- Fix typos

A reference implementation GitHub PR is up to date here:
https://github.com/python/cpython/pull/18939

Are there any more comments for it before submission?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UJE3WCQXSZI76IW54D2SKKL6OFQ2VFMA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Guido van Rossum
On Fri, Mar 27, 2020 at 3:29 PM Dennis Sweeney 
wrote:

> > > One may also continue using ``startswith()``
> > > and ``endswith()``
> > > methods for control flow instead of testing the lengths as above.
> > >
> > > That's worse, in a sense, since "foofoobar".removeprefix("foo") returns
> > "foobar" which still starts with "foo".
>
> I meant that startswith might be called before removeprefix, as it was
> in the ``deccheck.py`` example.
>

Not having read the full PEP, that wasn't clear to me. Sorry!


> > If I saw that in a code review I'd flag it for non-obviousness. One
> should
> > use 'string != new_string' unless there is severe pressure to squeeze
> > every nanosecond out of this particular code (and it better be inside an
> > inner loop).
>
> I thought that someone had suggested that such things go in the PEP,


I'm sure someone did. But not every bit of feedback is worth acting upon,
and sometimes a weird compromise is cooked up that addresses somebody's nit
while making things less understandable for everyone else. I think this is
one of those cases.


> but
> since these are more stylistic considerations, I would be more than happy
> to
> trim it down to just
>
> The builtin ``str`` class will gain two new methods which will behave
> as follows when ``type(self) is type(prefix) is str``::
>
> def removeprefix(self: str, prefix: str, /) -> str:
> if self.startswith(prefix):
> return self[len(prefix):]
> else:
> return self[:]
>
> def removesuffix(self: str, suffix: str, /) -> str:
> # suffix='' should not call self[:-0].
> if suffix and self.endswith(suffix):
> return self[:-len(suffix)]
> else:
> return self[:]
>
> These methods, even when called on ``str`` subclasses, should always
> return base ``str`` objects.
>
> Methods with the corresponding semantics will be added to the builtin
> ``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
> or ``bytearray`` object, then ``b.removeprefix()`` and
> ``b.removesuffix()``
> will accept any bytes-like object as an argument. The two methods will
> also be added to ``collections.UserString``, with similar behavior.
>

Excellent!

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4B3IO4QE5XTJYAFGNEMC6JNLHJKMTEL2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
> > One may also continue using ``startswith()``
> > and ``endswith()``
> > methods for control flow instead of testing the lengths as above.
> > 
> > That's worse, in a sense, since "foofoobar".removeprefix("foo") returns
> "foobar" which still starts with "foo".

I meant that startswith might be called before removeprefix, as it was 
in the ``deccheck.py`` example.

> If I saw that in a code review I'd flag it for non-obviousness. One should
> use 'string != new_string' unless there is severe pressure to squeeze
> every nanosecond out of this particular code (and it better be inside an
> inner loop).

I thought that someone had suggested that such things go in the PEP, but 
since these are more stylistic considerations, I would be more than happy to
trim it down to just

The builtin ``str`` class will gain two new methods which will behave
as follows when ``type(self) is type(prefix) is str``::

def removeprefix(self: str, prefix: str, /) -> str:
if self.startswith(prefix):
return self[len(prefix):]
else:
return self[:]

def removesuffix(self: str, suffix: str, /) -> str:
# suffix='' should not call self[:-0].
if suffix and self.endswith(suffix):
return self[:-len(suffix)]
else:
return self[:]

These methods, even when called on ``str`` subclasses, should always
return base ``str`` objects.

Methods with the corresponding semantics will be added to the builtin
``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()``
will accept any bytes-like object as an argument. The two methods will
also be added to ``collections.UserString``, with similar behavior.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HQRI26F6UPWL24LJOFFMKNAMYJSC2CAL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Guido van Rossum
On Fri, Mar 27, 2020 at 1:55 PM Dennis Sweeney 
wrote:

> I like how that would take the pressure off of the Python sample. How's
> something like this?
>
> Specification
> =
>
> The builtin ``str`` class will gain two new methods which will behave
> as follows when ``type(self) is str``::
>
> def removeprefix(self: str, prefix: str, /) -> str:
> if self.startswith(prefix):
> return self[len(prefix):]
> else:
> return self
>
> def removesuffix(self: str, suffix: str, /) -> str:
> if suffix and self.endswith(suffix):
> return self[:-len(suffix)]
> else:
> return self
>
> These methods, even when called on ``str`` subclasses, should always
> return base ``str`` objects.  One should not rely on the behavior
> of ``self`` being returned (as in ``s.removesuffix('') is s``) -- this
> optimization should be considered an implementation detail.
>

I'd suggest to drop the last sentence ("One should ... detail.") and
instead write 'return self[:]' in the methods.


> To test
> whether any affixes were removed during the call, one may use the
> constant-time behavior of comparing the lengths of the original and
> new strings::
>
> >>> string = 'Python String Input'
> >>> new_string = string.removeprefix('Py')
> >>> modified = (len(string) != len(new_string))
> >>> modified
> True
>

If I saw that in a code review I'd flag it for non-obviousness. One should
use 'string != new_string' *unless* there is severe pressure to squeeze
every nanosecond out of this particular code (and it better be inside an
inner loop).


> One may also continue using ``startswith()`` and ``endswith()``
> methods for control flow instead of testing the lengths as above.
>

That's worse, in a sense, since "foofoobar".removeprefix("foo") returns
"foobar" which still starts with "foo".

Note that without the check for the truthiness of ``suffix``,
> ``s.removesuffix('')`` would be mishandled and always return the empty
> string due to the unintended evaluation of ``self[:-0]``.
>

That's a good one (I started suggesting dropping that when I read this :-)
but maybe it ought to go in a comment (and shorter -- at most one line).


> Methods with the corresponding semantics will be added to the builtin
> ``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
> or ``bytearray`` object, then ``b.removeprefix()`` and
> ``b.removesuffix()``
> will accept any bytes-like object as an argument.  Although the methods
> on the immutable ``str`` and ``bytes`` types may make the
> aforementioned
> optimization of returning the original object,
> ``bytearray.removeprefix()``
> and ``bytearray.removesuffix()`` should *always* return a copy, never
> the
> original object.
>

This could also be simplified by writing 'return self[:]'.


> The two methods will also be added to ``collections.UserString``, with
> similar behavior.
>
> My hesitation to write "return self" is resolved by saying that it should
> not be relied on, so I think this is a win.
>

Writing 'return self[:]' seems to say the same thing in fewer words though.
:-)

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OXFHYRBN74MAZGA5QCLMFCCQCRH5SDNJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I like how that would take the pressure off of the Python sample. How's 
something like this?

Specification
=

The builtin ``str`` class will gain two new methods which will behave
as follows when ``type(self) is str``::

def removeprefix(self: str, prefix: str, /) -> str:
if self.startswith(prefix):
return self[len(prefix):]
else:
return self

def removesuffix(self: str, suffix: str, /) -> str:
if suffix and self.endswith(suffix):
return self[:-len(suffix)]
else:
return self

These methods, even when called on ``str`` subclasses, should always
return base ``str`` objects.  One should not rely on the behavior
of ``self`` being returned (as in ``s.removesuffix('') is s``) -- this
optimization should be considered an implementation detail.  To test
whether any affixes were removed during the call, one may use the
constant-time behavior of comparing the lengths of the original and
new strings::

>>> string = 'Python String Input'
>>> new_string = string.removeprefix('Py')
>>> modified = (len(string) != len(new_string))
>>> modified
True

One may also continue using ``startswith()`` and ``endswith()``
methods for control flow instead of testing the lengths as above.

Note that without the check for the truthiness of ``suffix``,
``s.removesuffix('')`` would be mishandled and always return the empty
string due to the unintended evaluation of ``self[:-0]``.

Methods with the corresponding semantics will be added to the builtin
``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()``
will accept any bytes-like object as an argument.  Although the methods
on the immutable ``str`` and ``bytes`` types may make the aforementioned
optimization of returning the original object, ``bytearray.removeprefix()``
and ``bytearray.removesuffix()`` should *always* return a copy, never the
original object.

The two methods will also be added to ``collections.UserString``, with
similar behavior.

My hesitation to write "return self" is resolved by saying that it should not 
be relied on, so I think this is a win.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YZD2BTB5RT6DZUTEGHTRNAJZHBMRATPS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Guido van Rossum
How about just presenting pseudo code with the caveat that that's for the
base str and bytes classes only, and then stipulating that for subclasses
the return value is still a str/bytes/bytearray instance, and leaving it at
that? After all the point of the Python code is to show what the C code
should do in a way that's easy to grasp -- giving a Python implementation
is not meant to constrain the C implementation to have *exactly* the same
behavior in all corner cases (since that would lead to seriously contorted
C code).

On Fri, Mar 27, 2020 at 1:02 PM Dennis Sweeney 
wrote:

> I was trying to start with the the intended behavior of the str class,
> then move on to generalizing to other classes, because I think completing a
> single example and *then* generalizing is an instructional style that's
> easier to digest, whereas intermixing all of the examples at once can get
> confused (can I call str.removeprefix(object(), 17)?). Is something missing
> that's not already there in the following sentence in the PEP?
>
> Although the methods on the immutable ``str`` and ``bytes`` types may make
> the aforementioned optimization of returning the original object,
> ``bytearray.removeprefix()`` and ``bytearray.removesuffix()`` should always
> return a copy, never the original object.
>
> Best,
> Dennis
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/IO33NJUQTN27TU342NAJAAMR7YGEPQRE/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RTK46ZXQYPWVRJIWPAD3EXTJBSU27VKF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I was trying to start with the the intended behavior of the str class, then 
move on to generalizing to other classes, because I think completing a single 
example and *then* generalizing is an instructional style that's easier to 
digest, whereas intermixing all of the examples at once can get confused (can I 
call str.removeprefix(object(), 17)?). Is something missing that's not already 
there in the following sentence in the PEP?

Although the methods on the immutable ``str`` and ``bytes`` types may make the 
aforementioned optimization of returning the original object, 
``bytearray.removeprefix()`` and ``bytearray.removesuffix()`` should always 
return a copy, never the original object.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IO33NJUQTN27TU342NAJAAMR7YGEPQRE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I appreciate the input and attention to detail!

Using the ``str()`` constructor was sort of what I had thought originally, and 
that's why I had gone overboard with "casting" in one iteration of the sample 
code. When I realized that this isn't quite "casting" and that ``__str__`` can 
be overridden, I went even more overboard and suggested that 
``str.__getitem__(self, ...)`` and ``str.__len__(self)`` could be written, 
which does have the behavior of effectively "casting", but looks nasty. Do you 
think that the following is a happy medium?

def removeprefix(self: str, prefix: str, /) -> str:
# coerce subclasses to str
self_str = str(self)
prefix_str = str(prefix)
if self_str.startswith(prefix_str):
return self_str[len(prefix_str):]
else:
return self_str

def removesuffix(self: str, suffix: str, /) -> str:
# coerce subclasses to str
self_str = str(self)
suffix_str = str(suffix)
if suffix_str and self_str.endswith(suffix_str):
return self_str[:-len(suffix_str)]
else:
return self_str

Followed by the text:

If ``type(self) is str`` (rather than a subclass) and if the given affix is 
empty or is not found, then these methods may, but are not required to, make 
the optimization of returning ``self``.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W6DMWMSF22HPKG6MYYCXQ6QE7QIWBNSI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Steve Holden
On Wed, Mar 25, 2020 at 5:42 PM Dennis Sweeney 
wrote:

> I'm removing the tuple feature from this PEP. So now, if I understand
> correctly, I don't think there's disagreement about behavior, just about
> how that behavior should be summarized in Python code.
> [...]
> return (the original object unchanged, or a copy of the object,
> depending on implementation details,
> but always make a copy when working with subclasses)
>
> is well-summarized by
>
>return self[:]
>
> especially if followed by the text
>
> Note that ``self[:]`` might not actually make a copy -- if the affix
> is empty or not found, and if ``type(self) is str``, then these methods

may, but are not required to, make the optimization of returning
> ``self``.
> However, when called on instances of subclasses of ``str``, these
> methods should return base ``str`` objects, not ``self``.
>
> Perhaps:

Note that ``self[:]`` might not actually make a copy of ``self``.
If the affix is empty or not found, and if ``type(self)`` is immutable,
then these methods may, but are not required to, make the
optimization of returning ``self``. ...

[...]
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UWIFAGIT6CKVYGWOCWHAUNFCVSS6TJ3X/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread senthil
On Sun, Mar 22, 2020 at 05:00:10AM -, Dennis Sweeney wrote:
> I like "removeprefix" and "removesuffix". My only concern before had
> been length, but three more characters than "cut***fix" is a small
> price to pay for clarity.

I personally rely on auto-complete of my editor while writing. So,
thinking about these these methods in "correct" terms might be more
important to me that the length.

+1 for removeprefix and removesuffix.

Thanks,
Senthil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L4NKSHGHVPLJ4PLBED5M6CCT5UX2BH5K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Ethan Furman

First off, thank you for being so patient -- trying to champion a PEP can be 
exhausting.

On 03/26/2020 05:22 PM, Dennis Sweeney wrote:

Ethan Furman wrote:



I don't understand that list bit -- surely, if I'm bothering to implement
removeprefix and removesuffix in my subclass, I would also want
to
return self to keep my subclass?  Why would I want to go through the extra
overhead of either calling my own __getitem__ method, or have the
str.__getitem__ method discard my subclass?


I should clarify: by "when working with subclasses" I meant "when
str.removeprefix() is called on a subclass that does not override
removeprefix", and in that case it should return a base str.


Okay.


However, if you are saying that self[:]  will call
self.__class__.__getitem__
so my subclass only has to override __getitem__ instead of
removeprefix and
removesuffix, that I can be happy with.


I was only saying that the new methods should match 20 other methods in
the str API by always returning a base str


Okay.


If ``return self[:]`` in the PEP is too closely linked to "must call
user-supplied ``__getitem__`` methods" for it not to be true, and so you're
suggesting ``return self`` is more faithful, I can understand.

So now if I understand the dilemma up to this point we have:

Benefits of writing ``return self`` in the PEP:
a - Makes it clear that the optimization of not copying is allowed
b - Makes it clear that ``self.__class__.__getitem__`` isn't used

Benefits of writing ``return self[:]`` in the PEP:
c - Makes it clear that returning self is an implementation detail
d - For subclasses not overriding ``__getitem__`` (the majority of cases), 
makes
   it clear that this method will return a base str like the other str 
methods.

Did I miss anything?


The only thing you missed is that, for me at least, points A, C, and D are not 
at all
clear from the example code.  If I wanted to be explicit about the return type 
being
`str` I would write:

 return str(self)   # subclasses are coerced to str

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XWVJNFN5O3BZQ6YQQEWHMOGRWST3MY6M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Dennis Sweeney
> I don't understand that list bit -- surely, if I'm bothering to implement
> removeprefix and removesuffix in my subclass, I would also want
> to
> return self to keep my subclass?  Why would I want to go through the extra
> overhead of either calling my own __getitem__ method, or have the
> str.__getitem__ method discard my subclass?

I should clarify: by "when working with subclasses" I meant "when
str.removeprefix() is called on a subclass that does not override
removeprefix", and in that case it should return a base str. I was
not taking a stance on how the methods should be overridden, and
I'm not sure there are many use cases where it should be.

> However, if you are saying that self[:]  will call
> self.__class__.__getitem__
> so my subclass only has to override __getitem__ instead of
> removeprefix and
> removesuffix, that I can be happy with.

I was only saying that the new methods should match 20 other methods in 
the str API by always returning a base str (the exceptions being format,
format_map, and (r)partition for some reason). I did not mean to suggest
that they should ever call user-supplied ``__getitem__`` code -- I don't
think they need to. I haven't found anyone trying to use ``str`` as a
mixin class/ABC, and it seems that this would be very difficult to do
given that none of its methods currently rely on 
``self.__class__.__getitem__``.

If ``return self[:]`` in the PEP is too closely linked to "must call 
user-supplied ``__getitem__`` methods" for it not to be true, and so you're
suggesting ``return self`` is more faithful, I can understand.

So now if I understand the dilemma up to this point we have:

Benefits of writing ``return self`` in the PEP:
- Makes it clear that the optimization of not copying is allowed
- Makes it clear that ``self.__class__.__getitem__`` isn't used

Benefits of writing ``return self[:]`` in the PEP:
- Makes it clear that returning self is an implementation detail
- For subclasses not overriding ``__getitem__`` (the majority of cases), 
makes 
  it clear that this method will return a base str like the other str 
methods.

Did I miss anything?

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EQVVXMC7XQJSQIHEB7ND2OLWBQLC7QYM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Sebastian Rittau

Am 26.03.20 um 06:28 schrieb Cameron Simpson:

On 24Mar2020 18:49, Brett Cannon  wrote:

-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with 
preexisting methods.

+1 on "remove*" for the same reasons as "trim*".


I reiterate my huge -1 on "trim" because it will confuse every PHP 
user who comes to us from the dark side. Over there "trim" means what 
our "strip" means.


I've got (differing) opinions about the others, but "trim" is a big 
one to me. 


As a full stack developer with terrible memory, I agree. JavaScript also 
uses trim() like Python's strip(), and this would quickly get confusing.


 - Sebastian

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RNVBORUJDVWBQPELPSPYMNHBVLCZ4B5B/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Dennis Sweeney
> I imagine it's an implementation detail of which ones depend on 
> ``__getitem__``.

If we write

class MyStr(str):
def __getitem__(self, key):
raise ZeroDivisionError()

then all of the assertions from before still pass, so in fact *none* of
the methods rely on ``__getitem__``. As of now ``str`` does not behave
as an ABC at all.

But it's an interesting proposal to essentially make it an ABC. Although
it makes me curious what all of the different reasons people actually have
for subclassing ``str``. All of the examples I found in the stdlib were
either (1) contrived test cases (2) strings (e.g. grammar tokens) with
some extra attributes along for the ride, or (3) string-based enums.
None of types (2) or (3) ever overrode ``__getitem__``, so it doesn't
feel like that common of a use case.

> I don't see removeprefix and removesuffix explicitly being implemented
> in terms of slicing operations as a huge win - you've demonstrated that
> someone who wants a persistent string subclass still would need to
> override a /lot/ of methods, so two more shouldn't hurt much - I just
> think that "consistent with most of the other methods" is a
> /particularly/ good reason to avoid explicitly defining these operations
> in terms of __getitem__. 

Making sure I understand: would you prefer the PEP to say ``return self``
rather than ``return self[:]``? I never had the intention of ``self[:]``
meaning "this must have exactly the behavior of 
``self.__getitem__(slice(None, None))`` regardless of type", but I can 
understand if that's how you're saying it could be interpreted.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A64Q6BXTXJYNTA4NX2GHBMOG6FPZUCZP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Cameron Simpson

On 25Mar2020 08:14, Paul Moore  wrote:

[...] The issue for me is how the function
should behave with a list of affixes if one is a prefix of another,
e.g.,removeprefix(('Test', 'Tests')). The empty string case is just
one form of that. The behaviour should be defined clearly, and while I
imagine "always remove the longest" is the "obvious" sensible choice,
I am fairly certain there will be other opinions :-) So deferring the
decision for now until we have more experience with the single-affix
form seems perfectly reasonable.


I'd like to preface this with "I'm fine to implement multiple affixes 
later, if at all". That said:


To me "first match" is the _only_ sensible choice. "longest match" can 
always be implemented with a "first match" function by sorting on length 
if desired.


Also, "longest first" requires the implementation to do a prescan of the 
supplied affixes whereas "first match" lets the implementation just 
iterate over the choices as supplied.


I'm beginning to think I must again threaten my partner's anecdote about 
Netscape Proxy's rule system, which prioritised rules by the lexical 
length of their regexp, not their config file order of appearance. That 
way lies (and, indeeed, lay) madness.


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RRWO6NIRC23F3FWEYJYFDUSVL6PTIQ5C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Cameron Simpson

On 24Mar2020 18:49, Brett Cannon  wrote:

-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with preexisting 
methods.
+1 on "remove*" for the same reasons as "trim*".


I reiterate my huge -1 on "trim" because it will confuse every PHP user 
who comes to us from the dark side. Over there "trim" means what our 
"strip" means.


I've got (differing) opinions about the others, but "trim" is a big one 
to me.


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FCMSA66UPTJV7YDINYJFQCVPO6B67JZN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Steven D'Aprano
On Tue, Mar 24, 2020 at 07:14:16PM +0100, Victor Stinner wrote:

> I would prefer to raise ValueError("empty separator") to avoid any
> risk of confusion. I'm not sure that str.cutprefix("") or
> str.cutsuffix("") does make any sense.

They make as much sense as any other null-operation, such as subtracting 
0 or deleting empty slices from lists.

Every string s is unchanged if you prepend or concatenate the empty 
string:

assert s == ''+s == s+''

so removing the empty string should obey the same invariant:

assert s == s.removeprefix('') == s.removesuffix('')


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/X7N57XKA3S7TK4W6OUJBCCLDSTERK636/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Paul Ganssle
I imagine it's an implementation detail of which ones depend on __getitem__.

The only methods that would be reasonably amenable to a guarantee like
"always returns the same thing as __getitem__" would be (l|r|)strip(),
split(), splitlines(), and .partition(), because they only work with
subsets of the input string.

Most of the other stuff involves constructing new strings and it's
harder to cast them in terms of other "primitive operations" since
strings are immutable.

I suspect that to the extent that the ones that /could/ be implemented
in terms of __getitem__ are returning base strings, it's either because
no one thought about doing it at the time and they used another
mechanism or it was a deliberate choice to be consistent with the other
methods.

I don't see removeprefix and removesuffix explicitly being implemented
in terms of slicing operations as a huge win - you've demonstrated that
someone who wants a persistent string subclass still would need to
override a /lot/ of methods, so two more shouldn't hurt much - I just
think that "consistent with most of the other methods" is a
/particularly/ good reason to avoid explicitly defining these operations
in terms of __getitem__. The /default/ semantics are the same (i.e. if
you don't explicitly change the return type of __getitem__, it won't
change the return type of the remove* methods), and the only difference
is that for all the /other/ methods, it's an implementation detail
whether they call __getitem__, whereas for the remove methods it would
be explicitly documented.

In my ideal world, a lot of these methods would be redefined in terms of
a small set of primitives that people writing subclasses could implement
as a protocol that would allow methods called on the functions to retain
their class, but I think the time for that has passed. Still, I don't
think it would /hurt/ for new methods to be defined in terms of what
primitive operations exist where possible.

Best,
Paul

On 3/25/20 3:09 PM, Dennis Sweeney wrote:
> I was surprised by the following behavior:
>
> class MyStr(str):
> def __getitem__(self, key):
> if isinstance(key, slice) and key.start is key.stop is key.end:
> return self
> return type(self)(super().__getitem__(key))
>
>
> my_foo = MyStr("foo")
> MY_FOO = MyStr("FOO")
> My_Foo = MyStr("Foo")
> empty = MyStr("")
>
> assert type(my_foo.casefold()) is str
> assert type(MY_FOO.capitalize()) is str
> assert type(my_foo.center(3)) is str
> assert type(my_foo.expandtabs()) is str
> assert type(my_foo.join(())) is str
> assert type(my_foo.ljust(3)) is str
> assert type(my_foo.lower()) is str
> assert type(my_foo.lstrip()) is str
> assert type(my_foo.replace("x", "y")) is str
> assert type(my_foo.split()[0]) is str
> assert type(my_foo.splitlines()[0]) is str
> assert type(my_foo.strip()) is str
> assert type(empty.swapcase()) is str
> assert type(My_Foo.title()) is str
> assert type(MY_FOO.upper()) is str
> assert type(my_foo.zfill(3)) is str
>
> assert type(my_foo.partition("z")[0]) is MyStr
> assert type(my_foo.format()) is MyStr
>
> I was under the impression that all of the ``str`` methods exclusively 
> returned base ``str`` objects. Is there any reason why those two are 
> different, and is there a reason that would apply to ``removeprefix`` and 
> ``removesuffix`` as well?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/TVDATHMCK25GT4OTBUBDWG3TBJN6DOKK/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L3ZQLTUWWTNKCWTTSJSOX3ME4EDSS4FR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Dennis Sweeney
I was surprised by the following behavior:

class MyStr(str):
def __getitem__(self, key):
if isinstance(key, slice) and key.start is key.stop is key.end:
return self
return type(self)(super().__getitem__(key))


my_foo = MyStr("foo")
MY_FOO = MyStr("FOO")
My_Foo = MyStr("Foo")
empty = MyStr("")

assert type(my_foo.casefold()) is str
assert type(MY_FOO.capitalize()) is str
assert type(my_foo.center(3)) is str
assert type(my_foo.expandtabs()) is str
assert type(my_foo.join(())) is str
assert type(my_foo.ljust(3)) is str
assert type(my_foo.lower()) is str
assert type(my_foo.lstrip()) is str
assert type(my_foo.replace("x", "y")) is str
assert type(my_foo.split()[0]) is str
assert type(my_foo.splitlines()[0]) is str
assert type(my_foo.strip()) is str
assert type(empty.swapcase()) is str
assert type(My_Foo.title()) is str
assert type(MY_FOO.upper()) is str
assert type(my_foo.zfill(3)) is str

assert type(my_foo.partition("z")[0]) is MyStr
assert type(my_foo.format()) is MyStr

I was under the impression that all of the ``str`` methods exclusively returned 
base ``str`` objects. Is there any reason why those two are different, and is 
there a reason that would apply to ``removeprefix`` and ``removesuffix`` as 
well?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TVDATHMCK25GT4OTBUBDWG3TBJN6DOKK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Ethan Furman

Dennis Sweeney wrote:
---

It appears that in CPython, self[:] is self is true for base
str objects, so I think return self[:] is consistent with (1) the
premise that returning self is an implementation detail that is
neither mandated nor forbidden, and (2) the premise that the
methods should return base str objects even when called on str
subclasses.


Ethan Furman wrote:
---

The Python interpreter in my head sees self[:] and returns a copy.


Dennis Sweeney wrote:
---

I think I'm still in the camp that ``return self[:]`` more precisely
prescribes the desired behavior. It would feel strange to me to write
``return self`` and then say "but you don't actually have to return self,
and in fact you shouldn't when working with subclasses".



I don't understand that list bit -- surely, if I'm bothering to implement
`removeprefix` and `removesuffix` in my subclass, I would also want to
`return self` to keep my subclass?  Why would I want to go through the extra
overhead of either calling my own `__getitem__` method, or have the
`str.__getitem__` method discard my subclass?

However, if you are saying that `self[:]`  *will* call 
`self.__class__.__getitem__`
so my subclass only has to override `__getitem__` instead of `removeprefix` and
`removesuffix`, that I can be happy with.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GTSUGU3CLYGS6R6DPEPNKD4IBN6PESGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Paul Ganssle
I've said a few times that I think it would be good if the behavior were
defined /in terms of __getitem__/'s behavior. If the rough behavior is this:

def removeprefix(self, prefix):
    if self.startswith(prefix):
    return self[len(prefix):]
    else:
    return self[:]

Then you can shift all the guarantees about whether the subtype is str
and whether it might return `self` when the prefix is missing onto the
implementation of __getitem__.

For CPython's implementation of str, `self[:]` returns `self`, so it's
clearly true that __getitem__ is allowed to return `self` in some
situations. Subclasses that do not override __getitem__ will return the
str base class, and subclasses that /do/ overwrite __getitem__ can
choose what they want to do. So someone could make their subclass do this:

class MyStr(str):
    def __getitem__(self, key):
    if isinstance(key, slice) and key.start is key.stop is key.end
is None:
    return self
    return type(self)(super().__getitem__(key))

They would then get "removeprefix" and "removesuffix" for free, with the
desired semantics and optimizations.

If we go with this approach (which again I think is much friendlier to
subclassers), that obviates the problem of whether `self[:]` is a good
summary of something that can return `self`: since "does the same thing
as self[:]" /is/ the behavior it's trying to describe, there's no ambiguity.

Best,
Paul

On 3/25/20 1:36 PM, Dennis Sweeney wrote:
> I'm removing the tuple feature from this PEP. So now, if I understand
> correctly, I don't think there's disagreement about behavior, just about
> how that behavior should be summarized in Python code. 
>
> Ethan Furman wrote:
>>> It appears that in CPython, self[:] is self is true for base
>>> str
>>>  objects, so I think return self[:] is consistent with (1) the premise
>>>  that returning self is an implementation detail that is neither mandated
>>>  nor forbidden, and (2) the premise that the methods should return base
>>>  str objects even when called on str subclasses.
>> The Python interpreter in my head sees self[:] and returns a copy. 
>> A
>> note that says a str is returned would be more useful than trying to
>> exactly mirror internal details in the Python "roughly equivalent" code.
> I think I'm still in the camp that ``return self[:]`` more precisely 
> prescribes
> the desired behavior. It would feel strange to me to write ``return self``
> and then say "but you don't actually have to return self, and in fact
> you shouldn't when working with subclasses". To me, it feels like
>
> return (the original object unchanged, or a copy of the object, 
> depending on implementation details, 
> but always make a copy when working with subclasses)
>
> is well-summarized by
>
>return self[:]
>
> especially if followed by the text
>
> Note that ``self[:]`` might not actually make a copy -- if the affix
> is empty or not found, and if ``type(self) is str``, then these methods
> may, but are not required to, make the optimization of returning ``self``.
> However, when called on instances of subclasses of ``str``, these
> methods should return base ``str`` objects, not ``self``.
>
> ...which is a necessary explanation regardless. Granted, ``return self[:]``
> isn't perfect if ``__getitem__`` is overridden, but at the cost of three
> characters, the Python gains accuracy over both the optional nature of
> returning ``self`` in all cases and the impossibility (assuming no dunders
> are overridden) of returning self for subclasses. It also dissuades readers
> from relying on the behavior of returning self, which we're specifying is
> an implementation detail.
>
> Is that text explanation satisfactory?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GG5BOKPQCP7J5RRWABEYOZDNDTH3UC6T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Eric V. Smith

On 3/25/2020 1:36 PM, Dennis Sweeney wrote:

I'm removing the tuple feature from this PEP. So now, if I understand
correctly, I don't think there's disagreement about behavior, just about
how that behavior should be summarized in Python code.

I think that's right.

Ethan Furman wrote:

It appears that in CPython, self[:] is self is true for base
str
  objects, so I think return self[:] is consistent with (1) the premise
  that returning self is an implementation detail that is neither mandated
  nor forbidden, and (2) the premise that the methods should return base
  str objects even when called on str subclasses.

The Python interpreter in my head sees self[:] and returns a copy.
A
note that says a str is returned would be more useful than trying to
exactly mirror internal details in the Python "roughly equivalent" code.

I think I'm still in the camp that ``return self[:]`` more precisely prescribes
the desired behavior. It would feel strange to me to write ``return self``
and then say "but you don't actually have to return self, and in fact
you shouldn't when working with subclasses". To me, it feels like

 return (the original object unchanged, or a copy of the object,
 depending on implementation details,
 but always make a copy when working with subclasses)

is well-summarized by

return self[:]

especially if followed by the text

 Note that ``self[:]`` might not actually make a copy -- if the affix
 is empty or not found, and if ``type(self) is str``, then these methods
 may, but are not required to, make the optimization of returning ``self``.
 However, when called on instances of subclasses of ``str``, these
 methods should return base ``str`` objects, not ``self``.

...which is a necessary explanation regardless. Granted, ``return self[:]``
isn't perfect if ``__getitem__`` is overridden, but at the cost of three
characters, the Python gains accuracy over both the optional nature of
returning ``self`` in all cases and the impossibility (assuming no dunders
are overridden) of returning self for subclasses. It also dissuades readers
from relying on the behavior of returning self, which we're specifying is
an implementation detail.

Is that text explanation satisfactory?


Yes, that makes sense to me. I haven't had time to review the most 
recent updates, and I'll probably wait until you update it one more time.


Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VE65KII4ZAQV4BR4RACCIOQ27BWRPNYI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Dennis Sweeney
I'm removing the tuple feature from this PEP. So now, if I understand
correctly, I don't think there's disagreement about behavior, just about
how that behavior should be summarized in Python code. 

Ethan Furman wrote:
> > It appears that in CPython, self[:] is self is true for base
> > str
> >  objects, so I think return self[:] is consistent with (1) the premise
> >  that returning self is an implementation detail that is neither mandated
> >  nor forbidden, and (2) the premise that the methods should return base
> >  str objects even when called on str subclasses.
> The Python interpreter in my head sees self[:] and returns a copy. 
> A
> note that says a str is returned would be more useful than trying to
> exactly mirror internal details in the Python "roughly equivalent" code.

I think I'm still in the camp that ``return self[:]`` more precisely prescribes
the desired behavior. It would feel strange to me to write ``return self``
and then say "but you don't actually have to return self, and in fact
you shouldn't when working with subclasses". To me, it feels like

return (the original object unchanged, or a copy of the object, 
depending on implementation details, 
but always make a copy when working with subclasses)

is well-summarized by

   return self[:]

especially if followed by the text

Note that ``self[:]`` might not actually make a copy -- if the affix
is empty or not found, and if ``type(self) is str``, then these methods
may, but are not required to, make the optimization of returning ``self``.
However, when called on instances of subclasses of ``str``, these
methods should return base ``str`` objects, not ``self``.

...which is a necessary explanation regardless. Granted, ``return self[:]``
isn't perfect if ``__getitem__`` is overridden, but at the cost of three
characters, the Python gains accuracy over both the optional nature of
returning ``self`` in all cases and the impossibility (assuming no dunders
are overridden) of returning self for subclasses. It also dissuades readers
from relying on the behavior of returning self, which we're specifying is
an implementation detail.

Is that text explanation satisfactory?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Walter Dörwald

On 25 Mar 2020, at 9:48, Stephen J. Turnbull wrote:


Walter Dörwald writes:


A `find()` that supports multiple search strings (and returns the
leftmost position where a search string can be found) is a great help 
in

implementing some kind of tokenizer:


In other words, you want the equivalent of Emacs's "(search-forward
(regexp-opt list-of-strings))", which also meets the requirement of
returning which string was found (as "(match-string 0)").


Sounds like it. I'm not familiar with Emacs.


Since Python already has a functionally similar API for regexps, we
can add a regexp-opt (with appropriate name) method to re, perhaps as
.compile_string_list(), and provide a convenience function
re.search_string_list() for your application.


If you're using regexps anyway, building the appropriate or-expression 
shouldn't be a problem. I guess that's what most lexers/tokenizers do 
anyway.



I'm applying practicality before purity, of course.  To some extent
we want to encourage simple string approaches, and putting this in
regex is not optimal for that.


Exactly. I'm always a bit hesitant when using regexps, if there's a 
simpler string approach.



Steve


Servus,
   Walter
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/46KMMKYHW7DIDNZFO27GNQCJVILNSQ6Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Stephen J. Turnbull
Walter Dörwald writes:

 > A `find()` that supports multiple search strings (and returns the 
 > leftmost position where a search string can be found) is a great help in 
 > implementing some kind of tokenizer:

In other words, you want the equivalent of Emacs's "(search-forward
(regexp-opt list-of-strings))", which also meets the requirement of
returning which string was found (as "(match-string 0)").

Since Python already has a functionally similar API for regexps, we
can add a regexp-opt (with appropriate name) method to re, perhaps as
.compile_string_list(), and provide a convenience function
re.search_string_list() for your application.

I'm applying practicality before purity, of course.  To some extent
we want to encourage simple string approaches, and putting this in
regex is not optimal for that.

Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SUUZNXUTB774GWSLPKLPVHN7VK237I2D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Paul Moore
On Wed, 25 Mar 2020 at 00:42, Dennis Sweeney
 wrote:
>
> There were at least two comments suggesting keeping it to one affix at a time:
>
> https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/
>
> https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/
>
> But I didn't see any big objections to the rest of the PEP, so I think maybe 
> we keep it restricted for now.

That sounds like a good idea. The issue for me is how the function
should behave with a list of affixes if one is a prefix of another,
e.g.,removeprefix(('Test', 'Tests')). The empty string case is just
one form of that. The behaviour should be defined clearly, and while I
imagine "always remove the longest" is the "obvious" sensible choice,
I am fairly certain there will be other opinions :-) So deferring the
decision for now until we have more experience with the single-affix
form seems perfectly reasonable.

I'm not even sure that switching to multiple affixes later would need
a PEP - it might be fine to add via a simple feature request issue.
But that can be a decision for later, too.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ELRLN2V3OIXD7PPSTMCGMH3METJWS5W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Steven D'Aprano
On Tue, Mar 24, 2020 at 04:53:55PM +0100, Walter Dörwald wrote:

> But for `cutprefix()` (or whatever it's going to be called). I'm +1 on 
> supporting multiple prefixes. For ambiguous cases, IMHO the most 
> straight forward option would be to chop off the first prefix found.

The Zen of Python has something to say about guessing in the face of 
ambiguity.


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B6VA7OGD7PCIG67OELV2B4ZQK6KFT4Z2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Victor Stinner
Thanks for the pointers to emails.

Ethan Furman: "This is why replace() still only takes a single
substring to match and this isn't supported: (...)"

Hum ok, it makes sense. I agree that we can start with only accepting
str (reject tuple), and maybe reconsider the idea of accepting a tuple
of str later.

Please move the idea in Rejected Ideas, but try also to summarize the
reasons why the idea was rejected. I saw:

* surprising result for empty prefix/suffix
* surprising result for "FooBar text".cutprefix(("Foo", "FooBar"))
* issue with unordered sequence like set: only accept tuple which is ordered
* str.replace() only accepts str.replace(str, str) to avoid these
issues: the idea of accepting str.replace(tuple of str, str) or
variant was rejected multiple times. XXX does someone have references
to past discussions? I found https://bugs.python.org/issue33647 which
is a little bit different.

You may mention re.sub() as an existing efficient solution for the
complex cases.

I have to confess that I had to think twice when I wrote my example
line.cutsuffix(("\r\n", "\r", "\n")). Did I write suffixes in the
correct order to get what I expect? :-) "\r\n" starts with "\r".

Victor

Le mer. 25 mars 2020 à 01:44, Dennis Sweeney
 a écrit :
>
> There were at least two comments suggesting keeping it to one affix at a time:
>
> https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/
>
> https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/
>
> But I didn't see any big objections to the rest of the PEP, so I think maybe 
> we keep it restricted for now.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/QBCB2QMUMYBLPXHB6VKIKFK7OODYVKX5/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XVKF24TEJOCNUTK5IOXJIUOBBC6BNT5K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Ethan Furman

On 03/24/2020 06:10 PM, Victor Stinner wrote:


Ethan Furman: "This is why replace() still only takes a single
substring to match and this isn't supported: (...)"


Correction:  The above quote belongs to Steven D'Aprano.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/62SRLYGK74JGUNVQ4WS63HQOQBW3LOUS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
There were at least two comments suggesting keeping it to one affix at a time:

https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/

https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/

But I didn't see any big objections to the rest of the PEP, so I think maybe we 
keep it restricted for now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QBCB2QMUMYBLPXHB6VKIKFK7OODYVKX5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Victor Stinner
Le mer. 25 mars 2020 à 00:29, Dennis Sweeney
 a écrit :
> Lastly, since the issue of multiple prefixes/suffixes is more controversial 
> and seems that it would not affect how the single-affix cases would work, I 
> can remove that from this PEP and allow someone else with a stronger opinion 
> about it to propose and defend a set of semantics in a different PEP. Is 
> there any objection to deferring this to a different PEP?

name.cutsuffix(('Mixin', 'Tests', 'Test')) is used in the "Motivating
examples from the Python standard library" section. It looks like a
nice usage of this feature. You added "There were many other such
examples in the stdlib."

What do you mean by controversial? I proposed to raise an empty if the
prefix/suffix is empty to make cutsuffix(("", "suffix")) less
surprising. But I'm also fine if you keep this behavior, since
startswith/endswith accepts an empty string, and someone wrote that
accepting an empty prefix/suffix is an useful feature.

Or did someone write that cutprefix/cutsuffix must not accept a tuple
of strings? (I'm not sure that I was able to read carefully all
emails.)

I like the ability to pass multiple prefixes and suffixes because it
makes the method similar to lstrip(), rstrip(), strip(), startswith(),
endswith() with all accepts multiple "values" (characters to remove,
prefixes, suffixes).

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IBPEMCBGC5GXUH7BWZPYGWS22WFICN6L/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Eric V. Smith

On 3/24/2020 7:21 PM, Dennis Sweeney wrote:

It seems that there is a consensus on the names ``removeprefix`` and 
``removesuffix``. I will update the PEP accordingly. I'll also simplify sample 
Python implementation to primarily reflect *intent* over strict type-checking 
correctness, and I'll adjust the accompanying commentary accordingly.

Lastly, since the issue of multiple prefixes/suffixes is more controversial and 
seems that it would not affect how the single-affix cases would work, I can 
remove that from this PEP and allow someone else with a stronger opinion about 
it to propose and defend a set of semantics in a different PEP. Is there any 
objection to deferring this to a different PEP?


No objection. I think that's a good idea.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LPTDYL6ZX47D26B4TGZWR5K6I5PWX77U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
It seems that there is a consensus on the names ``removeprefix`` and 
``removesuffix``. I will update the PEP accordingly. I'll also simplify sample 
Python implementation to primarily reflect *intent* over strict type-checking 
correctness, and I'll adjust the accompanying commentary accordingly.

Lastly, since the issue of multiple prefixes/suffixes is more controversial and 
seems that it would not affect how the single-affix cases would work, I can 
remove that from this PEP and allow someone else with a stronger opinion about 
it to propose and defend a set of semantics in a different PEP. Is there any 
objection to deferring this to a different PEP?

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5JJ5YDUPCLVYSCCFOI4MQG64SLY22HU5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Greg Ewing

On 25/03/20 9:14 am, Dennis Sweeney wrote:

I think my confusion is about just how precise this sort of
"reference implementation" should be. Should it behave with ``str``
and ``tuple`` subclasses exactly how it would when implemented?


No, I don't think so. The purpose of a Python implementation
of a proposed feature is to get the intended semantics across,
not to reproduce all the quirks of an imagined C implementation.

If you were to bake these details into a Python reference
implementation, you would be implying that these are *intended*
restrictions, which (unless I misunderstand) is not what you
are intending.

(Back when yield-fron was being designed, I described the
intended semantics in prose, and gave an approximate Python
equivalent, which went through several revisions as we thrashed
out exactly how the feature should behave. But I don't think
it ever exactly matched all the details of the actual
implementation, nor was it intended to. The prose turned out
to be much more readable, anway.:-)

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RD72YLECP7WTXVQBRPHECMGHFQGHWYSO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Ethan Furman

Dennis Sweeney wrote:

Steven D'Aprano wrote:

Dennis Sweeney wrote:



I think then maybe it would be preferred to
use the something like the following in the PEP:
def cutprefix(self, prefix, /):
 if isinstance(prefix, str):
 if self.startswith(prefix):
 return self[len(prefix):]
 return self[:]


Didn't we have a discussion about not mandating a copy when nothing
changes? For strings, I'd just return self. It is only bytearray that
requires a copy to be made.


It appears that in CPython, ``self[:] is self`` is true for base ``str``
 objects, so I think ``return self[:]`` is consistent with (1) the premise
 that returning self is an implementation detail that is neither mandated
 nor forbidden, and (2) the premise that the methods should return base
 ``str`` objects even when called on ``str`` subclasses.


The Python interpreter in my head sees `self[:]` and returns a copy.  A
note that says a `str` is returned would be more useful than trying to
exactly mirror internal details in the Python "roughly equivalent" code.



 elif isinstance(prefix, tuple):
 for option in prefix:
 if self.startswith(option):
 return self[len(option):]


I'd also remove the entire multiple substrings feature, for reasons I've
already given. "Compatibility with startswith" is not a good reason to
add this feature and you haven't established any good use-cases for it.
A closer analog is str.replace(substring, ''), and after almost 30 years
of real-world experience, that method still only takes a single
substring, not a tuple.


The ``test_concurrent_futures.py`` example seemed to be a good use case to
 me. I agree that it would be good to see how common that actually is though.
 But it seems to me that any alternative behavior, e.g. repeated removal,
 could be implemented by a user on top of the remove-only-the-first-found
 behavior or by fluently chaining multiple method calls. Maybe you're right
 that it's too complex, but I think it's at least worth discussing.


I agree with Steven -- a tuple of options is not necessary for the affix removal
methods.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Kyle Stanley
> -1 on "cut*" (feels too much like what .partition() does)
> -0 on "trim*" (this is the name used in .NET instead of "strip", so I
> foresee new confusion)
> +1 on "remove*" (because this is exactly what it does)

I'm also most strongly in favor of "remove*" (out of the above options).
I'm opposed to cut*, mainly because it's too ambiguous in comparison to
other options such as "remove*" and "replace*", which would do a much
better job of explaining the operation performed.

Without the .NET conflict, I would normally be +1 on "trim*" as well; with
it in mind though, I'd lower it down to +0. Personally, I don't consider a
conflict in a different ecosystem enough to lower it down to -0, but it
still has some influence on my preference.

So far, the consensus seems to be in favor of "remove*" with several +1s
and no arguments against it (as far as I can tell), whereas the other
options have been rather controversial.

On Tue, Mar 24, 2020 at 3:38 PM Steve Dower  wrote:

> On 24Mar2020 1849, Brett Cannon wrote:
> > -1 on "cut*" because my brain keeps reading it as "cute".
> > +1 on "trim*" as it is clear what's going on and no confusion with
> preexisting methods.
> > +1 on "remove*" for the same reasons as "trim*".
> >
> > And if no consensus is reached in this thread for a name I would assume
> the SC is going to ultimately decide on the name if the PEP is accepted as
> the burden of being known as "the person who chose _those_ method names on
> str" is more than any one person should have bear. ;)
>
> -1 on "cut*" (feels too much like what .partition() does)
> -0 on "trim*" (this is the name used in .NET instead of "strip", so I
> foresee new confusion)
> +1 on "remove*" (because this is exactly what it does)
>
> Cheers,
> Steve
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KVU75BNXIUBIOYM6ZJSPZSKNRS7Y6CYU/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EHK7WWVFUOMSD7NJDLOM7S5JKXK6WE3Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Paul Sokolovsky
Hello,

On Tue, 24 Mar 2020 22:51:55 +0100
Victor Stinner  wrote:

> > === config.something ===
> > # If you'd like to remove some prefix from your lines, set it here
> > REMOVE_PREFIX = ""
> > ==
> >
> > === src.py ===
> > ...
> > line = line.cutprefix(config.REMOVE_PREFIX)
> > ...
> > ==  
> 
> Just use:
> 
> if config.REMOVE_PREFIX:
> line = line.cutprefix(config.REMOVE_PREFIX)

Or even just:

if line.startswith(config.REMOVE_PREFIX):
 line = line[len(config.REMOVE_PREFIX):]

But the point taken - indeed, any confusing, inconsistent behavior can
be fixed on users' side with more if's, once they discover it.


-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WBPUTU2U5OC6M5GN32GOIJQQGMXLVPAC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Eric Fahlgren
On Tue, Mar 24, 2020 at 2:53 PM Ethan Furman  wrote:

> On 03/24/2020 01:37 PM, Eric V. Smith wrote:
> > On 3/24/2020 3:30 PM, Steve Dower wrote:
> >> On 24Mar2020 1849, Brett Cannon wrote:
> >>> -1 on "cut*" because my brain keeps reading it as "cute".
> >>> +1 on "trim*" as it is clear what's going on and no confusion with
> preexisting methods.
> >>> +1 on "remove*" for the same reasons as "trim*".
> >>>
> >>> And if no consensus is reached in this thread for a name I would
> assume the SC is going to ultimately decide on the name if the PEP is
> accepted as the burden of being known as "the person who chose _those_
> method names on str" is more than any one person should have bear. ;)
> >>
> >> -1 on "cut*" (feels too much like what .partition() does)
> >> -0 on "trim*" (this is the name used in .NET instead of "strip", so I
> foresee new confusion)
> >> +1 on "remove*" (because this is exactly what it does)
>
I think name choice is easier if you write the documentation first:

cutprefix - Removes the specified prefix.
trimprefix - Removes the specified prefix.
stripprefix - Removes the specified prefix.
removeprefix - Removes the specified prefix.  Duh. :)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/45YMVG53WMKN66JXZV7VO2LPFQ5W3Z4F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Victor Stinner
Le mar. 24 mars 2020 à 20:06, Paul Sokolovsky  a écrit :
> === config.something ===
> # If you'd like to remove some prefix from your lines, set it here
> REMOVE_PREFIX = ""
> ==
>
> === src.py ===
> ...
> line = line.cutprefix(config.REMOVE_PREFIX)
> ...
> ==

Just use:

if config.REMOVE_PREFIX:
line = line.cutprefix(config.REMOVE_PREFIX)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VUMXNEHLDNKOIAFXVV6CPSRBSAEB5AEI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
Steven D'Aprano wrote:
> On Tue, Mar 24, 2020 at 08:14:33PM -, Dennis Sweeney wrote:
> > I think then maybe it would be preferred to 
> > use the something like the following in the PEP:
> > def cutprefix(self, prefix, /):
> > if isinstance(prefix, str):
> > if self.startswith(prefix):
> > return self[len(prefix):]
> > return self[:]
> > 
> > Didn't we have a discussion about not mandating a copy when nothing 
> changes? For strings, I'd just return self. It is only bytearray that 
> requires a copy to be made.

It appears that in CPython, ``self[:] is self`` is true for base ``str`` 
objects, so I think ``return self[:]`` is consistent with (1) the premise that 
returning self is an implementation detail that is neither mandated nor 
forbidden, and (2) the premise that the methods should return base ``str`` 
objects even when called on ``str`` subclasses.

> > elif isinstance(prefix, tuple):
> > for option in prefix:
> > if self.startswith(option):
> > return self[len(option):]
> > 
> > I'd also remove the entire multiple substrings feature, for reasons I've 
> already given. "Compatibility with startswith" is not a good reason to 
> add this feature and you haven't established any good use-cases for it.
> A closer analog is str.replace(substring, ''), and after almost 30 years 
> of real-world experience, that method still only takes a single 
> substring, not a tuple.

The ``test_concurrent_futures.py`` example seemed to be a good use case to me. 
I agree that it would be good to see how common that actually is though. But it 
seems to me that any alternative behavior, e.g. repeated removal, could be 
implemented by a user on top of the remove-only-the-first-found behavior or by 
fluently chaining multiple method calls. Maybe you're right that it's too 
complex, but I think it's at least worth discussing.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TRTHGTLOEQXSYYXKQ6RFEXMGDI7O57EL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Ethan Furman

On 03/24/2020 01:37 PM, Eric V. Smith wrote:

On 3/24/2020 3:30 PM, Steve Dower wrote:

On 24Mar2020 1849, Brett Cannon wrote:

-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with preexisting 
methods.
+1 on "remove*" for the same reasons as "trim*".

And if no consensus is reached in this thread for a name I would assume the SC is going 
to ultimately decide on the name if the PEP is accepted as the burden of being known as 
"the person who chose _those_ method names on str" is more than any one person 
should have bear. ;)


-1 on "cut*" (feels too much like what .partition() does)
-0 on "trim*" (this is the name used in .NET instead of "strip", so I foresee 
new confusion)
+1 on "remove*" (because this is exactly what it does)


I actually prefer "without*" because it seems more descriptive, but I don't 
expect it to get any traction.

So "remove" would get my +1.


I still think "strip" is the most optimal, as strip, stripprefix, and stripsuffix would 
all be together -- but if that's not going to happen, "remove" is good.

+2 on "strip"   ;-)
+1 on "remove"

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EZLFUTAFMD6PCSVHK7M6L6G2HEXVENXG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Eric V. Smith

On 3/24/2020 3:30 PM, Steve Dower wrote:

On 24Mar2020 1849, Brett Cannon wrote:

-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with 
preexisting methods.

+1 on "remove*" for the same reasons as "trim*".

And if no consensus is reached in this thread for a name I would 
assume the SC is going to ultimately decide on the name if the PEP is 
accepted as the burden of being known as "the person who chose 
_those_ method names on str" is more than any one person should have 
bear. ;)


-1 on "cut*" (feels too much like what .partition() does)
-0 on "trim*" (this is the name used in .NET instead of "strip", so I 
foresee new confusion)

+1 on "remove*" (because this is exactly what it does)

I actually prefer "without*" because it seems more descriptive, but I 
don't expect it to get any traction.


So "remove" would get my +1.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KQNVH74JAZBXLD6YNQSHRQ6UEIKTTMVQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Gregory P. Smith
On Tue, Mar 24, 2020 at 11:55 AM Brett Cannon  wrote:

> -1 on "cut*" because my brain keeps reading it as "cute".
> +1 on "trim*" as it is clear what's going on and no confusion with
> preexisting methods.
> +1 on "remove*" for the same reasons as "trim*".
>
> And if no consensus is reached in this thread for a name I would assume
> the SC is going to ultimately decide on the name if the PEP is accepted as
> the burden of being known as "the person who chose _those_ method names on
> str" is more than any one person should have bear. ;)
>

"raymondLuxuryYacht*" pronounced Throatwobbler Mangrove it is!

Never fear, the entire stdlib is full of naming inconsistencies and
questionable choices accumulated over time.  Whatever is chosen will be
lost in the noise and people will happily use it.

The original PEP mentioned that trim had a different use in PHP which is
why I suggest avoiding that one.  I don't know how much crossover there
actually is between PHP and Python programmers these days outside of FB.

-gps

* https://montypython.fandom.com/wiki/Raymond_Luxury-Yacht

___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/Z7TK4C5PPECBRTCTPKCJEABFM62TDYWW/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XC5GLSUXP6RNKEDFDGWLBSKT4TPSP5GU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Steven D'Aprano
On Tue, Mar 24, 2020 at 08:14:33PM -, Dennis Sweeney wrote:

> I think then maybe it would be preferred to 
> use the something like the following in the PEP:
> 
> def cutprefix(self, prefix, /):
> if isinstance(prefix, str):
> if self.startswith(prefix):
> return self[len(prefix):]
> return self[:]

Didn't we have a discussion about not mandating a copy when nothing 
changes? For strings, I'd just return `self`. It is only bytearray that 
requires a copy to be made.

> elif isinstance(prefix, tuple):
> for option in prefix:
> if self.startswith(option):
> return self[len(option):]

I'd also remove the entire multiple substrings feature, for reasons I've 
already given. "Compatibility with startswith" is not a good reason to 
add this feature and you haven't established any good use-cases for it.
 
A closer analog is str.replace(substring, ''), and after almost 30 years 
of real-world experience, that method still only takes a single 
substring, not a tuple.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4CL4G4OY2DREUVSOHCYFDLCWGBQ6ULLD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Steven D'Aprano
On Tue, Mar 24, 2020 at 08:14:33PM -, Dennis Sweeney wrote:

> I think my confusion is about just how precise this sort of "reference 
> implementation" should be. Should it behave with ``str`` and ``tuple`` 
> subclasses exactly how it would when implemented? If so, I would expect the 
> following to work:

I think that for the purposes of a relatively straight-forward PEP like 
this, you should start simple and only add complexity if needed to 
resolve questions.

The Python implementation ought to show the desired semantics, not try 
to be an exact translation of the C code. Think of the Python 
equivalents in the itertools docs:

https://docs.python.org/3/library/itertools.html

See for example:

https://www.python.org/dev/peps/pep-0584/#reference-implementation

https://www.python.org/dev/peps/pep-0572/#appendix-b-rough-code-translations-for-comprehensions

You already state that the methods will show "roughly the following 
behavior", so there's no expectation that it will be precisely what 
the real methods do.

Aim for clarity over emulation of unusual corner cases. The reference 
implementation is informative not prescriptive.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F5Z24BQF5MNHL6BPIQGGIXGH23ZEREFA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
I think my confusion is about just how precise this sort of "reference 
implementation" should be. Should it behave with ``str`` and ``tuple`` 
subclasses exactly how it would when implemented? If so, I would expect the 
following to work:

class S(str): __len__ = __getitem__ = __iter__ = None
class T(tuple): __len__ = __getitem__ = __iter__ = None

x = str.cutprefix("FooBar", T(("a", S("Foo"), 17)))
assert x == "Bar"
assert type(x) is str

and so I think the ``str.__getitem__(self, slice(str.__len__(prefix), None))`` 
monstrosity would be the most technically correct, unless I'm missing 
something. But I've never seen Python code so ugly. And I suppose this is a 
slippery slope -- should it also guard against people redefining ``len = lambda 
x: 5`` and ``str = list`` in the global scope? Clearly not. I think then maybe 
it would be preferred to use the something like the following in the PEP:

def cutprefix(self, prefix, /):
if isinstance(prefix, str):
if self.startswith(prefix):
return self[len(prefix):]
return self[:]
elif isinstance(prefix, tuple):
for option in prefix:
if self.startswith(option):
return self[len(option):]
return self[:]
else:
raise TypeError()


def cutsuffix(self, suffix):
if isinstance(suffix, str):
if self.endswith(suffix):
return self[:len(self)-len(suffix)]
return self[:]
elif isinstance(suffix, tuple):
for option in suffix:
if self.endswith(option):
return self[:len(self)-len(option)]
return self[:]
else:
raise TypeError()

The above would fail the assertions as written before, but would pass them for 
subclasses ``class S(str): pass`` and ``class T(tuple): pass`` that do not 
override any dunder methods. Is this an acceptable compromise if it appears 
alongside a clarifying sentence like the following?

These methods should always return base ``str`` objects, even when called 
on ``str`` subclasses.

I'm looking for guidance as to whether that's an appropriate level of precision 
for a PEP. If so, I'll make that change.

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PV6ANJL7KN4VHPSNPZSAZGQCEWHEKYG2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Steve Dower

On 24Mar2020 1849, Brett Cannon wrote:

-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with preexisting 
methods.
+1 on "remove*" for the same reasons as "trim*".

And if no consensus is reached in this thread for a name I would assume the SC is going 
to ultimately decide on the name if the PEP is accepted as the burden of being known as 
"the person who chose _those_ method names on str" is more than any one person 
should have bear. ;)


-1 on "cut*" (feels too much like what .partition() does)
-0 on "trim*" (this is the name used in .NET instead of "strip", so I 
foresee new confusion)

+1 on "remove*" (because this is exactly what it does)

Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KVU75BNXIUBIOYM6ZJSPZSKNRS7Y6CYU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Paul Sokolovsky
Hello,

On Tue, 24 Mar 2020 19:14:16 +0100
Victor Stinner  wrote:

[]

> The behavior of tuple containing an empty string is a little bit
> surprising.
> 
> cutsuffix("Hello World", ("", " World")) returns "Hello World",
> whereas cutsuffix("Hello World", (" World", "")) returns "Hello".
> 
> cutprefix() has a the same behavior: the first empty strings stops the
> loop and returns the string unchanged.
> 
> I would prefer to raise ValueError("empty separator") to avoid any
> risk of confusion. I'm not sure that str.cutprefix("") or
> str.cutsuffix("") does make any sense.

str.cutprefix("")/str.cutsuffix("") definitely makes sense, e.g.:

=== config.something ===
# If you'd like to remove some prefix from your lines, set it here
REMOVE_PREFIX = ""
==

=== src.py ===
...
line = line.cutprefix(config.REMOVE_PREFIX)
...
==


Now one may ask whether str.cutprefix(("", "nonempty")) makes sense.
A response can be "the more complex functionality, the more complex
and confusing corner cases there're to handle".

[]

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3M23LCIYMJ4TEQ6GNMHD4NJKTYBMDGGZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Brett Cannon
-1 on "cut*" because my brain keeps reading it as "cute".
+1 on "trim*" as it is clear what's going on and no confusion with preexisting 
methods.
+1 on "remove*" for the same reasons as "trim*".

And if no consensus is reached in this thread for a name I would assume the SC 
is going to ultimately decide on the name if the PEP is accepted as the burden 
of being known as "the person who chose _those_ method names on str" is more 
than any one person should have bear. ;)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z7TK4C5PPECBRTCTPKCJEABFM62TDYWW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Victor Stinner
Hi Dennis,

Thanks for the updated PEP, it looks way better! I love the ability to
pass a tuple of strings ;-)

--

The behavior of tuple containing an empty string is a little bit surprising.

cutsuffix("Hello World", ("", " World")) returns "Hello World",
whereas cutsuffix("Hello World", (" World", "")) returns "Hello".

cutprefix() has a the same behavior: the first empty strings stops the
loop and returns the string unchanged.

I would prefer to raise ValueError("empty separator") to avoid any
risk of confusion. I'm not sure that str.cutprefix("") or
str.cutsuffix("") does make any sense.

"abc".startswith("") and "abc".startswith(("", "a")) are true, but
that's fine since startswith() doesn't modify the string. Moreover, we
cannot change the behavior now :-) But for new methods, we can try to
design them correctly to avoid any risk of confusion.

--

It reminds me https://bugs.python.org/issue28029: "".replace("", s, n)
now returns s instead of an empty string for all non-zero n. The
behavior changes in Python 3.9.

There are also discussions about "abc".split("") and
re.compile("").split("abc"). str.split() raises ValueError("empty
separator") whereas re.split returns ['', 'a', 'b', 'c', ''] which can
be (IMO) surprising.

See also https://bugs.python.org/issue28937 "str.split(): allow
removing empty strings (when sep is not None)".

Note: on the other wise, str.strip("") is accepted and returns the
string unmodified. But this method doesn't accept a tuple of
substrings. It's different  than cutprefix/cutsuffix.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JF22AARPSQSRNFOIAHEILIBDNMSGMYWA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Walter Dörwald

On 24 Mar 2020, at 2:42, Steven D'Aprano wrote:


On Sun, Mar 22, 2020 at 10:25:28PM -, Dennis Sweeney wrote:


Changes:
- More complete Python implementation to match what the type 
checking in the C implementation would be

- Clarified that returning ``self`` is an optimization
- Added links to past discussions on Python-Ideas and Python-Dev
- Specified ability to accept a tuple of strings


I am concerned about that tuple of strings feature.
[...]
Aside from those questions about the reference implementation, I am
concerned about the feature itself. No other string method that 
returns

a modified copy of the string takes a tuple of alternatives.

* startswith and endswith do take a tuple of (pre/suff)ixes, but they
  don't return a modified copy; they just return a True or False flag;

* replace does return a modified copy, and only takes a single
  substring at a time;

* find/index/partition/split etc don't accept multiple substrings
  to search for.

That makes startswith/endswith the unusual ones, and we should be
conservative before emulating them.


Actually I would like for other string methods to gain the ability to 
search for/chop off multiple substrings too.


A `find()` that supports multiple search strings (and returns the 
leftmost position where a search string can be found) is a great help in 
implementing some kind of tokenizer:


```python
def tokenize(source, delimiter):
lastpos = 0
while True:
pos = source.find(delimiter, lastpos)
if pos == -1:
token = source[lastpos:].strip()
if token:
yield token
break
else:
token = source[lastpos:pos].strip()
if token:
yield token
yield source[pos]
lastpos = pos + 1

print(list(tokenize(" [ 1, 2, 3] ", ("[", ",", "]"
```

This would output `['[', '1', ',', '2', ',', '3', ']']` if `str.find()` 
supported multiple substring.


Of course to be really usable `find()` would have to return **which** 
substring was found, which would make the API more complicated (and 
somewhat incompatible with the existing `find()`).


But for `cutprefix()` (or whatever it's going to be called). I'm +1 on 
supporting multiple prefixes. For ambiguous cases, IMHO the most 
straight forward option would be to chop off the first prefix found.



[...]


Servus,
   Walter
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3MYYK6AINVTVCNVYC53FEB4T3LQGPWSC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Greg Ewing

On 24/03/20 3:43 pm, Dennis Sweeney wrote:

This was an attempt to ensure no one can do funny business with tuple
or str subclassing. I was trying to emulate the ``PyTuple_Check``
followed by ``PyTuple_GET_SIZE`` and ``PyTuple_GET_ITEM`` that are
done by the C implementation of ``str.startswith()``


The C code uses those functions for efficiency, not to prevent
"funny business". PyTuple_GET_SIZE and PyTuple_GET_ITEM are macros
that directly access fields of the tuple struct, and PyTuple_Check
is much faster than a full isinstance check.

There is no point in trying to emulate these in Python code.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/X3OBCBM3XTZD7XFEQ2ULR6XGEXB6PRLZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Dennis Sweeney
Steven D'Aprano wrote:
> Having confirmed that prefix is a tuple, you call tuple() to 
> make a copy of it in order to iterate over it. Why?
> 
> Having confirmed that option is a string, you call str() on
> it to (potentially) make a copy. Why?

This was an attempt to ensure no one can do funny business with tuple or str 
subclassing. I was trying to emulate the ``PyTuple_Check`` followed by 
``PyTuple_GET_SIZE`` and ``PyTuple_GET_ITEM`` that are done by the C 
implementation of ``str.startswith()`` to ensure that only the tuple/str 
methods are used, not arbitrary user subclass code. It seems that that's what 
most of the ``str`` methods force.

I was mistaken in how to do this with pure Python. I believe I actually wanted 
something like:

def cutprefix(self, prefix, /):
if not isinstance(self, str):
raise TypeError()

if isinstance(prefix, tuple):
for option in tuple.__iter__(prefix):
if not isinstance(option, str):
raise TypeError()

if str.startswith(self, option):
return str.__getitem__(
self, slice(str.__len__(option), None))

return str.__getitem__(self, slice(None, None))

if not isinstance(prefix, str):
raise TypeError()

if str.startswith(self, prefix):
return str.__getitem__(self, slice(str.__len__(prefix), None))
else:
return str.__getitem__(self, slice(None, None))

... which looks even uglier.

> We ought to get some real-life exposure to the simple case first, before 
> adding support for multiple prefixes/suffixes.

I could be (and have been) convinced either way about whether or not to 
generalize to tuples of strings. I thought Victor made a good point about 
compatibility with ``startswith()``
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CQVVWGPC454LWATA2Y7BZ5OEAGVSTHEZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Dennis Sweeney
This should be fixed now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TQQXDLROEKI5ANEF3J7ESFO2VNYRVDYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Steven D'Aprano
On Sun, Mar 22, 2020 at 10:25:28PM -, Dennis Sweeney wrote:

> Changes:
> - More complete Python implementation to match what the type checking in 
> the C implementation would be
> - Clarified that returning ``self`` is an optimization
> - Added links to past discussions on Python-Ideas and Python-Dev
> - Specified ability to accept a tuple of strings

I am concerned about that tuple of strings feature.

First, an implementation question: you do this when the prefix is a 
tuple:

if isinstance(prefix, tuple):
for option in tuple(prefix):
if not isinstance(option, str):
raise TypeError()
option_str = str(option)

which looks like two unnecessary copies:

1. Having confirmed that `prefix` is a tuple, you call tuple() to 
   make a copy of it in order to iterate over it. Why?

2. Having confirmed that option is a string, you call str() on
   it to (potentially) make a copy. Why?


Aside from those questions about the reference implementation, I am 
concerned about the feature itself. No other string method that returns 
a modified copy of the string takes a tuple of alternatives.

* startswith and endswith do take a tuple of (pre/suff)ixes, but they
  don't return a modified copy; they just return a True or False flag;

* replace does return a modified copy, and only takes a single 
  substring at a time;

* find/index/partition/split etc don't accept multiple substrings 
  to search for.

That makes startswith/endswith the unusual ones, and we should be 
conservative before emulating them.

The difficulty here is that the notion of "cut one of these prefixes" is 
ambiguous if two or more of the prefixes match. It doesn't matter for 
startswith:

"extraordinary".startswith(('ex', 'extra'))

since it is True whether you match left-to-right, shortest-to-largest, 
or even in random order. But for cutprefix, which prefix should be 
deleted?

Of course we can make a ruling by fiat, right now, and declare that it 
will cut the first matching prefix reading left to right, whether that's 
what users expect or not. That seems reasonable when your prefixes are 
hard-coded in the source, as above.

But what happens here?

prefixes = get_prefixes('user.config')
result = mystring.cutprefix(prefixes)

Whatever decision we make -- delete the shortest match, longest match, 
first match, last match -- we're going to surprise and annoy the people 
who expected one of the other behaviours.

This is why replace() still only takes a single substring to match and 
this isn't supported:

"extraordinary".replace(('ex', 'extra'), '')

We ought to get some real-life exposure to the simple case first, before 
adding support for multiple prefixes/suffixes.


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Stephen J. Turnbull
Cameron Simpson writes:

 > As a diversion, _are_ there use cases where an empty affix is useful or 
 > reasonable or likely?

In the "raise on failure" design,

"aba".cutsuffix('.doc')

raises,

"aba".cutsuffix('.doc', '')

returns "aba".

BTW, since I'm here, thanks for your discussion of context managers
for loop invariants.  It was very enlightening.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DWJSRGHICP6AM6BVPVCQLTA5JOC5IJKO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Cameron Simpson

On 22Mar2020 23:33, Rob Cliffe  wrote:
Sorry, another niggle re handling an empty affix:  With your Python 
implementation,

'aba'.cutprefix(('', 'a')) == 'aba'
'aba'.cutsuffix(('', 'a')) == 'ab'
This seems surprising.


That surprises me too. I expect the first matching affix to be used. It 
is the only way for the caller to have a predictable policy.


As a diversion, _are_ there use cases where an empty affix is useful or 
reasonable or likely?


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZL4HWLP7TI3CAKM65WXXGTTTE77A7YSL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Rob Cliffe via Python-Dev
Sorry, another niggle re handling an empty affix:  With your Python 
implementation,

'aba'.cutprefix(('', 'a')) == 'aba'
'aba'.cutsuffix(('', 'a')) == 'ab'
This seems surprising.
Rob Gadfly Cliffe


On 22/03/2020 23:23, Dennis Sweeney wrote:

Much appreciated! I will add that single quote and change those snippets to::

  >>> s = 'FooBar' * 100 + 'Baz'
  >>> prefixes = ('Bar', 'Foo')
  >>> while len(s) != len(s := s.cutprefix(prefixes)): pass
  >>> s
  'Baz'

and::

  >>> s = 'FooBar' * 100 + 'Baz'
  >>> prefixes = ('Bar', 'Foo')
  >>> while s.startswith(prefixes): s = s.cutprefix(prefixes)
  >>> s
  'Baz'
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QJ54X6WHQQ5HFROSJOLGJF4QMFINMAPY/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XZD7TRPNXNVL4FL4NNUX6KB3OREVALCX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Eric V. Smith

On 3/23/2020 12:02 PM, Rhodri James wrote:

On 23/03/2020 14:50, Dan Stromberg wrote:


I tend to be mistrustful of code that tries to guess the best thing 
to do,

when something expected isn't found.

How about:

def cutprefix(self: str, pre: str, raise_on_no_match: bool=False, /) 
-> str:

 if self.startswith(pre):
 return self[len(pre):]
 if raise_on_no_match:
 raise ValueError('prefix not found')
 return self[:]


I'm firmly of the opinion that the functions should either raise or 
not, and should definitely not have a parameter to switch behaviours. 
Probably it should do nothing; if the programmer needs to know that 
the prefix wasn't there, cutprefix() probably wasn't the right thing 
to use anyway.


Agreed, and I think we shouldn't raise. If raising is important, the 
user can write a trivial wrapper that raises if no substitution was 
done. Let's not over-complicate this.


Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NYJMGSLPW4KFOYUT6ZWL6PSKXCF3EDHM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Rhodri James

On 23/03/2020 14:50, Dan Stromberg wrote:

On Fri, Mar 20, 2020 at 3:28 PM Victor Stinner  wrote:


The builtin ``str`` class will gain two new methods with roughly the
following behavior::

 def cutprefix(self: str, pre: str, /) -> str:
 if self.startswith(pre):
 return self[len(pre):]
 return self[:]




I tend to be mistrustful of code that tries to guess the best thing to do,
when something expected isn't found.

How about:

def cutprefix(self: str, pre: str, raise_on_no_match: bool=False, /) -> str:
 if self.startswith(pre):
 return self[len(pre):]
 if raise_on_no_match:
 raise ValueError('prefix not found')
 return self[:]


I'm firmly of the opinion that the functions should either raise or not, 
and should definitely not have a parameter to switch behaviours. 
Probably it should do nothing; if the programmer needs to know that the 
prefix wasn't there, cutprefix() probably wasn't the right thing to use 
anyway.


--
Rhodri James *-* Kynesim Ltd
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SA5RZNAJESODSMNGBX7OD2F77YMHWH5Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Dan Stromberg
On Fri, Mar 20, 2020 at 3:28 PM Victor Stinner  wrote:

> > The builtin ``str`` class will gain two new methods with roughly the
> > following behavior::
> >
> > def cutprefix(self: str, pre: str, /) -> str:
> > if self.startswith(pre):
> > return self[len(pre):]
> > return self[:]
>

I tend to be mistrustful of code that tries to guess the best thing to do,
when something expected isn't found.

How about:

def cutprefix(self: str, pre: str, raise_on_no_match: bool=False, /) -> str:
if self.startswith(pre):
return self[len(pre):]
if raise_on_no_match:
raise ValueError('prefix not found')
return self[:]
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NYMLVK35CVNWUL6OWZDB2CRA5W2HPMIH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Dennis Sweeney
Much appreciated! I will add that single quote and change those snippets to::

 >>> s = 'FooBar' * 100 + 'Baz'
 >>> prefixes = ('Bar', 'Foo')
 >>> while len(s) != len(s := s.cutprefix(prefixes)): pass
 >>> s
 'Baz'

and::

 >>> s = 'FooBar' * 100 + 'Baz'
 >>> prefixes = ('Bar', 'Foo')
 >>> while s.startswith(prefixes): s = s.cutprefix(prefixes)
 >>> s
 'Baz'
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QJ54X6WHQQ5HFROSJOLGJF4QMFINMAPY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Rob Cliffe via Python-Dev



On 22/03/2020 22:25, Dennis Sweeney wrote:

Here's an updated version.

Online: https://www.python.org/dev/peps/pep-0616/
Source: https://raw.githubusercontent.com/python/peps/master/pep-0616.rst

Changes:
 - More complete Python implementation to match what the type checking in 
the C implementation would be
 - Clarified that returning ``self`` is an optimization
 - Added links to past discussions on Python-Ideas and Python-Dev
 - Specified ability to accept a tuple of strings
 - Shorter abstract section and fewer stdlib examples
 - Mentioned
 - Typo and formatting fixes

I didn't change the name because it didn't seem like there was a strong 
consensus for an alternative yet. I liked the suggestions of ``dropprefix`` or 
``removeprefix``.

All the best,
Dennis
___


Proofreading:

it would not be obvious for users to have to call 
'foobar'.cutprefix(('foo,)) for the common use case of a single prefix.


Missing single quote after the last foo.


s = 'foobar' * 100 + 'bar'
prefixes = ('bar', 'foo')
while len(s) != len(s := s.cutprefix(prefixes)): pass
s

'bar'

or the more obvious and readable alternative:


s = 'foo' * 100 + 'bar'
prefixes = ('bar', 'foo')
while s.startswith(prefixes): s = s.cutprefix(prefixes)
s

'bar'


Er no, in both these examples s is reduced to an empty string.

Best wishes
Rob Cliffe

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HSCTQB4FVHM54REZEUKE5TRONFM7ZH2Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Dennis Sweeney
Here's an updated version.

Online: https://www.python.org/dev/peps/pep-0616/
Source: https://raw.githubusercontent.com/python/peps/master/pep-0616.rst

Changes:
- More complete Python implementation to match what the type checking in 
the C implementation would be
- Clarified that returning ``self`` is an optimization
- Added links to past discussions on Python-Ideas and Python-Dev
- Specified ability to accept a tuple of strings
- Shorter abstract section and fewer stdlib examples
- Mentioned 
- Typo and formatting fixes

I didn't change the name because it didn't seem like there was a strong 
consensus for an alternative yet. I liked the suggestions of ``dropprefix`` or 
``removeprefix``.

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RY7GS4GF7OT7CLZVEDSULMY53QZYDN5Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
> And we *have* to decide that it returns a plain str instance if called
> on a subclass instance (unless overridden, of course) since the base
> class (str) won't know the signature of the subclass constructor.
> That's also why all other str methods return an instance of plain str
> when called on a subclass instance.

My suggestion is to rely on __getitem__ here (for subclasses), in which
case we don't actually need to know the subclass constructor. The rough
implementation in the PEP shows how to do it without needing to know the
subclass constructor:

def redbikeshed(self, prefix):
    if self.startswith(pre):
    return self[len(pre):]
    return self[:]

The actual implementation doesn't need to be implemented that way, as
long as the result is always there result of slicing the original
string, it's safe to do so* and more convenient for subclass
implementers (who now only have to implement __getitem__ to get the
affix-trimming functions for free).

One downside to this scheme is that I think it makes getting the type
hinting right more complicated, since the return type of these functions
is basically, "Whatever the return type of self.__getitem__ is", but I
don't think anyone will complain if you write -> str with the
understanding that __getitem__ should return a str or a subtype thereof.

Best,
Paul

*Assuming they haven't messed with __getitem__ to do something
non-standard, but if they've done that I think they've tossed Liskov
substitution out the window and will have to re-implement these methods
if they want them to work.

On 3/22/20 2:03 PM, Guido van Rossum wrote:
> On Sun, Mar 22, 2020 at 4:20 AM Eric V. Smith  > wrote:
>
> Agreed. I think the PEP should say that a str will be returned (in
> the
> event of a subclass, assuming that's what we decide), but if the
> argument is exactly a str, that it may or may not return the original
> object.
>
>
> Yes. Returning self if the class is exactly str is *just* an
> optimization -- it must not be mandated nor ruled out.
>
> -- 
> --Guido van Rossum (python.org/~guido )
> /Pronouns: he/him //(why is my pronoun here?)/
> 
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZZTY3OCJFZTZM74MVWRYL23LFJGNKICU/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y3O7CBHJB4R34TYL7RDEU2TB5OPSNI3H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Mike Miller


On 2020-03-21 20:38, Guido van Rossum wrote:
It's not great, and I actually think that "stripprefix" and "stripsuffix" are 
reasonable. (I found that in Go, everything we call "strip" is called "Trim", 
and there are "TrimPrefix" and "TrimSuffix" functions that correspond to the PEP 
616 functions.)


To jump on the bikeshed, trimprefix and trimsuffix are the best I've read so 
far, due to the definitions of the words in English.


Though often used interchangeably, when I think of "strip" I think of removing 
multiple things, somewhat indiscriminately with an arm motion, which is how the 
functions currently work.  e.g. "strip paint", "strip clothes":


https://www.dictionary.com/browse/strip
to take away or remove

When I think of trim, I think more of a single cut of higher precision with 
scissors.  e.g. "trim hair", "trim branches":


https://www.dictionary.com/browse/trim
to put into a neat or orderly condition by clipping…


Which is what this method would do.  That trim matches Go is a small but decent 
benefit.  Another person warned against inconsistency with PHP, but don't think 
PHP should be considered for design guidance, IMHO.  Perhaps as an example of 
what not to do, which happily is in agreement with the above.


-Mike

p.s.  +1, I do support this PEP, with or without name change, since some 
mentioned concern over that.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PE7KP36HUDXCQX7NYGEXSECOQOMVDZKG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Guido van Rossum
On Sun, Mar 22, 2020 at 4:20 AM Eric V. Smith  wrote:

> Agreed. I think the PEP should say that a str will be returned (in the
> event of a subclass, assuming that's what we decide), but if the
> argument is exactly a str, that it may or may not return the original
> object.
>

Yes. Returning self if the class is exactly str is *just* an optimization
-- it must not be mandated nor ruled out.

And we *have* to decide that it returns a plain str instance if called on a
subclass instance (unless overridden, of course) since the base class (str)
won't know the signature of the subclass constructor. That's also why all
other str methods return an instance of plain str when called on a subclass
instance.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZZTY3OCJFZTZM74MVWRYL23LFJGNKICU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread MRAB

On 2020-03-22 05:00, Dennis Sweeney wrote:

I like "removeprefix" and "removesuffix". My only concern before had been length, but 
three more characters than "cut***fix" is a small price to pay for clarity.


How about "dropprefix" and "dropsuffix"?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UL3SAVB3RGFTRFERG3J3VQNPQBXWTV7G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Eric V. Smith

On 3/22/2020 12:25 PM, Paul Ganssle wrote:


Sorry, I think I accidentally left out a clause here - I meant that 
the rationale for /always returning a 'str'/ (as opposed to returning 
a subclass) is missing, it just says in the PEP:


The only difference between the real implementation and the above is 
that, as with other string methods like replace, the methods will 
raise a TypeError if any of self, pre or suf is not an instace of 
str, and will cast subclasses of str to builtin str objects.


I think the rationale for these differences is not made entirely 
clear, specifically the "and will cast subclasses of str to builtin 
str objects" part.
Agreed. I don't understand the rationale, either. If we stick with it, 
it should definitely be stated. And if we don't, that reason should be 
explained, too.


Eric

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B4AGI7TJU5HC7VMYEO7VK63LTDMU7Q4M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
Sorry, I think I accidentally left out a clause here - I meant that the
rationale for /always returning a 'str'/ (as opposed to returning a
subclass) is missing, it just says in the PEP:

> The only difference between the real implementation and the above is
> that, as with other string methods like replace, the methods will
> raise a TypeError if any of self, pre or suf is not an instace of str,
> and will cast subclasses of str to builtin str objects.

I think the rationale for these differences is not made entirely clear,
specifically the "and will cast subclasses of str to builtin str
objects" part.

I think it would be best to define the truncation in terms of
__getitem__ - possibly with the caveat that implementations are allowed
(but not required) to return `self` unchanged if no match is found.

Best,
Paul

P.S. Dennis - just noticed in this reply that there is a typo in the PEP
- s/instace/instance

On 3/22/20 12:15 PM, Victor Stinner wrote:
> tl; dr A method implemented in C is more efficient than hand-written
> pure-Python code, and it's less error-prone
>
> I don't think if it has already been said previously, but I hate
> having to compute manually the string length when writing:
>
> if line.startswith("prefix"): line = line[6:]
>
> Usually what I do is to open a Python REPL and I type: len("prefix")
> and copy-paste the result :-)
>
> Passing directly the length is a risk of mistake. What if I write
> line[7:] and it works most of the time because of a space, but
> sometimes the space is omitted randomly and the application fails?
>
> --
>
> The lazy approach is:
>
> if line.startswith("prefix"): line = line[len("prefix"):]
>
> Such code makes my "micro-optimizer hearth" bleeding since I know that
> Python is stupid and calls len() at runtime, the compiler is unable to
> optimize it (sadly for good reasons, len name can be overriden)  :-(
>
> => line.cutprefix("prefix") is more efficient! ;-) It's also also shorter.
>
> Victor
>
> Le dim. 22 mars 2020 à 17:02, Paul Ganssle  a écrit :
>> I don't see any rationale in the PEP or in the python-ideas thread
>> (admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
>> there). Is this just for consistency with other methods like .casefold?
>>
>> I can understand why you'd want it to be consistent, but I think it's
>> misguided in this case. It adds unnecessary complexity for subclass
>> implementers to need to re-implement these two additional methods, and I
>> can see no obvious reason why this behavior would be necessary, since
>> these methods can be implemented in terms of string slicing.
>>
>> Even if you wanted to use `str`-specific optimizations in C that aren't
>> available if you are constrained to use the subclass's __getitem__, it's
>> inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
>> path" that doesn't use slice.
>>
>> I think defining this in terms of string slicing makes the most sense
>> (and, notably, slice itself returns `str` unless explicitly overridden,
>> the default is for it to return `str` anyway...).
>>
>> Either way, it would be nice to see the rationale included in the PEP
>> somewhere.
>>
>> Best,
>> Paul
>>
>> On 3/22/20 7:16 AM, Eric V. Smith wrote:
>>> On 3/22/2020 1:42 AM, Nick Coghlan wrote:
 On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
> On 21Mar2020 12:45, Eric V. Smith  wrote:
>> On 3/21/2020 12:39 PM, Victor Stinner wrote:
>>> Well, if CPython is modified to implement tagged pointers and
>>> supports
>>> storing a short strings (a few latin1 characters) as a pointer, it
>>> may
>>> become harder to keep the same behavior for "x is y" where x and y
>>> are
>>> strings.
> Are you suggesting that it could become impossible to write this
> function:
>
>  def myself(o):
>  return o
>
> and not be able to rely on "o is myself(o)"? That seems... a pretty
> nasty breaking change for the language.
 Other way around - because strings are immutable, their identity isn't
 supposed to matter, so it's possible that functions that currently
 return the exact same object in some cases may in the future start
 returning a different object with the same value.

 Right now, in CPython, with no tagged pointers, we return the full
 existing pointer wherever we can, as that saves us a data copy. With
 tagged pointers, the pointer storage effectively *is* the instance, so
 you can't really replicate that existing "copy the reference not the
 storage" behaviour any more.

 That said, it's also possible that identity for tagged pointers would
 be value based (similar to the effect of the small integer cache and
 string interning), in which case the entire question would become
 moot.

 Either way, the PEP shouldn't be specifying that a new object *must*
 be returned, and it also shouldn't be specifying that the same object
 

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Victor Stinner
tl; dr A method implemented in C is more efficient than hand-written
pure-Python code, and it's less error-prone

I don't think if it has already been said previously, but I hate
having to compute manually the string length when writing:

if line.startswith("prefix"): line = line[6:]

Usually what I do is to open a Python REPL and I type: len("prefix")
and copy-paste the result :-)

Passing directly the length is a risk of mistake. What if I write
line[7:] and it works most of the time because of a space, but
sometimes the space is omitted randomly and the application fails?

--

The lazy approach is:

if line.startswith("prefix"): line = line[len("prefix"):]

Such code makes my "micro-optimizer hearth" bleeding since I know that
Python is stupid and calls len() at runtime, the compiler is unable to
optimize it (sadly for good reasons, len name can be overriden)  :-(

=> line.cutprefix("prefix") is more efficient! ;-) It's also also shorter.

Victor

Le dim. 22 mars 2020 à 17:02, Paul Ganssle  a écrit :
>
> I don't see any rationale in the PEP or in the python-ideas thread
> (admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
> there). Is this just for consistency with other methods like .casefold?
>
> I can understand why you'd want it to be consistent, but I think it's
> misguided in this case. It adds unnecessary complexity for subclass
> implementers to need to re-implement these two additional methods, and I
> can see no obvious reason why this behavior would be necessary, since
> these methods can be implemented in terms of string slicing.
>
> Even if you wanted to use `str`-specific optimizations in C that aren't
> available if you are constrained to use the subclass's __getitem__, it's
> inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
> path" that doesn't use slice.
>
> I think defining this in terms of string slicing makes the most sense
> (and, notably, slice itself returns `str` unless explicitly overridden,
> the default is for it to return `str` anyway...).
>
> Either way, it would be nice to see the rationale included in the PEP
> somewhere.
>
> Best,
> Paul
>
> On 3/22/20 7:16 AM, Eric V. Smith wrote:
> > On 3/22/2020 1:42 AM, Nick Coghlan wrote:
> >> On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
> >>> On 21Mar2020 12:45, Eric V. Smith  wrote:
>  On 3/21/2020 12:39 PM, Victor Stinner wrote:
> > Well, if CPython is modified to implement tagged pointers and
> > supports
> > storing a short strings (a few latin1 characters) as a pointer, it
> > may
> > become harder to keep the same behavior for "x is y" where x and y
> > are
> > strings.
> >>> Are you suggesting that it could become impossible to write this
> >>> function:
> >>>
> >>>  def myself(o):
> >>>  return o
> >>>
> >>> and not be able to rely on "o is myself(o)"? That seems... a pretty
> >>> nasty breaking change for the language.
> >> Other way around - because strings are immutable, their identity isn't
> >> supposed to matter, so it's possible that functions that currently
> >> return the exact same object in some cases may in the future start
> >> returning a different object with the same value.
> >>
> >> Right now, in CPython, with no tagged pointers, we return the full
> >> existing pointer wherever we can, as that saves us a data copy. With
> >> tagged pointers, the pointer storage effectively *is* the instance, so
> >> you can't really replicate that existing "copy the reference not the
> >> storage" behaviour any more.
> >>
> >> That said, it's also possible that identity for tagged pointers would
> >> be value based (similar to the effect of the small integer cache and
> >> string interning), in which case the entire question would become
> >> moot.
> >>
> >> Either way, the PEP shouldn't be specifying that a new object *must*
> >> be returned, and it also shouldn't be specifying that the same object
> >> *can't* be returned.
> >
> > Agreed. I think the PEP should say that a str will be returned (in the
> > event of a subclass, assuming that's what we decide), but if the
> > argument is exactly a str, that it may or may not return the original
> > object.
> >
> > Eric
> >
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-dev@python.org/message/JHM7T6JZU56PWYRJDG45HMRBXE3CBXMX/
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> 

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Victor Stinner
Dennis: please add references to past discussions in python-ideas and
python-dev. Link to the first email of each thread in these lists.

Victor
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WZDQJKEZTR3TTKEVF3MDAP6FCI4SMRDU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Victor Stinner
Le dim. 22 mars 2020 à 06:07, Gregory P. Smith  a écrit :
> Nice PEP! That this discussion wound up in the NP-complete "naming things" 
> territory as the main topic right from the start/prefix/beginning speaks 
> highly of it. :)

Maybe we should have a rule to disallow bikeshedding until the
foundations of a PEP are settled. Or always create two threads per
PEP: one for bikeshedding only, one for otherthing else :-D

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IEKQRYMI4QSS3XHSQ73KDFEKJN6E4FJZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
I don't see any rationale in the PEP or in the python-ideas thread
(admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
there). Is this just for consistency with other methods like .casefold?

I can understand why you'd want it to be consistent, but I think it's
misguided in this case. It adds unnecessary complexity for subclass
implementers to need to re-implement these two additional methods, and I
can see no obvious reason why this behavior would be necessary, since
these methods can be implemented in terms of string slicing.

Even if you wanted to use `str`-specific optimizations in C that aren't
available if you are constrained to use the subclass's __getitem__, it's
inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
path" that doesn't use slice.

I think defining this in terms of string slicing makes the most sense
(and, notably, slice itself returns `str` unless explicitly overridden,
the default is for it to return `str` anyway...).

Either way, it would be nice to see the rationale included in the PEP
somewhere.

Best,
Paul

On 3/22/20 7:16 AM, Eric V. Smith wrote:
> On 3/22/2020 1:42 AM, Nick Coghlan wrote:
>> On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
>>> On 21Mar2020 12:45, Eric V. Smith  wrote:
 On 3/21/2020 12:39 PM, Victor Stinner wrote:
> Well, if CPython is modified to implement tagged pointers and
> supports
> storing a short strings (a few latin1 characters) as a pointer, it
> may
> become harder to keep the same behavior for "x is y" where x and y
> are
> strings.
>>> Are you suggesting that it could become impossible to write this
>>> function:
>>>
>>>  def myself(o):
>>>  return o
>>>
>>> and not be able to rely on "o is myself(o)"? That seems... a pretty
>>> nasty breaking change for the language.
>> Other way around - because strings are immutable, their identity isn't
>> supposed to matter, so it's possible that functions that currently
>> return the exact same object in some cases may in the future start
>> returning a different object with the same value.
>>
>> Right now, in CPython, with no tagged pointers, we return the full
>> existing pointer wherever we can, as that saves us a data copy. With
>> tagged pointers, the pointer storage effectively *is* the instance, so
>> you can't really replicate that existing "copy the reference not the
>> storage" behaviour any more.
>>
>> That said, it's also possible that identity for tagged pointers would
>> be value based (similar to the effect of the small integer cache and
>> string interning), in which case the entire question would become
>> moot.
>>
>> Either way, the PEP shouldn't be specifying that a new object *must*
>> be returned, and it also shouldn't be specifying that the same object
>> *can't* be returned.
>
> Agreed. I think the PEP should say that a str will be returned (in the
> event of a subclass, assuming that's what we decide), but if the
> argument is exactly a str, that it may or may not return the original
> object.
>
> Eric
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/JHM7T6JZU56PWYRJDG45HMRBXE3CBXMX/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RTQWEE4KZYIIXL3HK3C6IJ2ATQ6CM7PG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Victor Stinner
Le dim. 22 mars 2020 à 01:45, Dennis Sweeney
 a écrit :
> For accepting multiple prefixes, I can't tell if there's a consensus about 
> whether
> ``s = s.cutprefix("a", "b", "c")`` should be the same as
>
>  for prefix in ["a", "b", "c"]:
>  s = s.cutprefix(prefix)
>
> or
>
>  for prefix in ["a", "b", "c"]:
>  if s.startwith(prefix):
>  s = s.cutprefix(prefix)
>  break
>
> The latter seems to be harder for users to implement through other means, and 
> it's the
> behavior that test_concurrent_futures.py has implemented now, so maybe that's 
> what we
> want.

I expect that "FooBar".cutprefix(("Foo", "Bar")) returns "Bar". IMO
it's consistent with "FooFoo".cutprefix("Foo") which only returns
"Foo" and not "":
https://www.python.org/dev/peps/pep-0616/#remove-multiple-copies-of-a-prefix

If you want to remove both prefixes,
"FooBar".cutprefix("Foo").cutprefix("Bar") should be called to get "".


> Also, it seems more elegant to me to accept variadic arguments, rather than a 
> single
> tuple of arguments. Is it worth it to match the related-but-not-the-same API 
> of
> "startswith" if it makes for uglier Python? My gut reaction is to prefer the 
> varargs, but
> maybe someone has a different perspective.

I suggest to accept a tuple of strings:

str.cutprefix(("prefix1", "prefix2"))

To be consistent with startswith():

str.startswith(("prefix1", "prefix2"))

cutprefix() and startswith() can be used together and so I would
prefer to have the same API:

prefixes = ("context: ", "ctx:")
has_prefix = False
if line.startswith(prefixes):
line = line.cutprefix(prefixes)
has_prefix = True

A different API would look more surprising, no? Compare it to:

prefixes = ("context: ", "ctx:")
has_prefix = False
if line.startswith(prefixes):
line = line.cutprefix(*prefixes)# <== HERE
has_prefix = True

The difference is even more visible is you pass directly the prefixes:
.cutprefix("context: ", "ctx:")
vs
.cutprefix(("context: ", "ctx:"))

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JTFKF2ASUR5QV3I73O72RHYL5S72OGDW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Eric V. Smith

On 3/22/2020 1:42 AM, Nick Coghlan wrote:

On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:

On 21Mar2020 12:45, Eric V. Smith  wrote:

On 3/21/2020 12:39 PM, Victor Stinner wrote:

Well, if CPython is modified to implement tagged pointers and supports
storing a short strings (a few latin1 characters) as a pointer, it may
become harder to keep the same behavior for "x is y" where x and y are
strings.

Are you suggesting that it could become impossible to write this
function:

 def myself(o):
 return o

and not be able to rely on "o is myself(o)"? That seems... a pretty
nasty breaking change for the language.

Other way around - because strings are immutable, their identity isn't
supposed to matter, so it's possible that functions that currently
return the exact same object in some cases may in the future start
returning a different object with the same value.

Right now, in CPython, with no tagged pointers, we return the full
existing pointer wherever we can, as that saves us a data copy. With
tagged pointers, the pointer storage effectively *is* the instance, so
you can't really replicate that existing "copy the reference not the
storage" behaviour any more.

That said, it's also possible that identity for tagged pointers would
be value based (similar to the effect of the small integer cache and
string interning), in which case the entire question would become
moot.

Either way, the PEP shouldn't be specifying that a new object *must*
be returned, and it also shouldn't be specifying that the same object
*can't* be returned.


Agreed. I think the PEP should say that a str will be returned (in the 
event of a subclass, assuming that's what we decide), but if the 
argument is exactly a str, that it may or may not return the original 
object.


Eric

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JHM7T6JZU56PWYRJDG45HMRBXE3CBXMX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Cameron Simpson

On 22Mar2020 08:10, Ivan Pozdeev  wrote:

On 22.03.2020 7:46, Steven D'Aprano wrote:

On Sun, Mar 22, 2020 at 06:57:52AM +0300, Ivan Pozdeev via Python-Dev wrote:

Does it need to be separate methods?

Yes.

Overloading a single method to do two dissimilar things is poor design.

They are similar. We're removing stuff from an edge in both cases. The 
only difference is whether input is treated as a character set or as a 
raw substring.


That is not the only difference. strip() does not just remove a 
character from the set provided (as a str). It removes as many of them 
as there are; that is why "foo.ext".strip(".ext") can actually be quite 
misleading to someone looking for a suffix remover - it often looks like 
it did the right thing.


By contrast, cutprefix/cutsuffix (or stripsuffix, whatever) remove only 
_one_ instance of the affix.


To my mind they are quite different, which is the basis of my personal 
dislike of reusing the word "strip". Just extending "strip()" with a 
funky new affix mode would be even worse, since it can _still_ be 
misleading if the caller omited the special mode.


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E4XEFKAWBHHOYAGBQUIUZHGB3J4HXBSJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Stephen J. Turnbull
Ivan Pozdeev via Python-Dev writes:
 > 
 > On 22.03.2020 7:46, Steven D'Aprano wrote:
 > > On Sun, Mar 22, 2020 at 06:57:52AM +0300, Ivan Pozdeev via Python-Dev 
 > > wrote:
 > >
 > >> Does it need to be separate methods?
 > > Yes.
 > >
 > > Overloading a single method to do two dissimilar things is poor design.
 > >
 > They are similar. We're removing stuff from an edge in both
 > cases. The only difference is whether input is treated as a
 > character set or as a raw substring.

That is true.  However, the rule of thumb (due to Guido, IIRC) is if
the parameter is normally going to be a literal constant, and there
are few such constants (like <= 3), put them in the name of the
function rather than as values for an optional parameter.  Overloading
doesn't save much, if any, typing in this case.

That's why we have strip, rstrip, and lstrip in the first place,
although nowadays we'd likely spell the modifiers out (and maybe use
start/end rather than left/right, which I would guess force BIDI users
to translate to start/end on the fly).

Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/56MCGZR4AHLCG6UWV5TOEYH2PNS52SNO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Nick Coghlan
On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
>
> On 21Mar2020 12:45, Eric V. Smith  wrote:
> >On 3/21/2020 12:39 PM, Victor Stinner wrote:
> >>Well, if CPython is modified to implement tagged pointers and supports
> >>storing a short strings (a few latin1 characters) as a pointer, it may
> >>become harder to keep the same behavior for "x is y" where x and y are
> >>strings.
>
> Are you suggesting that it could become impossible to write this
> function:
>
> def myself(o):
> return o
>
> and not be able to rely on "o is myself(o)"? That seems... a pretty
> nasty breaking change for the language.

Other way around - because strings are immutable, their identity isn't
supposed to matter, so it's possible that functions that currently
return the exact same object in some cases may in the future start
returning a different object with the same value.

Right now, in CPython, with no tagged pointers, we return the full
existing pointer wherever we can, as that saves us a data copy. With
tagged pointers, the pointer storage effectively *is* the instance, so
you can't really replicate that existing "copy the reference not the
storage" behaviour any more.

That said, it's also possible that identity for tagged pointers would
be value based (similar to the effect of the small integer cache and
string interning), in which case the entire question would become
moot.

Either way, the PEP shouldn't be specifying that a new object *must*
be returned, and it also shouldn't be specifying that the same object
*can't* be returned.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NDRZ4G2S2GG74UYBCZ46N7QPL3SFFO5K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Barney Gale
My 2c on the naming:

'start' and 'end' in 'startswith' and 'endswith' are verbs, whereas we're
looking for a noun if we want to cut/strip/trim a string. You can use
'start' and 'end' as nouns for this case but 'prefix' and 'suffix' seems a
more obvious choice in English to me.

Pathlib has `with_suffix()` and `with_name()`, which would give us
something like `without_prefix()` or `without_suffix()` in this case.

I think the name "strip", and the default (no-argument) behaviour of
stripping whitespace implies that the method is used to strip something
down to its bare essentials, like stripping a bed of its covers. Usually
you use strip() to remove whitespace and get to the real important data. I
don't think such an implication holds for removing a *specific*
prefix/suffix.

I also don't much like "strip" as the semantics are quite different - if
i'm understanding correctly, we're removing a *single* instance of a
*single* *multi-character* string. A verb like "trim" or "cut" seems
appropriate to highlight that difference.

Barney



On Fri, 20 Mar 2020 at 18:59, Dennis Sweeney 
wrote:

> Browser Link: https://www.python.org/dev/peps/pep-0616/
>
> PEP: 616
> Title: String methods to remove prefixes and suffixes
> Author: Dennis Sweeney 
> Sponsor: Eric V. Smith 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 19-Mar-2020
> Python-Version: 3.9
> Post-History: 30-Aug-2002
>
>
> Abstract
> 
>
> This is a proposal to add two new methods, ``cutprefix`` and
> ``cutsuffix``, to the APIs of Python's various string objects.  In
> particular, the methods would be added to Unicode ``str`` objects,
> binary ``bytes`` and ``bytearray`` objects, and
> ``collections.UserString``.
>
> If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then
> ``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has
> been removed.  If ``s`` does not have ``pre`` as a prefix, an
> unchanged copy of ``s`` is returned.  In summary, ``s.cutprefix(pre)``
> is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``.
>
> The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is
> roughly equivalent to
> ``s[:-len(suf)] if suf and s.endswith(suf) else s``.
>
>
> Rationale
> =
>
> There have been repeated issues [#confusion]_ on the Bug Tracker
> and StackOverflow related to user confusion about the existing
> ``str.lstrip`` and ``str.rstrip`` methods.  These users are typically
> expecting the behavior of ``cutprefix`` and ``cutsuffix``, but they
> are surprised that the parameter for ``lstrip`` is interpreted as a
> set of characters, not a substring.  This repeated issue is evidence
> that these methods are useful, and the new methods allow a cleaner
> redirection of users to the desired behavior.
>
> As another testimonial for the usefulness of these methods, several
> users on Python-Ideas [#pyid]_ reported frequently including similar
> functions in their own code for productivity.  The implementation
> often contained subtle mistakes regarding the handling of the empty
> string (see `Specification`_).
>
>
> Specification
> =
>
> The builtin ``str`` class will gain two new methods with roughly the
> following behavior::
>
> def cutprefix(self: str, pre: str, /) -> str:
> if self.startswith(pre):
> return self[len(pre):]
> return self[:]
>
> def cutsuffix(self: str, suf: str, /) -> str:
> if suf and self.endswith(suf):
> return self[:-len(suf)]
> return self[:]
>
> The only difference between the real implementation and the above is
> that, as with other string methods like ``replace``, the
> methods will raise a ``TypeError`` if any of ``self``, ``pre`` or
> ``suf`` is not an instace of ``str``, and will cast subclasses of
> ``str`` to builtin ``str`` objects.
>
> Note that without the check for the truthyness of ``suf``,
> ``s.cutsuffix('')`` would be mishandled and always return the empty
> string due to the unintended evaluation of ``self[:-0]``.
>
> Methods with the corresponding semantics will be added to the builtin
> ``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
> or ``bytearray`` object, then ``b.cutsuffix()`` and ``b.cutprefix()``
> will accept any bytes-like object as an argument.
>
> Note that the ``bytearray`` methods return a copy of ``self``; they do
> not operate in place.
>
> The following behavior is considered a CPython implementation detail,
> but is not guaranteed by this specification::
>
> >>> x = 'foobar' * 10**6
> >>> x.cutprefix('baz') is x is x.cutsuffix('baz')
> True
> >>> x.cutprefix('') is x is x.cutsuffix('')
> True
>
> That is, for CPython's immutable ``str`` and ``bytes`` objects, the
> methods return the original object when the affix is not found or if
> the affix is empty.  Because these types test for equality using
> shortcuts for identity and length, the following equivalent
> expressions are 

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Nick Coghlan
On Sun, 22 Mar 2020 at 14:01, Dennis Sweeney
 wrote:
>
> Is there a proven use case for anything other than the empty string as the 
> replacement? I prefer your "replacewhatever" to another "stripwhatever" name, 
> and I think it's clear and nicely fits the behavior you proposed. But should 
> we allow a naming convenience to dictate that the behavior should be 
> generalized to a use case we're not sure exists, where the same same argument 
> is passed 99% of the time?

I think so, as if we don't, then we'd end up with the following three
methods on str objects (using Guido's suggested names of
"removeprefix" and "removesuffix", as I genuinely like those):

* replace()
* removeprefix()
* removesuffix()

And the following questions still end up with relatively non-obvious answers:

Q: How do I do a replace, but only at the start or end of the string?
A: Use "new_prefix + s.removeprefix(old_prefix)" or
"s.removesuffix(old_suffix) + new_suffix"

Q: How do I remove a substring from anywhere in a string, rather than
just from the start or end?
A: Use "s.replace(substr, '')"

Most of that objection would go away if the PEP added a plain old
"remove()" method in addition to removeprefix() and removesuffix(),
though - the "replace the substring with an empty string" trick isn't
the most obvious spelling in the world, whereas I'd expect a lot folks
to reach for "s.remove(substr)" based on the regular sequence API, and
I think Guido's right that in many cases where a prefix or suffix is
being changed, you also want to add it if the old prefix/suffix is
missing (and in the cases where you don't then, then you can either
use startswith()/endswith() first, or else check for a length change.

> I think a downside would be that a pass-a-string-or-a-tuple-of-strings 
> interface would be more mental effort to keep track of than a ``*args`` 
> variadic interface for "(cut/remove/without/trim)prefix", even if the former 
> is how ``startswith()`` works.

I doubt we'd use *args for any new string methods, precisely because
we don't use it for any of the existing ones.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ULNZVYYZBX6RHEAVWGO4AIDOQSNSCURJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Cameron Simpson

On 21Mar2020 14:17, mus...@posteo.org  wrote:

On Fri, 20 Mar 2020 20:49:12 -
"Dennis Sweeney"  wrote:

exactly same way (as a character set) in each case. Looking at how
the argument is used, I'd argue that ``lstrip``/``rstrip``/``strip``
are much more similar to each other than they are to the proposed
methods


Correct, but I don't like the word "cut" because it suggests that
something is cut into pieces which can be used later separately.

I'd propose to use "trim" instead of "cut" because it makes clear that
something is cut off and discarded, and it is clearly different from
"strip".


Please, NO. "trim" is a VERY well known PHP function, and does what our 
strip does. I've very against this (otherwise fine) word for this 
reason.


I still prefer "cut", though the consensus seems to be for "strip".

Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7P55K6BICBQ4YEKXD373SX2SRYRWKNU2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Cameron Simpson

On 21Mar2020 14:40, Eric V. Smith  wrote:

On 3/21/2020 2:09 PM, Steven D'Aprano wrote:
If you want to know whether a prefix/suffix was removed, there's a 
more

reliable way than identity and a cheaper way than O(N) equality. Just
compare the length of the string before and after. If the lengths are
the same, nothing was removed.


That's a good point. This should probably go in the PEP, and maybe the 
documentation.


+1000 to this. - Cameron
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GHBT5RREZRMKXZDE6ZG3EZGLU3CM7VNW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Ivan Pozdeev via Python-Dev



On 22.03.2020 7:46, Steven D'Aprano wrote:

On Sun, Mar 22, 2020 at 06:57:52AM +0300, Ivan Pozdeev via Python-Dev wrote:


Does it need to be separate methods?

Yes.

Overloading a single method to do two dissimilar things is poor design.

They are similar. We're removing stuff from an edge in both cases. The only difference is whether input is treated as a character set or as 
a raw substring.

As written in the PEP preface, the very reason for the PEP is that people
are continuously trying to use *strip methods for the suggested
functionality -- which shows that this is where they are expecting to find
it.

They are only expecting to find it in strip() because there is no other
alternative where it could be. There's nothing inherent about strip that
means to delete a prefix or suffix, but when the only other choices are
such obviously wrong methods as upper(), find(), replace(), count() etc
it is easy to jump to the wrong conclusion that strip does what is
wanted.




--
Regards,
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V5N5K6WFWM4QPJ5YUGSCE6HY47P25PVG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Cameron Simpson

On 21Mar2020 12:45, Eric V. Smith  wrote:

On 3/21/2020 12:39 PM, Victor Stinner wrote:

Well, if CPython is modified to implement tagged pointers and supports
storing a short strings (a few latin1 characters) as a pointer, it may
become harder to keep the same behavior for "x is y" where x and y are
strings.


Are you suggesting that it could become impossible to write this 
function:


   def myself(o):
   return o

and not be able to rely on "o is myself(o)"? That seems... a pretty 
nasty breaking change for the language.


Good point. And I guess it's still a problem for interned strings, 
since even a copy could be the same object:



s = 'for'
s[:] is 'for'

True

So I now agree with Ned, we shouldn't be prescriptive here, and we 
should explicitly say in the PEP that there's no way to tell if the 
strip/cut/whatever took place, other than comparing via equality, not 
identity.


Unless Victor asserts that a function like myself() above cannot be 
relied on to have its return value "is" its passed in value, I disagree.  
The beauty of returning the original object on no change is that the 
test is O(1) and the criterion is clear. It is easy to document that 
stripping an empty affix returns the original string.


I guess a test for len(stripped_string) == len(unstripped_string) is 
also O(1), and is less prescriptive. I just don't see the weight to 
Ned's characterisation of "a is/is-not b" as overly prescriptive; 
returning the same reference as one is given seems nearly the easiest 
thing a function can ever do.


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/22RNX6ABI7KATARTGJPHBI3OKAE4XHED/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Cameron Simpson

On 22Mar2020 05:09, Steven D'Aprano  wrote:
I agree with Ned -- whether the string object is returned unchanged or 
a copy is an implementation decision, not a language decision.


[Eric]

The only reason I can think of is to enable the test above: did a
suffix/prefix removal take place? That seems like a useful thing.


We don't make this guarantee about string identity for any other string
method, and CPython's behaviour varies from method to method:

   py> s = 'a b c'
   py> s is s.strip()
   True
   py> s is s.lower()
   False

and version to version:

   py> s is s.replace('a', 'a')  # 2.7
   False
   py> s is s.replace('a', 'a')  # 3.5
   True

I've never seen anyone relying on this behaviour, and I don't expect
these new methods will change that. Thinking that `is` is another way of
writing `==`, yes, I see that frequently. But relying on object identity
to see whether a new string was created by a method, no.


Well, ok, expressed on this basis, colour me convinced. I'm not ok with 
not mandating that no change to the string returns an equal string (but, 
really, _only_ because i can do a test with len(), as I consider a test 
of content wildly excessive - potentially quite expensive - strings are 
not always short).



If you want to know whether a prefix/suffix was removed, there's a more
reliable way than identity and a cheaper way than O(N) equality. Just
compare the length of the string before and after. If the lengths are
the same, nothing was removed.


Aye.

Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TOO62DCWEANP23FN6MI4YIPQIIDAQ53U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Gregory P. Smith
Nice PEP! That this discussion wound up in the NP-complete "naming things"
territory as the main topic right from the start/prefix/beginning speaks
highly of it. :)

The only things left I have to add are (a) agreed on don't specify if it is
a copy or not for str and bytes.. BUT (b) do specify that for bytearray.

Being the only mutable type, it matters. Consistency with other bytearray
methods based on https://docs.python.org/3/library/stdtypes.html#bytearray
suggests copy.

(Someone always wants inplace versions of bytearray methods, that is a
separate topic not for this pep)

Fwiw I *like* your cutprefix/suffix names. Avoiding the terms strip and
trim is wise to avoid confusion and having the name read as nice English is
Pythonic.  I'm not going to vote on other suggestions.

-gps

On Sat, Mar 21, 2020, 9:32 PM Kyle Stanley  wrote:

> > In this case, being in line with the existing string API method names
> take priority over PEP 8, e.g. splitlines, startswith, endswith,
> splitlines, etc.
>
> Oops, I just realized that I wrote "splitlines" twice there. I guess that
> goes to show how much I use that specific method in comparison to the
> others, but the point still stands. Here's a more comprehensive set of
> existing string methods to better demonstrate it (Python 3.8.2):
>
> >>> [m for m in dir(str) if not m.startswith('_')]
> ['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith',
> 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum',
> 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower',
> 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join',
> 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind',
> 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines',
> 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>
> On Sun, Mar 22, 2020 at 12:17 AM Kyle Stanley  wrote:
>
>> Ivan Pozdeez wrote:
>> > I must note that names conforming to
>> https://www.python.org/dev/peps/pep-0008/#function-and-variable-names
>> would be "strip_prefix" and "strip_suffix".
>>
>> In this case, being in line with the existing string API method names
>> take priority over PEP 8, e.g. splitlines, startswith, endswith,
>> splitlines, etc. Although I agree that an underscore would probably be a
>> bit easier to read here, it would be rather confusing to randomly swap
>> between the naming convention for the same API. The benefit gained in 
>> *slightly
>> *easier readability wouldn't make up for the headache IMO.
>>
>> On Sun, Mar 22, 2020 at 12:13 AM Ivan Pozdeev via Python-Dev <
>> python-dev@python.org> wrote:
>>
>>> On 22.03.2020 6:38, Guido van Rossum wrote:
>>>
>>> On Sat, Mar 21, 2020 at 6:46 PM Nick Coghlan  wrote:
>>>
 On Sat., 21 Mar. 2020, 11:19 am Nathaniel Smith,  wrote:

> On Fri, Mar 20, 2020 at 11:54 AM Dennis Sweeney
>  wrote:
> > This is a proposal to add two new methods, ``cutprefix`` and
> > ``cutsuffix``, to the APIs of Python's various string objects.
>
> The names should use "start" and "end" instead of "prefix" and
> "suffix", to reduce the jargon factor and for consistency with
> startswith/endswith.
>

 This would also be more consistent with startswith() & endswith(). (For
 folks querying this: the relevant domain here is "str builtin method
 names", and we already use startswith/endswith there, not
 hasprefix/hassuffix. The most challenging relevant audience for new str
 builtin method *names* is also 10 year olds learning to program in school,
 not adults reading the documentation)

>>>
>>> To my language sense, hasprefix/hassuffix are horrible compared to
>>> startswith/endswith. If you were to talk about this kind of condition using
>>> English instead of Python, you wouldn't say "if x has prefix y", you'd say
>>> "if x starts with y". (I doubt any programming language uses hasPrefix or
>>> has_prefix for this, making it a strawman.)
>>>
>>> *But*, what would you say if you wanted to express the idea or removing
>>> something from the start or end? It's pretty verbose to say "remove y from
>>> the end of x", and it's not easy to translate that into a method name.
>>> x.removefromend(y)? Blech! And x.removeend(y) has the double 'e', which
>>> confuses the reader.
>>>
>>> The thing is that it's hard to translate "starts" (a verb) into a noun
>>> -- the "start" of something is its very beginning (i.e., in Python,
>>> position zero), while a "prefix" is a noun that specifically describes an
>>> initial substring (and I'm glad we don't have to use *that* :-).
>>>
>>>
 I think the concern about stripstart() & stripend() working with
 substrings, while strip/lstrip/rstrip work with character sets, is valid,
 but I also share the concern about introducing "cut" as yet another verb to
 learn in the already wide string API.

>>>
>>> It's not great, and I actually think that "stripprefix" and

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
I like "removeprefix" and "removesuffix". My only concern before had been 
length, but three more characters than "cut***fix" is a small price to pay for 
clarity.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y4O2AIODGI2Z45A32UK5EHR7A7RLQFOK/
Code of Conduct: http://python.org/psf/codeofconduct/


  1   2   >