On Sun, 8 Jan 2023 at 08:32, Stephen J. Turnbull
<stephenjturnb...@gmail.com> wrote:
>
> Steven D'Aprano writes:
>
>  > On Sat, Jan 07, 2023 at 10:48:48AM -0800, Peter Ludemann wrote:
>  > > You can get almost the same result using pattern matching. For example, 
> your
>  > > "foo:bar;baz".partition(":", ";")
>  > > can be done by a well-known matching idiom:
>  > > re.match(r'([^:]*):([^;]*);(.*)', 'foo:bar;baz').groups()
>
>  > I think that the regex solution is also wrong because it requires you
>  > to know *exactly* what order the separators are found in the source
>  > string.
>
> But that's characteristic of many examples.  In "structured" mail
> headers like Content-Type, you want the separators to come in the
> order ':', '=', ';'.  In a URI scheme with an authority component, you
> want them in the order '@', ':'.

+1 (while also recognising the caveats you mention subsequently)

> Except that you don't, in both those
> examples.  In Content-Type, the '=' is optional, and there may be
> multiple ';'.  In authority, the existing ':' is optional, and there's
> an optional ':' to separate password from username before the '@'.

Trying to avoid the usual discussions about permissive parsing /
supporting various implementations in-the-wild: long-term, the least
ambiguous and most computationally-efficient environment would
probably want to reduce special cases like that?  (both in-data and
in-code)

> user, _, domain = "example.com".partition('@')
>
> does the wrong thing!

Yep - it's important to choose partition arguments (I'm
mostly-resisting the temptation to call them a 'pattern') that are
appropriate for the input.

Structural pattern matching _seems_ like it could correspond here, in
terms of selecting appropriate arguments -- but it is, as I understand
it, limited to at-most-one wildcard pattern per match (by sensible
design).

>  I would prefer "one bite per call" partition
> to a partition at multiple points.

That does seem clearer - and clearer is, generally, probably better.

I suppose an analysis (that I don't have the ability to perform
easily) could be to determine how many regular expression codesites
could be migrated compatibly and beneficially by using
multiple-partition-arguments.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GLOHDOCG7K3BMDCO7N2Y5BD3SP2X3FSW/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to