On Sun, 8 Jan 2023 at 08:32, Stephen J. Turnbull <stephenjturnb...@gmail.com> wrote: > > Steven D'Aprano writes: > > > On Sat, Jan 07, 2023 at 10:48:48AM -0800, Peter Ludemann wrote: > > > You can get almost the same result using pattern matching. For example, > your > > > "foo:bar;baz".partition(":", ";") > > > can be done by a well-known matching idiom: > > > re.match(r'([^:]*):([^;]*);(.*)', 'foo:bar;baz').groups() > > > I think that the regex solution is also wrong because it requires you > > to know *exactly* what order the separators are found in the source > > string. > > But that's characteristic of many examples. In "structured" mail > headers like Content-Type, you want the separators to come in the > order ':', '=', ';'. In a URI scheme with an authority component, you > want them in the order '@', ':'.
+1 (while also recognising the caveats you mention subsequently) > Except that you don't, in both those > examples. In Content-Type, the '=' is optional, and there may be > multiple ';'. In authority, the existing ':' is optional, and there's > an optional ':' to separate password from username before the '@'. Trying to avoid the usual discussions about permissive parsing / supporting various implementations in-the-wild: long-term, the least ambiguous and most computationally-efficient environment would probably want to reduce special cases like that? (both in-data and in-code) > user, _, domain = "example.com".partition('@') > > does the wrong thing! Yep - it's important to choose partition arguments (I'm mostly-resisting the temptation to call them a 'pattern') that are appropriate for the input. Structural pattern matching _seems_ like it could correspond here, in terms of selecting appropriate arguments -- but it is, as I understand it, limited to at-most-one wildcard pattern per match (by sensible design). > I would prefer "one bite per call" partition > to a partition at multiple points. That does seem clearer - and clearer is, generally, probably better. I suppose an analysis (that I don't have the ability to perform easily) could be to determine how many regular expression codesites could be migrated compatibly and beneficially by using multiple-partition-arguments. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/GLOHDOCG7K3BMDCO7N2Y5BD3SP2X3FSW/ Code of Conduct: http://python.org/psf/codeofconduct/