On 2020-10-22 08:50, M.-A. Lemburg wrote:
On 22.10.2020 04:12, David Mertz wrote:
To bring it back to a concrete idea, here's how I see things:

 1. The idea of f-string-like assignment targets has little support.  Only
    Chris, and maybe the OP who seems to have gone away.
 2. The idea of a "scanning language" seems to garner a fair amount of
    enthusiasm from everyone who has commented.
 3. Having the scanning language be "inspired by" f-strings seems to fit nicely
    with Python
 4. Lots of folks like C scanf() as another inspiration for the need.  I was not
    being sarcastic in saying that I thought COBOL PICTURE clauses are another
    similar useful case.  I think Perl 6 "rules" were trying to do something
    along those lines... but, well, Perl.
 5. In my opinion, this is naturally a function, or several related functions,
    not new syntax (I think Steven agrees)

So the question is, what should the scanning language look like?  Another
question is: "Does this already exist?"

I'm looking around PyPI, and I see this that looks vaguely along the same lines.
But most likely I am missing things: https://pypi.org/project/rebulk/

In terms of API, assuming functions, I think there are two basic models.  We
could have two (or more) functions that were related though:

# E.g. pat_with_names = "{foo:f}/{bar:4s}/{baz:3d}"
matches = scan_to_obj(pat_with_names, haystack)
# something like (different match objects are possible choices, dict, dataclass,
etc)
print(matches.foo)
print(maches['bar'])

Alternately:

# pat_only = "{:f}/{:4s}/{:3d}"
foo, bar, baz = scan_to_tuple(pat_only, haystack)
# names, if bound, have the types indicated by scanning language

There are questions open about partial matching, defaults, exceptions to raise,
etc.  But the general utility of something along those lines seems roughly
consensus.

I like this idea :-)

There are lots of use cases where regular expressions + subsequent
type conversion are just overkill for a small parsing task.

The above would fit this space quite nicely, esp. since it already
comes with a set of typical format you have to parse, without having
to worry about the nitty details (as you have to do with REs) or
the type conversion from string to e.g. float.

One limitation is that only a few types would supported: 's' for str, 'd' or 'x' for int, 'f' for float.

But what if you wanted to scan to a Decimal instead of a float, or scan a date? A date could be formatted any number of ways!

So perhaps the scanning format should also let you specify the target type. For example, given "{?datetime:%H:%M}", it would look up the pre-registered name "datetime" to get a scanner; the scanner would be given the format, the string and the position and would return the value and the new position. I used '?' in the scan format to distinguish it from a format string.

It might even be possible to use the same format for both formatting and scanning. For example, given "{?datetime:%H:%M}", string formatting would just ignore the "?datetime" part.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XGHQMGYLQ6GRIS4NKVRGJHXSHS3A6L5T/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to