On Sat, Jun 29, 2019 at 3:24 AM Anders Hovmöller <bo...@killingar.net> wrote:
>
>
>
> > On 28 Jun 2019, at 19:01, Rhodri James <rho...@kynesim.co.uk> wrote:
> >
> > On 27/06/2019 18:58, James Lu wrote:
> >>> On Jun 26, 2019, at 7:13 PM, Chris Angelico <ros...@gmail.com> wrote:
> >>>
> >>> The main advantage of sscanf over a regular expression is that it
> >>> performs a single left-to-right pass over the format string and the
> >>> target string simultaneously, with no backtracking. (This is also its
> >>> main DISadvantage compared to a regular expression.) A tiny amount of
> >>> look-ahead in the format string is the sole exception (for instance,
> >>> format string "%s$%d" would collect a string up until it finds a
> >>> dollar sign, which would otherwise have to be written "%[^$]$%d").
> >>> There is significant value in having an extremely simple parsing tool
> >>> available; the question is, is it worth complicating matters with yet
> >>> another way to parse strings? (We still have fewer ways to parse than
> >>> ways to format strings. I think.)
> >> I agree. Python should have an equivalent of scanf, but perhaps it should 
> >> have some extensions:
> >> %P - read pickled object
> >> %J - read JSON object
> >> %M - read msgpack object
> >
> > I somewhat disagree; scanf (or rather sscanf) always looks like a brilliant 
> > idea right up until I come to use it, at which point I almost always do 
> > something else that gives me better control.  I get very paranoid about 
> > parsing, and rolling my own usually feels safer.  Whether or not it is 
> > safer is, of course, another issue :-/
>
>
> And let's not forgot how bad positional matching is. So if one were to 
> implement such a library it would be best if one can supply names for the the 
> parts and have it spit out a dict.
>

Dunno about that; positional matching works really nicely with
unpacking assignment:

spam, eggs, sausages = sscanf(string, "%d /// %s ||| %d")

No dictionary needed. Of course, if you _want_ named placeholders,
that can be a good feature to support, but I wouldn't say that
positional matching is "bad".

Here's a random thought, though. Let's break this into two separate parts.

1) For all the different types of object that can be read (integer,
string, JSON blob, etc), have a function that will read one, stop when
it's done, and report both the parsed object and the point where it
stopped parsing.
2) Have a general template handler that looks for the literal text
tokens, says, "oh, you want that kind of object up to the
slash-slash-slash", and splits up the string to hand to the main
parser.

I know for sure that the first half of that will have value in other
contexts. Just last night I was wishing that I could call
compile(data, "-", "eval") and have it read one Python expression and
leave behind the rest. (To my knowledge, that doesn't exist either.)
There are ways to hack around the JSON module to get that effect, but
it's not exactly a supported feature.

The second half? Less sure, and the API would make or break it. Anyone
feel inspired by this and want to come up with one?

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WBAFXIWUY4NUPBPMENHYEW552LVNFPFV/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to