[Metamath] Re: How should I approach comment parsing?

Benoit Thu, 12 May 2022 16:21:16 -0700

The EBNF in the Metamath book differs slightly from metamath.c in that "$( 
$( $)" returns an error in metamath.c but is accepted by the EBNF grammar 
and by mmverify.py. I opens an issue there 
(https://github.com/metamath/metamath-book/issues/233).


I clarified a bit readc() in mmverify.py, and I wrote a similar function 
for my OCaml verifier, which can forbid the above example, like 
metamath.c.  I copy both below:
-----------------------------------
    def readc(self) -> StringOption:
        """Read the next token once included files have been expanded and
        comments have been skipped.
        """
        tok = self.readf()
        while tok == '$(':
            # Note that we use 'read' in this while-loop, and not 'readf',
            # since inclusion statements within a comment are still comments
            # so should be skipped.
            # The following line is not necessary but makes things clearer;
            # note the similarity with the first three lines of 'readf'.
            tok = self.read()
            while tok and tok != '$)':
                tok = self.read()
            if not tok:
                raise MMError("Unclosed comment at end of file.")
            assert(tok == '$)')
            tok = self.read()
        vprint(70, "Token once comments skipped:", tok)
        return tok
-----------------------------------
(** Read a comment (and ignore its content, so does not need to be passed a
    [parse_env] as argument).  Comments should not nest, so opening-comment
    tokens are ignored. *)
let read_comment ic =
  Printf.printf "Reading a comment.\n";
  let rec read_comment_content ic =
    match read_token ic with
    | Some CloseComm -> Printf.printf "End of comment\n"
    | Some OpenComm ->
        Printf.printf "Comments should not nest.  Ignoring.\n";
        read_comment_content ic
    | Some _ -> read_comment_content ic
    | None -> Printf.printf "File ended while reading comment.  Ignoring.\n"
  in
  read_comment_content ic
-----------------------------------

Benoît


On Friday, May 13, 2022 at 12:39:25 AM UTC+2 Philip White wrote:

> I'm working on implementing comment parsing in my verifier. It doesn't 
> seem to lend itself well to elegant implementations, however, since the 
> token structure of comments seems to be different than that of 
> non-comment text. In addition, the token rules of $t comments and 
> ordinary comments seems to be different (for example, normal comments 
> don't have the C-style comments, but $t comments do). 
>
> I can certainly figure this out by handling all the special cases and 
> bolting together a few different parsers, but I'm wondering if anyone 
> has found an nice technique for this. 
>
> - Philip 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/2770d75f-6d32-4ab5-be5f-0634296feeaan%40googlegroups.com.

[Metamath] Re: How should I approach comment parsing?

Reply via email to