Akim Demaille <[email protected]> wrote: >> I hope this list is the right place for this.
>> In the past few weeks, I started working on "%language >> PHP". You can browse the code at >> <URI:https://github.com/scfc/bison-php> (bison's "master" is >> "upstream"). > I am very curious about this. Are you really wanting to _use_ a Bison parser > in PHP? Or is this some kind of experimental toy project? Unfortunately the > current maintainers don't spend a lot of time on Bison, and features that > might never be maintained will finish by hindering the development of the > whole project. <excursus> The starting point of this endeavour is <URI:https://bugzilla.wikimedia.org/show_bug.cgi?id=17465>. ATM, MediaWiki uses an external Ocaml program (generated by ocamllex/ocamlyacc) to determine whether a "<math>" segment contains only "safe" TeX by validating against a (subset of) TeX grammar. So I looked for existing PHP scanner/parser generators and found several, let's say, code dumps. Few were working code, actively maintained were none and grammars, if docu- mented at all, looked rather funky and/or were lacking flex's/Bison's functionalities. Rather than whipping one of them in usable shape, I adapt- ed Bison's Java generator as grammar, concept & Co. are fa- miliar to *everyone*, the infrastructure of testsuites, mailing lists, bug trackers, etc. is already in place and it would be much easier to port new features like GLR if the need arises. </excursus> >> But the second "is" should be a "$is". I tried some >> variants of "$$" and patsubst at different places, but >> unfortunately, m4's levels of quoting have always exceed- >> ed my imagination :-). > Bison is not ready for this, not at all. The easiest would be to > post-process the result as use some kind of new quadrigraph to denote $, say > @S|@ :) > Autoconf uses this: > s/\@<:\@/[/g; > s/\@:>\@/]/g; > s/\@\{:\@/(/g; > s/\@:\}\@/)/g; > s/\@S\|\@/\$/g; > s/\@%:\@/#/g; > s/\@&t\@//g; > A cleaner design requires more thinking. Actually, the real problem lay elsewhere as Bison expected "type identifier" as the argument to %lex-param and just si- lently accepted, but "mistreated" "$identifier" :-). What do you think of the patch: | diff --git a/data/php.m4 b/data/php.m4 | index 7095d8a..6ab2600 100644 | --- a/data/php.m4 | +++ b/data/php.m4 | @@ -262,7 +262,9 @@ m4_define([b4_lex_param_call], | [$1])]) | m4_define([b4_param_calls], | [m4_map([b4_param_call], [$@])]) | -m4_define([b4_param_call], [, $2]) | +# FIXME: This should probably better be dealt with in parse-gram.y's | +# add_param (). | +m4_define([b4_param_call], [, m4_bpatsubst($1, [^.* ], [])]) that I posted in <URI:news:[email protected]>? I would strongly disagree with your assumption that Bison isn't ready for this, though. The code at <URI:https://github.com/scfc/bison-php> is already working for simple examples. The only real showstopper yet - on the C side - is the use of "$variables" in actions for which I have written a patch that I will post for discussion in the next few days (which reminds that I still have to reply to Bruno on bug-gnulib :-(). Tim
