The ? quantifier is harder than it may seem, because it pairs a nulling
rule and a regular rule, and there are all sort of tricky aspects about
this. This is already done in star rules (A ::= B*), but most of the
tricky stuff is handled at a low-level -- in the Libmarpa C code. So,
to do this, I either hack the inner part of Marpa, or else execute the
same logic twice, once at the Perl level and once at the C level, with
all the interaction issues that implies.
It would be straight-forward to do this in a front-end/wrapper to the
SLIF, and several people have aired the idea of doing this, but so far
nobody has taken it on.
-- jeffrey
On 05/13/2014 07:50 AM, Steven Haryanto wrote:
Greetings Jeffrey and all,
I'm just getting started with Marpa::R2::Scanless. I have a lot of
high hopes related to using Marpa:
* (currently in progress) migrate Language::Expr from using
Regexp::Grammar. With RG there are lots of problems: limitations when
writing grammar (e.g. have to avoid left recursion), exponential
parsing time as the length of input string increases, the whole
debacle of failure to run under Perl 5.18, limited error
message/diagnostics, reentrace problem (can't use regex matching in
action code), etc.
* migrate Org::Parser from parsing with regex, with the hope of
speeding up the parsing (and improve the readability of the parser
code :-) ). Parsing my 400KB (8000 lines) todo.org file takes about
0.8-1s on my Core i7-4770 PC (and probably a couple of seconds on my
Core i5 laptop), I wish the time could go down to at least 0.1-0.2s.
* write a markdown parser and markdown-to-POD converter. The current
Markdown::POD module uses Markdent which is Moose-based and has a
heavy startup cost, about 0.4s on a fast computer and 1+s on a rather
slow one, which is annoying for command-line scripts. It also has
trouble parsing _ (emphasis), causing text like 'some_identifier and
another_identifier' to be converted to POD 'someI<identifier and
another>identifier'.
* rewrite my Ledger::Parser to using Marpa and increase its compliance
and feature support.
* write more parsers and converters for other formats which I so far
haven't done because the tools I had at my hand are just Perl regex
and Regexp::Grammars.
For now I'm playing and exercising with some simple grammars. I was
wondering whether Marpa BNF can (or will) support the "zero-or-one"
quantifier ? like commonly found in regexp. This is convenient when
stating a list of things that are optional but need to be in order.
For example, consider the case of parsing ISO 8601 date duration (I
apologize in advance for using MarpaX::Simple, it's just a thin
wrapper to keep things as simple and as short as possible):
----------------------------
#!/usr/bin/env perl
# parses ISO 8601 duration literal
use 5.010;
use MarpaX::Simple qw(gen_parser);
my $parser = gen_parser(
grammar => <<'_',
:start ::= duration_literal
duration_literal ~ 'P' year_opt month_opt week_opt day_opt
| 'P' year_opt month_opt week_opt day_opt 'T' hour_opt minute_opt
second_opt
year_opt ~ posnum 'Y'
year_opt ~
month_opt ~ posnum 'M'
month_opt ~
week_opt ~ posnum 'W'
week_opt ~
day_opt ~ posnum 'D'
day_opt ~
hour_opt ~ posnum 'H'
hour_opt ~
minute_opt ~ posnum 'M'
minute_opt ~
second_opt ~ posnum 'S'
second_opt ~
posnum ~ digits
| digits '.' digits
digits ~ [0-9]+
_
);
$parser->('P');
$parser->('P1Y');
$parser->('P2M');
$parser->('P2MT2M');
---------------------
It would be nice if I could write (or can I?) something like this like
in a regexp:
---------------------
duration_literal ~ 'P' year? month? week? day?
| 'P' year? month? week? 'T' hour? minute? second?
year ~ posnum 'Y'
month ~ posnum 'M'
# and so on
---------------------
Expect more (stupid) questions from me :-)
Regards,
Steven
--
You received this message because you are subscribed to the Google
Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "marpa
parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.