? quantifier like in regexp

Steven Haryanto Tue, 13 May 2014 07:50:30 -0700

Greetings Jeffrey and all,

I'm just getting started with Marpa::R2::Scanless. I have a lot of high 
hopes related to using Marpa:


* (currently in progress) migrate Language::Expr from using 
Regexp::Grammar. With RG there are lots of problems: limitations when 
writing grammar (e.g. have to avoid left recursion), exponential parsing 
time as the length of input string increases, the whole debacle of failure 
to run under Perl 5.18, limited error message/diagnostics, reentrace 
problem (can't use regex matching in action code), etc.

* migrate Org::Parser from parsing with regex, with the hope of speeding up 
the parsing (and improve the readability of the parser code :-) ). Parsing 
my 400KB (8000 lines) todo.org file takes about 0.8-1s on my Core i7-4770 
PC (and probably a couple of seconds on my Core i5 laptop), I wish the time 
could go down to at least 0.1-0.2s.

* write a markdown parser and markdown-to-POD converter. The current 
Markdown::POD module uses Markdent which is Moose-based and has a heavy 
startup cost, about 0.4s on a fast computer and 1+s on a rather slow one, 
which is annoying for command-line scripts. It also has trouble parsing _ 
(emphasis), causing text like 'some_identifier and another_identifier' to 
be converted to POD 'someI<identifier and another>identifier'.

* rewrite my Ledger::Parser to using Marpa and increase its compliance and 
feature support.

* write more parsers and converters for other formats which I so far 
haven't done because the tools I had at my hand are just Perl regex and 
Regexp::Grammars.

For now I'm playing and exercising with some simple grammars. I was 
wondering whether Marpa BNF can (or will) support the "zero-or-one" 
quantifier ? like commonly found in regexp. This is convenient when stating 
a list of things that are optional but need to be in order. For example, 
consider the case of parsing ISO 8601 date duration (I apologize in advance 
for using MarpaX::Simple, it's just a thin wrapper to keep things as simple 
and as short as possible):

----------------------------
#!/usr/bin/env perl

# parses ISO 8601 duration literal

use 5.010;
use MarpaX::Simple qw(gen_parser);

my $parser = gen_parser(
grammar => <<'_',
:start ::= duration_literal

duration_literal ~ 'P' year_opt month_opt week_opt day_opt
    | 'P' year_opt month_opt week_opt day_opt 'T' hour_opt minute_opt 
second_opt

year_opt ~ posnum 'Y'
year_opt ~
month_opt ~ posnum 'M'
month_opt ~
week_opt ~ posnum 'W'
week_opt ~
day_opt ~ posnum 'D'
day_opt ~
hour_opt ~ posnum 'H'
hour_opt ~
minute_opt ~ posnum 'M'
minute_opt ~
second_opt ~ posnum 'S'
second_opt ~

posnum ~ digits
    | digits '.' digits
digits ~ [0-9]+
_
);

$parser->('P');
$parser->('P1Y');
$parser->('P2M');
$parser->('P2MT2M');
---------------------

It would be nice if I could write (or can I?) something like this like in a 
regexp:

---------------------
duration_literal ~ 'P' year? month? week? day?
    | 'P' year? month? week? 'T' hour? minute? second?

year ~ posnum 'Y'
month ~ posnum 'M'
# and so on
---------------------

Expect more (stupid) questions from me :-)

Regards,
Steven

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

? quantifier like in regexp

Reply via email to