As a quick guess, you might want to use actions instead of events. As
Andrew Rodland pointed out in his recent YAPC talk, triggering only on
actual matches is a feature of actions. Events happen while the parse
is proceeding, but actions only occur once the parse is complete. At
the point it calls actions, Marpa knows what will wind up in the parse
and what will not, and it will not call the action for anything that is
not part of the actual parse.
I hope this helpful, jeffrey
On 06/26/2014 04:10 AM, Ion Toloaca wrote:
Hello everyone,
I am trying to use Marpa to parse mathematical formulas (MathML) and
extract the relevant arguments.
For that I trigger events whenever an argument gets matched. It
actually works fine - but there is one problem -
I get too many matches. The problem is that the argument rule gets
completed a lot of times, but only sometimes
it actually gets matched as a part of a formula and this is exactly
the point I'm stuck at - I need to find a way to
get rid of the matches of argument rules that don't live up to become
part of an actual formula match.
Here is a simplified version of what I'm working on:
(Notation means mathematical formula, Presentation - MathML content)
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
#HERE THE STRUCTURE OF THE GRAMMAR IS DEFINED
#:default ::= action => getString
lexeme default = latm => 1
:start ::= Expression
ExpressionList ::= Expression ExpressionList
| Expression
Expression ::= Notation
|| Presentation
#Presentation MathML
Presentation ::= mrowB ExpressionList mrowE
| moB '(' moE ExpressionList moB ')' moE
| moB text moE
| miB text miE
| mnB text mnE
| mtextB text mtextE
| msB text msE
| mfracB ExpressionList mfracE
| msqrtB Expression msqrtE
| msupB ExpressionList msupE
| msubB ExpressionList msubE
| msubsupB ExpressionList msubsupE
| munderB ExpressionList munderE
| moverB ExpressionList moverE
| munderoverB ExpressionList munderoverE
| mtdB ExpressionList mtdE
| mtrB ExpressionList mtrE
| mtableB ExpressionList mtableE
| mathB ExpressionList mathE
| emB ExpressionList emE
| mstyleB ExpressionList mstyleE
| mspaceB mspaceE
| miSingle
mfracB ::= ws '<mfrac' attribs '>' ws
mfracE ::= ws '</mfrac>' ws
msqrtB ::= ws '<msqrt' attribs '>' ws
msqrtE ::= ws '</msqrt>' ws
msupB ::= ws '<msup' attribs '>' ws
msupE ::= ws '</msup>' ws
msubB ::= ws '<msub' attribs '>' ws
msubE ::= ws '</msub>' ws
munderB ::= ws '<munder' attribs '>' ws
munderE ::= ws '</munder>' ws
moverB ::= ws '<mover' attribs '>' ws
moverE ::= ws '</mover>' ws
mnB ::= ws '<mn' attribs '>' ws
mnE ::= ws '</mn>' ws
miB ::= ws '<mi' attribs '>' ws
miE ::= ws '</mi>' ws
msB ::= ws '<ms' attribs '>' ws
msE ::= ws '</ms>' ws
mspaceB ::= ws '<mspace' attribs '>' ws
mspaceE ::= ws '</mspace>' ws
moB ::= ws '<mo' attribs '>' ws
moE ::= ws '</mo>' ws
mstyleB ::= ws '<mstyle' attribs '>' ws
mstyleE ::= ws '</mstyle>' ws
mtextB ::= ws '<mtext' attribs '>' ws
mtextE ::= ws '</mtext>' ws
emB ::= ws '<em' attribs '>' ws
emE ::= ws '</em>' ws
mtdB ::= ws '<mtd' attribs '>' ws
mtdE ::= ws '</mtd>' ws
mtrB ::= ws '<mtr' attribs '>' ws
mtrE ::= ws '</mtr>' ws
mtableB ::= ws '<mtable' attribs '>' ws
mtableE ::= ws '</mtable>' ws
msubsupB ::= ws '<msubsup' attribs '>' ws
msubsupE ::= ws '</msubsup>' ws
munderoverB ::= ws '<munderover' attribs '>' ws
munderoverE ::= ws '</munderover>' ws
mrowB ::= ws '<mrow' attribs '>' ws
mrowE ::= ws '</mrow>' ws
mathB ::= ws '<math' attribs '>' ws action => getString
mathE ::= ws '</math>' ws
miSingle ::= ws '<mi' attribs '/>' ws
ws ::= spaces action => getNothing
ws ::= # empty action => getNothing
spaces ~ space+
space ~ [\s]
attribs ::= ws || attrib || attrib attribs
attrib ::= ws notEqSignS '=' ws '"' notQuoteS '"' ws
notEqSignS ~ notEqSign+
notEqSign ~ [^=<>/]
notQuoteS ~ notQuote+
notQuote ~ [^"]
text ~ char+
char ~ [^<>]
#ARGUMENT RULE
argRule::= Expression
Notation::=_equal_eqN210
#HERE WE HAVE AN ACTUAL FORMULA - EQUAL WITH ARGUMENTS 'A=B=C=D'
_equal_eqN210::= rule252
#I had to give 3 different names to the argument rules because Marpa
doesn't handle consecutive matches of the same rule as I want
# (returns just the longest match when ->last_completed() is called)
rule252::= argRuleN210A1Seq1 rule53 rule252
| argRuleN210A1Seq2 rule53 argRuleN210A1Seq3
argRuleN210A1Seq1::= argRule
argRuleN210A1Seq2::= argRule
argRuleN210A1Seq3::= argRule
rule53::= moB '=' moE
event 'rule252' = completed rule252
event 'rule53_C' = completed rule53
event 'argRuleN210A1Seq1' = completed argRuleN210A1Seq1
event 'argRuleN210A1Seq2' = completed argRuleN210A1Seq2
event 'argRuleN210A1Seq3' = completed argRuleN210A1Seq3
event '_equal_eqN210_C' = completed _equal_eqN210
# event '_equal_eqN210_P' = predicted _equal_eqN210 #not sure how to
use it/whether I need it at all
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
EXAMPLE OF RESULTS:
On the input '<math><mn>1</mn><mo>=</mo><mn>2</mn></math>' the
following events are triggered:
*Event Positions: Begin->End MathMLContent*
'argRuleN210A1Seq1' 7 17 <mn>1</mn> - USELESS
'argRuleN210A1Seq2' 7 17 <mn>1</mn> - PERFECT
'argRuleN210A1Seq1' 17 27 <mo>=</mo> - USELESS
'rule53_C' 17 27 <mo>=</mo> - PERFECT
'argRuleN210A1Seq2' 17 27 <mo>=</mo> - USELESS
'_equal_eqN210_C' 7 37 <mn>1</mn><mo>=</mo><mn>2</mn> - PERFECT
'rule252' 27 37 <mn>2</mn> -USELESS
'argRuleN210A1Seq1' 7 37 <mn>1</mn><mo>=</mo><mn>2</mn> USELESS
'argRuleN210A1Seq2' 7 37 <mn>1</mn><mo>=</mo><mn>2</mn> - USELESS
'argRuleN210A1Seq3' 27 37 <mn>2</mn> - PERFECT
'argRuleN210A1Seq1' 1 44 <math><mn>1</mn><mo>=</mo><mn>2</mn></math>
-USELESS (but out of range anyway)
'argRuleN210A1Seq2' 1 44 <math><mn>1</mn><mo>=</mo><mn>2</mn></math>
-USELESS (but out of range anyway)
To sum up: How can I only get the 'good' matches (that are eventually
part of a notation match) for arguments and leave aside the other
matches?
Thank you in advance for help.
Best regards,
Toloaca Ion
--
You received this message because you are subscribed to the Google
Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "marpa
parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.