I have a natural-language tokenizer that I can call to return a stream of tokens of the form ['word', 'POS'], where POS is the part of speech. For the purposes of argument, this would be something along the lines of (['The', 'art'], ['dog', 'n'], ['barked', 'v'], ['.', 'P']). What I'd like to do is define a grammar for my target language and parse an incoming set of these tokens - I really want to use Marpa here because obviously parts of speech can easily be ambiguous at the word level and I'd like for Marpa to disambiguate.
There's an example gist at https://gist.github.com/wki/1511584 that illustrates how to use the old XS interface with an external tokenizer, but I see no way to translate that technique into the R2 paradigm. Where do I start? -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
