The question is 'How do we make sure that we never get Syntax error (e.g. really robust glue rules)'. Here are my thoughts.
The most common syntax errors are shift/reduce conflict and reduce/reduce conflict. I have a brief look at the document of GNU bison. I think the method mentioned in it is worthy of trying. When a conflict occurs, the parser will split into different parsers, one for each possible shift or reduction. So the parser can proceed as usual. And I have another thought. That is we can adopt a statistical model, just like Hidden Markov model adopted in the POS tagger. We can train the parser with some corpus and choose the most appropriate action to do in the parser based on the result of training. But this thought is not so specific so far. To make syntax rules more robust, I think we can provide a rule that include a special token 'error' in the context. This token is a sign that used for error recovery. We can predefine it in the transfer, and make the transfer ignore the error when it encounter an error. But this method does not eliminate errors, because it does not change the original rules. However, the rules are written manually, and we can correct them after an error is reported. This is some idea of mine so far. I am not so sure whether I am on the right direction. I would like some guide and advise on it if anyone has time. ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
