What about PathParser?

A parser has two parts - a tokenizer and a grammar.

Paths are composite - they have their own internal structure, and their own grammar - so they are not tokens. c.f. expressions

Adding "/" as a token makes sense (if you look at the code you will see that it is simply missing as are a few others)

If you want a parser with a grammar, use javacc. We already have one for paths and its wrapped up in PathParser. It calls into ARQ parser and returns a Path.

The same can be done for any part of SPARQL - call the SPARQL parser.

TokenizerText is not a general tokenizer - it does not do any common prefix matching certain necessary cases for Turtle. It is a carefully constructed around that use case for speed.

Handcoded parsers quickly get out of control. It's borderline for Turtle which has a very simple grammar over the tokens.

For a x2 speed up, it seemed worth it.

        Andy

On 03/01/17 20:32, Claude Warren wrote:
Should the TokenizerText parser be extended to parse paths?

so it could parse something like "<x:one>/<x:two>"

This would involve adding Path to Token as well some other changes, but
does it make sense?

Claude

Reply via email to