On 14/05/15 20:27, Martynas Jusevičius wrote:
Andy,

I took a crack at it:
https://github.com/Graphity/graphity-core/blob/master/src/main/java/org/graphity/core/riot/lang/RDFPostReader.java
https://github.com/Graphity/graphity-core/blob/master/src/main/java/org/graphity/core/riot/lang/TokenizerText.java

TokenizerRDFPost

I'd drop the "extends TokenizerText" or at least write AbstractTokenizerText with the machinery you want and
"abstract protected Token parseToken"

Throw out all unused code and so it won't accidentally get in the way in the future.

(If you do this, please contribute it - it would be useful and maybe should have been done originally if it makes no speed difference.)



It was surely one of the more labor-intensive pieces of code in a while...

That means you are on the right track! When a parser isn't tedious it is either not helpful or slow :-)


Works with the example from RDF/POST spec, but I need to do more
testing. Probably could be more DRY as well. If you have some advice,
please let me know.

For grammars and tokenizers, comprehensive testing of each pays big rewards. Theer is not much worse than chasing bugs when the core machinery is not doping the right thing. Tests pin that down and make you think of every case that can come up.

For speed, the tokenizer is more likely to be the bottleneck. PeekReader should do reasonable (for Java) speed I/O for one character lookahead tokenizing.

        Andy



Reply via email to