Gah - character encoding in Java is still a horrific minefield :( Removed the unnecessary overload
Rob On 12/11/2013 18:55, "Andy Seaborne" <[email protected]> wrote: >> >>jena/trunk/jena-arq/src/main/java/org/apache/jena/riot/tokens/TokenizerFa >>ctory.java >> > > >> Modified: >>jena/trunk/jena-arq/src/main/java/org/apache/jena/riot/tokens/TokenizerFa >>ctory.java >> URL: >>http://svn.apache.org/viewvc/jena/trunk/jena-arq/src/main/java/org/apache >>/jena/riot/tokens/TokenizerFactory.java?rev=1541118&r1=1541117&r2=1541118 >>&view=diff >> >>========================================================================= >>===== >> --- >>jena/trunk/jena-arq/src/main/java/org/apache/jena/riot/tokens/TokenizerFa >>ctory.java (original) >> +++ >>jena/trunk/jena-arq/src/main/java/org/apache/jena/riot/tokens/TokenizerFa >>ctory.java Tue Nov 12 15:53:36 2013 >> @@ -42,6 +42,13 @@ public class TokenizerFactory >> Tokenizer tokenizer = new TokenizerText(peekReader) ; >> return tokenizer ; >> } >> + >> + public static Tokenizer makeTokenizerUTF8(String string) >> + { >> + PeekReader peekReader = PeekReader.readString(string); >> + Tokenizer tokenizer = new TokenizerText(peekReader); >> + return tokenizer; >> + } >> >> public static Tokenizer makeTokenizerASCII(InputStream in) >> { >> >> > >Rob - > >There is TokenizerFactory.makeTokenizerString which is identical to >makeTokenizerUTF8. "String" was a better name because a string isn't >UTF8 in Java. > > Andy >
