[jira] [Commented] (JENA-641) org.apache.jena.atlas.AtlasException on particular Turtle file

Andy Seaborne (JIRA) Mon, 17 Feb 2014 02:08:21 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903119#comment-13903119
 ]


Andy Seaborne commented on JENA-641:
------------------------------------

Damian's idea looks viable for parsing and the best way to take some degree of 
control.

This isn't the only place this happens.  The SPARQL parser is susceptible as 
well where there the bytes->chars is controlled by javacc, not directly by 
Jena.  Arguably, its more serious on the data side (bigger, produced by someone 
else).

Even if the UTF_8.Decoder developers (that would be the openjdk team and 
similarly for other VMs; it's part of the std runtime) make a change, it then 
needs to be known about and roll through parts of the java ecosystem. (yes - we 
could replace that part of javacc as well - javacc is modular.)



> org.apache.jena.atlas.AtlasException on particular Turtle file
> --------------------------------------------------------------
>
>                 Key: JENA-641
>                 URL: https://issues.apache.org/jira/browse/JENA-641
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: RIOT
>    Affects Versions: Jena 2.11.1
>            Reporter: Vladimir Alexiev
>            Priority: Minor
>         Attachments: getty-codes.ttl
>
>
> {noformat}
> > riot --validate getty-codes.ttl
> Exception in thread "main" org.apache.jena.atlas.AtlasException: 
> java.nio.charset.MalformedInputException: Input length = 1
>         at org.apache.jena.atlas.io.IO.exception(IO.java:206)
>         at 
> org.apache.jena.atlas.io.CharStreamBuffered$SourceReader.fill(CharStreamBuffered.java:77)
>         at 
> org.apache.jena.atlas.io.CharStreamBuffered.fillArray(CharStreamBuffered.java:154)
>         at 
> org.apache.jena.atlas.io.CharStreamBuffered.advance(CharStreamBuffered.java:137)
>         at 
> org.apache.jena.atlas.io.PeekReader.advanceAndSet(PeekReader.java:243)
>         at org.apache.jena.atlas.io.PeekReader.init(PeekReader.java:237)
>         at org.apache.jena.atlas.io.PeekReader.peekChar(PeekReader.java:159)
>         at org.apache.jena.atlas.io.PeekReader.makeUTF8(PeekReader.java:100)
>         at 
> org.apache.jena.riot.tokens.TokenizerFactory.makeTokenizerUTF8(TokenizerFactory.java:41)
>         at org.apache.jena.riot.RiotReader.createParser(RiotReader.java:131)
>         at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:253)
>         at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:182)
>         at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:172)
>         at riotcmd.CmdLangParse.exec(CmdLangParse.java:148)
>         at arq.cmdline.CmdMain.mainMethod(CmdMain.java:102)
>         at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>         at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>         at riotcmd.riot.main(riot.java:35)
> Caused by: java.nio.charset.MalformedInputException: Input length = 1
>         at java.nio.charset.CoderResult.throwException(Unknown Source)
>         at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
>         at sun.nio.cs.StreamDecoder.read(Unknown Source)
>         at java.io.InputStreamReader.read(Unknown Source)
>         at java.io.Reader.read(Unknown Source)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (JENA-641) org.apache.jena.atlas.AtlasException on particular Turtle file

Reply via email to