On 15/11/13 10:02, Stian Soiland-Reyes wrote:
Could there be not non-String-Readers coming from more
character-set-correct environments beyond files and network streams?
E.g. databases or other libraries? Readers allow the string provider
to also do streaming from the source, like we do internally within
Riot of the statements.
Anything is possible ... although only providing Readers would be a bit
strange.
I've created JENA-589 to add Reader-ness.
But the projects experience from (RDF/)XML is that this is in itself is
trouble because Windows users pass in FileReaders that have the wrong
charset set. At that point, there is nothing that can be done to fix
the problem. Result - potentially corrupt data. XML is particularly
difficult because of in-content processing instructions.
The right use of Readers is "short-range" - they are used inside RIOT.
But they are under the control of the parser system that knows the
charset for each syntax (UTF-8).
Andy
On 15 November 2013 09:01, Andy Seaborne <[email protected]> wrote:
On 15/11/13 05:59, Phillip Rhodes wrote:
FWIW... I just grabbed the Jena source, imported it into my workspace
and set my project to reference that, so I could debug into this as
it's running. What I see is the code getting to line 864 in RDFDataMgr
and executing this code:
parser = RiotReader.createParser(tokenizer, lang, base, output) ;
which returns null. The next line tries to use the null reference,
and hence the NPE.
Looking at RiotReader.createParser(), I don't see anything in there
that mentions JSON-LD at all. But the docs for java-jsonld say
something like:
"JenaJSONLD must be initialized so that the readers and writers are
registered with Jena." and I do call the init() method. But I don't
see how that registration process plays with this createParser() code.
I'm guessing things have just gotten out of sync between Jena and the
JSON-LD stuff?
The problem is reading from a reader (I should have noticed this earlier in
the thread).
If you use an InputStream it will work (= it does for me).
Readers are troublesome because they fix the charset before Jena gets a
chance set the encodign according to syntax. A FileReader is particularly
troublesome because it fixes the charset as the system default so if that is
not UTF-8 (and it isn't on Windows) it will not be UTF-8.
StringReader is only usecase but it currently follows a hardwired path
inside RDFDataMgr. The general ReaderRIOT interface does not have Readers.
Regrettably, I think I'll have to go and add Readers to ReaderRIOT (it only
results in parallel code - InputStream and Reader have no commonality) just
because of StringReader.
You should be able to pass the input stream from HTTP request to the
RDFDataMgr or model.read directly.
Andy
Phil