Andy,
Enclosed is a simple test that reproduces the error. I'm using Jena 2.7.2, ARQ
2.9.2, and Fuseki 0.2.3. The data is UTF-8. Here is the exception:
Exception in thread "main" java.lang.NullPointerException
at
org.openjena.riot.lang.LangRDFXML$ErrorHandlerBridge.warning(LangRDFXML.java:199)
at
com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.warning(ARPSaxErrorHandler.java:46)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:203)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:185)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:180)
at com.hp.hpl.jena.rdf.arp.impl.ParserSupport.warning(ParserSupport.java:202)
at
com.hp.hpl.jena.rdf.arp.impl.ParserSupport.checkString(ParserSupport.java:113)
at
com.hp.hpl.jena.rdf.arp.impl.ARPDatatypeLiteral.<init>(ARPDatatypeLiteral.java:37)
at
com.hp.hpl.jena.rdf.arp.states.WantTypedLiteral.endElement(WantTypedLiteral.java:46)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.endElement(XMLHandler.java:133)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at org.apache.xerces.impl.XMLNamespaceBinder.handleEndElement(Unknown Source)
at org.apache.xerces.impl.XMLNamespaceBinder.endElement(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at com.hp.hpl.jena.rdf.arp.impl.RDFXMLParser.parse(RDFXMLParser.java:155)
at com.hp.hpl.jena.rdf.arp.ARP.load(ARP.java:120)
at org.openjena.riot.lang.LangRDFXML.parse(LangRDFXML.java:105)
at
org.apache.jena.fuseki.http.DatasetGraphAccessorHTTP.readGraph(DatasetGraphAccessorHTTP.java:307)
at
org.apache.jena.fuseki.http.DatasetGraphAccessorHTTP.exec(DatasetGraphAccessorHTTP.java:277)
at
org.apache.jena.fuseki.http.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:82)
at
org.apache.jena.fuseki.http.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:76)
at org.apache.jena.fuseki.http.DatasetAdapter.getModel(DatasetAdapter.java:47)
at NPETest.main(NPETest.java:24)
Thank you for your help and your hard work building and maintaining Jena!
-Elli
________________________________
From: Andy Seaborne <[email protected]>
To: [email protected]
Sent: Thursday, July 19, 2012 4:53 PM
Subject: Re: NullPointerException when writing Unicode data
On 19/07/12 18:11, Elli Schwarz wrote:
> Andy,
>
>
> The data in question is this:
I was looking for a complete sample of RDF/XML:
What is the charset?
>
>
> Arāḑ Muḩtallah
>
>
> Apparently the "h" is a latin U+0068, but it is combined with a U+0327 code
> to get the curve under the "h". My understanding is that the Unicode Normal
> Form C prefers one character (a combination character) instead of the two
> codes for the one character.
>
> Regardless of the data, I shouldn't expect a NullPointerException, I would
> expect a warning. The only way I found the Unicode warning was through
> stepping through the code in my debugger to figure out what was causing the
> problem.
No, it shouldn't but the description so far leaves me with a bit of
guessing as to the setup. A complete, minimal example please.
Andy
>
>
> Thanks,
>
> Elli
>
>
>
> ________________________________
> From: Andy Seaborne <[email protected]>
> To: [email protected]
> Sent: Thursday, July 19, 2012 1:01 PM
> Subject: Re: NullPointerException when writing Unicode data
>
> (switch to the users list)
>
> On 19/07/12 17:53, Elli Schwarz wrote:
>> Hello,
>>
>> I am attempting to write a graph stored in Fuseki out as RDF/XML, and I get
>> a NullPointerException from line 199 of LangRDFXML. It looks like the
>> variable errorHandler is null.
>>
>> There is actually a warning that "... {W131} String not in Unicode Normal
>> Form C: ..." that is coming from Jena's XMLHandler, but instead of this
>> being propagated back as a warning it is throwing a NullPointerException.
>>
>> It seems that it isn't a fatal error, so no exception at all should be
>> thrown, just a warning should be logged, so I'm guessing this is a bug?
>>
>> I'm using Jena 2.7.2, ARQ 2.9.2, and I'm connecting to a Fuseki 0.2.3 back
>> end (this error occurs when I do ds.getModel(modelName) where ds is a Fuseki
>> DataAccessor.
>>
>> Thank you!
>> -Elli
>>
>
> What does the data look like?
>
> Andy
>import org.apache.jena.fuseki.DatasetAccessor;
import org.apache.jena.fuseki.DatasetAccessorFactory;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.ResourceFactory;
public class NPETest {
/**
* @param args
*/
public static void main(String[] args) {
DatasetAccessor ds = DatasetAccessorFactory
.createHTTP("http://localhost:3030/ds/data");
Model m = ModelFactory.createDefaultModel();
m.add(ResourceFactory.createResource("http://example.com/test"),
ResourceFactory.createProperty("http://example.com/prop"),
ResourceFactory.createTypedLiteral("ArÄḠMuḩtallah"));
ds.add("http://example.com/npeTest", m);
ds.getModel("http://example.com/npeTest"); // NPE thrown here
}
}