Shawn Smith created JENA-1269:
---------------------------------
Summary: Spilling a data bag with boolean literals throws a parse
exception
Key: JENA-1269
URL: https://issues.apache.org/jira/browse/JENA-1269
Project: Apache Jena
Issue Type: Bug
Components: ARQ
Affects Versions: Jena 3.1.1
Reporter: Shawn Smith
Spilling bindings with boolean literals to a DistinctDataBag or SortedDataBag
results in parse errors when the data bag reads the bindings back in. This
occurs with:
{noformat}
"false"^^<http://www.w3.org/2001/XMLSchema#boolean>
"true"^^<http://www.w3.org/2001/XMLSchema#boolean>
{noformat}
It looks like there's a mismatch where booleans don't round trip correctly
through BindingOutputStream and BindingInputStream. BindingOutputStream writes
the boolean literals as to the spill file as "true" or "false", then
BindingInputStream parses them as symbol tokens instead of node tokens and
fails.
Here's a unit test that reproduces the parse error:
{code:java}
import org.apache.jena.atlas.data.*;
import org.apache.jena.datatypes.xsd.XSDDatatype;
import org.apache.jena.graph.*;
import org.apache.jena.riot.system.SerializationFactoryFinder;
import org.apache.jena.sparql.core.Var;
import org.apache.jena.sparql.engine.binding.*;
import org.junit.Assert;
import org.junit.Test;
public class JenaSparqlClientTest {
@Test
public void testSpillBooleans() {
Node literal = NodeFactory.createLiteral("true", XSDDatatype.XSDboolean);
Binding parent = BindingFactory.binding(Var.alloc("a"),
NodeFactory.createLiteral("xyz"));
Binding binding = BindingFactory.binding(parent, Var.alloc("b"), literal);
// Binding binding = BindingFactory.binding(BindingFactory.noParent,
Var.alloc("b"), literal);
SerializationFactory<Binding> serializationFactory =
SerializationFactoryFinder.bindingSerializationFactory();
SortedDataBag<Binding> dataBag = BagFactory.newSortedBag(new
ThresholdPolicyCount<>(0), serializationFactory, null);
try {
dataBag.add(binding);
dataBag.flush();
// Spill file looks like the following (uses Turtle syntax for literals):
// VARS ?b ?a .
// true "xyz" .
// On reading back the dataBag it throws:
//
// org.apache.jena.riot.RiotException: [line: 2, col: 7 ] Not a valid
token for an RDF term: [KEYWORD:false]
//
// If the test is modified to leave out the 'parent' binding (uncomment
'noParent' line) it throws:
//
// org.apache.jena.riot.RiotException: [line: 2, col: 6 ] Too many items
in a line. Expected 1
//
Binding actual = dataBag.iterator().next();
Assert.assertEquals(binding, actual);
} finally {
dataBag.close();
}
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)