Shawn Smith created JENA-1269:
---------------------------------

             Summary: Spilling a data bag with boolean literals throws a parse 
exception
                 Key: JENA-1269
                 URL: https://issues.apache.org/jira/browse/JENA-1269
             Project: Apache Jena
          Issue Type: Bug
          Components: ARQ
    Affects Versions: Jena 3.1.1
            Reporter: Shawn Smith


Spilling bindings with boolean literals to a DistinctDataBag or SortedDataBag 
results in parse errors when the data bag reads the bindings back in.  This 
occurs with:

{noformat}
"false"^^<http://www.w3.org/2001/XMLSchema#boolean>
"true"^^<http://www.w3.org/2001/XMLSchema#boolean>
{noformat}

It looks like there's a mismatch where booleans don't round trip correctly 
through BindingOutputStream and BindingInputStream.  BindingOutputStream writes 
the boolean literals as to the spill file as "true" or "false", then 
BindingInputStream parses them as symbol tokens instead of node tokens and 
fails.

Here's a unit test that reproduces the parse error:

{code:java}
import org.apache.jena.atlas.data.*;
import org.apache.jena.datatypes.xsd.XSDDatatype;
import org.apache.jena.graph.*;
import org.apache.jena.riot.system.SerializationFactoryFinder;
import org.apache.jena.sparql.core.Var;
import org.apache.jena.sparql.engine.binding.*;
import org.junit.Assert;
import org.junit.Test;

public class JenaSparqlClientTest {
  @Test
  public void testSpillBooleans() {
    Node literal = NodeFactory.createLiteral("true", XSDDatatype.XSDboolean);

    Binding parent = BindingFactory.binding(Var.alloc("a"), 
NodeFactory.createLiteral("xyz"));
    Binding binding = BindingFactory.binding(parent, Var.alloc("b"), literal);
//    Binding binding = BindingFactory.binding(BindingFactory.noParent, 
Var.alloc("b"), literal);

    SerializationFactory<Binding> serializationFactory = 
SerializationFactoryFinder.bindingSerializationFactory();
    SortedDataBag<Binding> dataBag = BagFactory.newSortedBag(new 
ThresholdPolicyCount<>(0), serializationFactory, null);
    try {
      dataBag.add(binding);
      dataBag.flush();

      // Spill file looks like the following (uses Turtle syntax for literals):
      // VARS ?b ?a .
      // true "xyz" .

      // On reading back the dataBag it throws:
      //
      //  org.apache.jena.riot.RiotException: [line: 2, col: 7 ] Not a valid 
token for an RDF term: [KEYWORD:false]
      //
      // If the test is modified to leave out the 'parent' binding (uncomment 
'noParent' line) it throws:
      //
      //  org.apache.jena.riot.RiotException: [line: 2, col: 6 ] Too many items 
in a line.  Expected 1
      //

      Binding actual = dataBag.iterator().next();
      Assert.assertEquals(binding, actual);
    } finally {
      dataBag.close();
    }
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to