Joshua did answer, but I thought I'd just add... Blank nodes have no canonical representation. They will typically be represented with a number internally. When a query asks to show them, then they should appear as one of the accepted formats for a blank node, which is usually _: followed by an identifier that is unique to that store. For instance, a blank node may appear as _:1234.
Just like RDF documents in Turtle, a blank node representation need only be unique to the current context. The next query result can return data about a completely different blank node and ALSO use _:1234 as the identifier, even though the nodes are completely different. In practice, this isn't likely to happen with most RDF stores (since stores usually just print the internal identifier), but it's possible. You tend to see it when a store exports data, since the first blank node may show up as _:0001, the second as _:0002, and so on. The next export can be showing completely different data but use those same identifiers in the document. Some systems like to skolemize blank nodes, which means that a pseudo global identifier (in the form of an IRI) is created for them. It looks like the first endpoint you referred to did this (using a scheme of "nodeID"). I don't like what the second end point did, since if it started with a letter (possible, since they're hex digits) then it would be parseable as a URI. The *only* way you can refer to a blank node again is by it's properties. Most schemas or ontologies will have a property on the blank node that uniquely identifies it. In some cases it may be a group of properties that uniquely identifies it. You need to use a variable to refer to the blank node, and then include a triple pattern in your query that connects that variable to the property/value that you need. If you have blank nodes that cannot be uniquely identified (by any one, or combination of properties), then you may not be using an appropriate schema for the data. As a convenience, instead of a variable to represent your blank node, you can use a blank node syntax. This is just a variable without a real name. So instead of your WHERE clause containing: $blank ex:identifier "identifying-value" . $blank ex:property $resultData You can instead say: _:b1 ex:identifier "identifying-value" . _:b1 ex:property $resultData But remember that this is essentially just an unnamed variable. It might look like a blank node, but it can bind to anything (including blank nodes and IRIs). Regards, Paul On Wed, Mar 6, 2013 at 10:54 AM, Ziqi Zhang <[email protected]>wrote: > Hi > > I may have misunderstood something but here is my problem. > > I am using Jena API to get triples from this SPARQL endpoint: > http://sparql.sindice.com/**sparql <http://sparql.sindice.com/sparql> > My query is: > -------------- > SELECT DISTINCT ?s ?o WHERE { > ?s rdf:type rdfs:Class . > {?s foaf:name "species"@en .} > UNION {?s foaf:name "species" .} > OPTIONAL {?s owl:equivalentClass ?o .} > } > -------------- > > The query should return 5 results, each about a *blank node*. If you send > the query using the web interface above, you should get the following > results: > --------------------------- > s o > nodeID://b122741495 http://purl.org/science/** > protein/bysequence/ncbi_gene.**42069<http://purl.org/science/protein/bysequence/ncbi_gene.42069> > nodeID://b122741495 http://purl.org/science/** > protein/bysequence/ncbi_gene.**42504<http://purl.org/science/protein/bysequence/ncbi_gene.42504> > nodeID://b122741495 http://purl.org/science/** > protein/bysequence/ncbi_gene.**47877<http://purl.org/science/protein/bysequence/ncbi_gene.47877> > nodeID://b122741495 http://purl.org/science/** > protein/bysequence/ncbi_gene.**42945<http://purl.org/science/protein/bysequence/ncbi_gene.42945> > nodeID://b122741495 nodeID://b122741495 > > > --------------------------- > > However, using Java Jena API and the following, code, I get completely > different blank node IDs: > ---------------------- > s o > 32ec7330:13d4066f80a:-7fff http://purl.org/science/** > protein/bysequence/ncbi_gene.**42069<http://purl.org/science/protein/bysequence/ncbi_gene.42069> > 32ec7330:13d4066f80a:-7fff http://purl.org/science/** > protein/bysequence/ncbi_gene.**42504<http://purl.org/science/protein/bysequence/ncbi_gene.42504> > 32ec7330:13d4066f80a:-7fff http://purl.org/science/** > protein/bysequence/ncbi_gene.**47877<http://purl.org/science/protein/bysequence/ncbi_gene.47877> > 32ec7330:13d4066f80a:-7fff http://purl.org/science/** > protein/bysequence/ncbi_gene.**42945<http://purl.org/science/protein/bysequence/ncbi_gene.42945> > ---------------------- > > Why are the IDs different? because they are different, I cannot do further > queries on the node at the sparql end point. What I mean is, if I then > query: > "Select ?p ?o where{32ec7330:13d4066f80a:-**7fff ?p ?o .}" > I will have no results, because the node ID does not match with > "nodeID://b122741495". > > > Would really appreciate any insight to this! > > -- > Ziqi Zhang > >
