Re: Disabling BNode UID generation

2022-02-10 Thread Ryan Shaw
Thank you both.

> On Feb 9, 2022, at 1:42 PM, Andrii Berezovskyi  wrote:
> 
> Ryan,
> 
> Here is an example of how we use it in JUnit: 
> https://github.com/eclipse/lyo/blob/aa3b18e4f28f3960d3a86a0b54151dccec2f139f/core/oslc4j-jena-provider/src/test/java/org/eclipse/lyo/oslc4j/provider/jena/JenaModelHelperTest.java#L64
> 
> And here is an AssertJ helper we wrote: 
> https://github.com/eclipse/lyo/blob/aa3b18e4f28f3960d3a86a0b54151dccec2f139f/core/oslc4j-jena-provider/src/test/java/org/eclipse/lyo/oslc4j/provider/jena/helpers/JenaAssert.java
> 
> /Andrew
> 
> On 2022-02-09, 17:10, "Shaw, Ryan"  wrote:
> 
>Thank you, Andy. 
> 
>I agree that working on the triple level is the correct way to approach 
> this. I was looking for something quick and dirty that would work with 
> textual diffing by a VCS, hence my focus on the blank node labels.
> 
>Are there any examples of how to use the isomorphism utilities in Jena?
> 
>> On Feb 5, 2022, at 12:48 PM, Andy Seaborne  wrote:
>> 
>> 
>> 
>> On 04/02/2022 19:09, Shaw, Ryan wrote:
>>> Hello,
>>> I am trying to experiment with generating diffable N-Triples or flat Turtle 
>>> files.
>> ...
>>> Thanks,
>>> Ryan
>> 
>> 
>> Info: There is work on a charter for
>> 
>> "RDF Dataset Canonicalization and Hash Working Group"
>> 
>> https://w3c.github.io/rch-wg-charter/
>> 
>> The end of section 1 has some links to related work.
>> 
>> Given RDF is inherently unordered, canonicalization and "diff of triples" 
>> are related.
>> 
>> 
>> For diff-able files, what counts as "different" between two files?
>> 
>> Instead of changing the bnode algorithm, have you considered making use of 
>> bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
>> bnodes in one list of triples to bnodes in the other list?
>> Iso.isomorphicTriples
>> 
>> (The list being the triples in encounter order during parsing). It is 
>> working not so much on the syntax as the abstraction of triples. e.g A 
>> Turtle file and an NT file produced by parsing the TTL file can be defined 
>> to be "the same".
>> 
>> It's fairly portable across files generated by other systems as well except 
>> for Turtle lists - Jena as a fixed order for triple generation for a list 
>> but it isn't necesasrily the same for all systems.
>> 
>> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list 
>> order, with rdf:first, then rdf:rest; the triple the referencing the list 
>> appears after the list. It happens to be the way the spec explains it:
>>  https://www.w3.org/TR/turtle/#sec-parsing-triples
>> but that is defining the outcome and isn't a requirement.
>> 
>>   Andy
> 
> 



Re: Disabling BNode UID generation

2022-02-09 Thread Andrii Berezovskyi
Ryan,

Here is an example of how we use it in JUnit: 
https://github.com/eclipse/lyo/blob/aa3b18e4f28f3960d3a86a0b54151dccec2f139f/core/oslc4j-jena-provider/src/test/java/org/eclipse/lyo/oslc4j/provider/jena/JenaModelHelperTest.java#L64

And here is an AssertJ helper we wrote: 
https://github.com/eclipse/lyo/blob/aa3b18e4f28f3960d3a86a0b54151dccec2f139f/core/oslc4j-jena-provider/src/test/java/org/eclipse/lyo/oslc4j/provider/jena/helpers/JenaAssert.java

/Andrew

On 2022-02-09, 17:10, "Shaw, Ryan"  wrote:

Thank you, Andy. 

I agree that working on the triple level is the correct way to approach 
this. I was looking for something quick and dirty that would work with textual 
diffing by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?

> On Feb 5, 2022, at 12:48 PM, Andy Seaborne  wrote:
> 
> 
> 
> On 04/02/2022 19:09, Shaw, Ryan wrote:
>> Hello,
>> I am trying to experiment with generating diffable N-Triples or flat 
Turtle files.
> ...
>> Thanks,
>> Ryan
> 
> 
> Info: There is work on a charter for
> 
> "RDF Dataset Canonicalization and Hash Working Group"
> 
> https://w3c.github.io/rch-wg-charter/
> 
> The end of section 1 has some links to related work.
> 
> Given RDF is inherently unordered, canonicalization and "diff of triples" 
are related.
> 
> 
> For diff-able files, what counts as "different" between two files?
> 
> Instead of changing the bnode algorithm, have you considered making use 
of bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
bnodes in one list of triples to bnodes in the other list?
> Iso.isomorphicTriples
> 
> (The list being the triples in encounter order during parsing). It is 
working not so much on the syntax as the abstraction of triples. e.g A Turtle 
file and an NT file produced by parsing the TTL file can be defined to be "the 
same".
> 
> It's fairly portable across files generated by other systems as well 
except for Turtle lists - Jena as a fixed order for triple generation for a 
list but it isn't necesasrily the same for all systems.
> 
> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list 
order, with rdf:first, then rdf:rest; the triple the referencing the list 
appears after the list. It happens to be the way the spec explains it:
>   https://www.w3.org/TR/turtle/#sec-parsing-triples
> but that is defining the outcome and isn't a requirement.
> 
>Andy




Re: Disabling BNode UID generation

2022-02-09 Thread Andy Seaborne




On 09/02/2022 16:09, Shaw, Ryan wrote:

Thank you, Andy.

I agree that working on the triple level is the correct way to approach this. I 
was looking for something quick and dirty that would work with textual diffing 
by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?


See the code - the isomorphism code takes two groups of triples in 
various grouping forms and returns true or false.  You'll probably want 
to look at how it does it and build similar for your use case to get a 
diff of triples.


Andy



On Feb 5, 2022, at 12:48 PM, Andy Seaborne  wrote:



On 04/02/2022 19:09, Shaw, Ryan wrote:

Hello,
I am trying to experiment with generating diffable N-Triples or flat Turtle 
files.

...

Thanks,
Ryan



Info: There is work on a charter for

"RDF Dataset Canonicalization and Hash Working Group"

https://w3c.github.io/rch-wg-charter/

The end of section 1 has some links to related work.

Given RDF is inherently unordered, canonicalization and "diff of triples" are 
related.


For diff-able files, what counts as "different" between two files?

Instead of changing the bnode algorithm, have you considered making use of 
bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
bnodes in one list of triples to bnodes in the other list?
Iso.isomorphicTriples

(The list being the triples in encounter order during parsing). It is working not so much 
on the syntax as the abstraction of triples. e.g A Turtle file and an NT file produced by 
parsing the TTL file can be defined to be "the same".

It's fairly portable across files generated by other systems as well except for 
Turtle lists - Jena as a fixed order for triple generation for a list but it 
isn't necesasrily the same for all systems.

Jena's Turtle algorithm, which is in LangTurtleBase, generates in list order, 
with rdf:first, then rdf:rest; the triple the referencing the list appears 
after the list. It happens to be the way the spec explains it:
   https://www.w3.org/TR/turtle/#sec-parsing-triples
but that is defining the outcome and isn't a requirement.

Andy




Re: Disabling BNode UID generation

2022-02-09 Thread Beaudet, David
I ran across an API call the other day that checks isomorphism.  See the 
topbraid shacl library junit test runner. I think it's called by the dash test 
case class to make sure the resulting graph matches the expected response.


On Feb 9, 2022 11:10, "Shaw, Ryan"  wrote:
Thank you, Andy.

I agree that working on the triple level is the correct way to approach this. I 
was looking for something quick and dirty that would work with textual diffing 
by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?

> On Feb 5, 2022, at 12:48 PM, Andy Seaborne  wrote:
>
>
>
> On 04/02/2022 19:09, Shaw, Ryan wrote:
>> Hello,
>> I am trying to experiment with generating diffable N-Triples or flat Turtle 
>> files.
> ...
>> Thanks,
>> Ryan
>
>
> Info: There is work on a charter for
>
> "RDF Dataset Canonicalization and Hash Working Group"
>
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Frch-wg-charter%2Fdata=04%7C01%7C%7C9b4e78ea9e08469c023008d9ebe6a533%7C53f6461e95ad4b08a8da973e49ae9312%7C0%7C0%7C637800198129953885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=iFjDAQwclQvtNtNPWQ1c98VVZh5WzEjyFcSRzP%2FckkQ%3Dreserved=0
>
> The end of section 1 has some links to related work.
>
> Given RDF is inherently unordered, canonicalization and "diff of triples" are 
> related.
>
>
> For diff-able files, what counts as "different" between two files?
>
> Instead of changing the bnode algorithm, have you considered making use of 
> bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
> bnodes in one list of triples to bnodes in the other list?
> Iso.isomorphicTriples
>
> (The list being the triples in encounter order during parsing). It is working 
> not so much on the syntax as the abstraction of triples. e.g A Turtle file 
> and an NT file produced by parsing the TTL file can be defined to be "the 
> same".
>
> It's fairly portable across files generated by other systems as well except 
> for Turtle lists - Jena as a fixed order for triple generation for a list but 
> it isn't necesasrily the same for all systems.
>
> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list order, 
> with rdf:first, then rdf:rest; the triple the referencing the list appears 
> after the list. It happens to be the way the spec explains it:
>   
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fturtle%2F%23sec-parsing-triplesdata=04%7C01%7C%7C9b4e78ea9e08469c023008d9ebe6a533%7C53f6461e95ad4b08a8da973e49ae9312%7C0%7C0%7C637800198129953885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=Y1dlFIAko0H92M2VQrUDvDQmZqCWYwDuJUFNFJoSVyc%3Dreserved=0
> but that is defining the outcome and isn't a requirement.
>
>Andy



Re: Disabling BNode UID generation

2022-02-09 Thread Shaw, Ryan
Thank you, Andy. 

I agree that working on the triple level is the correct way to approach this. I 
was looking for something quick and dirty that would work with textual diffing 
by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?

> On Feb 5, 2022, at 12:48 PM, Andy Seaborne  wrote:
> 
> 
> 
> On 04/02/2022 19:09, Shaw, Ryan wrote:
>> Hello,
>> I am trying to experiment with generating diffable N-Triples or flat Turtle 
>> files.
> ...
>> Thanks,
>> Ryan
> 
> 
> Info: There is work on a charter for
> 
> "RDF Dataset Canonicalization and Hash Working Group"
> 
> https://w3c.github.io/rch-wg-charter/
> 
> The end of section 1 has some links to related work.
> 
> Given RDF is inherently unordered, canonicalization and "diff of triples" are 
> related.
> 
> 
> For diff-able files, what counts as "different" between two files?
> 
> Instead of changing the bnode algorithm, have you considered making use of 
> bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
> bnodes in one list of triples to bnodes in the other list?
> Iso.isomorphicTriples
> 
> (The list being the triples in encounter order during parsing). It is working 
> not so much on the syntax as the abstraction of triples. e.g A Turtle file 
> and an NT file produced by parsing the TTL file can be defined to be "the 
> same".
> 
> It's fairly portable across files generated by other systems as well except 
> for Turtle lists - Jena as a fixed order for triple generation for a list but 
> it isn't necesasrily the same for all systems.
> 
> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list order, 
> with rdf:first, then rdf:rest; the triple the referencing the list appears 
> after the list. It happens to be the way the spec explains it:
>   https://www.w3.org/TR/turtle/#sec-parsing-triples
> but that is defining the outcome and isn't a requirement.
> 
>Andy



Re: Disabling BNode UID generation

2022-02-05 Thread Andy Seaborne




On 04/02/2022 19:09, Shaw, Ryan wrote:

Hello,

I am trying to experiment with generating diffable N-Triples or flat Turtle 
files.

...

Thanks,
Ryan



Info: There is work on a charter for

"RDF Dataset Canonicalization and Hash Working Group"

https://w3c.github.io/rch-wg-charter/

The end of section 1 has some links to related work.

Given RDF is inherently unordered, canonicalization and "diff of 
triples" are related.



For diff-able files, what counts as "different" between two files?

Instead of changing the bnode algorithm, have you considered making use 
of bnode-isomorphism? That is, during a diff, maintain a growing 
mapping from bnodes in one list of triples to bnodes in the other list?

Iso.isomorphicTriples

(The list being the triples in encounter order during parsing). It is 
working not so much on the syntax as the abstraction of triples. e.g A 
Turtle file and an NT file produced by parsing the TTL file can be 
defined to be "the same".


It's fairly portable across files generated by other systems as well 
except for Turtle lists - Jena as a fixed order for triple generation 
for a list but it isn't necesasrily the same for all systems.


Jena's Turtle algorithm, which is in LangTurtleBase, generates in list 
order, with rdf:first, then rdf:rest; the triple the referencing the 
list appears after the list. It happens to be the way the spec explains it:

   https://www.w3.org/TR/turtle/#sec-parsing-triples
but that is defining the outcome and isn't a requirement.

Andy


Re: Disabling BNode UID generation

2022-02-05 Thread Andy Seaborne

Hi Ryan,

There is an option when creating a parser to provide different policies 
for generating the internal system identfier for a blank node.


RDFParser.create()..labelToNode(...)...

It shouldn't depend on JenaParameters.disableBNodeUIDGeneration but 
there is a bug in the Turtle parser (JENA-2274) where it uses the core 
blank node id allocator so you'll need teh global flag as well.


public static void main(String... args) {
// Blank nodes in [] and in an RDF collection (AKA list)
String s = "PREFIX :  :s :p [ :q (1 2) ] .";

JenaParameters.disableBNodeUIDGeneration = true;
LabelToNode bnodes = LabelToNode.createIncremental();
StreamRDF output =
StreamRDFWriter.getWriterStream(System.out, Lang.NT);
RDFParser.create()
.fromString(s)
.lang(Lang.TTL)
.labelToNode(bnodes)
.parse(output);
}

The NT writer shows the label (it adds the "_:B")

Output at 4.4.0 (some URIs shortened to use in email).

_:BA10 rdf:first "1"^^ .
_:BA10 rdf:rest _:BA11 .
_:BA11 rdf:first "2"^^ .
_:BA11 rdf:rest   .
_:BX400  _:BA10 .
  _:BX400 .

There are two series of bnode here : BA100.. for lists and BX40.. for 
other blank nodes.


That will change to something like:

_:B0001 rdf:first "1"^^ .
_:B0001 rdf:rest _:B0002 .
_:B0002 rdf:first "2"^^ .
_:B0002 rdf:rest  .
_:B  _:B0001 .
  _:B .

and JenaParameters.disableBNodeUIDGeneration is not used when
  https://issues.apache.org/jira/browse/JENA-2274
happens.

Andy

On 04/02/2022 21:20, Shaw, Ryan wrote:




On Feb 4, 2022, at 4:03 PM, Andy Seaborne  wrote:

Ryan,

Please, could you show example code that illustrates what you are seeing?  Presumably it 
isn't a mix in one parser run because it looks like "Annn" come from a 
different place than UUIDs Ids.


--
package test;

import java.nio.file.Path;
import java.nio.file.Paths;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.shared.impl.JenaParameters;

public class Test {

 public static void main(String[] args) {
 JenaParameters.disableBNodeUIDGeneration = true;

 Path path = Paths.get(args[0]);
 Model model = RDFDataMgr.loadModel(path.toString());
 model.getGraph().stream().forEach(System.out::println);
 }
}
--

When I run the above on the following input:

--
PREFIX : 

PREFIX iso8601: 
PREFIX owl: 
PREFIX rdf: 
PREFIX rdfs: 
PREFIX time: 
PREFIX xsd: 

:when1
   a [
   a owl:Class ;
   owl:equivalentClass [
   owl:intersectionOf (
   time:Instant
   [
 a owl:Restriction ;
 owl:allValuesFrom [
 owl:unionOf (
 [
   owl:intersectionOf (
   time:DateTimeDescription
   [
 a owl:Restriction ;
 owl:onProperty time:year ;
 owl:someValuesFrom [
 a rdfs:Datatype ;
 owl:onDatatype xsd:integer ;
 owl:withRestrictions (
 [
   xsd:minInclusive 1984
 ]
   )
   ]
   ]
 )
 ]
 [
   owl:intersectionOf (
   time:GeneralDateTimeDescription
   [
 owl:complementOf [
 a owl:Restriction ;
 owl:hasValue iso8601:Gregorian ;
 owl:onProperty time:hasTRS
   ]
   ]
 )
 ]
   

Re: Disabling BNode UID generation

2022-02-04 Thread Shaw, Ryan



> On Feb 4, 2022, at 4:03 PM, Andy Seaborne  wrote:
> 
> Ryan,
> 
> Please, could you show example code that illustrates what you are seeing?  
> Presumably it isn't a mix in one parser run because it looks like "Annn" come 
> from a different place than UUIDs Ids.

--
package test;

import java.nio.file.Path;
import java.nio.file.Paths;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.shared.impl.JenaParameters;

public class Test {

public static void main(String[] args) {
JenaParameters.disableBNodeUIDGeneration = true;

Path path = Paths.get(args[0]);
Model model = RDFDataMgr.loadModel(path.toString());
model.getGraph().stream().forEach(System.out::println);
}
}
--

When I run the above on the following input:

--
PREFIX : 

PREFIX iso8601: 
PREFIX owl: 
PREFIX rdf: 
PREFIX rdfs: 
PREFIX time: 
PREFIX xsd: 

:when1
  a [
  a owl:Class ;
  owl:equivalentClass [
  owl:intersectionOf (
  time:Instant
  [
a owl:Restriction ;
owl:allValuesFrom [
owl:unionOf (
[
  owl:intersectionOf (
  time:DateTimeDescription
  [
a owl:Restriction ;
owl:onProperty time:year ;
owl:someValuesFrom [
a rdfs:Datatype ;
owl:onDatatype xsd:integer ;
owl:withRestrictions (
[
  xsd:minInclusive 1984
]
  )
  ]
  ]
)
]
[
  owl:intersectionOf (
  time:GeneralDateTimeDescription
  [
owl:complementOf [
a owl:Restriction ;
owl:hasValue iso8601:Gregorian ;
owl:onProperty time:hasTRS
  ]
  ]
)
]
  )
  ] ;
owl:onProperty time:inDateTime
  ]
)
]
] ;
  rdfs:label "1984 or some later year" ;
.
--

I get:

--
231b263db2bfddb58dfa9937b3b7c3a0 @owl:withRestrictions A12
231b263db2bfddb58dfa9937b3b7c3a0 @owl:onDatatype xsd:integer
231b263db2bfddb58dfa9937b3b7c3a0 @rdf:type rdfs:Datatype
8c6c72efd28cd325bc71b79a849faaa1 @xsd:minInclusive 
"1984"^^http://www.w3.org/2001/XMLSchema#integer
A14 @rdf:rest A17
A14 @rdf:first 820826bd52ee5a08eb7622801e144c6e
https://periodo.github.io/edtf-ontology/cases/level-2/range/on-or-after/when1 
@rdfs:label "1984 or some later year"
https://periodo.github.io/edtf-ontology/cases/level-2/range/on-or-after/when1 
@rdf:type 132cb499d890d8900be00c8642b044d4
A13 @rdf:rest rdf:nil
A13 @rdf:first faf2ba604cbef59a8f51d8c9242632f3
A18 @rdf:rest rdf:nil
A18 @rdf:first 5fdb8fff2b57077c0eeb04c51a5c81f9
132cb499d890d8900be00c8642b044d4 @owl:equivalentClass 
f36891f02e7b41fedf7ff3f1328aa1a5
132cb499d890d8900be00c8642b044d4 @rdf:type owl:Class
A12 @rdf:rest rdf:nil
A12 @rdf:first 8c6c72efd28cd325bc71b79a849faaa1
ce14b1c198fa8523fcc9cd87b6675ed7 @owl:onProperty 
http://www.w3.org/2006/time#hasTRS
ce14b1c198fa8523fcc9cd87b6675ed7 @owl:hasValue 
http://www.opengis.net/def/uom/ISO-8601/0/Gregorian
ce14b1c198fa8523fcc9cd87b6675ed7 @rdf:type owl:Restriction
faf2ba604cbef59a8f51d8c9242632f3 @owl:someValuesFrom 
231b263db2bfddb58dfa9937b3b7c3a0
faf2ba604cbef59a8f51d8c9242632f3 @owl:onProperty 
http://www.w3.org/2006/time#year
faf2ba604cbef59a8f51d8c9242632f3 @rdf:type owl:Restriction
7689913c074aeda41f891ce04e89887f @owl:complementOf 
ce14b1c198fa8523fcc9cd87b6675ed7
5fdb8fff2b57077c0eeb04c51a5c81f9 @owl:onProperty 
http://www.w3.org/2006/time#inDateTime
5fdb8fff2b57077c0eeb04c51a5c81f9 @owl:allValuesFrom 

Re: Disabling BNode UID generation

2022-02-04 Thread Andy Seaborne

Ryan,

Please, could you show example code that illustrates what you are 
seeing?  Presumably it isn't a mix in one parser run because it looks 
like "Annn" come from a different place than UUIDs Ids.


Which version of Jena are you running?

Andy

On 04/02/2022 19:09, Shaw, Ryan wrote:

Hello,

I am trying to experiment with generating diffable N-Triples or flat Turtle 
files.

I was hoping that I could do this by setting 
JenaParameters.disableBNodeUIDGeneration to true, so that blank nodes would be 
assigned IDs in increasing order as the parser created them. But it seems that 
only some methods of blank node creation respect this setting. When I parse 
Turtle with BNode UID generation disabled, I get a mix of `ANNN` (incremented, 
as expected) and random UUID BNode IDs. When I parse N-Triples I get all random 
UUIDs.

Is there any way to tap into the parsing pipeline to ensure that all BNode IDs 
are deterministically (ideally incrementally) generated?

Thanks,
Ryan


Re: Disabling BNode UID generation

2022-02-04 Thread Martynas Jusevičius
Hi Ryan,

Isn't it easier to skolemize the bnodes into URIs that you control?

If you only have URIs, then you could even hash the graph with SPARQL:
https://stackoverflow.com/questions/65798817/how-to-generate-a-hash-of-an-rdf-graph-using-sparql
It works but probably doesn't scale that well.

Martynas

On Fri, Feb 4, 2022 at 8:09 PM Shaw, Ryan  wrote:
>
> Hello,
>
> I am trying to experiment with generating diffable N-Triples or flat Turtle 
> files.
>
> I was hoping that I could do this by setting 
> JenaParameters.disableBNodeUIDGeneration to true, so that blank nodes would 
> be assigned IDs in increasing order as the parser created them. But it seems 
> that only some methods of blank node creation respect this setting. When I 
> parse Turtle with BNode UID generation disabled, I get a mix of `ANNN` 
> (incremented, as expected) and random UUID BNode IDs. When I parse N-Triples 
> I get all random UUIDs.
>
> Is there any way to tap into the parsing pipeline to ensure that all BNode 
> IDs are deterministically (ideally incrementally) generated?
>
> Thanks,
> Ryan


Disabling BNode UID generation

2022-02-04 Thread Shaw, Ryan
Hello,

I am trying to experiment with generating diffable N-Triples or flat Turtle 
files. 

I was hoping that I could do this by setting 
JenaParameters.disableBNodeUIDGeneration to true, so that blank nodes would be 
assigned IDs in increasing order as the parser created them. But it seems that 
only some methods of blank node creation respect this setting. When I parse 
Turtle with BNode UID generation disabled, I get a mix of `ANNN` (incremented, 
as expected) and random UUID BNode IDs. When I parse N-Triples I get all random 
UUIDs.

Is there any way to tap into the parsing pipeline to ensure that all BNode IDs 
are deterministically (ideally incrementally) generated?

Thanks,
Ryan