Actually, not, because when an owl:Thing is to be deleted, it is not to be
deleted, but simply marked as inactive, e.g. we insert a triple
?s meshv:active “false”^^xsd:Boolean
When an owl:Thing is to be updated or inserted though, we do remove and replace
all of its triples. So, to use the bulk load, we would have to know already
what has been updated and what subjects (owl:Thing) are no longer present in
the dataset.
I’ve been able to prototype something like this using Python rdflib, which is
maybe easier than Jena but probably slightly less capable. This lead me to
try to install the virtuoso package from pypi.python.org, aka
https://github.com/maparent/virtuoso-python. It appears to be quite finicky
to install and get working. Since I can do SPARQL queries through pyodbc
directly, I’m wondering why this package requires a patched version of pyodbc,
and hope that maparent is maybe present on this list…
From: Hugh Williams [mailto:hwilli...@openlinksw.com]
Sent: Tuesday, November 21, 2017 10:11 AM
To: Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov>
Cc: virtuoso-users@lists.sourceforge.net
Subject: Re: [Virtuoso-users] Two questions
Hi Daniel,
A possible solution to this problem would be to use the Virtuoso Delta aware
bulk load option which requires the use of NQUAD datasets as detailed at:
http://vos.openlinksw.com/owiki/wiki/VOS/VirtRDFBulkLoaderWithDelete
Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc. // http://www.openlinksw.com/
Weblog -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter -- http://twitter.com/OpenLink
Google+ -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers
On 20 Nov 2017, at 21:10, Davis, Daniel (NIH/NLM) [C]
<daniel.da...@nih.gov<mailto:daniel.da...@nih.gov>> wrote:
I wish also to clarify – I won’t have blank nodes in any of the graphs I am
considering, at least for now.
From: Davis, Daniel (NIH/NLM) [C]
Sent: Monday, November 20, 2017 4:04 PM
To:
virtuoso-users@lists.sourceforge.net<mailto:virtuoso-users@lists.sourceforge.net>
Subject: Two questions
I am seeking to develop a new way to load updated/changed data into my Virtuoso
graph. The current way I use involves a lot of churn – I load an NTriples
file representing the data into an updates graph, and I delete all old triples
and copy all new triples if the subject (owl:Thing) appears in the old graph at
all. I also mark all subjects that exist in the old graph as “inactive” if
they no longer appear in the incoming data, but I do not delete them.
I just noticed that rdflib (A python package), includes logic to compare two
graphs, which can be constructed from a query such as DESCRIBE/CONSTRUCT. It
also has an implementation of a graph checksum logic based on the following
paper http://www.hpl.hp.com/techreports/2003/HPL-2003-235R1.pdf
Does Virtuoso have a way to check that one graph is the same as another, or
whether two query solutions are the same?
Dan Davis, Systems/Applications Architect (Contractor),
Office of Computer and Communications Systems,
National Library of Medicine, NIH
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org<http://Slashdot.org>!
http://sdm.link/slashdot_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net<mailto:Virtuoso-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/virtuoso-users
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users