Hi Kingsley, In response to your advice I had a few questions. I recently performed a clean install of VOS. I'm running Version: 07.20.3214, Build: Oct 6 2015 on Debian + Ubuntu. I checked RDFa option under cartridges. I didn't see the double check for HTML (and variants) option. Where do I configure the URI burner options?
Here is a screen capture of my import settings for the crawler job: https://docs.google.com/document/d/1Y0Z9b5vBftbgniwmVTp10WT0gblXQKC93ivvvnggPYE/edit?usp=sharing When I execute a SPARQL query it returns duplicate data: http://52.23.175.123:8890/sparql?default-graph-uri=&query=PREFIX+xapi%3A+%3Chttp%3A%2F%2Fpurl.org%2Fxapi%2Fontology%23%3E%0D%0A%0D%0ASELECT+DISTINCT+*%0D%0A%0D%0AWHERE+%7B%0D%0A%0D%0A+++%3FVerb+a+xapi%3AVerb+.%0D%0A%0D%0A%0D%0A%7D%0D%0A&should-sponge=&format=text%2Fhtml&timeout=0&debug=on Are these URIs with an IP address from the sponger? Did I duplicate the import data by selecting too many options? Thank you for the support and advice. It would be helpful if there were more information about these settings/ hatch options. Kind Regards, J Haag SPARQL example: http://52.23.175.123:8890/sparql PREFIX xapi: <http://purl.org/xapi/ontology#> SELECT DISTINCT * WHERE { ?Verb a xapi:Verb . } Your advice was to do the following: [1] Uncheck "WebDAV" checkbox [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger Cartridge won't be invoked (this includes the ability to read RDFa) [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML Cartridge . Also double check the "HTML (and variants)" Cartridge options. You need to set: rdfa=yes, in options. Here is a dump of the options used by URIBurner: fallback-mode=no *rdfa=yes* reify_html5md=1 reify_rdfa=0 reify_jsonld=1 reify_all_grddl=0 passthrough_mode=yes loose=yes reify_html=0 reify_html_misc=0 reify_turtle=yes As for what's the best solution for your goal? This is the best solution since you can schedule your content crawling. You result should ultimately match: http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html -- Using /about sponger service. > Message: 1 > Date: Fri, 2 Oct 2015 12:43:18 -0400 > From: Kingsley Idehen <kide...@openlinksw.com> > Subject: Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso > To: virtuoso-users@lists.sourceforge.net > Message-ID: <560eb426.4060...@openlinksw.com> > Content-Type: text/plain; charset="windows-1252" > > On 9/29/15 10:57 AM, Haag, Jason wrote: >> Following up on my original inquiry: I currently have several RDF >> datasets available on my server. Each data set has an RDF dump >> available as RDF/XML, JSON-LD, and Turtle. These dumps are generated >> automatically without virtuoso from an HTML page marked up using RDFa. >> >> What is the best option for automating the import of this data on a >> regular basis into the virtuoso DB? I would like to automatically >> import RDFa data ideally, but or even rdf/xml or turtle files would be >> fine too. I tried this with the attached settings, but the data >> doesn't appear in the database. What do I need to enable or change in >> my settings in order to automatically import RDF data? See attached >> screen captures. Thanks for any tips or advice! > > Do the following: > > [1] Uncheck "WebDAV" checkbox > [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger > Cartridge won't be invoked (this includes the ability to read RDFa) > [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML > Cartridge . > > Also double check the "HTML (and variants)" Cartridge options. You need > to set: rdfa=yes, in options. Here is a dump of the options used by > URIBurner: > > fallback-mode=no > *rdfa=yes* > reify_html5md=1 > reify_rdfa=0 > reify_jsonld=1 > reify_all_grddl=0 > passthrough_mode=yes > loose=yes > reify_html=0 > reify_html_misc=0 > reify_turtle=yes > > > As for what's the best solution for your goal? This is the best solution > since you can schedule your content crawling. You result should > ultimately match: > > http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html > -- Using /about sponger service. > > > -- > Regards, > > Kingsley Idehen > Founder & CEO > OpenLink Software > Company Web: http://www.openlinksw.com > Personal Weblog 1: http://kidehen.blogspot.com > Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen > Twitter Profile: https://twitter.com/kidehen > Google+ Profile: https://plus.google.com/+KingsleyIdehen/about > LinkedIn Profile: http://www.linkedin.com/in/kidehen > Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this >
------------------------------------------------------------------------------
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users