Lewis,

In your original e-mail you mentioned you were using Jena's TDB. Is that
running as a SPARQL Server i.e. Fuseki?

I'm interested to know how you are inserting your RDF into TDB. Earlier
this year I started work on a project to implement client applications for
the SPARQL Graph Store HTTP Protocol, which is supported by Jena Fuseki,
and can be found on GitHub:

<https://github.com/philipfennell/grasp>

The GRASP project contains an XQuery library that allows MarkLogic to talk
to Triple Stores that support this protocol and there is also an XProc
library that does the same too.

Please feel free to have a look and see if you find anything that is
useful there.

As for your question:

> How would/could I synchronize the an individual
> XML document and its associated triple graph

Using the Graph Store Protocol you could load each XML document's RDF
graph as a separate graph, with it's own unique Graph URI and use that URI
as the reference to the Graph when, as David suggests, you maintain the
reference in a Document Property within MarkLogic.

One issue with using document properties is that, I believe, you can only
set them once the document has been inserted so you need to pass the
reference in the load transaction and then add the property in a
subsequent update transaction.

Are you running the XML document insert and the RDF Graph creation from
the pipeline?

Is it possible to create the metadata reference URI / Graph URI in the
XProc pipeline, inserting it into you source XML as a Atom/HTML style link
element:

<link rel="metadata" href="... The Graph URI ..."/>

or as a suitable attribute?

You could either leave the link in the XML or, alternatively, use a Post
Commit Trigger to extract the Graph URI from the document, removing the
link element, and add the URI as a document property.

Do you need to update the RDF when updates are made to the source XML?


I hope this helps.


Regards


Philip Fennell
Consultant
MarkLogic Corporation
[email protected]
Phone: +44 (0) 203 402 3619
Mobile:  +44 (0) 7824 830 866
www.marklogic.com <http://www.marklogic.com/>
 
This e-mail and any accompanying attachments are confidential. The
information is intended solely for the use of the individual to whom it is
addressed. Any review, disclosure, copying, distribution, or use of this
e-mail communication by others is strictly prohibited. If you are not the
intended recipient, please notify us immediately by returning this message
to the sender and delete all copies. Thank you for your cooperation.
 
 



 




On 22/11/2012 18:04, "McGibbney, Lewis John" <[email protected]>
wrote:

>Hi David,
>
>Thank you for your reply.
>
>Regarding your assumption on what I meant by "synchronize"... yes you are
>completely correct.
>
>{bq}I think this is more a question for Jena Developers{bq}
>I had my fears that you would say that ;) however your final comments are
>both very helpful and interesting. I did not know about property
>documents, this might be a way to at least create *a* link of sorts which
>is required to associate the XML doc with its sister RDF.
>
>Thank you
>
>Lewis
>________________________________________
>From: [email protected]
>[[email protected]] On Behalf Of David Lee
>[[email protected]]
>Sent: 22 November 2012 17:01
>To: MarkLogic Developer Discussion
>Subject: Re: [MarkLogic Dev General] Synchronizing content from
>heterogeneous   data stores
>
>By "synchronize" am I to assume that at some later date you will have new
>XML documents and will rerun the XProc pipeline, produce new RDF and
>store those into Jena ?
>You want to update Jena with the new RDF values AND delete ones that no
>longer are present ?
>
>I think this is more a question for Jena Developers ... I would guess you
>would want to tag the RDF values somehow so that you knew which ones to
>replace.   I am not sure how MarkLogic will help (or hinder) this process
>as everything is done prior to be put into the ML Database ...
>Unless you have a way of also storing the RDF ID's associated with the
>XML document.  That could possibly be put in a property document of the
>XML Document ... but I dont know how you identify RDF triples in Jena.
>
>
>
>--------------------------------------------------------------------------
>---
>David Lee
>Lead Engineer
>MarkLogic Corporation
>[email protected]
>Phone: +1 812-482-5224
>Cell:  +1 812-630-7622
>www.marklogic.com
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of McGibbney,
>Lewis John
>Sent: Thursday, November 22, 2012 11:11 AM
>To: [email protected]
>Subject: [MarkLogic Dev General] Synchronizing content from heterogeneous
>data stores
>
>Hi All,
>
>Currently I have a stack of XML documents in MarkLogic. They get there
>via an XProc pipeline.
>I am currently working to run some custom parsers on the XML (within the
>pipeline) *just* before it gets inserted into MarkLogic.
>The parsers extract RDF relationships (triples) from the XML content and
>I would like to send this extracted structure to a triple store (e.g.
>Jena TDB).
>The idea is then to build my application on top of MarkLogic and use the
>triples to compliment structured or text based queries within the search
>application.
>
>Currently I would really appreciate some need clarification on one major
>area if possible...
>
>How would/could I synchronize the an individual XML document and its
>associated triple graph within triple store? This is my major area of
>confusion. I am really curious to hear from anyone out there who has
>attempted anything similar.
>
>Thanks very much for any feedback on this one, I realize it is a pretty
>lengthy question but any suggestions would be great.
>
>All the best
>
>Lewis
>
>Glasgow Caledonian University is a registered Scottish charity, number
>SC021474
>
>Winner: Times Higher Education¹s Widening Participation Initiative of the
>Year 2009 and Herald Society¹s Education Initiative of the Year 2009.
>http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,
>en.html
>
>Winner: Times Higher Education¹s Outstanding Support for Early Career
>Researchers of the Year 2010, GCU as a lead with Universities Scotland
>partners.
>http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691
>,en.html
>_______________________________________________
>General mailing list
>[email protected]
>http://developer.marklogic.com/mailman/listinfo/general
>_______________________________________________
>General mailing list
>[email protected]
>http://developer.marklogic.com/mailman/listinfo/general
>
>Glasgow Caledonian University is a registered Scottish charity, number
>SC021474
>
>Winner: Times Higher Education¹s Widening Participation Initiative of the
>Year 2009 and Herald Society¹s Education Initiative of the Year 2009.
>http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,
>en.html
>
>Winner: Times Higher Education¹s Outstanding Support for Early Career
>Researchers of the Year 2010, GCU as a lead with Universities Scotland
>partners.
>http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691
>,en.html
>_______________________________________________
>General mailing list
>[email protected]
>http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to