Hi Joshua, My take:
--> (A) How I define the data to grab, whether some SQL statement or the like. <-- Have a look at the user documentation here: https://manifoldcf.apache.org/release/release-1.9/en_US/end-user-documentation.html#jdbcrepository It should be pretty clear how you define what you are looking for. --> (B) How to use this data as individual variables which I can arrange into a linked data relationship (ManifoldCF mapping module?) <-- Rafa's previous reply about the RepositoryDocument is appropriate. Basically, an output connector will be handed one of those objects for every MCF "document". The javadoc for it is here: https://manifoldcf.apache.org/release/trunk/api/framework/org/apache/manifoldcf/agents/interfaces/RepositoryDocument.html --> (C) How difficult would it be to connect to Marmotta's webservice(s). I'm not familiar with the exact mechanism, but I saw ManifoldCF has support for elasticsearch so maybe I could put something together that talks to Marmotta..<-- You can readily write your own output connector. There's a book, in fact, describing how to do that. See: https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs ... and read Chapter 9. Thanks, Karl On Sun, Jul 5, 2015 at 11:53 AM, Joshua Dunham <[email protected]> wrote: > That sounds promising. Would you recommend ManifoldCF for this? If so, > do you know of any resources which I can use to get up to speed with > using it in this way? > > -J > > On 4 July 2015 at 21:48, <[email protected]> wrote: > > Hi Joshua, > > > > The ManifoldCF unit logic in terms of indexing is the Repository Document > > which, simplifying a lot, model a document composed by content plus > metadata > > (key-value). It should be relative easy to tripifly that structure and > push > > it to Marmotta using SPARQL update queries or Marmotta’s java client for > > adding resources. > > The Generic Database connector uses a set of queries for crawling the > > database. You should have to use that queries to get you data. I’m not > > completely sure if each record result is converted directly to a > Repository > > Document, that is something that I would need to check. > > > > Hope that helps, > > Cheers, Rafa > > > > > > > > > > On Sun, Jul 5, 2015 at 2:56 AM, Joshua Dunham <[email protected]> > > wrote: > >> > >> Hi ManifoldCF Users (and Devs) > >> > >> I'm wondering if ManifoldCF can work in my use case. I have some > >> random mySQL and Oracle DB's that I would like to connect to and > >> extract certain known bits of info, format them each a certain way and > >> then store the info in Apache Marmotta [1]. Marmotta is an RDF triple > >> store for linked data so I would need to parse and store the mySQL and > >> Oracle DB's info into a linked format, which is no problem for me to > >> create the relationships etc, I just need something that would let me > >> specifically do this. > >> > >> From what I've read, ManifoldCF can connect to mySQL and Oracle > >> (via non-distributed libraries), and store the results out in several > >> target data stores. What isn't clear is > >> (A) How I define the data to grab, whether some SQL statement or the > like. > >> (B) How to use this data as individual variables which I can arrange > >> into a linked data relationship (ManifoldCF mapping module?) > >> (C) How difficult would it be to connect to Marmotta's webservice(s). > >> I'm not familiar with the exact mechanism, but I saw ManifoldCF has > >> support for elasticsearch so maybe I could put something together that > >> talks to Marmotta.. > >> > >> Would this be possible? If so, could someone point me in the right > >> direction? > >> > >> Thanks! > >> -Joshua > >> > >> > >> [1] - http://marmotta.apache.org/index.html > > > > >
