Hi Joshua,

My take:

--> (A) How I define the data to grab, whether some SQL statement or the
like. <--

Have a look at the user documentation here:
https://manifoldcf.apache.org/release/release-1.9/en_US/end-user-documentation.html#jdbcrepository

It should be pretty clear how you define what you are looking for.

--> (B) How to use this data as individual variables which I can arrange
into a linked data relationship (ManifoldCF mapping module?) <--

Rafa's previous reply about the RepositoryDocument is appropriate.
Basically, an output connector will be handed one of those objects for
every MCF "document".  The javadoc for it is here:

https://manifoldcf.apache.org/release/trunk/api/framework/org/apache/manifoldcf/agents/interfaces/RepositoryDocument.html

--> (C) How difficult would it be to connect to Marmotta's webservice(s).
I'm not familiar with the exact mechanism, but I saw ManifoldCF has
support for elasticsearch so maybe I could put something together that
talks to Marmotta..<--

You can readily write your own output connector.  There's a book, in fact,
describing how to do that.  See:

https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs

... and read Chapter 9.

Thanks,
Karl


On Sun, Jul 5, 2015 at 11:53 AM, Joshua Dunham <[email protected]>
wrote:

> That sounds promising. Would you recommend ManifoldCF for this? If so,
> do you know of any resources which I can use to get up to speed with
> using it in this way?
>
> -J
>
> On 4 July 2015 at 21:48,  <[email protected]> wrote:
> > Hi Joshua,
> >
> > The ManifoldCF unit logic in terms of indexing is the Repository Document
> > which, simplifying a lot, model a document composed by content plus
> metadata
> > (key-value). It should be relative easy to tripifly that structure and
> push
> > it to Marmotta using SPARQL update queries or Marmotta’s java client for
> > adding resources.
> > The Generic Database connector uses a set of queries for crawling the
> > database. You should have to use that queries to get you data. I’m not
> > completely sure if each record result is converted directly to a
> Repository
> > Document, that is something that I would need to check.
> >
> > Hope that helps,
> > Cheers, Rafa
> >
> >
> >
> >
> > On Sun, Jul 5, 2015 at 2:56 AM, Joshua Dunham <[email protected]>
> > wrote:
> >>
> >> Hi ManifoldCF Users (and Devs)
> >>
> >> I'm wondering if ManifoldCF can work in my use case. I have some
> >> random mySQL and Oracle DB's that I would like to connect to and
> >> extract certain known bits of info, format them each a certain way and
> >> then store the info in Apache Marmotta [1]. Marmotta is an RDF triple
> >> store for linked data so I would need to parse and store the mySQL and
> >> Oracle DB's info into a linked format, which is no problem for me to
> >> create the relationships etc, I just need something that would let me
> >> specifically do this.
> >>
> >> From what I've read, ManifoldCF can connect to mySQL and Oracle
> >> (via non-distributed libraries), and store the results out in several
> >> target data stores. What isn't clear is
> >> (A) How I define the data to grab, whether some SQL statement or the
> like.
> >> (B) How to use this data as individual variables which I can arrange
> >> into a linked data relationship (ManifoldCF mapping module?)
> >> (C) How difficult would it be to connect to Marmotta's webservice(s).
> >> I'm not familiar with the exact mechanism, but I saw ManifoldCF has
> >> support for elasticsearch so maybe I could put something together that
> >> talks to Marmotta..
> >>
> >> Would this be possible? If so, could someone point me in the right
> >> direction?
> >>
> >> Thanks!
> >> -Joshua
> >>
> >>
> >> [1] - http://marmotta.apache.org/index.html
> >
> >
>

Reply via email to