Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Alex To Fri, 13 Sep 2019 20:02:25 -0700

Hi Andy

I had to pull develop branch from here
https://github.com/marklogic/marklogic-jena/tree/develop to get the version
that works with Jena 3.1x.0


then update file
https://github.com/marklogic/marklogic-jena/blob/develop/marklogic-jena/build.gradle

with the following

1. Line 9: change version *3.0-SNAPSHOT* to *3.1.0*
2. Line 13: change *3.10.0* to *3.12.0 *

Then do "gradlew install" to install it to my local maven.

On Sat, Sep 14, 2019 at 8:05 AM Andy Seaborne <a...@apache.org> wrote:

The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our
> code depends on 3.1.0 - what code is it using?
>
> On 13/09/2019 01:18, Alex To wrote:
> > I created a small program to try out Lucene with MarkLogic Jena here
> >
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >
> >
> > My observation is as follows (see my comment at line 54 & 56)
> >
> > 1. If the model reads a small file with 2 triples, the loading can finish
> > quickly
> > 2. If the model reads a slightly larger file (1.5MB), the loading takes
> > forever so I have to terminate it
>
> Pure speculation but parts 1 & 2 sounds like the data load is not going
> to MarkLogic as a single transaction but as "autocommit" - one
> transaction for each triple added.
>
>      Andy
>
>
> > 3. After loading the small file, searching the Lucene index direct shows
> > that the triples are indexed
> > 4. After loading the small file, run SPARQL query with "text:query" won't
> > finish
> >
> > For now I created 2 separate implementation in my program to support Full
> > Text search with Jena or MarkLogic but I look forward to know more
> whether
> > it is still possible to use Jena Elastic indexing with TextDataset
> because
> > then I can provide a single UI to users to configure their search
> > regardless of the back end. :)
> >
> >
> > On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <dansm...@gmail.com> wrote:
> >
> >> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >> implementation of Dataset, and so while application is only using the
> >> virtuoso.jena.driver.VirtGraph and
> >> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> flexible
> >> integration is possible. I look forward to experimenting with it and
> seeing
> >> what I can do on the backend.
> >>
> >> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <dansm...@gmail.com> wrote:
> >>
> >>> Virtuoso's Jena driver implements the model interface, rather than the
> >>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >>> interface. You can see the architecture at
> >>>
> >>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> .
> >> However,
> >>> Virtuoso has its own full-text indexing, which can be effective. Its
> >> rules
> >>> for translating words into queries is not as flexible as
> >>> lucene/solr/elastic, but it does allow you to specify what should be
> >>> indexed - e.g. which objects from which which data properties in which
> >>> graphs.
> >>>
> >>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> >>> https://github.com/HHS/lodestar, which is run underneath
> >>> https://github.com/HHS/meshrdf.   You will see that
> >>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy
> has
> >>> been updated to Jena 3. The EBI version is ahead on UI features
> however.
> >>>
> >>> I cannot speak to MarkLogic, Stardog, etc.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> EBI's lodestar still uses Jena 2, but the fork at HHS has been updated
> to
> >>> Jena 3.
> >>>
> >>> Virtuoso has its own full-text indexing, which is not as flexible in
> how
> >>> it indexes as Elastic/Solr/Lucene.   It still works.
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <a...@apache.org> wrote:
> >>>
> >>>> Yes, probably - but.
> >>>>
> >>>> The Jena text index will work in conjunction with any (Jena)
> >>>> DatasetGraphAPI implementation. 3rd party systems are not tested in
> the
> >>>> build.
> >>>>
> >>>> The "but" is efficiency. Both those systems have their own built-in
> text
> >>>> indexing which execute as part of the native query engine. This may
> be a
> >>>> factor for you, it may not.
> >>>>
> >>>> Let us know how you get on trying it.
> >>>>
> >>>> ----
> >>>>
> >>>> There is a SPARQL 1.2 issue about standardizing text query.
> >>>>
> >>>> Issue 40 : SPARQL 1.2 Community Group:
> >>>> https://github.com/w3c/sparql-12/issues/40
> >>>>
> >>>>       Andy
> >>>>
> >>>> On 12/09/2019 02:53, Alex To wrote:
> >>>>> Hi
> >>>>>
> >>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying to
> >>>> get a
> >>>>> quick answer whether it can work with other Jena based API like
> >>>> Virtuoso /
> >>>>> MarkLogic.
> >>>>>
> >>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> >>>>> expected ?
> >>>>>
> >>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >>>>> interface, it may work but I am not sure because the "text:query"
> >> seems
> >>>> to
> >>>>> be more Jena specific.
> >>>>>
> >>>>> I will try out myself in the next couple of days to see if it works
> >> but
> >>>> if
> >>>>> there is a quick answer it may save me a couple of hours :)
> >>>>>
> >>>>> Thank a lot
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>
> >>>
> >>
> >
> >
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Reply via email to