Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-18 Thread Alex To
Hi Dan
Thanks for your suggestion but I am not trying to load large dataset yet.

I am trying to see if I can use Jena Full text search with other Jena based
API such as MarkLogic or Virtuoso but seems like it doesn't work as
expected. Not a Jena problem though. My set up is

1. Input file: dbpedia.owl (2.5MB)
2. Import using MarkLogic Jena without TextDataset: 1 minute
3. Import using MarkLogic Jena with TextDataset wrapping about it: 13
minutes

Regards

On Thu, Sep 19, 2019 at 10:54 AM Dan Davis  wrote:

> dbpedia is not actually that large.  Make sure you test with RDF datasets
> that really represent your data.
>
> On Wed, Sep 18, 2019 at 8:14 PM Alex To  wrote:
>
> > Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both
> Jena
> > and MarkLogic Jena works with indexing, I haven't tried querying
> MarkLogic
> > with text:query though.
> >
> > Using Kibana, I could see the number of documents increasing while
> > importing data with MarkLogic however it is very slow.
> >
> > Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a
> minute
> > without indexing.
> >
> > With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes
> so
> > I guess MarkLogic dataset does not seem to send triples in batch when
> using
> > with TextDataset.
> >
> >
> >
> > On Tue, Sep 17, 2019 at 9:58 AM Alex To  wrote:
> >
> > > Hi Andy
> > >
> > > I ended up creating separate implementation for Jena and MarkLogic full
> > > text search for now due to time constraints of the project. I will
> > > investigate further  at a later time.
> > >
> > > Thank you
> > >
> > > Best Regards
> > >
> > > On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne  wrote:
> > >
> > >> Alex,
> > >>
> > >> I can't try it out - I don't have a Marklogic system.
> > >>
> > >> Can you see in the server logs what is happening?
> > >>
> > >>  > Pure speculation but parts 1 & 2 sounds like the data load is not
> > going
> > >>  > to MarkLogic as a single transaction but as "autocommit" - one
> > >>  > transaction for each triple added.
> > >>
> > >>  Andy
> > >>
> > >> On 13/09/2019 23:04, Andy Seaborne wrote:
> > >> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but
> > >> our
> > >> > code depends on 3.1.0 - what code is it using?
> > >> >
> > >> > On 13/09/2019 01:18, Alex To wrote:
> > >> >> I created a small program to try out Lucene with MarkLogic Jena
> here
> > >> >>
> > >> >>
> > >>
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> > >> >>
> > >> >>
> > >> >>
> > >> >> My observation is as follows (see my comment at line 54 & 56)
> > >> >>
> > >> >> 1. If the model reads a small file with 2 triples, the loading can
> > >> finish
> > >> >> quickly
> > >> >> 2. If the model reads a slightly larger file (1.5MB), the loading
> > takes
> > >> >> forever so I have to terminate it
> > >> >
> > >> > Pure speculation but parts 1 & 2 sounds like the data load is not
> > going
> > >> > to MarkLogic as a single transaction but as "autocommit" - one
> > >> > transaction for each triple added.
> > >> >
> > >> >  Andy
> > >> >
> > >> >
> > >> >> 3. After loading the small file, searching the Lucene index direct
> > >> shows
> > >> >> that the triples are indexed
> > >> >> 4. After loading the small file, run SPARQL query with "text:query"
> > >> won't
> > >> >> finish
> > >> >>
> > >> >> For now I created 2 separate implementation in my program to
> support
> > >> Full
> > >> >> Text search with Jena or MarkLogic but I look forward to know more
> > >> >> whether
> > >> >> it is still possible to use Jena Elastic indexing with TextDataset
> > >> >> because
> > >> >> then I can provide a single UI to users to configure their search
> > >> >> regardless of the back end. :)
> > >> >>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-18 Thread Alex To
Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both Jena
and MarkLogic Jena works with indexing, I haven't tried querying MarkLogic
with text:query though.

Using Kibana, I could see the number of documents increasing while
importing data with MarkLogic however it is very slow.

Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a minute
without indexing.

With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes so
I guess MarkLogic dataset does not seem to send triples in batch when using
with TextDataset.



On Tue, Sep 17, 2019 at 9:58 AM Alex To  wrote:

> Hi Andy
>
> I ended up creating separate implementation for Jena and MarkLogic full
> text search for now due to time constraints of the project. I will
> investigate further  at a later time.
>
> Thank you
>
> Best Regards
>
> On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne  wrote:
>
>> Alex,
>>
>> I can't try it out - I don't have a Marklogic system.
>>
>> Can you see in the server logs what is happening?
>>
>>  > Pure speculation but parts 1 & 2 sounds like the data load is not going
>>  > to MarkLogic as a single transaction but as "autocommit" - one
>>  > transaction for each triple added.
>>
>>  Andy
>>
>> On 13/09/2019 23:04, Andy Seaborne wrote:
>> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but
>> our
>> > code depends on 3.1.0 - what code is it using?
>> >
>> > On 13/09/2019 01:18, Alex To wrote:
>> >> I created a small program to try out Lucene with MarkLogic Jena here
>> >>
>> >>
>> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
>> >>
>> >>
>> >>
>> >> My observation is as follows (see my comment at line 54 & 56)
>> >>
>> >> 1. If the model reads a small file with 2 triples, the loading can
>> finish
>> >> quickly
>> >> 2. If the model reads a slightly larger file (1.5MB), the loading takes
>> >> forever so I have to terminate it
>> >
>> > Pure speculation but parts 1 & 2 sounds like the data load is not going
>> > to MarkLogic as a single transaction but as "autocommit" - one
>> > transaction for each triple added.
>> >
>> >  Andy
>> >
>> >
>> >> 3. After loading the small file, searching the Lucene index direct
>> shows
>> >> that the triples are indexed
>> >> 4. After loading the small file, run SPARQL query with "text:query"
>> won't
>> >> finish
>> >>
>> >> For now I created 2 separate implementation in my program to support
>> Full
>> >> Text search with Jena or MarkLogic but I look forward to know more
>> >> whether
>> >> it is still possible to use Jena Elastic indexing with TextDataset
>> >> because
>> >> then I can provide a single UI to users to configure their search
>> >> regardless of the back end. :)
>> >>
>> >>
>> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis  wrote:
>> >>
>> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
>> >>> implementation of Dataset, and so while application is only using the
>> >>> virtuoso.jena.driver.VirtGraph and
>> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
>> >>> flexible
>> >>> integration is possible. I look forward to experimenting with it and
>> >>> seeing
>> >>> what I can do on the backend.
>> >>>
>> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis 
>> wrote:
>> >>>
>> >>>> Virtuoso's Jena driver implements the model interface, rather than
>> the
>> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
>> >>>> interface. You can see the architecture at
>> >>>>
>> >>>
>> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
>>
>> >>>
>> >>> However,
>> >>>> Virtuoso has its own full-text indexing, which can be effective. Its
>> >>> rules
>> >>>> for translating words into queries is not as flexible as
>> >>>> lucene/solr/elastic, but it does allow you to specify what should be
>> >>>> indexed - e.g. which obj

Best way to re-index with Jena full text search

2019-09-17 Thread Alex To
Hi everyone

I have a question. I want to add a new field for Jena Full Text search
using EntityDefinition.set(String field, Node predicate).

For e.g. if I want to include ("prefLabel", SKOS.prefLabel) as a new
mapping, then obviously I need to re-index existing data because I think
Jena creates index entries as triples are added.

So the question is what is the most efficient way to re-index existing data
or do I have to re-import all data again each time I add a new field?

Thanks a lot

Best regards


Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-16 Thread Alex To
Hi Andy

I ended up creating separate implementation for Jena and MarkLogic full
text search for now due to time constraints of the project. I will
investigate further  at a later time.

Thank you

Best Regards

On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne  wrote:

> Alex,
>
> I can't try it out - I don't have a Marklogic system.
>
> Can you see in the server logs what is happening?
>
>  > Pure speculation but parts 1 & 2 sounds like the data load is not going
>  > to MarkLogic as a single transaction but as "autocommit" - one
>  > transaction for each triple added.
>
>  Andy
>
> On 13/09/2019 23:04, Andy Seaborne wrote:
> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our
> > code depends on 3.1.0 - what code is it using?
> >
> > On 13/09/2019 01:18, Alex To wrote:
> >> I created a small program to try out Lucene with MarkLogic Jena here
> >>
> >>
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >>
> >>
> >>
> >> My observation is as follows (see my comment at line 54 & 56)
> >>
> >> 1. If the model reads a small file with 2 triples, the loading can
> finish
> >> quickly
> >> 2. If the model reads a slightly larger file (1.5MB), the loading takes
> >> forever so I have to terminate it
> >
> > Pure speculation but parts 1 & 2 sounds like the data load is not going
> > to MarkLogic as a single transaction but as "autocommit" - one
> > transaction for each triple added.
> >
> >  Andy
> >
> >
> >> 3. After loading the small file, searching the Lucene index direct shows
> >> that the triples are indexed
> >> 4. After loading the small file, run SPARQL query with "text:query"
> won't
> >> finish
> >>
> >> For now I created 2 separate implementation in my program to support
> Full
> >> Text search with Jena or MarkLogic but I look forward to know more
> >> whether
> >> it is still possible to use Jena Elastic indexing with TextDataset
> >> because
> >> then I can provide a single UI to users to configure their search
> >> regardless of the back end. :)
> >>
> >>
> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis  wrote:
> >>
> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >>> implementation of Dataset, and so while application is only using the
> >>> virtuoso.jena.driver.VirtGraph and
> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> >>> flexible
> >>> integration is possible. I look forward to experimenting with it and
> >>> seeing
> >>> what I can do on the backend.
> >>>
> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis  wrote:
> >>>
> >>>> Virtuoso's Jena driver implements the model interface, rather than the
> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >>>> interface. You can see the architecture at
> >>>>
> >>>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
>
> >>>
> >>> However,
> >>>> Virtuoso has its own full-text indexing, which can be effective. Its
> >>> rules
> >>>> for translating words into queries is not as flexible as
> >>>> lucene/solr/elastic, but it does allow you to specify what should be
> >>>> indexed - e.g. which objects from which which data properties in which
> >>>> graphs.
> >>>>
> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code
> at
> >>>> https://github.com/HHS/lodestar, which is run underneath
> >>>> https://github.com/HHS/meshrdf.   You will see that
> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy
> >>>> has
> >>>> been updated to Jena 3. The EBI version is ahead on UI features
> >>>> however.
> >>>>
> >>>> I cannot speak to MarkLogic, Stardog, etc.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
> >>>> updated to
> >>>> Jena 3.
> >>>>
> >>>> Virtuoso has its own full-text indexing, which is not as flexible in
> >>>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-13 Thread Alex To
Hi Andy

I had to pull develop branch from here
https://github.com/marklogic/marklogic-jena/tree/develop to get the version
that works with Jena 3.1x.0

then update file
https://github.com/marklogic/marklogic-jena/blob/develop/marklogic-jena/build.gradle

with the following

1. Line 9: change version *3.0-SNAPSHOT* to *3.1.0*
2. Line 13: change *3.10.0* to *3.12.0 *

Then do "gradlew install" to install it to my local maven.

On Sat, Sep 14, 2019 at 8:05 AM Andy Seaborne  wrote:

The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our
> code depends on 3.1.0 - what code is it using?
>
> On 13/09/2019 01:18, Alex To wrote:
> > I created a small program to try out Lucene with MarkLogic Jena here
> >
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >
> >
> > My observation is as follows (see my comment at line 54 & 56)
> >
> > 1. If the model reads a small file with 2 triples, the loading can finish
> > quickly
> > 2. If the model reads a slightly larger file (1.5MB), the loading takes
> > forever so I have to terminate it
>
> Pure speculation but parts 1 & 2 sounds like the data load is not going
> to MarkLogic as a single transaction but as "autocommit" - one
> transaction for each triple added.
>
>  Andy
>
>
> > 3. After loading the small file, searching the Lucene index direct shows
> > that the triples are indexed
> > 4. After loading the small file, run SPARQL query with "text:query" won't
> > finish
> >
> > For now I created 2 separate implementation in my program to support Full
> > Text search with Jena or MarkLogic but I look forward to know more
> whether
> > it is still possible to use Jena Elastic indexing with TextDataset
> because
> > then I can provide a single UI to users to configure their search
> > regardless of the back end. :)
> >
> >
> > On Fri, Sep 13, 2019 at 1:07 AM Dan Davis  wrote:
> >
> >> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >> implementation of Dataset, and so while application is only using the
> >> virtuoso.jena.driver.VirtGraph and
> >> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> flexible
> >> integration is possible. I look forward to experimenting with it and
> seeing
> >> what I can do on the backend.
> >>
> >> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis  wrote:
> >>
> >>> Virtuoso's Jena driver implements the model interface, rather than the
> >>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >>> interface. You can see the architecture at
> >>>
> >>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> .
> >> However,
> >>> Virtuoso has its own full-text indexing, which can be effective. Its
> >> rules
> >>> for translating words into queries is not as flexible as
> >>> lucene/solr/elastic, but it does allow you to specify what should be
> >>> indexed - e.g. which objects from which which data properties in which
> >>> graphs.
> >>>
> >>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> >>> https://github.com/HHS/lodestar, which is run underneath
> >>> https://github.com/HHS/meshrdf.   You will see that
> >>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy
> has
> >>> been updated to Jena 3. The EBI version is ahead on UI features
> however.
> >>>
> >>> I cannot speak to MarkLogic, Stardog, etc.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> EBI's lodestar still uses Jena 2, but the fork at HHS has been updated
> to
> >>> Jena 3.
> >>>
> >>> Virtuoso has its own full-text indexing, which is not as flexible in
> how
> >>> it indexes as Elastic/Solr/Lucene.   It still works.
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne  wrote:
> >>>
> >>>> Yes, probably - but.
> >>>>
> >>>> The Jena text index will work in conjunction with any (Jena)
> >>>> DatasetGraphAPI implementation. 3rd party systems are not tested in
> the
> >>>> build.
> >>>>
> >>>> The "but" is efficiency. Both those systems have their own built-in
> text
> >>>>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-12 Thread Alex To
I created a small program to try out Lucene with MarkLogic Jena here

https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java


My observation is as follows (see my comment at line 54 & 56)

1. If the model reads a small file with 2 triples, the loading can finish
quickly
2. If the model reads a slightly larger file (1.5MB), the loading takes
forever so I have to terminate it
3. After loading the small file, searching the Lucene index direct shows
that the triples are indexed
4. After loading the small file, run SPARQL query with "text:query" won't
finish

For now I created 2 separate implementation in my program to support Full
Text search with Jena or MarkLogic but I look forward to know more whether
it is still possible to use Jena Elastic indexing with TextDataset because
then I can provide a single UI to users to configure their search
regardless of the back end. :)


On Fri, Sep 13, 2019 at 1:07 AM Dan Davis  wrote:

> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> implementation of Dataset, and so while application is only using the
> virtuoso.jena.driver.VirtGraph and
> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more flexible
> integration is possible. I look forward to experimenting with it and seeing
> what I can do on the backend.
>
> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis  wrote:
>
> > Virtuoso's Jena driver implements the model interface, rather than the
> > DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> > interface. You can see the architecture at
> >
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
> However,
> > Virtuoso has its own full-text indexing, which can be effective. Its
> rules
> > for translating words into queries is not as flexible as
> > lucene/solr/elastic, but it does allow you to specify what should be
> > indexed - e.g. which objects from which which data properties in which
> > graphs.
> >
> > I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> > https://github.com/HHS/lodestar, which is run underneath
> > https://github.com/HHS/meshrdf.   You will see that
> > https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy has
> > been updated to Jena 3. The EBI version is ahead on UI features however.
> >
> > I cannot speak to MarkLogic, Stardog, etc.
> >
> >
> >
> >
> >
> > EBI's lodestar still uses Jena 2, but the fork at HHS has been updated to
> > Jena 3.
> >
> > Virtuoso has its own full-text indexing, which is not as flexible in how
> > it indexes as Elastic/Solr/Lucene.   It still works.
> >
> >
> >
> >
> > On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne  wrote:
> >
> >> Yes, probably - but.
> >>
> >> The Jena text index will work in conjunction with any (Jena)
> >> DatasetGraphAPI implementation. 3rd party systems are not tested in the
> >> build.
> >>
> >> The "but" is efficiency. Both those systems have their own built-in text
> >> indexing which execute as part of the native query engine. This may be a
> >> factor for you, it may not.
> >>
> >> Let us know how you get on trying it.
> >>
> >> 
> >>
> >> There is a SPARQL 1.2 issue about standardizing text query.
> >>
> >> Issue 40 : SPARQL 1.2 Community Group:
> >> https://github.com/w3c/sparql-12/issues/40
> >>
> >>  Andy
> >>
> >> On 12/09/2019 02:53, Alex To wrote:
> >> > Hi
> >> >
> >> > I have so far been happy with Jena + Lucene / Elastic. Just trying to
> >> get a
> >> > quick answer whether it can work with other Jena based API like
> >> Virtuoso /
> >> > MarkLogic.
> >> >
> >> > If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> >> > expected ?
> >> >
> >> > Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >> > interface, it may work but I am not sure because the "text:query"
> seems
> >> to
> >> > be more Jena specific.
> >> >
> >> > I will try out myself in the next couple of days to see if it works
> but
> >> if
> >> > there is a quick answer it may save me a couple of hours :)
> >> >
> >> > Thank a lot
> >> >
> >> > Regards
> >> >
> >>
> >
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>


Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

2019-09-11 Thread Alex To
Hi

I have so far been happy with Jena + Lucene / Elastic. Just trying to get a
quick answer whether it can work with other Jena based API like Virtuoso /
MarkLogic.

If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
expected ?

Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
interface, it may work but I am not sure because the "text:query" seems to
be more Jena specific.

I will try out myself in the next couple of days to see if it works but if
there is a quick answer it may save me a couple of hours :)

Thank a lot

Regards


Re: Jena Lucene full text search return no results

2019-09-05 Thread Alex To
I figured it out, I need to include GRAPH ?g in the query

Thanks

On Fri, Sep 6, 2019 at 2:08 PM Alex To  wrote:

>
> Hi
>
>  I created a Jena Lucene index with this configuration
>
> var entDef = new EntityDefinition(
> "uri",
> "label",
> "graph", RDFS.label.asNode());
> entDef.setLangField("lang");
> entDef.setUidField("uid");
>
>
> Then I loaded schema.org ontology and queried using Lucene API directly
> on the index directory, I got expected result
>
> QueryParser parser = new QueryParser("label", analyzer);
> Query query = parser.parse("tax~");
>
> But if I use Jena query, I got no result
>
> ParameterizedSparqlString q = new ParameterizedSparqlString(
> "SELECT ?s WHERE { " +
> "   ?s text:query 'Taxi' " +
> "}");
>
> q.setNsPrefix("text", "http://jena.apache.org/text#;);
>
>
> A minimal project to illustrate the problem can be found here
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainSearch.java
>
> If you run index() and then searchLucene() you can see expected results
> but if you run searchJena() you see nothing. What did I miss??
>
> Thank you
>
> Best Regards
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>


Jena Lucene full text search return no results

2019-09-05 Thread Alex To
Hi

 I created a Jena Lucene index with this configuration

var entDef = new EntityDefinition(
"uri",
"label",
"graph", RDFS.label.asNode());
entDef.setLangField("lang");
entDef.setUidField("uid");


Then I loaded schema.org ontology and queried using Lucene API directly on
the index directory, I got expected result

QueryParser parser = new QueryParser("label", analyzer);
Query query = parser.parse("tax~");

But if I use Jena query, I got no result

ParameterizedSparqlString q = new ParameterizedSparqlString(
"SELECT ?s WHERE { " +
"   ?s text:query 'Taxi' " +
"}");

q.setNsPrefix("text", "http://jena.apache.org/text#;);


A minimal project to illustrate the problem can be found here
https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainSearch.java

If you run index() and then searchLucene() you can see expected results but
if you run searchJena() you see nothing. What did I miss??

Thank you

Best Regards


Re: How to use ontModel.listHierarchyRootClasses() properly

2019-09-05 Thread Alex To
Thanks Lorenz

I figured out the union thing too so I ended up using the below SPARQL
instead and it seems to work fine but would love to know what is the
equivalent using Jena API

"SELECT DISTINCT ?s WHERE { " +
"?s a owl:Class . " +
"FILTER (!isBlank(?s)) " +
"FILTER (?s != owl:Thing && ?s != owl:Nothing) . " +
"OPTIONAL { " +
" ?s rdfs:subClassOf ?super . " +
" FILTER (?super != rdfs:Resource && ?super != owl:Thing
&& ?s != ?super) " +
"   } . " +
"FILTER (!bound(?super))" +
"}");


Regards


On Thu, Sep 5, 2019 at 8:00 PM Lorenz Buehmann <
buehm...@informatik.uni-leipzig.de> wrote:

> schema.org contains a bunch of anonymous classes like the union of other
> classes which are used as domain or range of a property, that's why you
> get null values because they do not have a URI. If you'd just call
>
> |System.out.println(clazz);|
>
>
> you'd see the blank node Ids.
>
>
> Among all those blank nodes, there is indeed the one top level class
> http://schema.org/Thing .
>
>
> I thought just adding a filter on the iterator is enough, i.e.
>
> |topClazzez = topClazzez.filterDrop(OntResource::isAnon);|
>
>
> but while it doesn't return the blank nodes anymore, it leads to an
> exception:
>
>
> |Exception in thread "main"
> org.apache.jena.ontology.ConversionException: Cannot convert node
> http://www.w3.org/2000/01/rdf-schema#Class to OntClass: it does not have
> rdf:type owl:Class or equivalent||
> ||at
> org.apache.jena.ontology.impl.OntClassImpl$1.wrap(OntClassImpl.java:82)||
> ||at org.apache.jena.enhanced.EnhNode.convertTo(EnhNode.java:152)||
> ||at org.apache.jena.enhanced.EnhNode.convertTo(EnhNode.java:31)||
> ||at
> org.apache.jena.enhanced.Polymorphic.asInternal(Polymorphic.java:62)||
> ||at org.apache.jena.enhanced.EnhNode.as(EnhNode.java:107)||
> ||at
>
> org.apache.jena.ontology.impl.OntResourceImpl.lambda$listDirectPropertyValues$8(OntResourceImpl.java:1536)||
> ||at
> org.apache.jena.util.iterator.Map1Iterator.next(Map1Iterator.java:46)||
> ||at
>
> org.apache.jena.ontology.impl.OntResourceImpl.computeDirectValues(OntResourceImpl.java:1580)||
> ||at
>
> org.apache.jena.ontology.impl.OntResourceImpl.listDirectPropertyValues(OntResourceImpl.java:1553)||
> ||at
>
> org.apache.jena.ontology.impl.OntClassImpl.listSuperClasses(OntClassImpl.java:180)||
> ||at
>
> org.apache.jena.ontology.impl.OntClassImpl.isHierarchyRoot(OntClassImpl.java:739)||
> ||at
>
> org.apache.jena.util.iterator.FilterIterator.hasNext(FilterIterator.java:56)||
> ||at
>
> org.apache.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:90)||
> ||at
>
> org.apache.jena.util.iterator.FilterIterator.hasNext(FilterIterator.java:55)||
> ||at com.company.Main.main(Main.java:25)|
>
>
>
> On 05.09.19 03:11, Alex To wrote:
> > Hi I am trying to load schema.org ontology and get all top classes
> > using ontModel.listHierarchyRootClasses() but can't get the expected
> > results with different OntModelSpec.
> >
> > If I use OWL_MEM, it lists 2500+ records with all the records have "null"
> > URI.
> > If I use OWL_DL_MEM, it lists 0 records
> >
> > The code is very simple as follows
> >
> > Model model = ModelFactory.createDefaultModel();
> > model.read("https://schema.org/docs/schemaorg.owl;);
> > OntModel ontModel =
> > ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model);
> > ExtendedIterator topClazzez =
> ontModel.listHierarchyRootClasses();
> > while (topClazzez.hasNext()) {
> > OntClass clazz = topClazzez.next();
> > System.out.println(clazz.getURI());
> > }
> >
> > A minimal Maven project ready to run to demonstrate my problem is here
> > https://github.com/AlexTo/jena-lab (have a look at the Main.java)
> >
> > Thanks a lot
>


--


How to use ontModel.listHierarchyRootClasses() properly

2019-09-04 Thread Alex To
Hi I am trying to load schema.org ontology and get all top classes
using ontModel.listHierarchyRootClasses() but can't get the expected
results with different OntModelSpec.

If I use OWL_MEM, it lists 2500+ records with all the records have "null"
URI.
If I use OWL_DL_MEM, it lists 0 records

The code is very simple as follows

Model model = ModelFactory.createDefaultModel();
model.read("https://schema.org/docs/schemaorg.owl;);
OntModel ontModel =
ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model);
ExtendedIterator topClazzez = ontModel.listHierarchyRootClasses();
while (topClazzez.hasNext()) {
OntClass clazz = topClazzez.next();
System.out.println(clazz.getURI());
}

A minimal Maven project ready to run to demonstrate my problem is here
https://github.com/AlexTo/jena-lab (have a look at the Main.java)

Thanks a lot
-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>


Re: CSV to rdf

2019-02-18 Thread Alex To
I'm surprised that no one mentioned RML (http://rml.io/) which is a W3C
standard and the Java library https://github.com/RMLio/rmlmapper-java/

I've been happily using this library to transform XML, CSV from web
services.

Regards


On Sat, Feb 16, 2019 at 12:34 AM John A. Fereira  wrote:

>
> When this question has come up in the past I’ve recommended the VIVO
> Harvester tool (https://github.com/vivo-project/VIVO-Harvester) which can
> transform content to RDF from many different sources (csv, jdbc database,
> json, xml, web services).  It consists of a suite of tools for ingesting
> content into the VIVO semantic web application but could be used for
> creating RDF in general (it is based on Jena).  It essentially transforms
> from one of the supported data sources into a “flat” rdf structure which
> can then be transformed into RDF which models the desired ontology using
> XSLT.   Note that database such as MySQL can get loaded from a CSV file,
> and then a tool like D2R Map could be used to transform content from the
> database to RDF.
>
> On 2/14/19, 11:31 PM, "Conal Tuohy"  wrote:
>
> Elio, this small XSLT may be a helpful example:
>
> https://github.com/TEIC/Stylesheets/blob/dev/profiles/default/csv/from.xsl
>
> It opens the CSV file, splits it into lines, and then generates a new
> TEI
> XML document in which it simply re-encodes the tabular data using TEI
> , , and  elements. You can see how you could use the
> same
> technique, but output RDF/XML or TriX, if that's what you want to do.
>
> Regards
>
> Conal
>
> On Thu, 14 Feb 2019 at 23:59, elio hbeich 
> wrote:
>
> > Dear all
> >
> > Do you have any suggestion about tools or XSLT that can transform
> CSV to
> > RDF
> >
> > Thank you in advance,
> > Elio HBEICH
> >
>
>
> --
> Conal Tuohy
> http://conaltuohy.com/
> @conal_tuohy
> +61-466-324297
>
>
>

-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>


Re: Loosely converting JSON/XML to RDF

2018-11-07 Thread Alex To
Perhaps you want to review JSON-LD and RML to understand what the two
standards are doing then suddenly your question become very obvious.

With JSON-LD, you add "extra elements" to existing JSON data to translate
JSON to triples with almost no control about how should subjects,
predicates and objects in the triples be generated from JSON.

With RML, you specify exactly what XML/JSON elements should become whether
subjects or predicates or objects. You can even specify the format of the
URI to be generated etc...

Have a look at the example here http://rml.io/RML_examples.html

On Wed, Nov 7, 2018 at 9:41 PM Laura Morales  wrote:

> This made me thinking... if I can convert CSV, XML, and other formats to
> JSON, and then use JSON-LD context and framing to change the data to my
> linking, why do tools such as RML, YARRRML, and SPARQL-Generate exist at
> all? Do they do anything at all that can't be done with JSON-LD?
>
>
>
>
> Sent: Monday, November 05, 2018 at 9:10 AM
> From: "Christopher Johnson" 
> To: users@jena.apache.org
> Subject: Re: Loosely converting JSON/XML to RDF
> Another approach is to use JSON-LD. A JSON document can be "converted" to
> RDF by adding a context and using the toRDF method[1] in one of the JSON-LD
> libraries. Defining the context is similar to what is done with RML,
> basically mapping data objects to structured vocabulary terms. If your XML
> is sufficiently denormalized, you can also convert that to JSON and repeat
> the same process as above.
>
> Christopher Johnson
> Scientific Associate
> Universitätsbibliothek Leipzig
>
> [1] https://json-ld.org/spec/latest/json-ld-api/#object-to-rdf-conversion
>
> On Mon, 5 Nov 2018 at 08:55, Alex To  wrote:
>
> > We have web services returning XML and JSON in our environment. We use
> >
> https://github.com/RMLio/rmlmapper-java[https://github.com/RMLio/rmlmapper-java]
> to map XML/JSON to RDF with
> > satisfied results.
> >
> > Or course you need a valid URI for your XML or Json elements for e.g. in
> > our XML, if we have ... then we use RML to
> map
> > it to
> >
> >
> http://ourdomain.com/resources/students/[http://ourdomain.com/resources/students/]{id}
> rdfs:type
> > http://ourdomain.com/ont/Student[http://ourdomain.com/ont/Student]
> >
> > You can define your own URI generation scheme whatever works for you
> >
> > You can read more about RDF Mapping Language (RML) from W3C website.
> >
> > Regards
> >
> > On Mon, 5 Nov 2018 at 6:34 pm, Laura Morales  wrote:
> >
> > > I have a mixed set of datasets in XML, JSON, and RDF formats. I would
> > like
> > > to convert all the XML/JSON ones to RDF such that I can only use one
> > query
> > > language/library to access all the data, instead of having three
> > different
> > > ones. I'm also not interested in using any particular ontology or
> > > vocabulary for the conversion, so anything will work as long as I can
> > make
> > > the conversion.
> > > What would be an appropriate strategy for this? Since RDF requires
> > > absolute IRIs, would it be a good idea for example to convert all
> > > properties to
> http://example.org/property-name-1[http://example.org/property-name-1],
> > > http://example.org/property-name-2[http://example.org/property-name-2],
> ...? And maybe use UUIDs for nodes?
> > > Or is there a better way of doing this?
> > >
> >
>


--


Re: Loosely converting JSON/XML to RDF

2018-11-04 Thread Alex To
We have web services returning XML and JSON in our environment. We use
https://github.com/RMLio/rmlmapper-java to map XML/JSON to RDF with
satisfied results.

Or course you need a valid URI for your XML or Json elements for e.g. in
our XML, if we have ... then we use RML to map
it to

http://ourdomain.com/resources/students/{id} rdfs:type
http://ourdomain.com/ont/Student

You can define your own URI generation scheme whatever works for you

You can read more about RDF Mapping Language (RML) from W3C website.

Regards

On Mon, 5 Nov 2018 at 6:34 pm, Laura Morales  wrote:

> I have a mixed set of datasets in XML, JSON, and RDF formats. I would like
> to convert all the XML/JSON ones to RDF such that I can only use one query
> language/library to access all the data, instead of having three different
> ones. I'm also not interested in using any particular ontology or
> vocabulary for the conversion, so anything will work as long as I can make
> the conversion.
> What would be an appropriate strategy for this? Since RDF requires
> absolute IRIs, would it be a good idea for example to convert all
> properties to http://example.org/property-name-1,
> http://example.org/property-name-2, ...? And maybe use UUIDs for nodes?
> Or is there a better way of doing this?
>


Re: Inference

2018-10-30 Thread Alex To
See SPARQL property path https://www.w3.org/TR/sparql11-query/#propertypaths

Assuming you have the Schema vocab loaded in the same dataset (or graph),
the following should work

SELECT ?entity
WHERE {
?entity rdf:type/rdfs:subClassOf* :CreativeWork
}


On Tue, Oct 30, 2018 at 5:20 PM Laura Morales  wrote:

> Let's say I have a node of type schema:Book and one of type
> schema:VideoGame. In the Schema vocabulary, both are subclasses of
> schema:CreativeWork.
> Can somebody please give me a hint how to query Fuseki for
> schema:CreativeWork in order to retrieve both types?
>


-- 

Alex To

PhD Candidate

School of Information Technologies

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>


Getting URI of the ontology from ontology model

2018-07-31 Thread Alex To
Hi

May I know what is the best way to get the URI of an  ontology. I have
looked at this page creating-ontology-models
<https://jena.apache.org/documentation/ontology/#creating-ontology-models> and
this page OntDocumentManager
<https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/ontology/OntDocumentManager.html#addModel-java.lang.String-org.apache.jena.rdf.model.Model-boolean->
but
can't seem to figure out.

The problem is that, in my application, the user will input a URI, for e.g.
"https://www.w3.org/2002/07/owl#;, I use Jena Ontology API to import the
ontology from this URL.

However, the actual URI of the ontology  is "http://www.w3.org/2002/07/owl#;
(it is HTTP and not HTTPS).

Even worse, the OWL specifies an import "
http://www.w3.org/2000/01/rdf-schema; but the URI of RDFS is "
http://www.w3.org/2000/01/rdf-schema#;

How do I get the URI of the ontology regardless of the URL where the
ontology is resolved from?

Do I need to query the imported model for the tripple "?s a owl:Ontology"
to find out or Jena has some built-in function for this?

Thank you

Best Regards

-- 

Alex To

PhD Candidate

School of Information Technologies

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656


-- 

Alex To

PhD Candidate

School of Information Technologies

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 


Re-using Lucene index

2017-09-27 Thread Alex Vasinca
Hi,



Me and my team are currently developing an application involving Jena.
We’re using Jena-Text together with Lucene indexing for full text search
support but we’re doing it through SPARQL queries. However, on every
startup of the app the indexing is done all over again even if the data
didn’t change.



Is there any way to re-use the already built index and get a Dataset object
from it?



I have to mention that, as we use SPARQL queries for the search, we need a
Dataset object, that we can successfully get from the TextDatasetFactory
class with the TextDatasetFactory#create(Dataset base, TextIndex textIndex)
method but this method triggers the indexing if we stop the app and the
start it again.

I hope you can understand the use case I’m targeting from the above
explanation.



Cheers,

Alex



Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
10


-- 

Alex Vasinca

Software Developer

Alternative e-mail: alexandru.vasi...@gmail.com

Mobile: (+4) 0742737706


Re: Upgrade to Jena 3.1.0 breaks the query

2016-06-21 Thread Alex Shkop

Thank, Andy. The workaround fixed the problem in my environment.

Alex

On 21.06.16 14:01, Andy Seaborne wrote:

Alex,

Thanks - the key point here is TDB - it follows a slightly different 
path through the code.


https://issues.apache.org/jira/browse/JENA-1198

The best workaround for Fuseki I came up with is to add a context 
setting to the dataset:


:tdb_dataset_readwrite
a tdb:DatasetTDB ;
tdb:location  "/fuseki/databases/test" ;
# Fuseki 2.4.0 only - remove for later versions.
ja:context [ ja:cxtName "arq:optFilterPlacementBGP" ;
 ja:cxtValue "true" ] ;
.

which is something you should remove at the next upgrade and also test 
in your environment.


    Andy

On 21/06/16 07:24, Alex Shkop wrote:

Hi,

I'm deploying Fuseki with docker. I'm using a container identical to
that one https://hub.docker.com/r/stain/jena-fuseki/, but with Fuseki
2.4.0. And I can reproduce an issue with this simple assembler file:

@prefix :  <http://base/#> .
@prefix tdb:   <http://jena.hpl.hp.com/2008/tdb#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ja:<http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix text:<http://jena.apache.org/text#> .

@prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .


[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDBrdfs:subClassOf  ja:Model .


Unrelated:
These 3 triples are no longer necessary




# Here we define the actual dataset
:tdb_dataset_readwrite
 a tdb:DatasetTDB ;
 tdb:location  "/fuseki/databases/test" .




:service_tdb_all  a   fuseki:Service ;
 rdfs:label"TDB test" ;
 fuseki:dataset:tdb_dataset_readwrite ;
 fuseki:name   "test" ;
 fuseki:serviceQuery   "query" , "sparql" ;
 fuseki:serviceReadGraphStore  "get" ;
 fuseki:serviceReadWriteGraphStore
 "data" ;
 fuseki:serviceUpdate  "update" ;
 fuseki:serviceUpload  "upload" .

Thanks,
Alex

On 17.06.16 19:54, Andy Seaborne wrote:

Hi,

I can't reproduce this with Fuseki 2.4.0 - could explain your Fuseki
setup please?

Andy



On 17/06/16 16:25, Alex Shkop wrote:

Hello

I've upgraded to Jena 3.1.0 and some of my SPARQL queries now crash 
with
NullPointerException. The simplest query that causes a crash looks 
like

this:

SELECT ?s
WHERE {
 ?s ?p ?o .
 ?s <http://www.w3.org/2000/01/rdf-schema#member> ?m .
 FILTER (!bound(?test))
}

Here's the stack trace:
java.lang.NullPointerException
 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.placePropertyFunctionProcedure(TransformFilterPlacement.java:453) 




 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.placePropertyFunction(TransformFilterPlacement.java:432) 




 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.transform(TransformFilterPlacement.java:200) 




 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.transform(TransformFilterPlacement.java:159) 




 at
org.apache.jena.sparql.algebra.TransformWrapper.transform(TransformWrapper.java:59) 




 at
org.apache.jena.sparql.algebra.op.OpFilter.apply(OpFilter.java:100)
 at
org.apache.jena.sparql.algebra.Transformer$ApplyTransformVisitor.visitFilter(Transformer.java:401) 




 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:110) 




 at
org.apache.jena.sparql.algebra.op.OpFilter.visit(OpFilter.java:103)
 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visit1(OpWalker.java:85) 




 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visitFilter(OpWalker.java:91) 




 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:110) 




 at
org.apache.jena.sparql.algebra.op.OpFilter.visit(OpFilter.java:103)
 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visit1(OpWalker.java:83) 




 at
org.apache.jena.sparql.algebra.OpVisitorByType.visitModifer(OpVisitorByType.java:42) 




 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:158) 




 at
org.apache.jena.sparql.algebra.op.OpProject.visit(OpProject.java:47)
 at org.apache.jena.sparql.algebra.OpWalker.walk(OpWalker.java:43)
 at org.apache.jena.sparql.algebra.OpWalker.walk(OpWalker.java:38)
 at
org.apache.jena.sparql.algebra.Transformer.applyTransformation(Transformer.java

Re: Upgrade to Jena 3.1.0 breaks the query

2016-06-21 Thread Alex Shkop

Hi,

I'm deploying Fuseki with docker. I'm using a container identical to 
that one https://hub.docker.com/r/stain/jena-fuseki/, but with Fuseki 
2.4.0. And I can reproduce an issue with this simple assembler file:


@prefix :  <http://base/#> .
@prefix tdb:   <http://jena.hpl.hp.com/2008/tdb#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ja:<http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix text:<http://jena.apache.org/text#> .

@prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .


[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDBrdfs:subClassOf  ja:Model .


# Here we define the actual dataset
:tdb_dataset_readwrite
a tdb:DatasetTDB ;
tdb:location  "/fuseki/databases/test" .




:service_tdb_all  a   fuseki:Service ;
rdfs:label"TDB test" ;
fuseki:dataset:tdb_dataset_readwrite ;
fuseki:name   "test" ;
fuseki:serviceQuery   "query" , "sparql" ;
fuseki:serviceReadGraphStore  "get" ;
fuseki:serviceReadWriteGraphStore
"data" ;
fuseki:serviceUpdate  "update" ;
fuseki:serviceUpload  "upload" .

Thanks,
Alex

On 17.06.16 19:54, Andy Seaborne wrote:

Hi,

I can't reproduce this with Fuseki 2.4.0 - could explain your Fuseki 
setup please?


Andy



On 17/06/16 16:25, Alex Shkop wrote:

Hello

I've upgraded to Jena 3.1.0 and some of my SPARQL queries now crash with
NullPointerException. The simplest query that causes a crash looks like
this:

SELECT ?s
WHERE {
 ?s ?p ?o .
 ?s <http://www.w3.org/2000/01/rdf-schema#member> ?m .
 FILTER (!bound(?test))
}

Here's the stack trace:
java.lang.NullPointerException
 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.placePropertyFunctionProcedure(TransformFilterPlacement.java:453) 



 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.placePropertyFunction(TransformFilterPlacement.java:432) 



 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.transform(TransformFilterPlacement.java:200) 



 at
org.apache.jena.sparql.algebra.optimize.TransformFilterPlacement.transform(TransformFilterPlacement.java:159) 



 at
org.apache.jena.sparql.algebra.TransformWrapper.transform(TransformWrapper.java:59) 



 at 
org.apache.jena.sparql.algebra.op.OpFilter.apply(OpFilter.java:100)

 at
org.apache.jena.sparql.algebra.Transformer$ApplyTransformVisitor.visitFilter(Transformer.java:401) 



 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:110) 



 at 
org.apache.jena.sparql.algebra.op.OpFilter.visit(OpFilter.java:103)

 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visit1(OpWalker.java:85) 



 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visitFilter(OpWalker.java:91) 



 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:110) 



 at 
org.apache.jena.sparql.algebra.op.OpFilter.visit(OpFilter.java:103)

 at
org.apache.jena.sparql.algebra.OpWalker$WalkerVisitor.visit1(OpWalker.java:83) 



 at
org.apache.jena.sparql.algebra.OpVisitorByType.visitModifer(OpVisitorByType.java:42) 



 at
org.apache.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:158) 



 at
org.apache.jena.sparql.algebra.op.OpProject.visit(OpProject.java:47)
 at org.apache.jena.sparql.algebra.OpWalker.walk(OpWalker.java:43)
 at org.apache.jena.sparql.algebra.OpWalker.walk(OpWalker.java:38)
 at
org.apache.jena.sparql.algebra.Transformer.applyTransformation(Transformer.java:147) 



 at
org.apache.jena.sparql.algebra.Transformer.transformation(Transformer.java:140) 



 at
org.apache.jena.sparql.algebra.Transformer.transformation(Transformer.java:129) 



 at
org.apache.jena.sparql.algebra.Transformer.transformation(Transformer.java:123) 



 at
org.apache.jena.sparql.algebra.Transformer.transform(Transformer.java:56) 


 at
org.apache.jena.sparql.algebra.Transformer.transformSkipService(Transformer.java:86) 



 at
org.apache.jena.sparql.algebra.Transformer.transformSkipService(Transformer.java:68) 



 at
org.apache.jena.sparql.algebra.optimize.Optimize.apply(Optimize.java:282) 


 at
org.apache.jena.sparql.algebra.optimize.Optimize.rewrite(Optimize.java:225) 


 at
org.apache.jena.sparql.algebra.optimize.Optimize.optimize(Optimize.java:78) 


 at org.apache.jena.sparql.algebra.Algebra.optimize(Algebra.java:65)
 at
org.apache

Re[2]: Problem with getting jena jar filies.

2015-07-04 Thread Alex Sviridov
 Thank you very much. I added the aoache snapshot repository and I got the jar 
file. Can you answer the following questions:
1) is is standart apache policy not to keep snapshot in central maven?
2) how can I building jena from sources to get this osgi jar? I tried maven 
install both to parent and module. I got the output that jar was copied but in 
local maven repo I had only pom.


Суббота,  4 июля 2015, 21:00 +01:00 от Andy Seaborne a...@apache.org:
On 04/07/15 20:40, Alex Sviridov wrote:
   Thank you for you answer. But if I add typePOM/type then the necessary 
 classes are not found.

typepom/type  Lowercase.

Which classes?

(do you have a mix of 2.13.0 and 3.0.0-SNAPSHOT? because there is a 
package name change between them)

Snapshots need

 repository
   idapache.snapshots/id
   nameApache Snapshot Repository/name
   urlhttp://repository.apache.org/snapshots/url
   releases
 enabledfalse/enabled
   /releases
   snapshots
 enabledtrue/enabled
   /snapshots
 /repository

because they are not in maven central.


 Really, I don't understand who and why did this way. It really as nightmare 
 to get jena-osgi jar file. Please, help me to get as I use osgi and need 
 jena as osgi bundle.


The artifact is apache-jena-osgi (that currently goes to jena-osgi which 
is the jar IIRC)

I'm trying to! (and I don't use OSGI currently)

Andy


 Суббота,  4 июля 2015, 20:21 +01:00 от Andy Seaborne  a...@apache.org :
 On 04/07/15 13:15, Alex Sviridov wrote:

 I can't get apache-jena-osgi jar. I tried central repo
 dependency
   groupIdorg.apache.jena/groupId
   artifactIdapache-jena-osgi/artifactId
   version2.13.0/version
 /dependency
 but constanly get
 The POM for org.apache.jena:jena-osgi:jar:2.12.2-SNAPSHOT is missing, no 
 dependency information available


 I think it is because you need to add typepom/type

dependency
 groupIdorg.apache.jena/groupId
 artifactIdapache-jena-osgi/artifactId
 version2.13.0/version
 typepom/type
/dependency

 org.apache.jena:apache-jena-osgi is not the bundle itself but an
 indirection point that pulls in the right modules (which the project
 reserves the right to change, hence the indirection point).

 I don't know why it is saying 2.12.2-SNAPSHOT -- the released
 apache-jena-osgi looks OK to me.

 (there is supposed to be some documentation at /download/osgi.html but
 no one has written it yet. Hint, hint :-)

 Andy


 Finally I downloaded sources and built it myself. Here is the output of 
 maven install:
 [INFO]--- maven-install-plugin:2.5.2:install (default-install)@ jena-osgi 
 ---

[INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/target/jena-osgi-3.0.0-SNAPSHOT.jar
 
to 
/home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/pom.xml
  to 
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.pom
 [INFO][INFO]--- maven-bundle-plugin:2.5.3:install (default-install)@ 
 jena-osgi ---[INFO]Installing 
 org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Writing OBR metadata
 [INFO][INFO]ReactorSummary:[INFO][INFO]ApacheJena-OSGi
  SUCCESS [4.898s][INFO]ApacheJena-OSGi bundle . 
 SUCCESS 
 [26.290s][INFO][INFO]
  BUILD SUCCESS
 [INFO][INFO]Total
  time:32.665s[INFO]Finished at:SatJul0414:32:44 MSK 
 2015[INFO]FinalMemory:30M/450M[INFO]
 However in  
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/ there is 
 not jar file. Only pom.

 How can I get jena-osgi.jar? I have never had such strange problem with 
 getting jar file.











-- 
Alex Sviridov


Re[2]: Problem with getting jena jar filies.

2015-07-04 Thread Alex Sviridov
 Thank you for you answer. But if I add typePOM/type then the necessary 
classes are not found.  

Really, I don't understand who and why did this way. It really as nightmare to 
get jena-osgi jar file. Please, help me to get as I use osgi and need jena as 
osgi bundle.


Суббота,  4 июля 2015, 20:21 +01:00 от Andy Seaborne a...@apache.org:
On 04/07/15 13:15, Alex Sviridov wrote:

 I can't get apache-jena-osgi jar. I tried central repo
 dependency
  groupIdorg.apache.jena/groupId
  artifactIdapache-jena-osgi/artifactId
  version2.13.0/version
 /dependency
 but constanly get
 The POM for org.apache.jena:jena-osgi:jar:2.12.2-SNAPSHOT is missing, no 
 dependency information available


I think it is because you need to add typepom/type

  dependency
   groupIdorg.apache.jena/groupId
   artifactIdapache-jena-osgi/artifactId
   version2.13.0/version
   typepom/type
  /dependency

org.apache.jena:apache-jena-osgi is not the bundle itself but an 
indirection point that pulls in the right modules (which the project 
reserves the right to change, hence the indirection point).

I don't know why it is saying 2.12.2-SNAPSHOT -- the released 
apache-jena-osgi looks OK to me.

(there is supposed to be some documentation at /download/osgi.html but 
no one has written it yet. Hint, hint :-)

Andy


 Finally I downloaded sources and built it myself. Here is the output of 
 maven install:
 [INFO]--- maven-install-plugin:2.5.2:install (default-install)@ jena-osgi 
 ---[INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/target/jena-osgi-3.0.0-SNAPSHOT.jar
  to 
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/pom.xml
  to 
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.pom
 [INFO][INFO]--- maven-bundle-plugin:2.5.3:install (default-install)@ 
 jena-osgi ---[INFO]Installing 
 org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Writing OBR metadata
 [INFO][INFO]ReactorSummary:[INFO][INFO]ApacheJena-OSGi
  SUCCESS [4.898s][INFO]ApacheJena-OSGi bundle . 
 SUCCESS 
 [26.290s][INFO][INFO]
  BUILD SUCCESS
 [INFO][INFO]Total
  time:32.665s[INFO]Finished at:SatJul0414:32:44 MSK 
 2015[INFO]FinalMemory:30M/450M[INFO]
 However in  
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/ there is 
 not jar file. Only pom.

 How can I get jena-osgi.jar? I have never had such strange problem with 
 getting jar file.








-- 
Alex Sviridov


Re[2]: Problem with getting jena jar filies.

2015-07-04 Thread Alex Sviridov
 Thank you for your time.

Суббота,  4 июля 2015, 21:25 +01:00 от Andy Seaborne a...@apache.org:
On 04/07/15 21:18, Alex Sviridov wrote:
   Thank you very much. I added the aoache snapshot repository and I got the 
 jar file. Can you answer the following questions:
Please add this to the stackoveflow question.

 1) is is standart apache policy not to keep snapshot in central maven?

Not Apache specific - maven central has releases, not snapshots.

 2) how can I building jena from sources to get this osgi jar? I tried maven 
 install both to parent and module. I got the output that jar was copied but 
 in local maven repo I had only pom.

Build it from the top, not partially.

As on stackoverflow - probably because you are doing a partial build of 
one part of jena, so dependent snapshots for the bundling and parent 
POMs are not available.
No, I firstly did maven install to parent pom. It took long time to build all 
the project. But I repeat, after that I had no osgi jar in local repo.



Andy


 Суббота,  4 июля 2015, 21:00 +01:00 от Andy Seaborne  a...@apache.org :
 On 04/07/15 20:40, Alex Sviridov wrote:
Thank you for you answer. But if I add typePOM/type then the 
 necessary classes are not found.

 typepom/type  Lowercase.

 Which classes?

 (do you have a mix of 2.13.0 and 3.0.0-SNAPSHOT? because there is a
 package name change between them)

 Snapshots need

   repository
 idapache.snapshots/id
 nameApache Snapshot Repository/name
 urlhttp://repository.apache.org/snapshots/url
 releases
   enabledfalse/enabled
 /releases
 snapshots
   enabledtrue/enabled
 /snapshots
   /repository

 because they are not in maven central.


 Really, I don't understand who and why did this way. It really as 
 nightmare to get jena-osgi jar file. Please, help me to get as I use osgi 
 and need jena as osgi bundle.


 The artifact is apache-jena-osgi (that currently goes to jena-osgi which
 is the jar IIRC)

 I'm trying to! (and I don't use OSGI currently)

 Andy


 Суббота,  4 июля 2015, 20:21 +01:00 от Andy Seaborne   a...@apache.org :
 On 04/07/15 13:15, Alex Sviridov wrote:

 I can't get apache-jena-osgi jar. I tried central repo
 dependency
groupIdorg.apache.jena/groupId
artifactIdapache-jena-osgi/artifactId
version2.13.0/version
 /dependency
 but constanly get
 The POM for org.apache.jena:jena-osgi:jar:2.12.2-SNAPSHOT is missing, no 
 dependency information available


 I think it is because you need to add typepom/type

 dependency
  groupIdorg.apache.jena/groupId
  artifactIdapache-jena-osgi/artifactId
  version2.13.0/version
  typepom/type
 /dependency

 org.apache.jena:apache-jena-osgi is not the bundle itself but an
 indirection point that pulls in the right modules (which the project
 reserves the right to change, hence the indirection point).

 I don't know why it is saying 2.12.2-SNAPSHOT -- the released
 apache-jena-osgi looks OK to me.

 (there is supposed to be some documentation at /download/osgi.html but
 no one has written it yet. Hint, hint :-)

 Andy


 Finally I downloaded sources and built it myself. Here is the output of 
 maven install:
 [INFO]--- maven-install-plugin:2.5.2:install (default-install)@ 
 jena-osgi ---

 [INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/target/jena-osgi-3.0.0-SNAPSHOT.jar
 to
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/pom.xml
  to 
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.pom
 [INFO][INFO]--- maven-bundle-plugin:2.5.3:install (default-install)@ 
 jena-osgi ---[INFO]Installing 
 org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
 [INFO]Writing OBR metadata
 [INFO][INFO]ReactorSummary:[INFO][INFO]ApacheJena-OSGi
  SUCCESS [4.898s][INFO]ApacheJena-OSGi bundle . 
 SUCCESS 
 [26.290s][INFO][INFO]
  BUILD SUCCESS
 [INFO][INFO]Total
  time:32.665s[INFO]Finished at:SatJul0414:32:44 MSK 
 2015[INFO]FinalMemory:30M/450M[INFO]
 However in  
 /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/ there 
 is not jar file. Only pom.

 How can I get jena-osgi.jar? I have never had such strange problem with 
 getting jar file.














-- 
Alex Sviridov


Problem with getting jena jar filies.

2015-07-04 Thread Alex Sviridov

I can't get apache-jena-osgi jar. I tried central repo
dependency
    groupIdorg.apache.jena/groupId
    artifactIdapache-jena-osgi/artifactId
    version2.13.0/version
/dependency
but constanly get
The POM for org.apache.jena:jena-osgi:jar:2.12.2-SNAPSHOT is missing, no 
dependency information available


Finally I downloaded sources and built it myself. Here is the output of maven 
install:
[INFO]--- maven-install-plugin:2.5.2:install (default-install)@ jena-osgi 
---[INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/target/jena-osgi-3.0.0-SNAPSHOT.jar
 to 
/home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
[INFO]Installing/home/Me/SoftProjects/LIB/jena-master/apache-jena-osgi/jena-osgi/pom.xml
 to 
/home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.pom
[INFO][INFO]--- maven-bundle-plugin:2.5.3:install (default-install)@ jena-osgi 
---[INFO]Installing 
org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/jena-osgi-3.0.0-SNAPSHOT.jar
[INFO]Writing OBR metadata
[INFO][INFO]ReactorSummary:[INFO][INFO]ApacheJena-OSGi
 SUCCESS [4.898s][INFO]ApacheJena-OSGi bundle . SUCCESS 
[26.290s][INFO][INFO]
 BUILD SUCCESS
[INFO][INFO]Total
 time:32.665s[INFO]Finished at:SatJul0414:32:44 MSK 
2015[INFO]FinalMemory:30M/450M[INFO]
However in  /home/Me/.m2/repository/org/apache/jena/jena-osgi/3.0.0-SNAPSHOT/ 
there is not jar file. Only pom. 

How can I get jena-osgi.jar? I have never had such strange problem with getting 
jar file.



-- 
Alex Sviridov

Integrating D2R Update with Fuseki Server

2013-07-05 Thread Alex Lee
Hi,

I am trying to get a Fuseki server to interface with D2R Update as its data
source. My initial attempts have involved including the D2R packages in the
fuseki-server.jar and running the server with a simple D2R assembler config
file. Through this method, I've gotten as far as getting the server to
start-up with a data source, but have been unsuccessful in getting any
queries to work. As errors have been coming up, I've simply added or
removed any classes that have been causing problems, but this has been very
slow and brute-forced. I'm hoping there is a better solution that
preferably doesn't involve modifying source code.

The root of the issue seems to lie in the fact D2R Update uses Jena 2.6.3
and Jena ARQ 2.8.5, while the oldest version of Fuseki uses Jena 2.7.0 and
Jena ARQ 2.9.0. Based on the javadocs for each of these packages, it seems
there has been significant changes in their structures between versions,
such that D2R doesn't play nice with newer Jena/ARQ, and vice versa for
Fuseki with older Jena/ARQ.

My question is: Has anyone been able to successfully implement this setup?
Is it even possible to get this to work with just modifying various
run-time configurations, or is direct modification and re-compilation of
the code necessary?

Thanks,

--
Alex


Jena TDB with Java x64

2013-04-18 Thread Alex Shapiro
Hello,
Everything I've read about jena working in x64 environment tells that it should 
work faster than on x86.
But somehow when we migrated our system form x86 to x64 (Java 6 update 14, both 
32 and 64 bit versions) we encountered a slowdown in jena's performance - it 
works about 3 times slower than on 32 bit system. The only difference in 
configuration is maximum java heap size parameter - on x86 it is 1Gb and on x64 
it is 2Gb.
Does anybody know any reason for such behavior or how can we fix this?

Alexander Shapiro
Software Engineer
dbMotion Ltd.


RE: Two tdb instances using same data files

2013-03-11 Thread Alex Shapiro
Thank you Marco and Andy! I perfectly understand that changes made in one JVM 
will not update the model in second JVM and that this is in general a bad idea 
:-). We are working on changing the architecture of our application. Meanwhile, 
let's say I know when the update is done in one JVM and can notify second JVM 
about the change - will it help to close the model in second JVM and reopen it 
or reset the model somehow to get the changes made in first JVM?

Alex



-Original Message-
From: Andy Seaborne [mailto:andy.seaborne.apa...@gmail.com] On Behalf Of Andy 
Seaborne
Sent: Sunday, March 10, 2013 20:15
To: users@jena.apache.org
Subject: Re: Two tdb instances using same data files

On 10/03/13 17:05, Marco Neumann wrote:
 Ok yes so this is a very bad idea as mentioned earlier. I would 
 consider to replace the file access with an endpoint and execute 
 select and update via SPARQL.

Yes, use an endpoint - use Fuseki as a shared database server.

It will go wrong otherwise.  Even with having an external lock and sync'ing the 
database inside the exclusive writer lock, does not make it work. The two JVMs 
will still see inconsistent views of the database, and it will get corrupted.  
A write action by JVM1 does not update caches in JVM2.

Andy






RE: Two tdb instances using same data files

2013-03-10 Thread Alex Shapiro
Generally there is a web services on each machine that passes the requests to 
tdb.
We are using TDBFactory.createModel(shared_files_dir_location) to open a 
model.
All the manipulations are done with model object.
Yes, I know that this is a bad idea :-) We are working on this.

Alex



-Original Message-
From: Marco Neumann [mailto:marco.neum...@gmail.com] 
Sent: Sunday, March 10, 2013 15:20
To: users@jena.apache.org
Subject: Re: Two tdb instances using same data files

how do you access the tdb databases? in general it's a bad idea to grant access 
to the files to more than one client.




On Sun, Mar 10, 2013 at 9:12 AM, Alex Shapiro alex.shap...@dbmotion.com wrote:
 Hi,
 We have 2 tdb instances (2 JVMs on separate machines) that access the same 
 data files on shared location. No simultaneous WRITE operations are allowed. 
 The question is whether we should reset/update/close and open again the model 
 on second JVM after WRITE operation was executed on first one? If the answer 
 is yes - how do we do this?
 We have an old version of tdb - 0.8.9.

 Thanks in advance,

 Alexander Shapiro
 Software Engineer
 dbMotion Ltd.
 www.dbMotion.com





-- 


---
Marco Neumann
KONA




RE: Two tdb instances using same data files

2013-03-10 Thread Alex Shapiro
There is an external lock mechanism we use to prevent concurrent write. There 
are simply no write requests that we allow to process in the same time.

Alex



-Original Message-
From: Marco Neumann [mailto:marco.neum...@gmail.com] 
Sent: Sunday, March 10, 2013 16:01
To: users@jena.apache.org
Subject: Re: Two tdb instances using same data files

how do you guarantee that there are no concurrent read/writes on the files in 
the current setup?

On Sun, Mar 10, 2013 at 9:33 AM, Alex Shapiro alex.shap...@dbmotion.com wrote:
 Generally there is a web services on each machine that passes the requests to 
 tdb.
 We are using TDBFactory.createModel(shared_files_dir_location) to open a 
 model.
 All the manipulations are done with model object.
 Yes, I know that this is a bad idea :-) We are working on this.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 15:20
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 how do you access the tdb databases? in general it's a bad idea to grant 
 access to the files to more than one client.




 On Sun, Mar 10, 2013 at 9:12 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 Hi,
 We have 2 tdb instances (2 JVMs on separate machines) that access the same 
 data files on shared location. No simultaneous WRITE operations are allowed. 
 The question is whether we should reset/update/close and open again the 
 model on second JVM after WRITE operation was executed on first one? If the 
 answer is yes - how do we do this?
 We have an old version of tdb - 0.8.9.

 Thanks in advance,

 Alexander Shapiro
 Software Engineer
 dbMotion Ltd.
 www.dbMotion.com





 --


 ---
 Marco Neumann
 KONA





-- 


---
Marco Neumann
KONA




RE: Two tdb instances using same data files

2013-03-10 Thread Alex Shapiro
I'm not sure I understand the question. The server sides on both machines are 
always on - the model is created/opened once when the server side is started.
The http requests are processed separately - there is no session stored for 
http connection.

Alex



-Original Message-
From: Marco Neumann [mailto:marco.neum...@gmail.com] 
Sent: Sunday, March 10, 2013 17:42
To: users@jena.apache.org
Subject: Re: Two tdb instances using same data files

they are non static connections?

On Sun, Mar 10, 2013 at 11:40 AM, Alex Shapiro alex.shap...@dbmotion.com 
wrote:
 There is an external lock mechanism we use to prevent concurrent write. There 
 are simply no write requests that we allow to process in the same time.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 16:01
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 how do you guarantee that there are no concurrent read/writes on the files in 
 the current setup?

 On Sun, Mar 10, 2013 at 9:33 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 Generally there is a web services on each machine that passes the requests 
 to tdb.
 We are using TDBFactory.createModel(shared_files_dir_location) to open a 
 model.
 All the manipulations are done with model object.
 Yes, I know that this is a bad idea :-) We are working on this.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 15:20
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 how do you access the tdb databases? in general it's a bad idea to grant 
 access to the files to more than one client.




 On Sun, Mar 10, 2013 at 9:12 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 Hi,
 We have 2 tdb instances (2 JVMs on separate machines) that access the same 
 data files on shared location. No simultaneous WRITE operations are 
 allowed. The question is whether we should reset/update/close and open 
 again the model on second JVM after WRITE operation was executed on first 
 one? If the answer is yes - how do we do this?
 We have an old version of tdb - 0.8.9.

 Thanks in advance,

 Alexander Shapiro
 Software Engineer
 dbMotion Ltd.
 www.dbMotion.com





 --


 ---
 Marco Neumann
 KONA





 --


 ---
 Marco Neumann
 KONA





-- 


---
Marco Neumann
KONA




RE: Two tdb instances using same data files

2013-03-10 Thread Alex Shapiro
It keeps the reference to model in memory.

Alex


-Original Message-
From: Marco Neumann [mailto:marco.neum...@gmail.com] 
Sent: Sunday, March 10, 2013 18:25
To: users@jena.apache.org
Subject: Re: Two tdb instances using same data files

does the app terminate after each connection or do they constantly hold a 
reference in memory to the model db?


On Sun, Mar 10, 2013 at 12:22 PM, Alex Shapiro alex.shap...@dbmotion.com 
wrote:
 I'm not sure I understand the question. The server sides on both machines are 
 always on - the model is created/opened once when the server side is started.
 The http requests are processed separately - there is no session stored for 
 http connection.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 17:42
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 they are non static connections?

 On Sun, Mar 10, 2013 at 11:40 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 There is an external lock mechanism we use to prevent concurrent write. 
 There are simply no write requests that we allow to process in the same time.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 16:01
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 how do you guarantee that there are no concurrent read/writes on the files 
 in the current setup?

 On Sun, Mar 10, 2013 at 9:33 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 Generally there is a web services on each machine that passes the requests 
 to tdb.
 We are using TDBFactory.createModel(shared_files_dir_location) to open a 
 model.
 All the manipulations are done with model object.
 Yes, I know that this is a bad idea :-) We are working on this.

 Alex



 -Original Message-
 From: Marco Neumann [mailto:marco.neum...@gmail.com]
 Sent: Sunday, March 10, 2013 15:20
 To: users@jena.apache.org
 Subject: Re: Two tdb instances using same data files

 how do you access the tdb databases? in general it's a bad idea to grant 
 access to the files to more than one client.




 On Sun, Mar 10, 2013 at 9:12 AM, Alex Shapiro alex.shap...@dbmotion.com 
 wrote:
 Hi,
 We have 2 tdb instances (2 JVMs on separate machines) that access the same 
 data files on shared location. No simultaneous WRITE operations are 
 allowed. The question is whether we should reset/update/close and open 
 again the model on second JVM after WRITE operation was executed on first 
 one? If the answer is yes - how do we do this?
 We have an old version of tdb - 0.8.9.

 Thanks in advance,

 Alexander Shapiro
 Software Engineer
 dbMotion Ltd.
 www.dbMotion.com





 --


 ---
 Marco Neumann
 KONA





 --


 ---
 Marco Neumann
 KONA





 --


 ---
 Marco Neumann
 KONA





-- 


---
Marco Neumann
KONA




Re: Want to run SPARQL Query with Hadoop Map Reduce Framework

2012-06-26 Thread Alex Miller
 Right now I am only using DBPedia, Geoname and NYTimes for LOD cloud. And
 later on I want to extend my dataset.

 By the way, yes, I can use sparql directly to collect my required
 statistics but my assumption is using Hadoop could give me some boosting in
 collecting those stat.

 Sincerely
 Md Mizanur


Hello Md,

The Revelytix Spinner product supports SPARQL in Hadoop if you're
interested (SPARQL translated to map/reduce jobs). To fully use the
parallelism of Hadoop you would need to import all of the data.  You might
also find that just using Spinner outside of Hadoop, simple federation via
SERVICE extension might be sufficient and that is also supported.

http://www.revelytix.com/content/download-spinner

Alex Miller


Performance/optimization of minus queries

2012-06-11 Thread Alex Hall
I started a Fuseki server (using the latest 0.2.3-SNAPSHOT release) with a
TDB database using a default configuration, and loaded a file with ~500K
triples into a graph called data:input. Now, I'm trying to do some
validation on that data, specifically find resources that use a property
but are not explicitly declared as members of that property's domain:

SELECT (count(*) as ?c) WHERE {
 GRAPH data:input {
  ?p rdfs:domain ?d . ?s ?p ?o
  MINUS { ?s a ?d } } }

(I know that if we're using rdfs:domain then any subjects using that
property can be inferred to be members of that property's domain, but
that's beside the point).

This query doesn't return in any reasonable amount of time (I let it run
for about half an hour). So, my next step was to eliminate the join in this
query using a temporary graph:

INSERT { GRAPH data:output { ?s temp:typeByDomain ?d } } WHERE {
 GRAPH data:input {
  ?p rdfs:domain ?d . ?s ?p ?o } }

SELECT (count(*) as ?c) WHERE {
  GRAPH data:output { ?s temp:typeByDomain ?d }
  MINUS { GRAPH data:input { ?s a ?d } } }

This query takes about 15 minutes to execute on my machine -- still longer
than I'd like, but at least it's progress.

Next I attempted to eliminate the effects of materializing the entire
result set by converting this to an ASK query:

ASK WHERE {
  GRAPH data:output { ?s temp:typeByDomain ?d }
  MINUS { GRAPH data:input { ?s a ?d } } }

This query takes about 5 minutes to complete, which is certainly better
than not completing at all but still slower than I would like. Is there any
way to tune or optimize TDB to better handle this query? As I mentioned, I
am using the default TDB configuration (just specifying --loc with an empty
directory to the fuseki-server script and accepting whatever it gives me).
From what I can tell in the online help, most of the performance tuning
relates to the ordering of triple patterns within a join. Are there any
other suggestions to try?

FWIW, here are the approximate cardinalities of the various query patterns
in my dataset:
?s ?p ?o: 532,000
?p rdfs:domain ?d: 200
{?p rdfs:domain ?d . ?s ?p ?o}: 62,000
{?s rdf:type ?d}: 37,000
{?p rdfs:domain ?d . ?s ?p ?o} MINUS { ?s rdf:type ?d }: 39,000

Thanks,
Alex