Re: Jena native store indexes

2017-04-26 Thread james anderson
good morning;

> On 2017-04-26, at 10:33, baran...@gmail.com wrote:
> 
> […]
> 
> Is it so far fetched thinking to standardise 'only' the SPARQL 'syntax' for 
> text-indexing?

that would not resolve your complaint.
there are proposals which use facilities described in the recommendation to 
extend bgp entailment by associating alternative matching mechanisms with 
particular (combinations of) predicates.
with those, the syntax need not change, but the problem shifts to agreeing on 
the entailment regime.

best regards, from berlin,
---
james anderson | ja...@dydra.com | http://dydra.com







Re: Jena native store indexes

2017-04-26 Thread baran . ha

Hello,

you generelized the problem of standardisating suggesting to standardise  
first the extensions as an important step to the mean standardisation, i  
think. To formulate similar things are essentially important for all users  
and very stimulating. Since so many years the first eMail i printed and  
ready to further study on my desk for next days.


The next thing text-indexing: I have had nothing to do with text-indexing  
implementations, but i can imagine what a 'huge' problem it can be NOW to  
'standardise' it for SPARQL. Otherwise, on Fuseki side i can add  
text-indexing for 'all' interested properties and on the Virtuoso side i  
can use bif:contains extension:


Is it so far fetched thinking to standardise 'only' the SPARQL 'syntax'  
for text-indexing?


Thank you very much, baran

PS: Since about 6-7 years a am on the other side of this environment:  
Querying UI for public endpoints fluently changing from this to that one,  
pure HTML+Javascript thing...


***

On Tue, 25 Apr 2017 12:37:42 +0200, Rob Vesse  wrote:

Actually, no I am not fundamentally satisfied. I was trying to explain  
how the current situation came to be in reply to your assertion that  
“some idiocy” was responsible and in the context of your specific  
complaint about to text indexing.


In general property functions as they exist in a variety of  
implementations all try to address a limitation of the language in that  
we have limited ways to introduce new solutions into a query:


1 - Pattern matches
2 - BIND()/Project Expressions
3 - Aggregation
4 - Values

2 is limited in that you can only introduce additional columns to  
pre-existing solutions introduced by the other forms, 3 is limited in  
that it reduces data. 4 only permits static data


What I would like to see in the language is a generalised mechanism to  
allow inserting extensions that expand the possible solutions e.g.


SELECT *
WHERE
{
 ?s a  .
 INVOKE (?s, “arg1”, “arg2”) RETURNS (?o)
 ?s ?p ?o
}

However, no such extension exists currently to my knowledge nor do I  
have the free time to investigate the potential ways to implement such a  
solution. If no such extensions come into existence then there is very  
little chance that they would make their way into future standards. So I  
can complain about this all I want but it won’t change anything.


On the other hand, text indexing which is by now a widely supported  
extension will likely be a prime candidate for future standardisation


 There are other limitations in the language that have been discussed on  
these lists in the past e.g. Supporting custom aggregations. Why doesn’t  
the language supports standard deviation as a standard aggregate?  
Ultimately a working group has limited time and limited scope, not  
everything that everybody wants present in the language Will make it  
into the standard. That is why we have vendor specific extensions  
despite all the other interoperability problems that those create for  
myself and other users.


I would reiterate the point I often make when people ask why X cannot  
achieve Y:


A tool is designed for a specific set of jobs, it is not designed to  
solve every possible problem!  Don’t forget that you are a programmer  
and that you have a general-purpose programming language at your  
disposal.  You can use this to achieve Solutions to many more problems  
than your tool alone provides for.


Rob

On 24/04/2017 12:30, "baran...@gmail.com"  wrote:

Where SPARQL is now relating to text-indexing, this is  
'fundamentally' not

acceptable for me. And you seem to be 'fundamentally' satisfied...







--
Using Opera's mail client: http://www.opera.com/mail/


Re: Jena native store indexes

2017-04-25 Thread Rob Vesse
Actually, no I am not fundamentally satisfied. I was trying to explain how the 
current situation came to be in reply to your assertion that “some idiocy” was 
responsible and in the context of your specific complaint about to text 
indexing.

In general property functions as they exist in a variety of implementations all 
try to address a limitation of the language in that we have limited ways to 
introduce new solutions into a query:

1 - Pattern matches
2 - BIND()/Project Expressions
3 - Aggregation
4 - Values

2 is limited in that you can only introduce additional columns to pre-existing 
solutions introduced by the other forms, 3 is limited in that it reduces data. 
4 only permits static data

What I would like to see in the language is a generalised mechanism to allow 
inserting extensions that expand the possible solutions e.g.

SELECT *
WHERE
{
 ?s a  .
 INVOKE (?s, “arg1”, “arg2”) RETURNS (?o)
 ?s ?p ?o
}

However, no such extension exists currently to my knowledge nor do I have the 
free time to investigate the potential ways to implement such a solution. If no 
such extensions come into existence then there is very little chance that they 
would make their way into future standards. So I can complain about this all I 
want but it won’t change anything.

On the other hand, text indexing which is by now a widely supported extension 
will likely be a prime candidate for future standardisation

 There are other limitations in the language that have been discussed on these 
lists in the past e.g. Supporting custom aggregations. Why doesn’t the language 
supports standard deviation as a standard aggregate? Ultimately a working group 
has limited time and limited scope, not everything that everybody wants present 
in the language Will make it into the standard. That is why we have vendor 
specific extensions despite all the other interoperability problems that those 
create for myself and other users.

I would reiterate the point I often make when people ask why X cannot achieve Y:

A tool is designed for a specific set of jobs, it is not designed to solve 
every possible problem!  Don’t forget that you are a programmer and that you 
have a general-purpose programming language at your disposal.  You can use this 
to achieve Solutions to many more problems than your tool alone provides for.

Rob

On 24/04/2017 12:30, "baran...@gmail.com"  wrote:

Where SPARQL is now relating to text-indexing, this is 'fundamentally' not  
acceptable for me. And you seem to be 'fundamentally' satisfied...








Re: Jena native store indexes

2017-04-25 Thread baran . ha
On Mon, 24 Apr 2017 15:05:19 +0200, Martynas Jusevičius  
 wrote:


"Should have been, could have been". It is how it is, your opinion is  
just

one of many and you will achieve nothing by complaining on this list. Go
create a W3C Community Group and initiate some real work to achieve the
standardisation that you think is required.


You 'fundamentally misunderstand' what i want to achieve: A bit  
background-info about problems waiting in my backhead since so many years  
for answers and for me the comments of Rob or Andy has been very  
informative how they think and vice versa they got how a user thinks about  
this and that in the context of this thread.


The rest sounds to me like a posting to 'Army Times' though i miss '!' at  
the end.


baran

***



On Mon, 24 Apr 2017 at 13.30,  wrote:



Hello,

> You seem to fundamentally misunderstand how the standardisation  
process

> works.

The point is not whether i understand standardisation or not, the point  
is

your argument

>   At the time that SPARQL 1.1 was standardised indexing was not a
> widely used extension so there was no impetus to standardise it.

No supply, no demand. The torture creating for each property  
text-indexing

out of SPARQL syntax and than beeing even not compatible to other SPARQL
implementations yields no statistical statement whether text-indexing  
has

been widely used or not.

In my posting i pointed up, text-indexing should have had top priority
starting from scratch to develope a query language for Semantic Web
environment, you don't think so and this has nothing to do with
'fundamental' knowledge of a user, this has something to do setting
different priorities.

Where SPARQL is now relating to text-indexing, this is 'fundamentally'  
not

acceptable for me. And you seem to be 'fundamentally' satisfied...

baran

*



  One might imagine that a future round of standardisation
> would choose to consider this as one candidate for a new feature in a
> future  Version of the standard.
>
> Rob
>
> On 22/04/2017 11:02, "baran...@gmail.com"  wrote:
>
> ...(text search with text-indexing) cannot be offically expressed  
in

> SPARQL.
>I don't think Jena Development was responsible for this, but i  
assume

> they
> know who and i as a user want also know who is in the history of
> SPARQL
> development responsible for this idiocy...
>
>
>
>


--
Using Opera's mail client: http://www.opera.com/mail/




--
Using Opera's mail client: http://www.opera.com/mail/


Re: Jena native store indexes

2017-04-24 Thread Martynas Jusevičius
"Should have been, could have been". It is how it is, your opinion is just
one of many and you will achieve nothing by complaining on this list. Go
create a W3C Community Group and initiate some real work to achieve the
standardisation that you think is required.

On Mon, 24 Apr 2017 at 13.30,  wrote:

>
> Hello,
>
> > You seem to fundamentally misunderstand how the standardisation process
> > works.
>
> The point is not whether i understand standardisation or not, the point is
> your argument
>
> >   At the time that SPARQL 1.1 was standardised indexing was not a
> > widely used extension so there was no impetus to standardise it.
>
> No supply, no demand. The torture creating for each property text-indexing
> out of SPARQL syntax and than beeing even not compatible to other SPARQL
> implementations yields no statistical statement whether text-indexing has
> been widely used or not.
>
> In my posting i pointed up, text-indexing should have had top priority
> starting from scratch to develope a query language for Semantic Web
> environment, you don't think so and this has nothing to do with
> 'fundamental' knowledge of a user, this has something to do setting
> different priorities.
>
> Where SPARQL is now relating to text-indexing, this is 'fundamentally' not
> acceptable for me. And you seem to be 'fundamentally' satisfied...
>
> baran
>
> *
>
>
>
>   One might imagine that a future round of standardisation
> > would choose to consider this as one candidate for a new feature in a
> > future  Version of the standard.
> >
> > Rob
> >
> > On 22/04/2017 11:02, "baran...@gmail.com"  wrote:
> >
> > ...(text search with text-indexing) cannot be offically expressed in
> > SPARQL.
> >I don't think Jena Development was responsible for this, but i assume
> > they
> > know who and i as a user want also know who is in the history of
> > SPARQL
> > development responsible for this idiocy...
> >
> >
> >
> >
>
>
> --
> Using Opera's mail client: http://www.opera.com/mail/
>


Re: Jena native store indexes

2017-04-24 Thread baran . ha


Hello,

You seem to fundamentally misunderstand how the standardisation process  
works.


The point is not whether i understand standardisation or not, the point is  
your argument


  At the time that SPARQL 1.1 was standardised indexing was not a  
widely used extension so there was no impetus to standardise it.


No supply, no demand. The torture creating for each property text-indexing  
out of SPARQL syntax and than beeing even not compatible to other SPARQL  
implementations yields no statistical statement whether text-indexing has  
been widely used or not.


In my posting i pointed up, text-indexing should have had top priority  
starting from scratch to develope a query language for Semantic Web  
environment, you don't think so and this has nothing to do with  
'fundamental' knowledge of a user, this has something to do setting  
different priorities.


Where SPARQL is now relating to text-indexing, this is 'fundamentally' not  
acceptable for me. And you seem to be 'fundamentally' satisfied...


baran

*



 One might imagine that a future round of standardisation
would choose to consider this as one candidate for a new feature in a  
future  Version of the standard.


Rob

On 22/04/2017 11:02, "baran...@gmail.com"  wrote:

...(text search with text-indexing) cannot be offically expressed in
SPARQL.
   I don't think Jena Development was responsible for this, but i assume  
they
know who and i as a user want also know who is in the history of  
SPARQL

development responsible for this idiocy...







--
Using Opera's mail client: http://www.opera.com/mail/


Re: Jena native store indexes

2017-04-24 Thread Andy Seaborne

On 24/04/17 10:57, Rob Vesse wrote:

You seem to fundamentally misunderstand how the standardisation
process works. The intent of a standard is never to specify every
feature that exists or that could exist but rather to specify a set
of standard functionality that will be useful to end users while also
being amenable to multiple interoperable implementations.

For a technology like text indexing where there is a huge variety of
approaches standardising would be hugely difficult. For example if
you pick a particular technology e.g. Lucene then you automatically
exclude any implementations in languages/environments where Lucene is
not usable. If you specify a behaviour then you potentially create a
huge burden for implementers in trying to make disparate underlying
Technologies produce a specific set of answers is for a specific set
of standardised test cases that may be of little relation to
real-world use cases.


Indeed, all those issues.

For SPARQL 1.1, text indexing was discussed as a possible work item 
(consult the email archives) but the task was huge. There was no 
standard text search language (unlike regex, which are defined by XQuery).


No one volunteered to do the work.

Typical WG lifecycle - reasonable number of people at the start when 
defining the work program, fewer to do the work, fewer still to complete 
the work, respond to comments, etc.


Andy



Additionally each round of standardisation takes input based upon
commonly used extensions in the real-world as input and works to
standardise those.  At the time that SPARQL 1.1 was standardised
indexing was not a widely used extension so there was no impetus to
standardise it. One might imagine that a future round of
standardisation would choose to consider this as one candidate for a
new feature in a future  Version of the standard.

Rob

On 22/04/2017 11:02, "baran...@gmail.com" 
wrote:

...(text search with text-indexing) cannot be offically expressed in
 SPARQL.

I don't think Jena Development was responsible for this, but i assume
they know who and i as a user want also know who is in the history of
SPARQL development responsible for this idiocy...






Re: Jena native store indexes

2017-04-24 Thread Rob Vesse
You seem to fundamentally misunderstand how the standardisation process works. 
The intent of a standard is never to specify every feature that exists or that 
could exist but rather to specify a set of standard functionality that will be 
useful to end users while also being amenable to multiple interoperable 
implementations. 

For a technology like text indexing where there is a huge variety of approaches 
standardising would be hugely difficult. For example if you pick a particular 
technology e.g. Lucene then you automatically exclude any implementations in 
languages/environments where Lucene is not usable. If you specify a behaviour 
then you potentially create a huge burden for implementers in trying to make 
disparate underlying Technologies produce a specific set of answers is for a 
specific set of standardised test cases that may be of little relation to 
real-world use cases.

Additionally each round of standardisation takes input based upon commonly used 
extensions in the real-world as input and works to standardise those.  At the 
time that SPARQL 1.1 was standardised indexing was not a widely used extension 
so there was no impetus to standardise it. One might imagine that a future 
round of standardisation would choose to consider this as one candidate for a 
new feature in a future  Version of the standard.

Rob

On 22/04/2017 11:02, "baran...@gmail.com"  wrote:

...(text search with text-indexing) cannot be offically expressed in  
SPARQL.

I don't think Jena Development was responsible for this, but i assume they  
know who and i as a user want also know who is in the history of SPARQL  
development responsible for this idiocy...






Re: Jena native store indexes

2017-04-22 Thread dandh988
Idiocy IMHO is rather strong. If Jena provided specialist text indexing 
natively why doesn't it provide other indexing? I process IFC files extensively 
and use stored inference and secondary indexing to handle the quirks of the IFC 
format. I would not expect Jena or SPARQL to provide native support for the 
queries required.


Dick
 Original message From: baran...@gmail.com Date: 22/04/2017  
11:02  (GMT+00:00) To: users@jena.apache.org Subject: Re: Jena native store 
indexes 
On Wed, 12 Apr 2017 15:01:34 +0200, Rob Vesse <rve...@dotnetrdf.org> wrote:

> .
> In the RDF world it may still be useful to create secondary indexes as  
> others have noted for certain kinds of specialised search that cannot be  
> officially expressed in SPARQL.

Here is primarily text indexing meant, i assume.

But alone the object literals of my rdfs:label's are definitly not  
'secondary' indexing, i know what a performance jump it makes and i think  
text-indexing for 'all' corresponding properties must have 'top-priority'  
in Semantic Web query-issues guessing from my experience with querying  
clients.

And from the statement above i can easily reason:

...(text search with text-indexing) cannot be offically expressed in  
SPARQL.

I don't think Jena Development was responsible for this, but i assume they  
know who and i as a user want also know who is in the history of SPARQL  
development responsible for this idiocy...

baran

-- 
Using Opera's mail client: http://www.opera.com/mail/


Re: Jena native store indexes

2017-04-22 Thread baran . ha

On Wed, 12 Apr 2017 15:01:34 +0200, Rob Vesse  wrote:


.
In the RDF world it may still be useful to create secondary indexes as  
others have noted for certain kinds of specialised search that cannot be  
officially expressed in SPARQL.


Here is primarily text indexing meant, i assume.

But alone the object literals of my rdfs:label's are definitly not  
'secondary' indexing, i know what a performance jump it makes and i think  
text-indexing for 'all' corresponding properties must have 'top-priority'  
in Semantic Web query-issues guessing from my experience with querying  
clients.


And from the statement above i can easily reason:

...(text search with text-indexing) cannot be offically expressed in  
SPARQL.


I don't think Jena Development was responsible for this, but i assume they  
know who and i as a user want also know who is in the history of SPARQL  
development responsible for this idiocy...


baran

--
Using Opera's mail client: http://www.opera.com/mail/


Re: Jena native store indexes

2017-04-12 Thread Rob Vesse
A RDF store is basically a four column database so and implementation can 
automatically construct the necessary indexes to be able to Service any simple 
scan i.e. Basic graph pattern. Efficient answering of queries can be done by 
having a sufficiently smart optimiser and using precomputed statistics about 
the data to perform the index scans and joins in the most efficient order.

This is very different from the relational databases which have to deal with 
arbitrarily structured tables and which typically only index on the primary and 
foreign keys of tables by default. Therefore in the relational world it is 
common to define your own custom indexes based on how your application accesses 
the data e.g. A persons name is unlikely to be a primary key but is often used 
to search the database.

In the RDF world it may still be useful to create secondary indexes as others 
have noted for certain kinds of specialised search that cannot be officially 
expressed in SPARQL.


On 11/04/2017 18:30, "Laura Morales"  wrote:

But is Jena (or any RDF store for what matters) expected to perform well 
even if I don't explicitly add any index?


> You 'can' create text-indexes for selected properties of your data for
> text search with a much better performance:
> 
> https://jena.apache.org/documentation/query/text-query.html







Re: Jena native store indexes

2017-04-11 Thread A. Soroka
The Jena list can't really answer questions about "any RDF store", but for TDB, 
you begin with basic covering indexes, so you do not need to add anything (in 
fact you cannot add anything) to provide more indexing for standard SPARQL 
forms.

As has been pointed out, there are _extensions_ to SPARQL provided by Jena that 
can make use of additional indexes:

https://jena.apache.org/documentation/query/text-query.html

and

https://jena.apache.org/documentation/query/spatial-query.html
---
A. Soroka
The University of Virginia Library

> On Apr 11, 2017, at 1:30 PM, Laura Morales  wrote:
> 
> But is Jena (or any RDF store for what matters) expected to perform well even 
> if I don't explicitly add any index?
> 
> 
>> You 'can' create text-indexes for selected properties of your data for
>> text search with a much better performance:
>> 
>> https://jena.apache.org/documentation/query/text-query.html



Re: Jena native store indexes

2017-04-11 Thread Laura Morales
But is Jena (or any RDF store for what matters) expected to perform well even 
if I don't explicitly add any index?


> You 'can' create text-indexes for selected properties of your data for
> text search with a much better performance:
> 
> https://jena.apache.org/documentation/query/text-query.html


Re: Jena native store indexes

2017-04-11 Thread baran . ha


When writing SPARQL queries, should I be aware of any particular  
index? Should I create new indexes myself (how)?


You 'can' create text-indexes for selected properties of your data for  
text search with a much better performance:


https://jena.apache.org/documentation/query/text-query.html

--
Using Opera's mail client: http://www.opera.com/mail/


Jena native store indexes

2017-04-11 Thread Laura Morales
With RDBMSes, indexes are a bit topic and should always be taken into 
consideration when writing queries. With RDF stores instead, I barely see them 
mentioned at all. So I was wondering how indexes work in Jena native store, or 
other native stores in general. When writing SPARQL queries, should I be aware 
of any particular index? Should I create new indexes myself (how)?