Re: [RT] New resource query API

2015-06-10 Thread Alexander Klimetschek
On 09.06.2015, at 22:24, Carsten Ziegeler cziege...@apache.org wrote:
 If you mean whatever query language is supported by mongo or other nosql 
 providers (and fits into a string), then that's the best you can do on the 
 Sling layer.
 
 Thanks for confirming my point why an abstraction is necessary :)

But that's my point: introducing abstraction here comes at great cost. It's 
much easier to set up an external search index such as Solr, pump data from 
different backends (behind the different resource providers) and query this one 
directly for a search across everything (if this is what an application needs).

Cheers,
Alex



Re: [RT] New resource query API

2015-06-09 Thread Alexander Klimetschek
On 04.06.2015, at 21:56, Carsten Ziegeler cziege...@apache.org wrote:
 Ok, could you then please provide an implementation for the mongo
 resource provider or one of the other nosql providers?

If you mean jcr xpath (or any other jcr/oak supported query language), no.

If you mean whatever query language is supported by mongo or other nosql 
providers (and fits into a string), then that's the best you can do on the 
Sling layer.

Cheers,
Alex

Re: [RT] New resource query API

2015-06-09 Thread Carsten Ziegeler
Am 09.06.15 um 21:01 schrieb Alexander Klimetschek:
 On 04.06.2015, at 21:56, Carsten Ziegeler cziege...@apache.org wrote:
 Ok, could you then please provide an implementation for the mongo
 resource provider or one of the other nosql providers?
 
 If you mean jcr xpath (or any other jcr/oak supported query language), no.
 
 If you mean whatever query language is supported by mongo or other nosql 
 providers (and fits into a string), then that's the best you can do on the 
 Sling layer.
 
Thanks for confirming my point why an abstraction is necessary :)

Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-04 Thread Carsten Ziegeler
Am 04.06.15 um 15:52 schrieb Alexander Klimetschek:
 On 03.06.2015, at 14:52, Carsten Ziegeler cziege...@apache.org wrote:
 Well, let's agree that we disagree here. For the majority of users,
 there is only JCR anyway, which means there is no difference between
 using a nice api and fiddling with strings by hand when it comes to
 performance.
 
 And that's exaclty the argument for not having a new query API that is 
 agnostic to the individual resource provider implementations and would 
 support aggregation.
 
 What's there with findResources() is fine - just do your jcr query there. 
 Only passing offset/limit is missing.
 
Ok, could you then please provide an implementation for the mongo
resource provider or one of the other nosql providers?

Thanks
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-04 Thread Alexander Klimetschek
On 03.06.2015, at 14:52, Carsten Ziegeler cziege...@apache.org wrote:
 Well, let's agree that we disagree here. For the majority of users,
 there is only JCR anyway, which means there is no difference between
 using a nice api and fiddling with strings by hand when it comes to
 performance.

And that's exaclty the argument for not having a new query API that is agnostic 
to the individual resource provider implementations and would support 
aggregation.

What's there with findResources() is fine - just do your jcr query there. Only 
passing offset/limit is missing.

Cheers,
Alex

Re: [RT] New resource query API

2015-06-03 Thread Carsten Ziegeler
Am 03.06.15 um 11:40 schrieb Alexander Klimetschek:
 On 02.06.2015, at 16:39, Carsten Ziegeler cziege...@apache.org wrote:
 The query contains the sort information (which properties and whether
 ascending or descending), so you can get the values of the props and
 compare them.
 
 But then you need to
 
 a) be able to understand and fully evaluate the query on the aggregate level

Nope, just the sorting

 b) cannot use an index for that and do a property read for every result entry

It's true, the property is read for every entry - but only(!) if there
is more than one resource provider providing results.

 
 I am just saying, you are getting into a very complex and performance 
 critical business. And even if you say users won't use such edge-case 
 queries, it's ok if we don't look for perfect performance, they will use it 
 (experience tells me, people always find ways to run difficult  slow queries 
 :D), and then you have just created a new performance critical area out of 
 nowhere.

Well, let's agree that we disagree here. For the majority of users,
there is only JCR anyway, which means there is no difference between
using a nice api and fiddling with strings by hand when it comes to
performance.

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-03 Thread Alexander Klimetschek
On 02.06.2015, at 16:39, Carsten Ziegeler cziege...@apache.org wrote:
 The query contains the sort information (which properties and whether
 ascending or descending), so you can get the values of the props and
 compare them.

But then you need to

a) be able to understand and fully evaluate the query on the aggregate level
b) cannot use an index for that and do a property read for every result entry

I am just saying, you are getting into a very complex and performance critical 
business. And even if you say users won't use such edge-case queries, it's ok 
if we don't look for perfect performance, they will use it (experience tells 
me, people always find ways to run difficult  slow queries :D), and then you 
have just created a new performance critical area out of nowhere.

Cheers,
Alex



Re: [RT] New resource query API

2015-06-02 Thread Bertrand Delacretaz
On Wed, May 27, 2015 at 10:35 AM, Bertrand Delacretaz
bdelacre...@apache.org wrote:
 ...I'd
 like to discuss what the alternatives are, before we invent YAQA (*)...

FWIW I've played a bit with the Oak query code to see if its parsers
are reusable.

That's not the case out of the box but it looks like refactoring Oak's
SQL2Parser and XPathToSQL2Converter to provide access to the abstract
syntax tree that they generate wouldn't be too hard.

Going this way would allow us to reuse (a subset of) the JCR query
languages, instead of inventing yet another one.

-Bertrand

 (*) Yet Another Query API ;-)


Re: [RT] New resource query API

2015-06-02 Thread Alexander Klimetschek
On 02.06.2015, at 05:17, Daniel Klco daniel.k...@gmail.com wrote:
 @Alex,
 
 Sorting wouldn't necessarily be slow.  What you could do is have the API
 return an iterator which wraps the sorted result iterator from the various
 resource providers.

And how do you know that result X from provider A is  result Y from provider B?

Cheers,
Alex



Re: [RT] New resource query API

2015-06-02 Thread Carsten Ziegeler
Am 02.06.15 um 12:32 schrieb Alexander Klimetschek:
 On 02.06.2015, at 05:17, Daniel Klco daniel.k...@gmail.com wrote:
 @Alex,

 Sorting wouldn't necessarily be slow.  What you could do is have the API
 return an iterator which wraps the sorted result iterator from the various
 resource providers.
 
 And how do you know that result X from provider A is  result Y from provider 
 B?
 
The query contains the sort information (which properties and whether
ascending or descending), so you can get the values of the props and
compare them.

Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-02 Thread Daniel Klco
@Alex,

Sorting wouldn't necessarily be slow.  What you could do is have the API
return an iterator which wraps the sorted result iterator from the various
resource providers.  This iterator would keep track of the next value of
the iterator from every resource provider result and return the closest
value for every next call of the wrapping iterator.  This should result in
performance of approximately On where n is the number of resource providers
for each next call.

In the case where sorting isn't provided, I would think it would just
interleave the results from the various resource providers.

-Dan

On Mon, Jun 1, 2015 at 5:00 PM, Alexander Klimetschek aklim...@adobe.com
wrote:

 On 01.06.2015, at 08:50, Carsten Ziegeler cziege...@apache.org wrote:
  So I think, we should support sorting across providers

 This means sorting will be very slow. Since you have to re-sort the
 partial results on the sling level without usage of an index.

 Cheers,
 Alex


Re: [RT] New resource query API

2015-06-02 Thread Carsten Ziegeler
Am 02.06.15 um 01:39 schrieb Bertrand Delacretaz:
 On Wed, May 27, 2015 at 10:35 AM, Bertrand Delacretaz 
 FWIW I've played a bit with the Oak query code to see if its parsers
 are reusable.
 
 That's not the case out of the box but it looks like refactoring Oak's
 SQL2Parser and XPathToSQL2Converter to provide access to the abstract
 syntax tree that they generate wouldn't be too hard.
 
 Going this way would allow us to reuse (a subset of) the JCR query
 languages, instead of inventing yet another one.
 
Which is a) not possible today, b) a dangerous route as we have to
clearly define the subset and c) still ugly;

Keep also in mind it's not just the provider that has to implement it,
but for sorting also the resource resolver implementation needs to be
aware of it.

I think we really should complete our abstraction.

Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-01 Thread Carsten Ziegeler
One thing we have to think of is how to use pagination and sorting.
A search could go across the whole resource tree hitting different
resource providers. For example one use case is getting all vanity paths.
Sorting such a search is possible, the resource resolver implementation
always picks one result from all providers, compares them, brings them
into an order etc. That's definitely doable.

However pagination is a different beast. Of course, for paging the
result needs to be sorted. Today, a common practice for doing pagination
is key based pagination: with the search result you get a key that is
used for the next page. We can easily do this if the search is just
hitting a single resource provider. However if the search targets more
than one, pagination becomes a problem. For example, if the search has a
page size of 20 and two providers might provide resources. Asking each
of them for 20 entries, is not very efficient. Asking each provider for
ten, hoping that it's evenly split does usually not work either. I guess
one could come up with some clever algorithm that solves the problem by
potentially doing two (or more ) searches against a provider. But this
will be complicated and not really perform well.

So I think, we should support sorting across providers, but pagination
only across a single provider and throw some exception if a search
potentially hits more than one provider. It's a limitation we can document.

WDYT?

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-01 Thread Alexander Klimetschek
On 30.05.2015, at 01:31, Stefan Seifert sseif...@pro-vision.de wrote:
 
 And this is a typical case where abstraction fails: performance. Which is
 extremely important for queries.
 
 Well, this is a broad statement and neither true nor wrong.
 
 i'm the same opinion as carsten. i did a quick check for the most queries in 
 our projects from the last years and most of them can be expressed with an 
 API like this and the code maintainability would benefit from it. and for new 
 developers its easier to learn a fluent API then a query syntax.

But do you have queries across resource providers? Do you know the 
implementation complexity and performance limitations you are asking for?

If you have different resource providers with their own search index, and you 
have to aggregate on the resource resolver/tree level, you basically start 
having JOINs without any indexes, and these will be slow.

 and the abstraction may even help improve performance for the unexperienced 
 ones - there was some time in jackrabbit 2 where the same query in either 
 xpath or sql syntax was quite differently in performance - if such an 
 abstraction is implemented in an intelligent way it could always use the most 
 performant query variant, and the user of the query does not have to care 
 about those implementation details. of course this makes the implementation 
 of the abstraction much more complex.

Choosing the most performent query is absolutely non trivial. And it requires 
your resource provider implementation to be able to ask it's underlying 
database/repository to figure that out (before you translate to the particular 
query itself). Here you'll have tons of places where the abstraction will leak 
and/or you can't get the right performance.

Also, be careful to tell (unexperienced) developers to magically do the most 
performance query - this just doesn't work that way in practice, especially if 
you have different backends involved, they will have to understand what's going 
on.

Cheers,
Alex

 btw. i assume we do not remove the old support for directly passing a query 
 string to the resource resolver, but add the additional support for the 
 abstraction? this would allow experienced developers who now they are only 
 using JCR still use direct JCR queries against the resource resolver.
 
 stefan 



RE: [RT] New resource query API

2015-06-01 Thread Stefan Seifert

But do you have queries across resource providers? Do you know the
implementation complexity and performance limitations you are asking for?

no, i never required searching across different providers in the past, it would 
even be ok for me to not support cross-provider searching in the beginning to 
keep things simple.


Choosing the most performent query is absolutely non trivial. And it
requires your resource provider implementation to be able to ask it's
underlying database/repository to figure that out (before you translate to the
particular query itself). Here you'll have tons of places where the
abstraction will leak and/or you can't get the right performance.

the idea is that there is one resource provider impl per backend, so each 
provider does only have to know the specialities of it's backend. but ok 
concerning JCR there are already two underlying implementations with JCR2 and 
oak. i agree that this is an absolutely nontrivial task and should not be part 
of a first implementation as well.

stefan



Re: [RT] New resource query API

2015-06-01 Thread Alexander Klimetschek
On 01.06.2015, at 08:50, Carsten Ziegeler cziege...@apache.org wrote:
 So I think, we should support sorting across providers

This means sorting will be very slow. Since you have to re-sort the partial 
results on the sling level without usage of an index.

Cheers,
Alex

Re: [RT] New resource query API

2015-06-01 Thread Carsten Ziegeler
Am 01.06.15 um 13:58 schrieb Alexander Klimetschek:

 But do you have queries across resource providers? Do you know the 
 implementation complexity and performance limitations you are asking for?
 
 If you have different resource providers with their own search index, and you 
 have to aggregate on the resource resolver/tree level, you basically start 
 having JOINs without any indexes, and these will be slow.

These are not joins, you get two (or more) result sets and you simply
return all of them.

Carsten


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-06-01 Thread Carsten Ziegeler
Am 01.06.15 um 14:00 schrieb Alexander Klimetschek:
 On 01.06.2015, at 08:50, Carsten Ziegeler cziege...@apache.org wrote:
 So I think, we should support sorting across providers
 
 This means sorting will be very slow. Since you have to re-sort the partial 
 results on the sling level without usage of an index.
 
No, that's not true - each provider sorts, so all you have to do is
merge the sorted results, which is trivial and does not require any
resorting or processing of the full data set

Carsten


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-30 Thread Carsten Ziegeler
Am 30.05.15 um 10:31 schrieb Stefan Seifert:

 btw. i assume we do not remove the old support for directly passing a query 
 string to the resource resolver, but add the additional support for the 
 abstraction? this would allow experienced developers who now they are only 
 using JCR still use direct JCR queries against the resource resolver.
 
Exactly, right - thanks for pointing this out.

Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-30 Thread Carsten Ziegeler
Am 30.05.15 um 03:17 schrieb Alexander Klimetschek:
 On 28.05.2015, at 23:08, Carsten Ziegeler cziege...@apache.org wrote:
 I agree with this, however users of the resource api do not know which
 provider is serving the resources. That's the hole point of an abstraction.
 
 And this is a typical case where abstraction fails: performance. Which is 
 extremely important for queries.
 
Well, this is a broad statement and neither true nor wrong.

Look at the use cases in Sling where we use a query, e.g. the job
handling. Replacing the current approach of generating a very long
string for the JCR query with a more modern api does not change anything
with respect to the query; but it provides the abstraction. The
execution of the query is still as fast or slow as before.
The same is true for most of the other technical queries we have in Sling.
A good abstraction provides code to run on different providers. Today,
we have abstracted nearly everything - with the only exception being a
query. Having a 80% abstraction is more or less as good as not having an
abstraction - which means, it's bad.
I don't understand why someone is opposed to complete the abstraction.
We are trying to reach this goal for a very long time now.


Carsten

-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


RE: [RT] New resource query API

2015-05-30 Thread Stefan Seifert
 And this is a typical case where abstraction fails: performance. Which is
extremely important for queries.

Well, this is a broad statement and neither true nor wrong.

i'm the same opinion as carsten. i did a quick check for the most queries in 
our projects from the last years and most of them can be expressed with an API 
like this and the code maintainability would benefit from it. and for new 
developers its easier to learn a fluent API then a query syntax.

and the abstraction may even help improve performance for the unexperienced 
ones - there was some time in jackrabbit 2 where the same query in either xpath 
or sql syntax was quite differently in performance - if such an abstraction is 
implemented in an intelligent way it could always use the most performant query 
variant, and the user of the query does not have to care about those 
implementation details. of course this makes the implementation of the 
abstraction much more complex.

btw. i assume we do not remove the old support for directly passing a query 
string to the resource resolver, but add the additional support for the 
abstraction? this would allow experienced developers who now they are only 
using JCR still use direct JCR queries against the resource resolver.

stefan 


Re: [RT] New resource query API

2015-05-29 Thread Carsten Ziegeler
Am 29.05.15 um 01:13 schrieb Alexander Klimetschek:
 On 28.05.2015, at 09:32, Carsten Ziegeler cziege...@apache.org wrote:

 Am 28.05.15 um 18:25 schrieb Alexander Klimetschek:
 On 27.05.2015, at 01:35, Bertrand Delacretaz bdelacre...@apache.org wrote:
 I'm happy to collaborate on creating these examples (which can simply
 be unit tests for a relevant ResourceProvider) but before that I'd
 like to discuss what the alternatives are, before we invent YAQA (*).

 We have the querybuilder [1] in our CQ/AEM product, we could contribute 
 this to Sling, I think (pending internal legal processes of course).

 I guess that would be nice, but isnt the query builder an http api? How
 is that translated to resource queries?
 
 No, it's both. It is a normal API [1], usage example at [2], but it also has 
 a form that is easy to transport over http using GET/POST parameters and 
 comes with a (popular) servlet that provides the result as json. The general 
 predicate format is not depending on an order or correctly nested brackets, 
 so you can easily build advanced search forms.
 
 [1] 
 https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/package-summary.html
 [2] 
 https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/QueryBuilder.html
 
Right but it's JCR based, any plans on basing this on resources?

Carsten


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-29 Thread Carsten Ziegeler
Am 29.05.15 um 01:16 schrieb Alexander Klimetschek:
 When you run a query across multiple backends, you have to aggregate the 
 results. This is non-trivial an in most cases you are better off using an 
 external search index that covers everything. And from my experience, you 
 usually you don't have the use case to search across different providers, 
 e.g. if you have a) a file system provider for bundles and code and b) a 
 database provider providing ecommerce order entries, you never search across 
 both at the same time.
 
I agree with this, however users of the resource api do not know which
provider is serving the resources. That's the hole point of an abstraction.

Carsten

 Cheers,
 Alex
 
 On 28.05.2015, at 11:06, Carsten Ziegeler cziege...@apache.org wrote:

 Just to clarify as it seems people got the proposal wrong: this is about
 a new API, not an implementation. It's an abstraction on the resource
 level. Of course with a JCR provider underneath, the search is delegated
 to that provider. Same with other providers.
 It should be easy for every provider to implement the api.

 Typical use cases are for example the job handling which searches for
 jobs in the resource tree or the resource resolver implementation
 looking for vanity paths etc.

 Right now - although these parts use the resource api - it's not
 possible to run them with a different provider than jcr.

 Carsten

 Am 18.05.15 um 08:17 schrieb Carsten Ziegeler:
 The current resource query api has several problems:
 - it's using the JCR spec to define a query
 - it's not clear which queries are supported by providers
 - queries are string based
 - implementing queries in a resource provider is way too hard as this
 would require to implement the complete jcr query api.

 I've created a draft for a new, object based API at [1]. The main idea
 is to use a builder pattern to create Query objects. This are immutable
 and have a unique identifier. The QueryManager service can be used to
 execute a query in the context of a resource resolver. The manager
 delegates the query to the providers. As each Query object has this
 identifier, implementations can use this to cache the parsing of the query.
 In addition to the query object you can pass in query instructions to
 specify a limit or range for the query.

 Obviously this is a reduced set compared to the full fledged jcr search
 api, however it should be suitable for the majority of use cases.

 [1]
 https://svn.apache.org/repos/asf/sling/whiteboard/cziegeler/api-v3/src/main/java/org/apache/sling/api/resource/query/

 Regards
 Carsten



 -- 
 Carsten Ziegeler
 Adobe Research Switzerland
 cziege...@apache.org
 
 


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-29 Thread Alexander Klimetschek
On 28.05.2015, at 23:08, Carsten Ziegeler cziege...@apache.org wrote:
 I agree with this, however users of the resource api do not know which
 provider is serving the resources. That's the hole point of an abstraction.

And this is a typical case where abstraction fails: performance. Which is 
extremely important for queries.

Cheers,
Alex

Re: [RT] New resource query API

2015-05-29 Thread Alexander Klimetschek
On 28.05.2015, at 23:07, Carsten Ziegeler cziege...@apache.org wrote:
 No, it's both. It is a normal API [1], usage example at [2], but it also has 
 a form that is easy to transport over http using GET/POST parameters and 
 comes with a (popular) servlet that provides the result as json. The general 
 predicate format is not depending on an order or correctly nested brackets, 
 so you can easily build advanced search forms.
 
 [1] 
 https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/package-summary.html
 [2] 
 https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/QueryBuilder.html
 
 Right but it's JCR based, any plans on basing this on resources?

It converts into a JCR xpath query, yes. But other than that it gives both JCR 
(Node) and Resource API (Resource) in the search result hits for convenience. 
Nothing that should be a problem. Moving it to Sling would mean a few changes 
anyway, while keeping it backwards compatible on the query statement side.

Cheers,
Alex

Re: [RT] New resource query API

2015-05-28 Thread Alexander Klimetschek
On 28.05.2015, at 09:32, Carsten Ziegeler cziege...@apache.org wrote:
 
 Am 28.05.15 um 18:25 schrieb Alexander Klimetschek:
 On 27.05.2015, at 01:35, Bertrand Delacretaz bdelacre...@apache.org wrote:
 I'm happy to collaborate on creating these examples (which can simply
 be unit tests for a relevant ResourceProvider) but before that I'd
 like to discuss what the alternatives are, before we invent YAQA (*).
 
 We have the querybuilder [1] in our CQ/AEM product, we could contribute this 
 to Sling, I think (pending internal legal processes of course).
 
 I guess that would be nice, but isnt the query builder an http api? How
 is that translated to resource queries?

No, it's both. It is a normal API [1], usage example at [2], but it also has a 
form that is easy to transport over http using GET/POST parameters and comes 
with a (popular) servlet that provides the result as json. The general 
predicate format is not depending on an order or correctly nested brackets, so 
you can easily build advanced search forms.

[1] 
https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/package-summary.html
[2] 
https://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/QueryBuilder.html

Cheers,
Alex


Re: [RT] New resource query API

2015-05-28 Thread Alexander Klimetschek
When you run a query across multiple backends, you have to aggregate the 
results. This is non-trivial an in most cases you are better off using an 
external search index that covers everything. And from my experience, you 
usually you don't have the use case to search across different providers, e.g. 
if you have a) a file system provider for bundles and code and b) a database 
provider providing ecommerce order entries, you never search across both at the 
same time.

Cheers,
Alex

 On 28.05.2015, at 11:06, Carsten Ziegeler cziege...@apache.org wrote:
 
 Just to clarify as it seems people got the proposal wrong: this is about
 a new API, not an implementation. It's an abstraction on the resource
 level. Of course with a JCR provider underneath, the search is delegated
 to that provider. Same with other providers.
 It should be easy for every provider to implement the api.
 
 Typical use cases are for example the job handling which searches for
 jobs in the resource tree or the resource resolver implementation
 looking for vanity paths etc.
 
 Right now - although these parts use the resource api - it's not
 possible to run them with a different provider than jcr.
 
 Carsten
 
 Am 18.05.15 um 08:17 schrieb Carsten Ziegeler:
 The current resource query api has several problems:
 - it's using the JCR spec to define a query
 - it's not clear which queries are supported by providers
 - queries are string based
 - implementing queries in a resource provider is way too hard as this
 would require to implement the complete jcr query api.
 
 I've created a draft for a new, object based API at [1]. The main idea
 is to use a builder pattern to create Query objects. This are immutable
 and have a unique identifier. The QueryManager service can be used to
 execute a query in the context of a resource resolver. The manager
 delegates the query to the providers. As each Query object has this
 identifier, implementations can use this to cache the parsing of the query.
 In addition to the query object you can pass in query instructions to
 specify a limit or range for the query.
 
 Obviously this is a reduced set compared to the full fledged jcr search
 api, however it should be suitable for the majority of use cases.
 
 [1]
 https://svn.apache.org/repos/asf/sling/whiteboard/cziegeler/api-v3/src/main/java/org/apache/sling/api/resource/query/
 
 Regards
 Carsten
 
 
 
 -- 
 Carsten Ziegeler
 Adobe Research Switzerland
 cziege...@apache.org




Re: [RT] New resource query API

2015-05-28 Thread Carsten Ziegeler
Just to clarify as it seems people got the proposal wrong: this is about
a new API, not an implementation. It's an abstraction on the resource
level. Of course with a JCR provider underneath, the search is delegated
to that provider. Same with other providers.
It should be easy for every provider to implement the api.

Typical use cases are for example the job handling which searches for
jobs in the resource tree or the resource resolver implementation
looking for vanity paths etc.

Right now - although these parts use the resource api - it's not
possible to run them with a different provider than jcr.

Carsten

Am 18.05.15 um 08:17 schrieb Carsten Ziegeler:
 The current resource query api has several problems:
 - it's using the JCR spec to define a query
 - it's not clear which queries are supported by providers
 - queries are string based
 - implementing queries in a resource provider is way too hard as this
 would require to implement the complete jcr query api.
 
 I've created a draft for a new, object based API at [1]. The main idea
 is to use a builder pattern to create Query objects. This are immutable
 and have a unique identifier. The QueryManager service can be used to
 execute a query in the context of a resource resolver. The manager
 delegates the query to the providers. As each Query object has this
 identifier, implementations can use this to cache the parsing of the query.
 In addition to the query object you can pass in query instructions to
 specify a limit or range for the query.
 
 Obviously this is a reduced set compared to the full fledged jcr search
 api, however it should be suitable for the majority of use cases.
 
 [1]
 https://svn.apache.org/repos/asf/sling/whiteboard/cziegeler/api-v3/src/main/java/org/apache/sling/api/resource/query/
 
 Regards
 Carsten
 


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-28 Thread Alexander Klimetschek
On 27.05.2015, at 01:35, Bertrand Delacretaz bdelacre...@apache.org wrote:
 I'm happy to collaborate on creating these examples (which can simply
 be unit tests for a relevant ResourceProvider) but before that I'd
 like to discuss what the alternatives are, before we invent YAQA (*).

We have the querybuilder [1] in our CQ/AEM product, we could contribute this to 
Sling, I think (pending internal legal processes of course).

[1] https://docs.adobe.com/docs/en/aem/6-1/develop/search/querybuilder-api.html

Cheers,
Alex

Re: [RT] New resource query API

2015-05-28 Thread Carsten Ziegeler
Am 28.05.15 um 18:25 schrieb Alexander Klimetschek:
 On 27.05.2015, at 01:35, Bertrand Delacretaz bdelacre...@apache.org wrote:
 I'm happy to collaborate on creating these examples (which can simply
 be unit tests for a relevant ResourceProvider) but before that I'd
 like to discuss what the alternatives are, before we invent YAQA (*).
 
 We have the querybuilder [1] in our CQ/AEM product, we could contribute this 
 to Sling, I think (pending internal legal processes of course).
 
I guess that would be nice, but isnt the query builder an http api? How
is that translated to resource queries?

Carsten


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: [RT] New resource query API

2015-05-27 Thread Bertrand Delacretaz
Hi,

On Mon, May 18, 2015 at 8:17 AM, Carsten Ziegeler cziege...@apache.org wrote:
 The current resource query api has several problems:
 - it's using the JCR spec to define a query..

Why is that a problem?
Creating a good query API is hard work, so I'd be much more in favor
of reusing an existing query API than inventing our own.

AFAIK Oak parses queries to the internal JCR query object model, so
translating that to a subset that random ResourceProviders can
implement should be possible.

 - it's not clear which queries are supported by providers..

Agreed, that's a difficult one to solve, especially with a query that
spans multiple providers.

 ...Obviously this is a reduced set compared to the full fledged jcr search
 api, however it should be suitable for the majority of use cases...

IMO it's impossible to validate such a query API in the abstract,
without having examples of how queries look, based on a set of
realistic use cases.

I'm happy to collaborate on creating these examples (which can simply
be unit tests for a relevant ResourceProvider) but before that I'd
like to discuss what the alternatives are, before we invent YAQA (*).

-Bertrand

(*) Yet Another Query API ;-)


Re: [RT] New resource query API

2015-05-27 Thread Carsten Ziegeler
Am 27.05.15 um 10:35 schrieb Bertrand Delacretaz:
 Hi,
 
 On Mon, May 18, 2015 at 8:17 AM, Carsten Ziegeler cziege...@apache.org 
 wrote:
 The current resource query api has several problems:
 - it's using the JCR spec to define a query..
 
 Why is that a problem?
 Creating a good query API is hard work, so I'd be much more in favor
 of reusing an existing query API than inventing our own.
 
 AFAIK Oak parses queries to the internal JCR query object model, so
 translating that to a subset that random ResourceProviders can
 implement should be possible.

The goal of this api is to do what I call technical queries, so queries
we do in our code. Its not necessarily be used for user facing queries.
Instead of using a subset of an api that doesn't look too appealing to
me I would rather stay within the resource api.

 
 ...Obviously this is a reduced set compared to the full fledged jcr search
 api, however it should be suitable for the majority of use cases...
 
 IMO it's impossible to validate such a query API in the abstract,
 without having examples of how queries look, based on a set of
 realistic use cases.

We have queries in Sling, we have already people contributing within
this thread, so I guess this works out fine.
 
 I'm happy to collaborate on creating these examples (which can simply
 be unit tests for a relevant ResourceProvider) but before that I'd
 like to discuss what the alternatives are, before we invent YAQA (*).
 
Make a proposal and we can discuss it

Thanks
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


RE: [RT] New resource query API

2015-05-26 Thread Stefan Seifert

 5. full text search on any property (jcr:contains) - is this possible with
.property(*).approx(searchterm)? or perhaps something
 like .anyProperty().approx(searchtearm) - or a special signature
like .anyPropertyApprox(searchtearm)?

Haven't thought about this one yet, but I guess a special signature
sounds better.

do you want to add it? i've not found it in the updated API.

stefan


Re: [RT] New resource query API

2015-05-26 Thread Carsten Ziegeler
Am 26.05.15 um 22:10 schrieb Stefan Seifert:
 
 5. full text search on any property (jcr:contains) - is this possible with
 .property(*).approx(searchterm)? or perhaps something
 like .anyProperty().approx(searchtearm) - or a special signature
 like .anyPropertyApprox(searchtearm)?

 Haven't thought about this one yet, but I guess a special signature
 sounds better.
 
 do you want to add it? i've not found it in the updated API.
 
I changed my mind and went with property(*) - the main reason is to
keep the query interface smaller as that needs to be interpreted by the
providers.


Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


RE: [RT] New resource query API

2015-05-26 Thread Stefan Seifert

I changed my mind and went with property(*) - the main reason is to
keep the query interface smaller as that needs to be interpreted by the
providers.

ok

stefan 


Re: [RT] New resource query API

2015-05-26 Thread Carsten Ziegeler
I've updated the API with the feedback I received so far. In addition I
changed the paging from a skip number to key based paging. I guess these
interfaces are not perfect yet, but show the direction.

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


RE: [RT] New resource query API

2015-05-21 Thread Stefan Seifert

 6. property conditions with deep property paths should be supported as well
- if the underlying provider supports it. so .property() could optionally
 accept a path to a deeper property. to clarify in javadocs.

I'm not sure if we should go there, so your use case is searching for a
resource which has a child resource that has a property foo=bar (or
something like that)?

yes - this is supported by xpath query currently, and - at least in jackrabbit 
2 - it was performant if it matches with a proper lucene index configuration 
for the search index that includes a certain level of child nodes.

i looked in some of our old projects as well and found this usecase in some 
places.

stefan


Re: [RT] New resource query API

2015-05-21 Thread Carsten Ziegeler
Thanks for your feedback Stefan, more inline...

Am 21.05.15 um 13:00 schrieb Stefan Seifert:
 
 1. Query can reference further Queryies for nesting with and/or expressions. 
 in this case the sort* methods do not make sense. perhaps the two sort 
 methods should be moved to QueryInstructions?

Yes, I had it this way in my first draft, but for some reason (which I
can't rememeber...) decided against. Make moving is better.

 
 2. it would be useful not only to filter by property values, but by 
 node/resource name as well (e.g. resource name = X)

ah good one.

 
 3. I suppose the isA could not only mean a resource type but a JCR primary 
 type as well if the resource provider supports it? to clarify in javadocs.

Yep

 
 4. perhaps we should not use arrays in public interfaces as output values, 
 they have always the problem with immutability (e.g. Query.getPaths etc.)

ok

 
 5. full text search on any property (jcr:contains) - is this possible with 
 .property(*).approx(searchterm)? or perhaps something 
 like .anyProperty().approx(searchtearm) - or a special signature
like .anyPropertyApprox(searchtearm)?

Haven't thought about this one yet, but I guess a special signature
sounds better.

 
 6. property conditions with deep property paths should be supported as well - 
 if the underlying provider supports it. so .property() could optionally 
 accept a path to a deeper property. to clarify in javadocs.

I'm not sure if we should go there, so your use case is searching for a
resource which has a child resource that has a property foo=bar (or
something like that)?

 
 7. all in all this query supports a lot of features, although not all as it 
 is possible with XPath etc. what happens if a resource provider can only 
 support a subset of those features?

The main idea is to cover most search use cases, looking for example at
the queries we have in the Sling code base, I guess we can cover all of
those. So the set should be implementable by all providers. We could
make this mandatory. Having optional parts is always problematic as
you never know if something is supported or not, and even asking before
if something is supported doesn't help if you get back a no.
Therefore I personally would like to keep this simply and therefore
implementable by everyone.

Carsten

-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


RE: [RT] New resource query API

2015-05-21 Thread Stefan Seifert
hello carsten.

some feedback:

1. Query can reference further Queryies for nesting with and/or expressions. in 
this case the sort* methods do not make sense. perhaps the two sort methods 
should be moved to QueryInstructions?

2. it would be useful not only to filter by property values, but by 
node/resource name as well (e.g. resource name = X)

3. I suppose the isA could not only mean a resource type but a JCR primary type 
as well if the resource provider supports it? to clarify in javadocs.

4. perhaps we should not use arrays in public interfaces as output values, they 
have always the problem with immutability (e.g. Query.getPaths etc.)

5. full text search on any property (jcr:contains) - is this possible with 
.property(*).approx(searchterm)? or perhaps something like 
.anyProperty().approx(searchtearm) - or a special signature like 
.anyPropertyApprox(searchtearm)?

6. property conditions with deep property paths should be supported as well - 
if the underlying provider supports it. so .property() could optionally accept 
a path to a deeper property. to clarify in javadocs.

7. all in all this query supports a lot of features, although not all as it is 
possible with XPath etc. what happens if a resource provider can only support a 
subset of those features?


stefan


-Original Message-
From: Carsten Ziegeler [mailto:cziege...@apache.org]
Sent: Monday, May 18, 2015 8:17 AM
To: Sling Developers
Subject: [RT] New resource query API

The current resource query api has several problems:
- it's using the JCR spec to define a query
- it's not clear which queries are supported by providers
- queries are string based
- implementing queries in a resource provider is way too hard as this
would require to implement the complete jcr query api.

I've created a draft for a new, object based API at [1]. The main idea
is to use a builder pattern to create Query objects. This are immutable
and have a unique identifier. The QueryManager service can be used to
execute a query in the context of a resource resolver. The manager
delegates the query to the providers. As each Query object has this
identifier, implementations can use this to cache the parsing of the query.
In addition to the query object you can pass in query instructions to
specify a limit or range for the query.

Obviously this is a reduced set compared to the full fledged jcr search
api, however it should be suitable for the majority of use cases.

[1]
https://svn.apache.org/repos/asf/sling/whiteboard/cziegeler/api-
v3/src/main/java/org/apache/sling/api/resource/query/

Regards
Carsten
--
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


[RT] New resource query API

2015-05-18 Thread Carsten Ziegeler
The current resource query api has several problems:
- it's using the JCR spec to define a query
- it's not clear which queries are supported by providers
- queries are string based
- implementing queries in a resource provider is way too hard as this
would require to implement the complete jcr query api.

I've created a draft for a new, object based API at [1]. The main idea
is to use a builder pattern to create Query objects. This are immutable
and have a unique identifier. The QueryManager service can be used to
execute a query in the context of a resource resolver. The manager
delegates the query to the providers. As each Query object has this
identifier, implementations can use this to cache the parsing of the query.
In addition to the query object you can pass in query instructions to
specify a limit or range for the query.

Obviously this is a reduced set compared to the full fledged jcr search
api, however it should be suitable for the majority of use cases.

[1]
https://svn.apache.org/repos/asf/sling/whiteboard/cziegeler/api-v3/src/main/java/org/apache/sling/api/resource/query/

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org