Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Carsten Ziegeler Mon, 22 Jun 2015 17:58:03 -0700

Thanks Justin for the detailed response. I guess we all have different
experience and have different use cases in mind.


I think we all agree that the current way of searching in the resource
api is tied to JCR - and that we don't have an abstraction for the query.
My main point is simple : we need this abstraction. I want to specify a
query and I don't want to care about the implementation or the storage.
You're right that there will be situations where not the best way for a
search is used as this is not possible through the abstraction and yes
there is no extension mechanism. For the latter, as soon as there is an
extension mechanism you loose the abstraction. For the first one, well
this might be true. On the other hand when ORM became popular there was
the long debate whether a hand-crafted SQL query is more efficient than
the ones generated by the ORM tools. And in the end it became clear that
the generated ones where good enough if not better. So I don't see why
this should work in this case as well. On the other hand if you really
want to do a specific query against a specific resource provider, do
that, don't use the abstraction.

Or in other words, the propsed API will not cover 100%, it might cover
60% in a nice way. And that alone is reason for me to go this way. Of
course we can go pestimistic and say people will try to use it for the
remaining 40% and fail. We could also say this with other things like
the adapter pattern we have which allows you to break out of the
abstraction. My use cases work pretty well with that new api and can be
efficiently implemented.

For the idea of donating the query buider - are there any concrete
plans? If this would happen who is doing the refactoring? Where would
the refactoring take place, at Adobe, in Sling? We all agree that
throwing this code into Sling by itself does not help.

So whoever wants to get his hands dirty, please come up with a concrete
proposal which we can discuss

Thanks
Carsten

Am 22.06.15 um 17:49 schrieb Justin Edelson:
> Hi,
> 
> 
> Apologies for not tracking this discussion, but I wanted to weigh in before
> things got much further.
> 
> IIUC, the core problem we are trying to solve is to provide a query syntax
> indepdent of any particular ResourceResolver implementation. While, to be
> honest, this is not a problem I have personally run into using Sling for
> the past 6 years, I can certainly see why it is one.
> 
> But I do think we have a good answer available which was Alex's original
> proposal to have Adobe donate the QueryBuilder code to Sling. Now the
> QueryBuilder code as-is wouldn't solve this problem; it would require a
> refactoring, but I believe this refactoring is managable. This would have
> the following benefits:
> 
> 1) Adopt a syntax many (but certainly not all) Sling developers are
> famililar with.
> 2) Provide a path to avoid YAQL. While yes, in the near term we will have
> "Sling QueryBuilder" and "AEM QueryBuilder", the AEM QueryBuilder could be
> deprecated (obviously up to AEM Product Management) and eventually removed.
> 3) An opportunity to fix some of the issues with QueryBuilder (granted,
> this isn't necessarily Sling's problem to solve).
> 
> One thing which concerns me about the current Query API is that it appears
> to be completely non-extensible. How, for example, would one implement
> something like
> https://docs.adobe.com/docs/en/cq/5-6-1/javadoc/com/day/cq/search/eval/RelativeDateRangePredicateEvaluator.html
> ? If I'm reading this correctly, the date math has to be done by the
> caller. Which isn't that problematic at first, but the code would be
> significantly more verbose than
> 
> relativedaterange.property=jcr:lastModified
> relativedaterange.lowerBound=-1d
> 
> What is potentially problematic about not having this type of extensibility
> is that it prevents specific implementations from providing the best
> implementation possible. For example, let's say that MongoDB has a really
> efficient way to query for documents modified in the last day. If I do the
> date math in Java code, I'm making it that much harder for the MongoDB
> ResourceProvider to opimitize this query (sorry, this isn't a great
> example, but it's late and I'm getting tired). Plus, the query isn't really
> expressing what I want -- I want to find resources modified in the last
> day, not from some absolute date. So someone reading my code later has to
> figure out what the calls to Calendar.add(Calendar.DAY_OF_MONTH, -1) are
> there for.
> 
> Here's a better example: JCR is unable to compare two properties, i.e. give
> me all nodes where property foo equals the value of property bar. But
> MongoDB *can* do this (it isn't super-efficient, but it is possible). I can
> almost see how you would do this with the new Query API, but it would be
> ugly at best. Or, more broadly, how would the MongoDB $where operator be
> supported?
> 
> The advantage of the AEM QueryBuilder's model is that figuring all of this
> stuff out isn't the responsibility of the platform developer. We just need
> to provide a solid basis and then let downstream users add their own hooks.
> As soon as you say that these are the only 8 operations anyone is ever
> going to do on a property or the 4 operations anyone is ever going to do on
> a resource, you're into "640k should be enough memory for anyone" territory.
> 
> So how specifically would the Sling QueryBuilder be different than the AEM
> QueryBuilder?
> 
> I think of QueryBuilder queries being processed in these separate steps
> (FWIW, none of this is proprietary information, it is based on public
> documentation):
> 
> 1) A map of key/value pairs is turned into a PredicateGroup object. While
> technically this step is optional (you can build a PredicateGroup by hand),
> it is pretty common. This would be common functionality across all
> ResourceResolvers and the code from AEM could probably be brought over
> as-is.
> 2) The PredicateGroup (which is a nested tree) at this point represents the
> query statement. It is then passed to the ResourceResolver (this part is
> somewhat different than the AEM QueryBuilder).
> 3) Each ResourceProvider analyzes the predicates and decides whether or not
> it knows how to evaluate all of them. If it can't, it should return no
> results (this is debatable, but I think it makes sense). The only exception
> is where you had an or clause, i.e. this query:
> 
> fulltext=Management
> group.p.or=true
> group.1_jcrType=dam:Asset
> group.2_resourceType=some/resource/type
> 
> If a non-JCR provider didn't know how to evaluate the jcrType predicate
> type, it could still evaluate the query because it is OR'd with a
> resourceType predicate (which let's say it does know how to evaluate). But
> if it didn't know how to evaluate the fulltext predicate type, it shouldn't
> return any results.
> 
> 4) The ResourceProvider uses PredicateEvaluators to map each predicate to
> its native query syntax. For this to work, each ResourceProvider would
> expose its own PredicateEvaluator interface (in theory,
> a ResourceProvider doesn't need to do this if the evaluation process isn't
> intended to be pluggable). IIOW, the current AEM PredicateEvaluator
> interface would be renamed JcrPredicateEvaluator.
> 5) At least in JCR (based on current functionality), some Predicates can't
> be evaluated in a native query (i.e. XPath) and will need to be handled as
> filters on the result set, but this is an implementation detail left to
> the ResourceProvider.
> 6) The ResourceProvider returns results to the ResourceResolver.
> 7) Sorting is handled (or not) as currently proposed.
> 
> To be clear, I don't have a concrete proposal for how to replicate (or not)
> AEM QueryBuilder's facet support. Alex might...
> 
> Regards,
> Justin
> 
> 
>>
>> Cheers,
>> Alex
>>
> 


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org

Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Reply via email to