Thomas Mueller commented on OAK-4934:

Sure, it can be done. 

In some cases, knowing the values is important, and the values might have a 
(more or less) big impact on query time and cost. For example, for a property 
called "status", there might be "status='active'" with 1000 entries, and 
"status='inactive'" with 1 million entries. The property index does distinguish 
between the two (cost estimates are for each value). Special are nodetype 
restrictions, and path restrictions.

What about using a utility (that internally uses regular expression logic or 
similar) to reduce the details? The query could be transformed depending on the 

* Detail level 3 = as of now
* Detail level 2 = without literals, but with path and nodetypes
* Detail level 1 = without literals, but with path
* Detail level 0 = without any literals

Where literals are strings, numbers, dates, and so on.

> Query shapes for JCR Query
> --------------------------
>                 Key: OAK-4934
>                 URL: https://issues.apache.org/jira/browse/OAK-4934
>             Project: Jackrabbit Oak
>          Issue Type: Wish
>          Components: query
>            Reporter: Chetan Mehrotra
> For certain requirements it would be good to have a notion/support to deduce 
> query shape [1]
> {quote}
>  A combination of query predicate, sort, and projection specifications.
> For the query predicate, only the structure of the predicate, including the 
> field names, are significant; the values in the query predicate are 
> insignificant. As such, a query predicate \{ type: 'food' \} is equivalent to 
> the query predicate \{ type: 'utensil' \} for a query shape.
> {quote}
> So transforming that to Oak the shape should represent a JCR-SQL2 query 
> string (xpath query gets transformed to SQL2) which is a *canonical* 
> representation of actual query ignoring the property restriction values. 
> Example we have 2 queries
> * SELECT   * FROM [app:Asset] AS a WHERE  a.[jcr:content/metadata/status] = 
> 'published'
> * SELECT   * FROM [app:Asset] AS a WHERE  a.[jcr:content/metadata/status] = 
> 'disabled'
> The query shape would be 
> SELECT * FROM [app:Asset] AS a WHERE  a.[jcr:content/metadata/status] = 'A'. 
> The plan for query having given shape would remain same irrespective of value 
> of property restrictions. Path restriction can cause some difference though
> The shape can then be used for
> * Stats Collection - Currently stats collection gets overflown if same query 
> with different value gets invoked
> * Allow configuring hints - See support in Mongo [2] for an example. One 
> specify via config that for a query of such and such shape this index should 
> be used
> * Less noisy diagnostics - If a query gets invoked with bad plan the QE can 
> log the warning once instead of logging it for each query invocation 
> involving different values.
> [1] https://docs.mongodb.com/manual/reference/glossary/#term-query-shape
> [2] https://docs.mongodb.com/manual/reference/command/planCacheSetFilter/

This message was sent by Atlassian JIRA

Reply via email to