Re: Querying vs iterating

Julian Sedding Mon, 20 Jun 2016 06:44:15 -0700

Hi Roy

>From you question ("hard to put an index to it") I assume that you are
running on an Oak repository. If that is incorrect, my answer does not
apply.


Oak will always consider traversal as an alternative to existing
indexes. For most queries the cost of traversal is so high that an
index is chosen. However, if no suitable index exists (and
theoretically also if the traversal is cheaper than a lookup in a
matching index), it will do a traversal behind the scenes. Note that
traversal logs a warning every 10000 traversed nodes. So if you plan
to traverse more than that you should really consider creating an
index.

In short: with Oak using a query on a small subtree should give you
what you want, even without an index.

Regards
Julian


On Thu, Jun 16, 2016 at 4:44 PM, Steven Walters <kemu...@gmail.com> wrote:
> Hopefully other people chime in here, I've only had bad experiences
> with utilizing queries and have often resulted in personally never
> using them - so I always end up iterating/navigating myself.
>
> Theoretically if you have a REALLY GOOD index then you may get some
> similar performances, but if your index(es) are inefficient, then it's
> just wasted CPU cycles (you'd wish those CPU cycles were going to a
> good cause, but they're not).
>
> the transition of Sling (and AEM) to Oak from Jackrabbit 2.x made this
> experience worse with the awkward indexing policies/process in Oak,
> and the fact that Oak never seemed to ever use multiple indexes.
> Oak always seemed to calculates the costs of the entire query against
> all the available indexes and only chooses the ONE best index.
> This sounds like a good idea in theory, but then most DBMS I've used
> in the past utilize ALL the indexes they can - not just one.
>
> So basically i guess this comes to be "If you have a good index (in
> that it can apply to ALL the conditions/attributes/properties of your
> query) then using a query should be fine, otherwise iterate yourself"
> having any condition missing from the index can be fatal in
> performance, such as lacking the evaluatePathRestrictions = true,
> which without it is basically death of the system if you have a lot of
> content.
>
> But really, I hope some other people with more positive experiences
> can provide some better advice.
>
> On Thu, Jun 16, 2016 at 11:08 PM, Roy Teeuwen <r...@teeuwen.be> wrote:
>> Ok, it would be handy to have an estimate on the approximate amount / levels 
>> of resources when to go for iterating vs querying :).
>>
>> Greets
>> Roy
>>> On 16 Jun 2016, at 16:06, Steven Walters <kemu...@gmail.com> wrote:
>>>
>>> if you know there are that few resources, then I say iterating would be
>>> better performing than XPath / JCR-SQL2 queries.
>>> This is primarily from past experience speaking in that queries have
>>> generally turned out (often MUCH) slower than directly iterating if you
>>> know what you're actually looking for.
>>>
>>>
>>> On Thu, Jun 16, 2016 at 10:28 PM, Roy Teeuwen <r...@teeuwen.be> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Lets say I got a resource with around 10-20 child/grand-child resources,
>>>> not going deeper than 3 levels max. What is the most performant when
>>>> searching for the child resources containing a specific property (the
>>>> property is configurable with OSGi, so hard to put an index on it).
>>>> Iterating the child / grand-child resources until you find it or making an
>>>> xpath/jcr-sql2 query? When would one option start to be more performant
>>>> than the other.
>>>>
>>>> Thanks!
>>>> Roy
>>

Re: Querying vs iterating

Reply via email to