Thanks for a hint!
We have wrote small service faced to calculate facets.
It split my huge AQL provided above into 5 queries:
- main - to filter, sort and retrieve matching entities:
-
LET docs = (FOR a IN Asset
FILTER a.name like 'test-asset-%'
SORT a.name
RETURN a)
RETURN {
counts: (RETURN {
total: LENGTH(docs),
offset: 0,
to: 5
}),
items: (FOR a IN docs LIMIT 0, 5 RETURN a)
}
- 4 small ones to purely calculate facets:
-
LET docs = (FOR a IN Asset
FILTER a.name like 'test-asset-%'
RETURN a)
LET attributeX = (
FOR a in docs
COLLECT attr = a.attributeX WITH COUNT INTO length
RETURN { value: attr, count: length}
)
RETURN {
counts: (RETURN {
total: LENGTH(docs),
offset: 0,
to: -1,
facets: {
attributeX: {
from: 0,
to: 1000,
total: LENGTH(attributeX)
}
}
}),
facets: {
attributeX: (FOR a in attributeX LIMIT 0, 1000 return a)
}
}
We run these using Java's 8 Fork/Join and basically execute Map(split into
sub-queries)/Reduce(merge results) potentially against ArangoDB cluster.
We run custom ArangoDB build from Jan's feature branch. And results are
pretty impressive.
Remember, we have started with 140 secs for the full AQL and now we are
down to 11 seconds (with sort/filter + skiplist on name) or 4 seconds (w/o
soft/filter and 4 facets).
I think, we are satisfied for now :-) and hope this PR will make it to
'master'.
Also, would be awesome to see if AQL-pipeline will be advanced in the
future to accomodate more analytical type of queries (facets with
sub-facets, ElasticSearch style of sub-grouping, etc)
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.