+1 Erik
Definitely using the [$start to $end] is not efficient in this case. As noted
it requires reading all the values, creating a copy of them ( using a syntax I
don't quite get ' || $o1 || ' )
Is this maybe a snippet of a larger expression that generates a string and
eval's it ?
1) ( for .. return expr ) [ $start to $end ] -> in general does not optimize
as well -- it requires generating the entire sequence before evaluating the
range.
2) searchable-function()[ $start to $end ] -> in general *can* optimize well
-- it is equivalent to
fn:subsequence( searchable-function() , $start , $end - $start + 1 )
and will optimize in the same cases as the above -- but even better is if the
function takes arguments for range and ordering -- then use those instead of
generic methods, they were most likely added to the function specifically to
allow better optimizations.
3) Code Generation / xdmp:eval() -- > I am inferring from the "||" expressions
that you are creating a string then running xdmp:eval() --> if at all possible
do not do that.
This boils down to Erik's' suggestion and simply 1 call
cts:search( /path , $query , < options for sort order > )[$start to $end]
/fn:value()
From: [email protected]
[mailto:[email protected]] On Behalf Of Erik Hennum
Sent: Tuesday, May 30, 2017 12:11 PM
To: MarkLogic Developer Discussion <[email protected]>
Subject: Re: [MarkLogic Dev General] Optimising XQuery Timeouts
Hi, Basavaraj:
I suspect some of the xdmp:value() call got lost along the way, given the
string concatenation operator || with no strings.
It might speed things up to execute fn:subsequence(cts:search(...), $start,
$end - $start + 1), specifying the sort order with cts:index-order() arguments
to cts:search() instead of evaluating a for iteration.
Erik Hennum
________________________________
From:
[email protected]<mailto:[email protected]>
[[email protected]] on behalf of Basavaraj Kalloli
[[email protected]]
Sent: Monday, May 29, 2017 11:25 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Optimising XQuery Timeouts
Hi Erik,
Here is the full extract of the xdmp:value() call:
xdmp:value((for $i in cts:search(/record[.//npg:Article], $final-query,
('unfiltered' ), 0.0) order by || $o1 || $o2 || return $i)[$start to $end])
Looks like that query is bringing in everything from the database and then
ordering. As I am typing I think it will be good for it to be ordered as part
of the search query and return only that many results as from $start to $end.
I will also look into the empty not-queries and post more details. The problem
we have been having is that this is legacy code which is making it difficult to
investigate.
Thanks for the pointers, hope the above code can possibly be the reason the
queries are running slow with concurrent requests?
Cheers,
Basavaraj
On Fri, May 26, 2017 at 3:58 PM, Erik Hennum
<[email protected]<mailto:[email protected]>> wrote:
Hi, Basavaraj:
Can you show the full xdmp:value() call?
It looks like the FLWOR expression is ordering based on an XPath into each
retrieved document. It would be more efficient to order within the
cts:search() call based on range indexes.
The $final-query reported by query trace has some odd subqueries of the form:
cts:not-query(cts:or-query((), ()), 1)
An empty cts:or-query() is always false, making the negation is always true, so
these subqueries contribute no selectivity.
You might look at expressing the searchable expression criteria as a subquery
with the $final-query. Also, you could try to call cts:search() directly
instead of evalling it with xdmp:value().
Finally, if you're retrieving a large number of documents, the best practice is
to page over the result set.
Hoping that helps,
Erik Hennum
________________________________
From:
[email protected]<mailto:[email protected]>
[[email protected]<mailto:[email protected]>]
on behalf of Basavaraj Kalloli
[[email protected]<mailto:[email protected]>]
Sent: Friday, May 26, 2017 4:13 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Optimising XQuery Timeouts
Hi All,
I have been trying to resolve a couple of our queries which time out every
couple of hours. I believe this is down to the number of concurrent requests.
Things evaluated/investigated:
* I turned on the debug logs to see if there are any deadlocks
o I couldn't find any traces of locks
* Next I profiled my query and looks like there are 234 expressions and
normally it returns under 0.00738 secs
* Next I did try query-trace the output looks like:
xdmp:value("(for $i in cts:search(/record[.//npg:Article], $final-query,
('u...")
Analyzing path for search: fn:collection()/record[descendant::npg:Article]
Step 1 is searchable: fn:collection()
Step 2 is searchable: record[descendant::npg:Article]
Path is fully searchable.
Gathering constraints.
Step 2 predicate 1 contributed 1 constraint: descendant::npg:Article
Step 2 predicate 1 contributed 2 constraints: descendant::npg:Article
Step 2 contributed 3 constraints: record[descendant::npg:Article]
Comparison contributed string scatter value constraint: xdmp:collection =
("http://ns.nature.com/graphs/articles-labanimal", "lab_animal",
"http://ns.nature.com/graphs/articles-nature", ...)
Search query contributed 1 constraint:
cts:and-query((cts:not-query(cts:or-query((), ()), 1),
cts:collection-query(("http://ns.nature.com/graphs/articles-labanimal",
"lab_animal", "http://ns.nature.com/graphs/articles-nature", "journals_nature",
"http://ns.nature.com/graphs/articles-palgrave", "journals_palgrave")),
cts:or-query(cts:field-value-query("doi", "10.1038/212441a0", ("lang=en"), 0),
()), cts:not-query(cts:or-query((), ()), 1), cts:not-query(cts:or-query((),
()), 1), cts:not-query(cts:or-query((), ()), 1)), ())
Order by clause contributed 1 range ordering constraint for $i: order by
xs:date($i/descendant::prism:publicationDate) descending
Ordering can be unfiltered.
Executing search.
Selected 1 fragment.
I dont see anything unsual with the output - no traces of unsearchable
expressions and lack of indexes.
* I did try query-meters
<qm:query-meters xsi:schemaLocation="http://marklogic.com/xdmp/query-meters
query-meters.xsd" xmlns:qm="http://marklogic.com/xdmp/query-meters"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<qm:elapsed-time>PT0.003874S</qm:elapsed-time>
<qm:requests>0</qm:requests>
<qm:list-cache-hits>150</qm:list-cache-hits>
<qm:list-cache-misses>0</qm:list-cache-misses>
<qm:in-memory-list-hits>0</qm:in-memory-list-hits>
<qm:triple-cache-hits>0</qm:triple-cache-hits>
<qm:triple-cache-misses>0</qm:triple-cache-misses>
<qm:triple-value-cache-hits>0</qm:triple-value-cache-hits>
<qm:triple-value-cache-misses>0</qm:triple-value-cache-misses>
<qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
<qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
<qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
<qm:compressed-tree-cache-misses>0</qm:compressed-tree-cache-misses>
<qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
<qm:value-cache-hits>3</qm:value-cache-hits>
<qm:value-cache-misses>35</qm:value-cache-misses>
<qm:regexp-cache-hits>1</qm:regexp-cache-hits>
<qm:regexp-cache-misses>1</qm:regexp-cache-misses>
<qm:link-cache-hits>0</qm:link-cache-hits>
<qm:link-cache-misses>0</qm:link-cache-misses>
<qm:filter-hits>0</qm:filter-hits>
<qm:filter-misses>0</qm:filter-misses>
<qm:fragments-added>0</qm:fragments-added>
<qm:fragments-deleted>0</qm:fragments-deleted>
<qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
<qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
<qm:db-program-cache-hits>0</qm:db-program-cache-hits>
<qm:db-program-cache-misses>0</qm:db-program-cache-misses>
<qm:env-program-cache-hits>0</qm:env-program-cache-hits>
<qm:env-program-cache-misses>0</qm:env-program-cache-misses>
<qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
<qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
<qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
<qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
<qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
<qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
<qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
<qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>
<qm:fragments>
<qm:fragment>
<qm:root xmlns="">record</qm:root>
<qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
<qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
</qm:fragment>
</qm:fragments>
<qm:documents>
<qm:document>
<qm:uri>/n5061/xml/212441a0.xml</qm:uri>
<qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
<qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
</qm:document>
</qm:documents>
<qm:hosts>
<qm:host>
<qm:host-name>#removed#</qm:host-name>
<qm:round-trip-count>1</qm:round-trip-count>
<qm:round-trip-time>PT0.000921S</qm:round-trip-time>
</qm:host>
<qm:host>
<qm:host-name>#removed#</qm:host-name>
<qm:round-trip-count>1</qm:round-trip-count>
<qm:round-trip-time>PT0.000856S</qm:round-trip-time>
</qm:host>
</qm:hosts>
</qm:query-meters>
The only thing that strikes me is that there are value cache misses, I dont
know if I can do anything for it or anything else I could try. I am running out
of ideas so it would be great if anyone can share some thoughts/pointers.
Thanks,
Basavaraj Kalloli
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general