+1 Erik
Definitely using the [$start to $end] is not efficient in this case.  As noted 
it requires reading all the values, creating a copy of them ( using a syntax I 
don't quite get   '   || $o1 ||  ' )
Is this maybe a snippet of a larger expression that generates a string and 
eval's it ?

1)  ( for .. return expr ) [ $start to $end ]  -> in general does not optimize 
as well -- it requires generating the entire sequence before evaluating the 
range.

2)  searchable-function()[ $start to $end ]  -> in general *can* optimize well 
-- it is equivalent to

    fn:subsequence( searchable-function() , $start , $end - $start + 1 )

and will optimize in the same cases as the above -- but even better is if the 
function takes arguments for range and ordering -- then use those instead of 
generic methods, they were most likely added to the function specifically to 
allow better optimizations.

3) Code Generation / xdmp:eval() -- > I am inferring from the "||" expressions 
that you are creating a string then running xdmp:eval()  --> if at all possible 
do not do that.

This boils down to Erik's' suggestion and simply 1 call

    cts:search(  /path , $query , < options for sort order > )[$start to $end] 
/fn:value()



From: [email protected] 
[mailto:[email protected]] On Behalf Of Erik Hennum
Sent: Tuesday, May 30, 2017 12:11 PM
To: MarkLogic Developer Discussion <[email protected]>
Subject: Re: [MarkLogic Dev General] Optimising XQuery Timeouts

Hi, Basavaraj:

I suspect some of the xdmp:value() call got lost along the way, given the 
string concatenation operator || with no strings.

It might speed things up to execute fn:subsequence(cts:search(...), $start, 
$end - $start + 1), specifying the sort order with cts:index-order() arguments 
to cts:search() instead of evaluating a for iteration.

Erik Hennum

________________________________
From: 
[email protected]<mailto:[email protected]>
 [[email protected]] on behalf of Basavaraj Kalloli 
[[email protected]]
Sent: Monday, May 29, 2017 11:25 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Optimising XQuery Timeouts
Hi Erik,

Here is the full extract of the xdmp:value() call:

xdmp:value((for $i in cts:search(/record[.//npg:Article], $final-query, 
('unfiltered' ), 0.0) order by || $o1 || $o2 || return $i)[$start to $end])

Looks like that query is bringing in everything from the database and then 
ordering. As I am typing I think it will be good for it to be ordered as part 
of the search query and return only that many results as from $start to $end.

I will also look into the empty not-queries and post more details. The problem 
we have been having is that this is legacy code which is making it difficult to 
investigate.

Thanks for the pointers, hope the above code can possibly be the reason the 
queries are running slow with concurrent requests?

Cheers,
Basavaraj

On Fri, May 26, 2017 at 3:58 PM, Erik Hennum 
<[email protected]<mailto:[email protected]>> wrote:
Hi, Basavaraj:

Can you show the full xdmp:value() call?

It looks like the FLWOR expression is ordering based on an XPath into each 
retrieved document.  It would be more efficient to order within the 
cts:search() call based on range indexes.

The $final-query reported by query trace has some odd subqueries of the form:  
cts:not-query(cts:or-query((), ()), 1)
An empty cts:or-query() is always false, making the negation is always true, so 
these subqueries contribute no selectivity.

You might look at expressing the searchable expression criteria as a subquery 
with the $final-query.  Also, you could try to call cts:search() directly 
instead of evalling it with xdmp:value().

Finally, if you're retrieving a large number of documents, the best practice is 
to page over the result set.


Hoping that helps,


Erik Hennum

________________________________
From: 
[email protected]<mailto:[email protected]>
 
[[email protected]<mailto:[email protected]>]
 on behalf of Basavaraj Kalloli 
[[email protected]<mailto:[email protected]>]
Sent: Friday, May 26, 2017 4:13 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Optimising XQuery Timeouts
Hi All,

I have been trying to resolve a couple of our queries which time out every 
couple of hours. I believe this is down to the number of concurrent requests. 
Things evaluated/investigated:

* I turned on the debug logs to see if there are any deadlocks
o I couldn't find any traces of locks
* Next I profiled my query and looks like there are 234 expressions and 
normally it returns under 0.00738 secs
* Next I did try query-trace the output looks like:

xdmp:value("(for $i in cts:search(/record[.//npg:Article], $final-query, 
('u...")
Analyzing path for search: fn:collection()/record[descendant::npg:Article]
Step 1 is searchable: fn:collection()
Step 2 is searchable: record[descendant::npg:Article]
Path is fully searchable.
Gathering constraints.
Step 2 predicate 1 contributed 1 constraint: descendant::npg:Article
Step 2 predicate 1 contributed 2 constraints: descendant::npg:Article
Step 2 contributed 3 constraints: record[descendant::npg:Article]
Comparison contributed string scatter value constraint: xdmp:collection = 
("http://ns.nature.com/graphs/articles-labanimal";, "lab_animal", 
"http://ns.nature.com/graphs/articles-nature";, ...)
Search query contributed 1 constraint: 
cts:and-query((cts:not-query(cts:or-query((), ()), 1), 
cts:collection-query(("http://ns.nature.com/graphs/articles-labanimal";, 
"lab_animal", "http://ns.nature.com/graphs/articles-nature";, "journals_nature", 
"http://ns.nature.com/graphs/articles-palgrave";, "journals_palgrave")), 
cts:or-query(cts:field-value-query("doi", "10.1038/212441a0", ("lang=en"), 0), 
()), cts:not-query(cts:or-query((), ()), 1), cts:not-query(cts:or-query((), 
()), 1), cts:not-query(cts:or-query((), ()), 1)), ())
Order by clause contributed 1 range ordering constraint for $i: order by 
xs:date($i/descendant::prism:publicationDate) descending
Ordering can be unfiltered.
Executing search.
Selected 1 fragment.
I dont see anything unsual with the output - no traces of unsearchable  
expressions and lack of indexes.

* I did try query-meters

<qm:query-meters xsi:schemaLocation="http://marklogic.com/xdmp/query-meters 
query-meters.xsd" xmlns:qm="http://marklogic.com/xdmp/query-meters"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
  <qm:elapsed-time>PT0.003874S</qm:elapsed-time>
  <qm:requests>0</qm:requests>
  <qm:list-cache-hits>150</qm:list-cache-hits>
  <qm:list-cache-misses>0</qm:list-cache-misses>
  <qm:in-memory-list-hits>0</qm:in-memory-list-hits>
  <qm:triple-cache-hits>0</qm:triple-cache-hits>
  <qm:triple-cache-misses>0</qm:triple-cache-misses>
  <qm:triple-value-cache-hits>0</qm:triple-value-cache-hits>
  <qm:triple-value-cache-misses>0</qm:triple-value-cache-misses>
  <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
  <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
  <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
  <qm:compressed-tree-cache-misses>0</qm:compressed-tree-cache-misses>
  <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
  <qm:value-cache-hits>3</qm:value-cache-hits>
  <qm:value-cache-misses>35</qm:value-cache-misses>
  <qm:regexp-cache-hits>1</qm:regexp-cache-hits>
  <qm:regexp-cache-misses>1</qm:regexp-cache-misses>
  <qm:link-cache-hits>0</qm:link-cache-hits>
  <qm:link-cache-misses>0</qm:link-cache-misses>
  <qm:filter-hits>0</qm:filter-hits>
  <qm:filter-misses>0</qm:filter-misses>
  <qm:fragments-added>0</qm:fragments-added>
  <qm:fragments-deleted>0</qm:fragments-deleted>
  <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
  <qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
  <qm:db-program-cache-hits>0</qm:db-program-cache-hits>
  <qm:db-program-cache-misses>0</qm:db-program-cache-misses>
  <qm:env-program-cache-hits>0</qm:env-program-cache-hits>
  <qm:env-program-cache-misses>0</qm:env-program-cache-misses>
  
<qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
  
<qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
  
<qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
  
<qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
  <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
  <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
  <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
  <qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>
  <qm:fragments>
    <qm:fragment>
      <qm:root xmlns="">record</qm:root>
      <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
      <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
    </qm:fragment>
  </qm:fragments>
  <qm:documents>
    <qm:document>
      <qm:uri>/n5061/xml/212441a0.xml</qm:uri>
      <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
      <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
    </qm:document>
  </qm:documents>
  <qm:hosts>
    <qm:host>
      <qm:host-name>#removed#</qm:host-name>
      <qm:round-trip-count>1</qm:round-trip-count>
      <qm:round-trip-time>PT0.000921S</qm:round-trip-time>
    </qm:host>
    <qm:host>
      <qm:host-name>#removed#</qm:host-name>
      <qm:round-trip-count>1</qm:round-trip-count>
      <qm:round-trip-time>PT0.000856S</qm:round-trip-time>
    </qm:host>
  </qm:hosts>
</qm:query-meters>

The only thing that strikes me is that there are value cache misses, I dont 
know if I can do anything for it or anything else I could try. I am running out 
of ideas so it would be great if anyone can share some thoughts/pointers.

Thanks,
Basavaraj Kalloli

_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to