Jakob,

I'm fairly confused about what you're trying to do with this latest query. I don't see the relationship between your latest query and anything else in this thread.

There is very little performance difference between cts:search() and an equivalent XPath expression. Both make use of the same indexes. The main difference is that XPath results are in document order, while cts:search orders by relevance.

I'm not sure why you're seeing XDMP-EXPNTREECACHEFULL with your cts:search, but there are only two possibilities. Either your cts:search() returns too many matches to fit into the expanded-tree cache, or the rather odd arg2 (starting with collection... was that intentional?) returns too many matches for the expanded-tree cache. Generally, the solution to either would be to limit the number of matches.

The tutorial at http://developer.marklogic.com/howto/tutorials/2006-09-paginated-search.xqy might be helpful to you - it discusses this issue and others. The Search Developer Guide (http://developer.marklogic.com/pubs/4.1/books/search-dev-guide.pdf) might be helpful, too.

If you are using 4.1, you might also want to look at the search:search() API as an alternative to cts:search. It also has a tutorial and is also covered in the search guide.

  http://developer.marklogic.com/pubs/4.1/apidocs/SearchAPI.html


http://developer.marklogic.com/howto/tutorials/2009-07-search-api-walkthrough.xqy

BTW, it may be that you expect fn:collection("collection-2009-7-11")/cr:crossref_result/cr:query_result[1] to return at most one node. That isn't how XPath works: that expression could return any number of nodes. The positional predicate is evaluated separately for each context node, not for the sequence of all nodes. You might have meant something more like (//a)[1].

-- Mike

On 2009-07-16 12:40, Jakob Fix wrote:
Mike,

thanks for your reply, I agree with the points you make regarding namespaces.

Moving on, would it not be faster to use something like the cts:query
expressions shown in my original message instead of a simple xpath?  I
had the impression that it was recommended to me by one of the posters
(you?) to create an index in order to accelerate the search.


I'm also trying this (which gives a "expanded tree cache full" error :( ):

declare namespace cr = "http://www.crossref.org/qrschema/2.0";;

(xdmp:query-trace(true()),

cts:search(
   fn:doc(),
   
fn:collection("collection-2009-7-11")/cr:crossref_result/cr:query_result[1]/cr:body[1]/cr:query[...@status='resolved']
),

xdmp:query-trace(false()))

[1.0-ml] XDMP-EXPNTREECACHEFULL: xdmp:eval("declare namespace cr =
&quot;http://www.crossref.org/qrschema/2....";, (),<options
xmlns="xdmp:eval"><database>10374816636749856048</database><modules>10374816636749...</options>)
-- Expanded tree cache full on host

So, I don't even see the output of the trace ...

cheers,
Jakob.



On Thu, Jul 16, 2009 at 20:21, Michael
Blakeley<[email protected]>  wrote:
Jakob,

Do you really want 'query' in *any* namespace? It looks to me like 'query'
is in the empty namespace, and is always a child of the root 'result', so I
would write '/result/query' or '//query' instead of '//*:query'. If you need
to find 'query' in multiple namespaces, I recommend enumerating all the
possibilities.

Expressions using '*:' are best avoided in production code. They tend to
introduce bugs into your application, and they can't be resolved using the
server's indexes. While '*:' expressions can be useful when debugging, they
should be removed as soon as possible. When doing code reviews, I treat them
as a red flag.

thanks,
-- Mike

On 2009-07-16 04:38, Jakob Fix wrote:
Here I am again ...

1) added a number of test items to the collection "test"
2) each document contains xml like this

<result>
   ....
   <query key="555-555" status="resolved" fl_count="0">
     <doi type="journal_title">10.1787/1684341x</doi>
     <issn type="print">16095316</issn>
     <journal_title>Documents de l OCDE</journal_title>
   </query>
</result>

3) I am interested in all documents in the "test" collection where the
xpath //*:que...@status="resolved"] on the one side, and
[...@status="unresolved"] on the other side - using the xpath directly
works, but is too slow over many thousand documents.

4) I've created an attribute range index for xs:string, "query",
"status" (no namespaces defined; btw, I also created an
element-attribute word index, but it seems that this is not necessary)

5) I was hoping the following query would return the expected results,
but it doesn't:

cts:search(fn:doc(),
    cts:and-query((
      cts:collection-query(("test")),
      cts:element-attribute-word-match(xs:QName("query"),
xs:QName("status"), "unresolved")
    ))
)
return xdmp:node-uri($x)

6) three of the four test documents have a @status="resolved" and one
"unresolved" so I expected one uri for the above query.  However, the
result is this:
/data/2009/07/16/1684341X.xml
/data/2009/07/16/16097513.xml
/data/2009/07/16/16812328.xml
/data/2009/07/16/16097408.xml

I do get an empty sequence when asking for @status="resolved" ...  Is
this just a configuration problem, or is my query wrongly constructed?

Thanks, as usual, for your help,
Jakob.
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to