My impression (no guarantees) is that the cts:not-query *should* work and be equivalent to the XPath expression if the following two conditions are true: - The searchable expression (/terminology/conceptDef) is the fragment root, and - The definingConcepts and concept elements are guaranteed to appear only as a direct child (not a descendent of arbitrary depth) of their stated parent elements.
This latter condition is because cts:element-query searches for the specified element as a descendent of the context node, and thus is not equivalent to David's XPath expression. Hope that helps and doesn't make things more confusing. :-) Doug Glidden Software Engineer The Boeing Company [email protected] -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: Thursday, November 19, 2009 16:45 To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] search for non-existing elements To really tell if it will help, try adding an xdmp:query-trace to your query and then look at how many fragments are selected. I did the following experiment and it seems Geert's not-query of an element-query has some promise: xquery version "1.0-ml"; xdmp:document-insert("a.xml", <foo></foo>), xdmp:document-insert("b.xml", <foo><bar/></foo>) ; xquery version "1.0-ml"; count(/foo) (: returns 2 :) ; xquery version "1.0-ml"; xdmp:query-trace(fn:true()), /foo[cts:contains(., cts:not-query( cts:element-query(xs:QName("bar"), cts:and-query( () ) ) ) )] (: log shows: 2009-11-19 13:37:49.217 Info: danny: line 4: Selected 1 fragment to filter :) ; xquery version "1.0-ml"; xdmp:query-trace(fn:true()), /foo[empty(bar)] (: log shows: 2009-11-19 13:42:35.936 Info: danny: line 4: Selected 2 fragments to filter :) So for this small test case, the not-query of the element-query seems to work (with Doug's caution). -Danny -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Glidden, Douglass A Sent: Thursday, November 19, 2009 1:25 PM To: [email protected] Subject: RE: [MarkLogic Dev General] search for non-existing elements A word of warning, unless you're 100% certain that your indices are set up such that the query inside the cts:not-query will _never_ need filtering, cts:not-query can miss correct results. The cts:not-query is one of the only queries that can err in failing to return positive matches (i.e. it gives false negatives instead of false positives). Not saying you can't use it, just be very careful... Doug Glidden Software Engineer The Boeing Company [email protected] -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Geert Josten Sent: Thursday, November 19, 2009 16:16 To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] search for non-existing elements Hi David, > Each conceptDef is a fragment (its parent, terminology is a fragment > Root) I rewrote it as this : I don't think that defining 'terminology' as fragment Root puts each conceptDef in its own fragment. Actually, as 'terminology' seems to be the root element of your document, it would be putting the whole document in a fragment, which isn't very helpful. > My guess is that the non-existance of elements is not > indexed. I would think there would be some way in ML to > index that information but I cant find it. It is pretty difficult to put something in an index that isn't there.. :-) > Instead what I did was pre-process the XML file to generate a 'tree > structure' of the elements containing just the id's. > Like > > <concept id="1"> > <concept id="2"> > <concept id="3"/> > <concept id="4"/> > </concept> > </concept> > </concept> > > This tree xml is 4MB instead of 50MB and processing through it to find > out what things have child elements is almost instant. > > Then I cross reference back to the main XML using the id and thats > instant as well. Yes, this works not too bad, unless there are lot's of updates which means updating this tree each time as well. Note: you could store this tree in the document properties if you like. > SO while this isnt blocking me, I'm curious. Is there a way to tell > ML to index the non existence of elements or attributes ? I would > think thats a very common search. I tried searching for word matches > with the empty text but that didnt work. Well.. There is the cts:not-query. But it usually doesn't work as expected. The documentation states: "The cts:not-query constructor is fragment-based, so it returns true only if the specified query does not produce a match anywhere in a fragment." It is best to read the full explanation: http://developer.marklogic.com/pubs/4.1/apidocs/cts-query.html#cts:not-query A bit of a guess, but it might work if you would do the following (untested!): cts:search(doc("/NDFRT/NDFRT_Public_2009.05.12_TDE.xml")//conceptDef, cts:not-query( cts:element-query(xs:Qname('definingConcepts'), cts:element-query(xs:QName('concept'), cts:and-query(())) ) ) ) Add a fragment root for 'conceptDef', instead of 'terminology', otherwise the cts:not-query won't help at all. Perhaps you need to reindex as well. HTH! Kind regards, Geert Drs. G.P.H. Josten Consultant http://www.daidalos.nl/ Daidalos BV Source of Innovation Hoekeindsehof 1-4 2665 JZ Bleiswijk Tel.: +31 (0) 10 850 1200 Fax: +31 (0) 10 850 1199 http://www.daidalos.nl/ KvK 27164984 De informatie - verzonden in of met dit emailbericht - is afkomstig van Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend. _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
