Geert - Thanks! Actually, wrapping the range query in a cts:document-fragment-query returns accurate results and performs reasonably well.
Mike - I had originally tried to do it without the additional roots for the reasons you list here (actually after reading some posts including yours from a similar topic a few months ago). However, without them there seems to be no way to get accurate result/facet counts using Search API. <item-frequency> can get the citation facets closer (it still counts non-matching citations), but the result count estimates will be based on the citations' ancestor doc (I think using cts:remainder?), which is off by enough to confuse our users. Unfortunately, I think we are boxed to this fragment root configuration if we want Search API to be accurate. However, relative to many or most ML implementations, our database is small, so I guess it's a reasonable tradeoff. -Will -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Michael Blakeley Sent: Thursday, January 12, 2012 1:22 PM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Range query not returning all fragments? You might consider removing the fragment root, too. It may not be necessary, and if your citations are small then it will become fairly costly as the database grows. With default settings, the database indexes the words in "Some citation info" as citation element words. So a cts:search with //citation and an element-word-query will be fairly efficient, even without the fragment. -- Mike On 12 Jan 2012, at 13:02 , Geert Josten wrote: > Hi Will, > > The problem is that the citation query part doesn't match the fragments > selected by your searchable path, because - as you rightly state - the > citations are in different fragments than your docs . You can cross > fragment 'borders' using cts:document-fragment-query > (http://developer.marklogic.com/pubs/5.0/apidocs/cts-query.html#cts:docume > nt-fragment-query), but that may result in too tolerant results as I > suspect it goes wider than your //doc search path. > > Anyhow, you can add cts:document-fragment-query, with a nested > cts:element-word-query or else, as an additional-query to search:search.. > > Kind regards, > Geert > > -----Oorspronkelijk bericht----- > Van: [email protected] > [mailto:[email protected]] Namens Will Thompson > Verzonden: donderdag 12 januari 2012 21:32 > Aan: General MarkLogic Developer Discussion > Onderwerp: [MarkLogic Dev General] Range query not returning all > fragments? > > Hi folks, > > I think I have run into a bit of an edge case, and I am struggling to find > a complete solution in ML 5. Using Search API I am trying to provie the > following, with (reasonably) accurate facet and result counts: > > 1. Search //doc for terms > 2. Search //citation for terms > 3. Search //doc ancestors of <citation> meeting certain criteria > > The content looks like: > > <root> > <doc> > <p>Some paragraph text. <citation type="type">Some citation > info</citation></p> > <p>...</p> > </doc> > ... > </root> > > I created a fragment roots on <doc> and <citation> to satisfy 1 and 2. The > problem that searching for one fragment type with a query for a descendant > fragment type breaks a search of type 3. For example: > > cts:search(//doc, > cts:element-attribute-range-query(Qname("","citation"),QName("","type"),"= > ","some-type")) > > What I have discovered is that although there are now both doc *and* > citation fragments matching this range query, ML only returns citation > fragments, which get nulled out in index resolution (I assume) and search > returns 0 results. Removing the fragment root for citation, the range > query returns doc fragments, and search returns the correct results. > However, without the citation fragments, a search over //citation will > return inaccurate result and facet counts, since they will be calculated > based on its ancestor fragments. > > So I don't know how to satisfy both 2 and 3. Our content model and search > requirements do not seem very exotic. Intuitively, it seems like the query > should return all of the matching doc and citation fragments instead of > assuming I want only citation, especially that assumption may be different > from what is defined in the xpath expression. But it appears to be by > design, and I am stumped at a workaround. > > Is there another way to approach this that I have not considered? > > Best regards, > > Will > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
