Sorry, how do element word positions help with pagination? Different QNames for different meaning is definitely a GOOD THING, no doubt. :)
Kind regards, Geert -----Oorspronkelijk bericht----- Van: [email protected] [mailto:[email protected]] Namens Michael Blakeley Verzonden: woensdag 19 oktober 2011 8:13 Aan: General MarkLogic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] How to get different facet counts for different searchable-expression in Search API There too you could use different QNames: if some QNames only occur as descendants of cite, then there is no ambiguity. From a storage point of view, adding QNames is almost free. Element word positions could also help. If you wrap the user query in an element-query on cite, element word positions can be used to figure out which documents actually match. Position indexes are somewhat expensive, but in most cases I think that would be cheaper than having a dozen small fragments per document. -- Mike On Oct 18, 2011, at 23:04, Geert Josten <[email protected]> wrote: > Right, ofcourse. I was not paying attention, not thinking of facet counts, > but search result counts (for pagination and such). I reccon that if you want > the latter to match returned results as well, you would need fragmentation on > 'cite', provided you would really be showing cites individually, and not just > documents that happen to contain a matching cite.. ;-) > > Thnx, > Geert > > -----Oorspronkelijk bericht----- > Van: [email protected] > [mailto:[email protected]] Namens Michael Blakeley > Verzonden: woensdag 19 oktober 2011 7:58 > Aan: General MarkLogic Developer Discussion > CC: General MarkLogic Developer Discussion > Onderwerp: Re: [MarkLogic Dev General] How to get different facet counts for > different searchable-expression in Search API > > Well, I haven't looked at the search API code lately, but I presume it is > using cts:frequency, since that is the most efficient way to get counts from > a range index. The reason to use different QNames is because they can be tied > to different range indexes. For example, if I had a /doc/country element and > a /doc//cite/country element, with a range index on country, I would always > get facets based on the entire document. If I have a document that was > published in UK but cited an article from FR, both would show up in the > facets. The range index for a QName contains every value for that QName. > > But if I have a /doc/country element and a /doc//cite/cite-country element, I > can build a range index on each and query them separately. So I can see > "published in XX" separate from "cites articles published in XX". I can also > see both together if I wish, because cts:element-values allows a sequence of > QNames. Essentially I am choosing QNames to tell the database what to index. > > Naturally there would be even more flexibility if we could create range > indexes based on simple XPath expressions as well as QNames. But the existing > functionality is quite powerful, and enriching existing XML with expressive > QNames works well for most applications. > > -- Mike > > On Oct 18, 2011, at 22:30, Geert Josten <[email protected]> wrote: > >> Hi Mike, >> >> In what way does selecting a different range index influence the counts in >> this case? I'd say you are still selecting the same doc fragments, so I'd >> expect the counts to not change at all. Am I overlooking something? Or is >> the search:search libray really using count, and not the fragment-based >> xdmp:estimate? >> >> Kind regards, >> Geert >> >> -----Oorspronkelijk bericht----- >> Van: [email protected] >> [mailto:[email protected]] Namens Michael Blakeley >> Verzonden: woensdag 19 oktober 2011 2:31 >> Aan: General MarkLogic Developer Discussion >> Onderwerp: Re: [MarkLogic Dev General] How to get different facet counts for >> different searchable-expression in Search API >> >> Will, if I can jump in.... I think your idea of using different QNames is >> the right way to look at it. >> >> Facets are built from range indexes, and range indexes contain lists of >> values and fragment ids for a given QName. So if the query matches the >> fragment, the facet will show all the values in that fragment. In your case >> the fragment is the entire document, so you will see all the values in the >> matching documents, whether they occur under /doc or under /doc//cite. Now, >> you *could* create a fragment root on 'cite', but I think that would be >> counter-productive. It's better to use different QNames and have different >> range indexes. >> >> So I think what you'd want to do is simply arrange for a different set of >> search options for doc vs cite, including both searchable expression and >> constraints. Testing for that could be as simple as a call to >> cts:contains($user-search, 'select:cite') before you call search:search(). >> Or if that might generate false positives, you could search:parse the user >> query and then look at the cts:query XML to see whether or not the parser >> found a select:cite term. If it did, then you can switch to the correct >> options before calling search:resolve. >> >> -- Mike >> >> On 18 Oct 2011, at 17:14 , Will Thompson wrote: >> >>> Micah, >>> >>> I think I may have explained poorly. This is essentially what I'm doing -- >>> Docs are, generally, like this: >>> >>> <doc> >>> <search-meta/> >>> <p>...<cite><search-meta/></cite>...</p> >>> <section> >>> <p>...<cite><search-meta/></cite>...</p> >>> ... >>> </section> >>> </doc> >>> >>> Searches operate over //doc by default, but if you add the operator/state >>> "select:cite" it changes the searchable expression to //cite. The results >>> are correct, but the problem is that the facet counts appear to be for >>> *both* doc and cite metadata, and thus do not change when toggling >>> searchable-expressions via operator/state. >>> >>> This won't make any sense to our users, who will expect the facet counts to >>> match what they think they're searching for. >>> >>> -W >>> >>> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of Micah Dubinko >>> Sent: Tuesday, October 18, 2011 6:56 PM >>> To: General MarkLogic Developer Discussion >>> Subject: Re: [MarkLogic Dev General] How to get different facet counts for >>> different searchable-expression in Search API >>> >>> Hi Will, >>> >>> Everything you want to search exists in document fragments (not properties) >>> right? >>> >>> What would happen if you switched in a different searchable-expression via >>> operator and state? The combined query is taken into account by faceting, >>> but the searchable-expression is not. >>> >>> -m >>> >>> >>> On Oct 18, 2011, at 4:42 PM, Will Thompson wrote: >>> >>>> Our app has typically searched only document-type elements, but I recently >>>> added metadata to citation elements (contained within and scattered about >>>> document elements) so that they can be optionally searched using an >>>> operator. i.e.: "term1 term2 select:citations" The operator changes the >>>> searchable-expression and transform-results to search only citation >>>> elements and return citation-specific snippets. >>>> >>>> However, I need the facet counts to reflect the search being performed - >>>> i.e.: only show estimates for document element direct-child metadata >>>> during normal search, and only for citations when that is toggled using >>>> the operator. >>>> >>>> My first thought was to use different names or namespace for the citation >>>> metadata and have the operator toggle a separate set of constraints >>>> associated with those names. But constraints are not supported children of >>>> search:state under search:operator. >> >>>> >>>> Any ideas on how to accomplish this with Search API? >>>> >>>> Thanks! >>>> >>>> -Will >>>> >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
