Which release? So you're using an element-range index? What about using a path-range index?
-- Mike On 2 Aug 2013, at 09:26 , Ron Hitchens <[email protected]> wrote: > > I have a sorting problem that I can't find a good solution > for. I'm working on a project where a lot of content exists in > one form which was not designed for efficient searching or sorting. > In fact, MarkLogic is not used at all for search at the moment, > that's what I'm adding. > > This existing format has multiple versions of the content > in each document, with an element range index on an xs:date field. > I can do efficient sorts on this content alone using the ranged > date field in an "order by" clause. > > Here's the complication: a new type of content is being added > in a newer, more MarkLogic-friendly schema. These documents all > have a common metadata section with a ranged date field. Different > name and namespace, but serving the same purpose. > > My problem is that I need to do searches across both types of > content and sort them together. Searching one kind or the other > and sorting by their respective date fields works great for massive > result sets. But doing them together blows the expanded tree cache > if the result set is large. > > Because of the odd layout of the old content, my searchable > expression is rather funky and looks something like this: > > cts:search > (fn:doc()/(/container/group[@state="live]/doc[fn:not(@foo)]|x:new1|x:new2), > $q, "unfiltered") > > Note that the first one returns a sub-element of the document, > which is actually a fragment root. The other two on the end return > root elements. > > A FLWOR like this doesn't work: > > for $result in cts:search ( . . .) > order by xs:date (($result/old/path/date, $result/new/path/m:sort-date)[1]) > return $result > > It runs but ok and will do the right thing if the result > set is reasonably small (a few thousand) will blow the cache > if there are too many results. Trying to ignore one of the > dates also blows he cache: > > for $result in cts:search ( . . .) > order by xs:date ($result/old/path/date) > return $result > > But removing the last two components of the XPath (|x:new1|x:new2) > will then run fast. I'm not sure why this prevents the range index > from kicking in, probably because of the complexity of the XPath. > > Sorting combined results by relevance in either direction is fast. > > Does anyone have a voodoo trick to enable fast sorting using values > from two different range indexes? I don't need to look into the documents > the get the sort keys, it seems like it shouldn't have to consume expanded > tree cache space for this. > > --- > Ron Hitchens {mailto:[email protected]} Ronsoft Technologies > +44 7879 358 212 (voice) http://www.ronsoft.com > +1 707 924 3878 (fax) Bit Twiddling At Its Finest > "No amount of belief establishes any fact." -Unknown > > > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
