In general I agree with Mike that you will be able to use an index of kindof /a/(b|c).
But creating a good path range index for $result/(old/path/date|new/path/m:sort-date) may not be easy. ML 6 doesn't allow you to create an index with a top level grouping operator, i.e. (old/path/date|new/path/m:sort-date). You can create an index of type say, indexroot/(old/path/date|new/path/m:sort-date). But then in order to get that index used by fast order by the "indexroot" must match against $result (in fact it should be a leaf of $result). It will be a good idea for Ron to contact Stephen Buxton and submit an RFE. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Michael Blakeley Sent: Friday, August 02, 2013 10:49 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Fast order-by with multiple range indexes? With ML6 I think you could create a single, useful path range index on both elements. There is a table of permitted XPath syntax at http://docs.marklogic.com/guide/admin/range_index#id_54948 and syntax like /a/(b|c) is supported. So you're answer might be "this will be slow now, but we have a plan to make it faster when we upgrade". -- Mike On 2 Aug 2013, at 10:40 , Ron Hitchens <[email protected]> wrote: > > This one is on 5.x. I plan to move them to 6.x for the next phase > (where everything goes to a consistent model), but it's not possible > now. > > If you know a way on 6.x I'd love to see it. We can't use path > range indexes, but how would that work? Something like this?: > > order by xs:date ($result/(old/path/date|new/path/m:sort-date)) > > On Aug 2, 2013, at 6:13 PM, Michael Blakeley <[email protected]> wrote: > >> Which release? >> >> So you're using an element-range index? What about using a path-range index? >> >> -- Mike >> >> On 2 Aug 2013, at 09:26 , Ron Hitchens <[email protected]> wrote: >> >>> >>> I have a sorting problem that I can't find a good solution for. I'm >>> working on a project where a lot of content exists in one form which >>> was not designed for efficient searching or sorting. >>> In fact, MarkLogic is not used at all for search at the moment, >>> that's what I'm adding. >>> >>> This existing format has multiple versions of the content in each >>> document, with an element range index on an xs:date field. >>> I can do efficient sorts on this content alone using the ranged date >>> field in an "order by" clause. >>> >>> Here's the complication: a new type of content is being added in a >>> newer, more MarkLogic-friendly schema. These documents all have a >>> common metadata section with a ranged date field. Different name >>> and namespace, but serving the same purpose. >>> >>> My problem is that I need to do searches across both types of >>> content and sort them together. Searching one kind or the other and >>> sorting by their respective date fields works great for massive >>> result sets. But doing them together blows the expanded tree cache >>> if the result set is large. >>> >>> Because of the odd layout of the old content, my searchable >>> expression is rather funky and looks something like this: >>> >>> cts:search >>> (fn:doc()/(/container/group[@state="live]/doc[fn:not(@foo)]|x:new1|x >>> :new2), $q, "unfiltered") >>> >>> Note that the first one returns a sub-element of the document, which >>> is actually a fragment root. The other two on the end return root >>> elements. >>> >>> A FLWOR like this doesn't work: >>> >>> for $result in cts:search ( . . .) >>> order by xs:date (($result/old/path/date, >>> $result/new/path/m:sort-date)[1]) return $result >>> >>> It runs but ok and will do the right thing if the result set is >>> reasonably small (a few thousand) will blow the cache if there are >>> too many results. Trying to ignore one of the dates also blows he >>> cache: >>> >>> for $result in cts:search ( . . .) >>> order by xs:date ($result/old/path/date) return $result >>> >>> But removing the last two components of the XPath (|x:new1|x:new2) >>> will then run fast. I'm not sure why this prevents the range index >>> from kicking in, probably because of the complexity of the XPath. >>> >>> Sorting combined results by relevance in either direction is fast. >>> >>> Does anyone have a voodoo trick to enable fast sorting using values >>> from two different range indexes? I don't need to look into the >>> documents the get the sort keys, it seems like it shouldn't have to >>> consume expanded tree cache space for this. >>> >>> --- >>> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies >>> +44 7879 358 212 (voice) http://www.ronsoft.com >>> +1 707 924 3878 (fax) Bit Twiddling At Its Finest >>> "No amount of belief establishes any fact." -Unknown >>> >>> >>> >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
