Random thought:
Instead of trying to find a way to specify in XPath a sortable expression that
matches up with a set of range indexes, it might be cleaner to specify the
ordering range index directly, such as:
for $item in cts:search(...)
order by cts:field-reference("fieldname")
return $item
Perk: the behavior when the index doesn't exist will be an explanatory error
instead of slow execution.
Note this doesn't work today. Just thinking aloud.
-jh-
On Aug 2, 2013, at 12:45 PM, Gajanan Chinchwadkar
<[email protected]> wrote:
> In general I agree with Mike that you will be able to use an index of kindof
> /a/(b|c).
>
> But creating a good path range index for
> $result/(old/path/date|new/path/m:sort-date) may not be easy. ML 6 doesn't
> allow you to create an index with a top level grouping operator, i.e.
> (old/path/date|new/path/m:sort-date). You can create an index of type say,
> indexroot/(old/path/date|new/path/m:sort-date). But then in order to get that
> index used by fast order by the "indexroot" must match against $result (in
> fact it should be a leaf of $result).
>
> It will be a good idea for Ron to contact Stephen Buxton and submit an RFE.
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Michael Blakeley
> Sent: Friday, August 02, 2013 10:49 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Fast order-by with multiple range
> indexes?
>
> With ML6 I think you could create a single, useful path range index on both
> elements. There is a table of permitted XPath syntax at
> http://docs.marklogic.com/guide/admin/range_index#id_54948 and syntax like
> /a/(b|c) is supported.
>
> So you're answer might be "this will be slow now, but we have a plan to make
> it faster when we upgrade".
>
> -- Mike
>
> On 2 Aug 2013, at 10:40 , Ron Hitchens <[email protected]> wrote:
>
>>
>> This one is on 5.x. I plan to move them to 6.x for the next phase
>> (where everything goes to a consistent model), but it's not possible
>> now.
>>
>> If you know a way on 6.x I'd love to see it. We can't use path
>> range indexes, but how would that work? Something like this?:
>>
>> order by xs:date ($result/(old/path/date|new/path/m:sort-date))
>>
>> On Aug 2, 2013, at 6:13 PM, Michael Blakeley <[email protected]> wrote:
>>
>>> Which release?
>>>
>>> So you're using an element-range index? What about using a path-range index?
>>>
>>> -- Mike
>>>
>>> On 2 Aug 2013, at 09:26 , Ron Hitchens <[email protected]> wrote:
>>>
>>>>
>>>> I have a sorting problem that I can't find a good solution for. I'm
>>>> working on a project where a lot of content exists in one form which
>>>> was not designed for efficient searching or sorting.
>>>> In fact, MarkLogic is not used at all for search at the moment,
>>>> that's what I'm adding.
>>>>
>>>> This existing format has multiple versions of the content in each
>>>> document, with an element range index on an xs:date field.
>>>> I can do efficient sorts on this content alone using the ranged date
>>>> field in an "order by" clause.
>>>>
>>>> Here's the complication: a new type of content is being added in a
>>>> newer, more MarkLogic-friendly schema. These documents all have a
>>>> common metadata section with a ranged date field. Different name
>>>> and namespace, but serving the same purpose.
>>>>
>>>> My problem is that I need to do searches across both types of
>>>> content and sort them together. Searching one kind or the other and
>>>> sorting by their respective date fields works great for massive
>>>> result sets. But doing them together blows the expanded tree cache
>>>> if the result set is large.
>>>>
>>>> Because of the odd layout of the old content, my searchable
>>>> expression is rather funky and looks something like this:
>>>>
>>>> cts:search
>>>> (fn:doc()/(/container/group[@state="live]/doc[fn:not(@foo)]|x:new1|x
>>>> :new2), $q, "unfiltered")
>>>>
>>>> Note that the first one returns a sub-element of the document, which
>>>> is actually a fragment root. The other two on the end return root
>>>> elements.
>>>>
>>>> A FLWOR like this doesn't work:
>>>>
>>>> for $result in cts:search ( . . .)
>>>> order by xs:date (($result/old/path/date,
>>>> $result/new/path/m:sort-date)[1]) return $result
>>>>
>>>> It runs but ok and will do the right thing if the result set is
>>>> reasonably small (a few thousand) will blow the cache if there are
>>>> too many results. Trying to ignore one of the dates also blows he
>>>> cache:
>>>>
>>>> for $result in cts:search ( . . .)
>>>> order by xs:date ($result/old/path/date) return $result
>>>>
>>>> But removing the last two components of the XPath (|x:new1|x:new2)
>>>> will then run fast. I'm not sure why this prevents the range index
>>>> from kicking in, probably because of the complexity of the XPath.
>>>>
>>>> Sorting combined results by relevance in either direction is fast.
>>>>
>>>> Does anyone have a voodoo trick to enable fast sorting using values
>>>> from two different range indexes? I don't need to look into the
>>>> documents the get the sort keys, it seems like it shouldn't have to
>>>> consume expanded tree cache space for this.
>>>>
>>>> ---
>>>> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies
>>>> +44 7879 358 212 (voice) http://www.ronsoft.com
>>>> +1 707 924 3878 (fax) Bit Twiddling At Its Finest
>>>> "No amount of belief establishes any fact." -Unknown
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general