Thanks David,
I ended up separating time and date values into an attribute and element,
respectively, as you suggested, with an element-range-index on the
element's value. As I wanted to get this as JSON in the end, this is what
I'm using right now (0.002 secs :-)):
xdmp:to-json(
for $date in cts:element-values(xs:QName("normalized-date"), ())
order by $date descending
return
map:new((
map:entry("date", $date),
map:entry("count", cts:frequency($date))
))
)
cheers,
Jakob.
On Thu, Apr 10, 2014 at 11:26 PM, David Ennis <[email protected]> wrote:
> I would personally use another element or even split the date and time
> into two fields. Then you would use a range index on the date field of the
> type date. Of course, you can also just store the date part as an
> attribute of the current element and work off of an element-attribute-index
> and wrk on the same solutions.
>
> to get your results (pseudocode)
>
> for $res in (for $date in
> cts:element-values(xs:QName("yourdatefieldhere"), (set the frequency order
> options here)
> return
> <res><date>{$date}</date><count>{cts:frequencies($date)}</count></date>)
> order by $res/count
> return $res
>
> the result from element-values is from an index and is already a list of
> distinct-values.
>
> Regards,
> David
>
> On 10 April 2014 22:50, Jakob Fix <[email protected]> wrote:
>
>> Hello,
>>
>> How would one go about this one: I'm storing dateTime values for each
>> document.
>> I want to retrieve, efficiently, all unique dates (irrespective of the
>> time part) and the number of items with that date.
>>
>> The naive implementation (because it takes about 1.5 seconds for not much
>> data):
>>
>> let $c := collection("item")
>> return
>> for $d in distinct-values($c !
>> xs:date(xs:dateTime(.//normalized-dateTime/text())))
>> order by $d descending
>> return concat($d, " - ",
>> count($c//item[xs:date(xs:dateTime(.//normalized-dateTime)) = $d]))
>>
>> The profiler tells me that it's spending about 85% of the time in the
>> predicate in the last line, so an index would probably speed up the lookups.
>>
>> I created an element range index for the normalized-dateTime element, of
>> type dateTime. What I was hoping to be a bit less naive, turns out to be
>> not feasible:
>>
>> for $d in distinct-values(collection("item")/item !
>> xs:date(xs:dateTime(./normalized-dateTime/text())))
>> order by $d descending
>> return concat($d, " - ",
>> count(
>> cts:search(/item,
>> cts:and-query((
>> cts:collection-query("item")
>> ,
>> cts:element-range-query(xs:QName("normalized-dateTime"), "=",
>> $d)
>> ))
>> )
>> )
>> )
>>
>> Now it complains that there is no element range index of type date for
>> the normalized-dateTime element, which is correct...
>>
>> Would the recommendation be to add another element, normalized-date, that
>> contained only the date part and work with that, or is there possibly
>> another, even simpler solution?
>>
>> cheers,
>> Jakob.
>>
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general