I'm experimenting with Range Indexes over dateTime and dayTimeDuration values. I have a large number (millions) of small documents/fragments each with a dateTime and/or dayTimeDuration.
Currently these are in millisecond accuracy. For most things I dont need the ms accuracy but its useful on occasion. I am wondering is there a detrimental effect of this precision ? One example is log file entries. There may be hundreds that occur within the same second. If I make a range index over the dateTime field, each of these will get a unique value, and if I query the index say using cts:element-attribute-values() pretty much every fragment will have a unique value (so the number of unique entries in the index is high). However if I truncate the dateTime to seconds then there will be vastly fewer unique values . I am curious what the effect, if any, would be of doing this. Does the size or search time of the range indexes depend on the number of unique values ? or more on the number of fragments ? I am thinking it would have to depend on both as it needs to map value -> (set of fragments). So what is the difference if the common case is nearly 1:1 value:fragment vs 1:many value: fragment ? I suspect a similar issue arises with double and geo values as well. My guess/hope is it doesnt make much difference ... but am curious if there might be an easy dramatic savings in time or space by truncating precision. -David ---------------------------------------- David A. Lee Senior Principal Software Engineer Epocrates, Inc. [email protected]<mailto:[email protected]> 812-482-5224
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
