David,
The answer to question (1) is that positions aren't needed, and (2) is
"yes".
You've gathered the right information, and your deductions are correct.
The '4357 unordered' trace appears in two situations that I know of:
either the range index is incomplete, or the cts:search searchable
expression (arg1) is in a different fragment than the sort key. It looks
like the latter applies here. If so, the search can't use the range
index effectively because most of the fragments to be sorted do not
contain the sort key directly.
To sort this result set efficiently, you will need to have the
range-indexed values at the same fragment level as your search.
Depending on your application and content, that might involve tweaking
your query to avoid the fragment link, or reconsidering your
fragmentation policy, or possibly copying or moving an element or two
from one fragmentation level to another.
As you suggested, the bottleneck is probably I/O. From the query-meters
output there could be up to 8385 read operations (some of the blocks
might already be in disk cache or buffer cache). That implies up to 490
logical reads/sec, which isn't bad for a RAID-5. The 10k disks can
probably sustain 200-300 reads/sec each, so you might expect better. But
in my experience RAID-5 random read performance usually isn't much
better than single disk performance. I prefer RAID-10, or RAID-50 with
small parity groups (for example, you could use a RAID-0 stripe across
two 3+1 RAID-5 volumes), and a small stripe size (8-32 kB). The average
document size on disk may play a role here too, if it's larger than the
disk's physical read size.
-- Mike
On 2009-04-26 22:22, Dave Feldmeier wrote:
I am sorting results and the performance is less than I would like. A search
without ordering takes 0.2 seconds and a search with ordering takes 17 seconds.
First some questions:
1. For improving sort performance, is it sufficient to set up an element range index, or must I
also set the "range value positions" radio button to "True"?
2. Does an element range index improve performance if all the element values
are unique?
For the following example, I am doing a word query and sorting based on the value of an element (each
document has a unique value of this element and the type is "string"). I have set up an element
range index for the element and set the "range value positions" radio button to "True".
Each document has two fragments and the fragment that I'm retrieving for each
probably is something like 20K.
In the following trace (see below), the path is searchable (good), the word
query contributes a constraint (good), the order by clause contributed a
constraint (good), but then I get the line:
Selected 4359 fragments to filter (2 ordered, 4357 unordered)
If I understand this line, it appears that the element range index is not
helping very much and that I'm fetching 4357 fragments (even so, does 17
seconds seem reasonable for this many fragments?).
The MarkLogic version is 4.0-3 running on Red Hat Enterprise Linux version 4.
The hardware is eight 300G 10K RPM disks in a RAID5 array with dual quad core
64-bit Intel 5405 processors, and I'd think that this would be plenty fast.
-Dave
David Feldmeier
Twin Dolphin Software, Inc.
303 Twin Dolphin Drive, Suite 600
Redwood City, CA 94065
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: this:search_sort(cts:element-query(expanded-QName("",
"ASSC_AENSC"), cts:word-query("test", ("lang=en"), 1), ()), "PATNUM", "ascending", 1, 30,
"US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Analyzing path for
search: collection("US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Step 1 is
searchable: collection("US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Path is
fully searchable.
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335:
Gathering constraints.
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335: Search query contributed 1 constraint:
cts:element-query(expanded-QName("", "ASSC_AENSC"), cts:word-query("test",
("lang=en"), 1), ())
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335: Order by
clause contributed 1 range ordering constraint for $i: order by
$i/child::PATENT/child::PATNUM ascending
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335:
Executing search.
2009-04-26 00:40:34.091 Info: gazelle2-8007: /lib/patent.xqy line 335: Selected
4359 fragments to filter (2 ordered, 4357 unordered).
2009-04-26 00:40:51.263 Info: gazelle2-8007:<qm:query-meters
xsi:schemaLocation="http://marklogic.com/xdmp/query-meters
query-meters.xsd"<http://marklogic.com/xdmp/query-metersquery-meters.xsd>
xmlns:qm="http://marklogic.com/xdmp/query-meters"<http://marklogic.com/xdmp/query-meters>
xmlns:xsi="http:/\
/www.w3.org/2001/XMLSchema-instance">
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:elapsed-time>PT17.261083S</qm:elapsed-time>
2009-04-26 00:40:51.263 Info: gazelle2-8007:<qm:requests>1</qm:requests>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:list-cache-hits>722</qm:list-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:list-cache-misses>0</qm:list-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:in-memory-list-hits>0</qm:in-memory-list-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:expanded-tree-cache-hits>4688</qm:expanded-tree-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:expanded-tree-cache-misses>8385</qm:expanded-tree-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:compressed-tree-cache-hits>3948</qm:compressed-tree-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:compressed-tree-cache-misses>4437</qm:compressed-tree-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:value-cache-hits>1</qm:value-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:value-cache-misses>4545</qm:value-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:regexp-cache-hits>1</qm:regexp-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:regexp-cache-misses>4</qm:regexp-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:link-cache-hits>0</qm:link-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:link-cache-misses>8714</qm:link-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fragments-added>0</qm:fragments-added>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fragments-deleted>0</qm:fragments-deleted>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-program-cache-hits>1</qm:fs-program-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-program-cache-hits>0</qm:db-program-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-program-cache-misses>0</qm:db-program-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
2009-04-26 00:40:51.263 Info:
gazelle2-8007:<qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general