David,

The answer to question (1) is that positions aren't needed, and (2) is "yes".

You've gathered the right information, and your deductions are correct. The '4357 unordered' trace appears in two situations that I know of: either the range index is incomplete, or the cts:search searchable expression (arg1) is in a different fragment than the sort key. It looks like the latter applies here. If so, the search can't use the range index effectively because most of the fragments to be sorted do not contain the sort key directly.

To sort this result set efficiently, you will need to have the range-indexed values at the same fragment level as your search. Depending on your application and content, that might involve tweaking your query to avoid the fragment link, or reconsidering your fragmentation policy, or possibly copying or moving an element or two from one fragmentation level to another.

As you suggested, the bottleneck is probably I/O. From the query-meters output there could be up to 8385 read operations (some of the blocks might already be in disk cache or buffer cache). That implies up to 490 logical reads/sec, which isn't bad for a RAID-5. The 10k disks can probably sustain 200-300 reads/sec each, so you might expect better. But in my experience RAID-5 random read performance usually isn't much better than single disk performance. I prefer RAID-10, or RAID-50 with small parity groups (for example, you could use a RAID-0 stripe across two 3+1 RAID-5 volumes), and a small stripe size (8-32 kB). The average document size on disk may play a role here too, if it's larger than the disk's physical read size.

-- Mike

On 2009-04-26 22:22, Dave Feldmeier wrote:
I am sorting results and the performance is less than I would like. A search 
without ordering takes 0.2 seconds and a search with ordering takes 17 seconds.

First some questions:

  1.  For improving sort performance, is it sufficient to set up an element range index, or must I 
also set the "range value positions" radio button to "True"?
  2.  Does an element range index improve performance if all the element values 
are unique?

For the following example, I am doing a word query and sorting based on the value of an element (each 
document has a unique value of this element and the type is "string"). I have set up an element 
range index for the element and set the "range value positions" radio button to "True".

Each document has two fragments and the fragment that I'm retrieving for each 
probably is something like 20K.

In the following trace (see below), the path is searchable (good), the word 
query contributes a constraint (good), the order by clause contributed a 
constraint (good), but then I get the line:
Selected 4359 fragments to filter (2 ordered, 4357 unordered)
If I understand this line, it appears that the element range index is not 
helping very much and that I'm fetching 4357 fragments (even so, does 17 
seconds seem reasonable for this many fragments?).

The MarkLogic version is 4.0-3 running on Red Hat Enterprise Linux version 4. 
The hardware is eight 300G 10K RPM disks in a RAID5 array with dual quad core 
64-bit Intel 5405 processors, and I'd think that this would be plenty fast.

                                               -Dave

David Feldmeier
Twin Dolphin Software, Inc.
303 Twin Dolphin Drive, Suite 600
Redwood City, CA 94065


2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: this:search_sort(cts:element-query(expanded-QName("", 
"ASSC_AENSC"), cts:word-query("test", ("lang=en"), 1), ()), "PATNUM", "ascending", 1, 30, 
"US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Analyzing path for 
search: collection("US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Step 1 is 
searchable: collection("US")
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: Path is 
fully searchable.
2009-04-26 00:40:34.000 Info: gazelle2-8007: /lib/patent.xqy line 335: 
Gathering constraints.
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335: Search query contributed 1 constraint: 
cts:element-query(expanded-QName("", "ASSC_AENSC"), cts:word-query("test", 
("lang=en"), 1), ())
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335: Order by 
clause contributed 1 range ordering constraint for $i: order by 
$i/child::PATENT/child::PATNUM ascending
2009-04-26 00:40:34.001 Info: gazelle2-8007: /lib/patent.xqy line 335: 
Executing search.
2009-04-26 00:40:34.091 Info: gazelle2-8007: /lib/patent.xqy line 335: Selected 
4359 fragments to filter (2 ordered, 4357 unordered).
2009-04-26 00:40:51.263 Info: gazelle2-8007:<qm:query-meters 
xsi:schemaLocation="http://marklogic.com/xdmp/query-meters 
query-meters.xsd"<http://marklogic.com/xdmp/query-metersquery-meters.xsd>  
xmlns:qm="http://marklogic.com/xdmp/query-meters";<http://marklogic.com/xdmp/query-meters>  
xmlns:xsi="http:/\
/www.w3.org/2001/XMLSchema-instance">
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:elapsed-time>PT17.261083S</qm:elapsed-time>
2009-04-26 00:40:51.263 Info: gazelle2-8007:<qm:requests>1</qm:requests>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:list-cache-hits>722</qm:list-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:list-cache-misses>0</qm:list-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:in-memory-list-hits>0</qm:in-memory-list-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:expanded-tree-cache-hits>4688</qm:expanded-tree-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:expanded-tree-cache-misses>8385</qm:expanded-tree-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:compressed-tree-cache-hits>3948</qm:compressed-tree-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:compressed-tree-cache-misses>4437</qm:compressed-tree-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:value-cache-hits>1</qm:value-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:value-cache-misses>4545</qm:value-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:regexp-cache-hits>1</qm:regexp-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:regexp-cache-misses>4</qm:regexp-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:link-cache-hits>0</qm:link-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:link-cache-misses>8714</qm:link-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fragments-added>0</qm:fragments-added>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fragments-deleted>0</qm:fragments-deleted>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-program-cache-hits>1</qm:fs-program-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-program-cache-hits>0</qm:db-program-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-program-cache-misses>0</qm:db-program-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
2009-04-26 00:40:51.263 Info: 
gazelle2-8007:<qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>



_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to