Thanks for your suggestions, Mike. See below. > I strongly recommend pagination in your query: see > http://developer.marklogic.com/howto/tutorials/2006-09-paginated-search. xqy
This greatly increases the performance, but there is a hitch. In my case, there is a special requirement for the search results page that all of the categories that have at least one matching record are to be displayed. (Categories are things like map, image, biography, etc.) I think this means that I have to loop through all matching records in order to grab all of the matched categories... unless there is a way to craft a fast search that only pulls out the categories. Then I could combine the fast category search with the fast paginated search. I'll explore that some more. > As well as xdmp:query-meters(), you should consult > xdmp:query-trace() - see > http://developer.marklogic.com/pubs/4.0/books/performance.pdf Here's the output from query-meters() and query-trace(). I didn't see anything, except I'm not sure what the value of the <qm:elapsed-time> element means. (The search took approximately 5 seconds to return.) /eval line 1: Analyzing path for search: doc() /eval line 1: Step 1 is searchable: doc() /eval line 1: Path is fully searchable. /eval line 1: Gathering constraints. /eval line 1: Search query contributed 1 constraint: cts:field-word-query("FullText", "president", ("lang=en"), 1) /eval line 1: Executing search. /eval line 1: Selected 4090 fragments <qm:query-meters xsi:schemaLocation="http://marklogic.com/xdmp/query-meters query-meters.xsd" xmlns:qm="http://marklogic.com/xdmp/query-meters" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <qm:elapsed-time>PT0S</qm:elapsed-time> <qm:requests>1</qm:requests> <qm:list-cache-hits>4</qm:list-cache-hits> <qm:list-cache-misses>0</qm:list-cache-misses> <qm:in-memory-list-hits>0</qm:in-memory-list-hits> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses> <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits> <qm:compressed-tree-cache-misses>0</qm:compressed-tree-cache-misses> <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits> <qm:value-cache-hits>0</qm:value-cache-hits> <qm:value-cache-misses>0</qm:value-cache-misses> <qm:regexp-cache-hits>0</qm:regexp-cache-hits> <qm:regexp-cache-misses>0</qm:regexp-cache-misses> <qm:link-cache-hits>0</qm:link-cache-hits> <qm:link-cache-misses>0</qm:link-cache-misses> <qm:fragments-added>0</qm:fragments-added> <qm:fragments-deleted>0</qm:fragments-deleted> <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits> <qm:fs-program-cache-misses>0</qm:fs-program-cache-misses> <qm:db-program-cache-hits>0</qm:db-program-cache-hits> <qm:db-program-cache-misses>0</qm:db-program-cache-misses> <qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cac he-hits> <qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-c ache-misses> <qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cac he-hits> <qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-c ache-misses> <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits> <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses> <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits> <qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses> <qm:fragments> <qm:fragment> <qm:root xmlns="">entry</qm:root> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses> </qm:fragment> </qm:fragments> <qm:documents> <qm:document> <qm:uri>/C/TEMP/EBookDump/436672.xml</qm:uri> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses> </qm:document> </qm:documents> > Looking at the structure of your documents, I'd try storing > each entry as a separate document. So your search would > become /entry rather than/content/entry. I removed the content element and created and loaded a separate document for each record. This didn't change the performance, however. > > On 2008-12-29 11:58, Grant Lindley wrote: > > I'm comparing full-text search performance between > MarkLogic 4.0 and > > SQL > > Server2005 from a C# .NET web page. > > > > So far searches take about twice as long in MarkLogic > compared to SQL > > Server, and I'm looking for suggestions to improve > performance in ML. > > > > The test data consists of 14,035 searchable records that > take up 52 MB > > in an XML text file. > > > > Here's a sample record: > > > > <content> > > <entry entryId="121866"> > > <title>Alvar Aalto</title> > > <sortTitle>Aalto, Alvar</sortTitle> > > <searchTitle>Aalto, Alvar</searchTitle> > > <synopsis>Finland's most distinguished designer, Alvar > Aalto is > > renowned for his building designs as well as for his unique > birchwood > > furniture designs that are the archetype of Finnish furniture. > > </synopsis> > > <mainText> Finland's most distinguished architect and > designer, ... > > [long text removed]</mainText> > > <entryDate></entryDate> > > <searchExclude>False</searchExclude> > > <hyperlink>False</hyperlink> > > <furtherReading>Alvar Aalto Museum Web Site > > (http://www.alvaraalto.fi)</furtherReading> > > <siteCredits>ABC-CLIO</siteCredits> > > <citationCredits></citationCredits> > > <citationCredits2></citationCredits2> > > <accentUpdated>True</accentUpdated> > > <category categoryId="22"> > > <displayTitle>Individuals</displayTitle> > > <formOrder>30</formOrder> > > <filterable>True</filterable> > > <categoryTypeId>5</categoryTypeId> > > <longDescription>Individuals</longDescription> > > </category> > > <subTopic subTopicId="62" topicId="3"> > > <displayTitle>Finland</displayTitle> > > <description>Finland</description> > > <sortOrder>-1</sortOrder> > > </subTopic> > > <topic topicId="3"> > > <description>Europe</description> > > </topic> > > </entry> > > </content> > > > > The elements that are included in the search are title, sortTitle, > > mainText, and siteCredits. > > > > For the MarkLogic index settings, I have selected only > basic stemmed > > searches and fast phrase searches. > > > > The best results so far have been obtained when the entry > element has > > been added as a fragment root. > > > > Here's the code currently being used to execute the search: > > > > cts:search(fn:doc()//content/entry, > > cts:field-word-query("FullText", "president"), "unfiltered" ) > > > > where "FullText" is a field that has been set up with the four > > searchable elements above. > > > > I tried running with xdmp:query-meters() and didn't find any cache > > misses. > > > > I'm experienced with SQL Server, but brand new to MarkLogic, so any > > suggestions would be appreciated. > > > > -Grant > > > > > > > > > > > ---------------------------------------------------------------------- > > -- > > > > _______________________________________________ > > General mailing list > > [email protected] > > http://xqzone.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://xqzone.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
