Grant,

I strongly recommend pagination in your query: see http://developer.marklogic.com/howto/tutorials/2006-09-paginated-search.xqy

As well as xdmp:query-meters(), you should consult xdmp:query-trace() - see http://developer.marklogic.com/pubs/4.0/books/performance.pdf

Looking at the structure of your documents, I'd try storing each entry as a separate document. So your search would become /entry rather than/content/entry.

You could use Corb (http://developer.marklogic.com/svn/corb/trunk/README.html) to split up the documents in your existing database, or you could use RecordLoader (http://developer.marklogic.com/howto/tutorials/2006-06-recordloader.xqy) to split up the source documents as you insert each one.

thanks,
-- Mike

On 2008-12-29 11:58, Grant Lindley wrote:
I'm comparing full-text search performance between MarkLogic 4.0 and SQL
Server2005 from a C# .NET web page.

So far searches take about twice as long in MarkLogic compared to SQL
Server, and I'm looking for suggestions to improve performance in ML.

The test data consists of 14,035 searchable records that take up 52 MB
in an XML text file.

Here's a sample record:

<content>
   <entry entryId="121866">
     <title>Alvar Aalto</title>
     <sortTitle>Aalto, Alvar</sortTitle>
     <searchTitle>Aalto, Alvar</searchTitle>
     <synopsis>Finland's most distinguished designer, Alvar Aalto is
renowned for his building designs as well as for his unique birchwood
furniture designs that are the archetype of Finnish furniture.
</synopsis>
     <mainText>  Finland's most distinguished architect and designer, ...
[long text removed]</mainText>
     <entryDate></entryDate>
     <searchExclude>False</searchExclude>
     <hyperlink>False</hyperlink>
     <furtherReading>Alvar Aalto Museum Web Site
(http://www.alvaraalto.fi)</furtherReading>
     <siteCredits>ABC-CLIO</siteCredits>
     <citationCredits></citationCredits>
     <citationCredits2></citationCredits2>
     <accentUpdated>True</accentUpdated>
     <category categoryId="22">
       <displayTitle>Individuals</displayTitle>
       <formOrder>30</formOrder>
       <filterable>True</filterable>
       <categoryTypeId>5</categoryTypeId>
       <longDescription>Individuals</longDescription>
     </category>
     <subTopic subTopicId="62" topicId="3">
       <displayTitle>Finland</displayTitle>
       <description>Finland</description>
       <sortOrder>-1</sortOrder>
     </subTopic>
     <topic topicId="3">
       <description>Europe</description>
     </topic>
   </entry>
</content>

The elements that are included in the search are title, sortTitle,
mainText, and siteCredits.

For the MarkLogic index settings, I have selected only basic stemmed
searches and fast phrase searches.

The best results so far have been obtained when the entry element has
been added as a fragment root.

Here's the code currently being used to execute the search:

   cts:search(fn:doc()//content/entry, cts:field-word-query("FullText",
"president"), "unfiltered" )

where "FullText" is a field that has been set up with the four
searchable elements above.

I tried running with xdmp:query-meters() and didn't find any cache
misses.

I'm experienced with SQL Server, but brand new to MarkLogic, so any
suggestions would be appreciated.

-Grant




------------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to