Re: [MarkLogic Dev General] MarkLogic vs SQL Server search performance

Michael Blakeley Mon, 29 Dec 2008 13:29:32 -0800

Grant,

I strongly recommend pagination in your query: seehttp://developer.marklogic.com/howto/tutorials/2006-09-paginated-search.xqy

As well as xdmp:query-meters(), you should consult xdmp:query-trace() -see http://developer.marklogic.com/pubs/4.0/books/performance.pdf

Looking at the structure of your documents, I'd try storing each entryas a separate document. So your search would become /entry ratherthan/content/entry.

You could use Corb(http://developer.marklogic.com/svn/corb/trunk/README.html) to split upthe documents in your existing database, or you could use RecordLoader(http://developer.marklogic.com/howto/tutorials/2006-06-recordloader.xqy) tosplit up the source documents as you insert each one.


thanks,
-- Mike

On 2008-12-29 11:58, Grant Lindley wrote:

I'm comparing full-text search performance between MarkLogic 4.0 and SQL
Server2005 from a C# .NET web page.

So far searches take about twice as long in MarkLogic compared to SQL
Server, and I'm looking for suggestions to improve performance in ML.

The test data consists of 14,035 searchable records that take up 52 MB
in an XML text file.

Here's a sample record:

<content>
   <entry entryId="121866">
     <title>Alvar Aalto</title>
     <sortTitle>Aalto, Alvar</sortTitle>
     <searchTitle>Aalto, Alvar</searchTitle>
     <synopsis>Finland's most distinguished designer, Alvar Aalto is
renowned for his building designs as well as for his unique birchwood
furniture designs that are the archetype of Finnish furniture.
</synopsis>
     <mainText>  Finland's most distinguished architect and designer, ...
[long text removed]</mainText>
     <entryDate></entryDate>
     <searchExclude>False</searchExclude>
     <hyperlink>False</hyperlink>
     <furtherReading>Alvar Aalto Museum Web Site
(http://www.alvaraalto.fi)</furtherReading>
     <siteCredits>ABC-CLIO</siteCredits>
     <citationCredits></citationCredits>
     <citationCredits2></citationCredits2>
     <accentUpdated>True</accentUpdated>
     <category categoryId="22">
       <displayTitle>Individuals</displayTitle>
       <formOrder>30</formOrder>
       <filterable>True</filterable>
       <categoryTypeId>5</categoryTypeId>
       <longDescription>Individuals</longDescription>
     </category>
     <subTopic subTopicId="62" topicId="3">
       <displayTitle>Finland</displayTitle>
       <description>Finland</description>
       <sortOrder>-1</sortOrder>
     </subTopic>
     <topic topicId="3">
       <description>Europe</description>
     </topic>
   </entry>
</content>

The elements that are included in the search are title, sortTitle,
mainText, and siteCredits.

For the MarkLogic index settings, I have selected only basic stemmed
searches and fast phrase searches.

The best results so far have been obtained when the entry element has
been added as a fragment root.

Here's the code currently being used to execute the search:

   cts:search(fn:doc()//content/entry, cts:field-word-query("FullText",
"president"), "unfiltered" )

where "FullText" is a field that has been set up with the four
searchable elements above.

I tried running with xdmp:query-meters() and didn't find any cache
misses.

I'm experienced with SQL Server, but brand new to MarkLogic, so any
suggestions would be appreciated.

-Grant




------------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Re: [MarkLogic Dev General] MarkLogic vs SQL Server search performance

Reply via email to