A call to xdmp:value() could be doing almost anything. But my guess is snippet 
generation and highlighting. Going further out on a limb, it may have something 
to do with XPath across fragment boundaries. If the snippet code tries to go up 
to the '/logfile' root element, it will be loading a fairly large fragment with 
many (millions of?) child links. That could get ugly, especially if the working 
set is too large for the CPU's on-die caches.

You could test that theory by turning off snippet display, and by creating some 
logfile documents of various sizes. If I'm on the right track, you'll see that 
elapsed time is related to the number of fragments. I don't know if it will be 
O(n) or something worse, though.

I would consider making '/log' the root element, with no subfragments. If you 
have metadata in the logfile, you could represent that with a directory 
structure under /logs/, and perhaps have a metadata document with a known base 
URI inside each directory. Or you could repeat the metadata in each log-entry 
document. But I would try to get away from using sub-document fragments.

-- Mike

On 11 Jul 2011, at 07:55 , Lee, David wrote:

> I'm playing with search:search
> I have about 4GB of data (few million xml files)
> Most searchs  return with a second or 2 but something magic is happening if I 
> use the word "INFO"
>  
>  
> With the following search it takes nearly 3 minutes and returns no results.
> I tried other words that both return results and no results and few results 
> and can only replicate it with the magic word "INFO"
>  
> search:search( "INFO" ,
>                                 <options 
> xmlns="http://marklogic.com/appservices/search";>
>                                   
> <additional-query>{cts:directory-query("/logs/","infinity")}
>                                   </additional-query>
>                                   
> <searchable-expression>/logfile/log</searchable-expression>
>                                 </options>
>                                 , 1 , 10 )
>  
>  
> Profiling shows the majority of time is spent in xdmp:value
>  
> MarkLogic/appservices/utils/higher-order.xqy:   52
> xdmp:value($expr/hof:lambda/@expr)
> 1
> 100
> 124556878
> 100
> 124556880
>  
>  
> thats 124 seconds.  Everything else is noise (mostly under 100 us).
>  
> Any ideas ?
>  
>  
> ----------------------------------------
> David A. Lee
> Senior Principal Software Engineer
> Epocrates, Inc.
> [email protected]
> 812-482-5224
>  
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to