As Danny suggested, streaming makes it possible to yield the entire result set 
without pinning down expanded tree cache space. However, streaming only works 
for a fairly narrow range of XQuery expressions. A streaming query can't use 
FLWOR, direct constructors, or enclosed expressions. So the information would 
arrive in a somewhat different format, which means changes to your Stax handler.

Here's one way to do it:

  declare variable $keys as xs:string+ :=
    xdmp:get-request-field('key') ;

  xdmp:directory("/db/netvisn/content/", "infinity")
    /content/lookupInfo[key = $keys]

That will stream - if it can. The "if it can" part comes in because many 
environments don't allow streaming. For example, cq can't stream results simply 
because of the way it evaluates queries and handles the results. I believe 
that's also true for XCC. But you could make an HTTP request to an HTTP server, 
and streaming back the results. That's why I used xdmp:get-request-field to 
define the keys.

You might also want to take a step back and see if you're making best use of 
MarkLogic's strengths. Perhaps I am jumping to conclusions, but expanded tree 
cache problems often come up in applications where the documents try to act 
like RDBMS tables. MarkLogic works best when documents act more like rows. In 
most cases that is an easy change to make... easier than fighting with the 
product, anyway.

-- Mike

On 11 Aug 2011, at 15:43 , Gary Larsen wrote:

> Hi Danny,
>  
> I didn’t realize side effect of the for loop.  Below is the original query 
> (converted from eXist)  where I’m trying to create a structure that’s passed 
> to a Stax handler in Java.   Perhaps a better approach would be to run the 
> $cq and cts:search for each key rather than all at once.
>  
> $keys := ("i9297889D3DB8430DB22F66EF3BA331DF", 
> "i955A9E9DE3964B3A91D4F26BE35C88EB", "i598A118E534B4A5DA42FDBB36FF039AA", 
> "i7A8EAC0921E049D98CC80C3A6DC032A7", "i8DB23187DFCD4D95AF094A2DDB7F43EA")
>  
> let $cq := cts:and-query((cts:directory-query("/db/netvisn/content/", 
> "infinity"),
>                    cts:element-value-query(fn:QName( "key"), $keys)))
> return <results>
>   { for $info in cts:search(fn:doc(), $cq, "filtered")/content/lookupInfo 
> return
> <SyncData>{ $info/key }{ $info }</SyncData> }
> </results>
>  
> Thanks,
> Gary
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf OfDanny Sokolsky
> Sent: Thursday, August 11, 2011 5:07 PM
> To: General MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> Also, because the cts:search is in a for loop and building up a result set, 
> the entire cts:search result set must be put in memory in order to build your 
> result set.  If you just returned the cts:search, it would still be a lot of 
> results, but it could stream it out.  Like Geert says, pagination is a great 
> way to do these kinds of things.
>  
> -Danny
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf OfGary Larsen
> Sent: Thursday, August 11, 2011 2:02 PM
> To: 'General MarkLogic Developer Discussion'
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> HI Geert,
>  
> Thanks for clarifying that only query results are stored to the cache, and I 
> take from your suggestion of paging that read only query results would not 
> clog the cache.
>  
> I may have some unexpected data somewhere and need to research that.
>  
> Thanks!
> gary
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf OfGeert Josten
> Sent: Thursday, August 11, 2011 4:38 PM
> To: General MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> Hi Gary,
>  
> The cts querying part is based on indexes only, so no fragments nor data 
> itself is involved. It is the compilation of the results structure that is 
> causing the problem. To put it simple: you are returning way too many results 
> at once. Start using pagination. Try returning something like 20 or 100 
> results per page. You will see that the pagination works really fast.
>  
> You might also want to take a look at search:search, it can do pagination for 
> you as well. Use additional-query for the directory-query, and perhaps a 
> constraint for your keys (so you could use “key:mykey1”)..
>  
> Kind regards,
> Geert
>  
> Van: [email protected] 
> [mailto:[email protected]] Namens Gary Larsen
> Verzonden: donderdag 11 augustus 2011 21:18
> Aan: 'General MarkLogic Developer Discussion'
> Onderwerp: [MarkLogic Dev General] Quering documents with fragments
>  
> Hi,
>  
> I try solve a query which is throwing XDMP-EXPNTREECACHEFULL errors.  It’s 
> possible that the documents being queried have a large number of fragments (> 
> 30K).  The query is not referencing any elements in the fragments, but I’m 
> wondering if the query is loading the root document AND fragments into the 
> expanded tree cache.  Here’s main part of the query
>  
> let $cq := cts:and-query((
>  cts:directory-query('/db/netvisn/content/','infinity'),
>  cts:element-value-query(xs:QName('key'), $keys)
> ))  return <results>
> {for $info in cts:search(doc(), $cq, 'filtered')/content/lookupInfo return
> <SyncData>{$info/key}
>  
> Do I need to surround the cts:element-value-query() with a 
> cts:document-fragment-query() to avoid grabbing the fragments?
>  
> Thanks
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to