Thanks for everyone's suggestions as everything helps to write queries with performance.
The problem turned out to be a wrong QName in cts:element-value-query() which was indeed bringing back unwanted tons of data. gary -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Michael Blakeley Sent: Thursday, August 11, 2011 7:07 PM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Quering documents with fragments As Danny suggested, streaming makes it possible to yield the entire result set without pinning down expanded tree cache space. However, streaming only works for a fairly narrow range of XQuery expressions. A streaming query can't use FLWOR, direct constructors, or enclosed expressions. So the information would arrive in a somewhat different format, which means changes to your Stax handler. Here's one way to do it: declare variable $keys as xs:string+ := xdmp:get-request-field('key') ; xdmp:directory("/db/netvisn/content/", "infinity") /content/lookupInfo[key = $keys] That will stream - if it can. The "if it can" part comes in because many environments don't allow streaming. For example, cq can't stream results simply because of the way it evaluates queries and handles the results. I believe that's also true for XCC. But you could make an HTTP request to an HTTP server, and streaming back the results. That's why I used xdmp:get-request-field to define the keys. You might also want to take a step back and see if you're making best use of MarkLogic's strengths. Perhaps I am jumping to conclusions, but expanded tree cache problems often come up in applications where the documents try to act like RDBMS tables. MarkLogic works best when documents act more like rows. In most cases that is an easy change to make... easier than fighting with the product, anyway. -- Mike On 11 Aug 2011, at 15:43 , Gary Larsen wrote: > Hi Danny, > > I didn't realize side effect of the for loop. Below is the original query (converted from eXist) where I'm trying to create a structure that's passed to a Stax handler in Java. Perhaps a better approach would be to run the $cq and cts:search for each key rather than all at once. > > $keys := ("i9297889D3DB8430DB22F66EF3BA331DF", "i955A9E9DE3964B3A91D4F26BE35C88EB", "i598A118E534B4A5DA42FDBB36FF039AA", "i7A8EAC0921E049D98CC80C3A6DC032A7", "i8DB23187DFCD4D95AF094A2DDB7F43EA") > > let $cq := cts:and-query((cts:directory-query("/db/netvisn/content/", "infinity"), > cts:element-value-query(fn:QName( "key"), $keys))) > return <results> > { for $info in cts:search(fn:doc(), $cq, "filtered")/content/lookupInfo return > <SyncData>{ $info/key }{ $info }</SyncData> } > </results> > > Thanks, > Gary > > From: [email protected] [mailto:[email protected]] On Behalf OfDanny Sokolsky > Sent: Thursday, August 11, 2011 5:07 PM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Quering documents with fragments > > Also, because the cts:search is in a for loop and building up a result set, the entire cts:search result set must be put in memory in order to build your result set. If you just returned the cts:search, it would still be a lot of results, but it could stream it out. Like Geert says, pagination is a great way to do these kinds of things. > > -Danny > > From: [email protected] [mailto:[email protected]] On Behalf OfGary Larsen > Sent: Thursday, August 11, 2011 2:02 PM > To: 'General MarkLogic Developer Discussion' > Subject: Re: [MarkLogic Dev General] Quering documents with fragments > > HI Geert, > > Thanks for clarifying that only query results are stored to the cache, and I take from your suggestion of paging that read only query results would not clog the cache. > > I may have some unexpected data somewhere and need to research that. > > Thanks! > gary > > From: [email protected] [mailto:[email protected]] On Behalf OfGeert Josten > Sent: Thursday, August 11, 2011 4:38 PM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Quering documents with fragments > > Hi Gary, > > The cts querying part is based on indexes only, so no fragments nor data itself is involved. It is the compilation of the results structure that is causing the problem. To put it simple: you are returning way too many results at once. Start using pagination. Try returning something like 20 or 100 results per page. You will see that the pagination works really fast. > > You might also want to take a look at search:search, it can do pagination for you as well. Use additional-query for the directory-query, and perhaps a constraint for your keys (so you could use "key:mykey1").. > > Kind regards, > Geert > > Van: [email protected] [mailto:[email protected]] Namens Gary Larsen > Verzonden: donderdag 11 augustus 2011 21:18 > Aan: 'General MarkLogic Developer Discussion' > Onderwerp: [MarkLogic Dev General] Quering documents with fragments > > Hi, > > I try solve a query which is throwing XDMP-EXPNTREECACHEFULL errors. It's possible that the documents being queried have a large number of fragments (> 30K). The query is not referencing any elements in the fragments, but I'm wondering if the query is loading the root document AND fragments into the expanded tree cache. Here's main part of the query > > let $cq := cts:and-query(( > cts:directory-query('/db/netvisn/content/','infinity'), > cts:element-value-query(xs:QName('key'), $keys) > )) return <results> > {for $info in cts:search(doc(), $cq, 'filtered')/content/lookupInfo return > <SyncData>{$info/key} > > Do I need to surround the cts:element-value-query() with a cts:document-fragment-query() to avoid grabbing the fragments? > > Thanks > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
