Thanks for everyone's suggestions as everything helps to write queries with
performance.  

The problem turned out to be a wrong QName in cts:element-value-query()
which was indeed bringing back unwanted tons of data.

gary

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Michael
Blakeley
Sent: Thursday, August 11, 2011 7:07 PM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Quering documents with fragments

As Danny suggested, streaming makes it possible to yield the entire result
set without pinning down expanded tree cache space. However, streaming only
works for a fairly narrow range of XQuery expressions. A streaming query
can't use FLWOR, direct constructors, or enclosed expressions. So the
information would arrive in a somewhat different format, which means changes
to your Stax handler.

Here's one way to do it:

  declare variable $keys as xs:string+ :=
    xdmp:get-request-field('key') ;

  xdmp:directory("/db/netvisn/content/", "infinity")
    /content/lookupInfo[key = $keys]

That will stream - if it can. The "if it can" part comes in because many
environments don't allow streaming. For example, cq can't stream results
simply because of the way it evaluates queries and handles the results. I
believe that's also true for XCC. But you could make an HTTP request to an
HTTP server, and streaming back the results. That's why I used
xdmp:get-request-field to define the keys.

You might also want to take a step back and see if you're making best use of
MarkLogic's strengths. Perhaps I am jumping to conclusions, but expanded
tree cache problems often come up in applications where the documents try to
act like RDBMS tables. MarkLogic works best when documents act more like
rows. In most cases that is an easy change to make... easier than fighting
with the product, anyway.

-- Mike

On 11 Aug 2011, at 15:43 , Gary Larsen wrote:

> Hi Danny,
>  
> I didn't realize side effect of the for loop.  Below is the original query
(converted from eXist)  where I'm trying to create a structure that's passed
to a Stax handler in Java.   Perhaps a better approach would be to run the
$cq and cts:search for each key rather than all at once.
>  
> $keys := ("i9297889D3DB8430DB22F66EF3BA331DF",
"i955A9E9DE3964B3A91D4F26BE35C88EB", "i598A118E534B4A5DA42FDBB36FF039AA",
"i7A8EAC0921E049D98CC80C3A6DC032A7", "i8DB23187DFCD4D95AF094A2DDB7F43EA")
>  
> let $cq := cts:and-query((cts:directory-query("/db/netvisn/content/",
"infinity"),
>                    cts:element-value-query(fn:QName( "key"), $keys)))
> return <results>
>   { for $info in cts:search(fn:doc(), $cq, "filtered")/content/lookupInfo
return
> <SyncData>{ $info/key }{ $info }</SyncData> }
> </results>
>  
> Thanks,
> Gary
>  
> From: [email protected]
[mailto:[email protected]] On Behalf OfDanny Sokolsky
> Sent: Thursday, August 11, 2011 5:07 PM
> To: General MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> Also, because the cts:search is in a for loop and building up a result
set, the entire cts:search result set must be put in memory in order to
build your result set.  If you just returned the cts:search, it would still
be a lot of results, but it could stream it out.  Like Geert says,
pagination is a great way to do these kinds of things.
>  
> -Danny
>  
> From: [email protected]
[mailto:[email protected]] On Behalf OfGary Larsen
> Sent: Thursday, August 11, 2011 2:02 PM
> To: 'General MarkLogic Developer Discussion'
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> HI Geert,
>  
> Thanks for clarifying that only query results are stored to the cache, and
I take from your suggestion of paging that read only query results would not
clog the cache.
>  
> I may have some unexpected data somewhere and need to research that.
>  
> Thanks!
> gary
>  
> From: [email protected]
[mailto:[email protected]] On Behalf OfGeert Josten
> Sent: Thursday, August 11, 2011 4:38 PM
> To: General MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Quering documents with fragments
>  
> Hi Gary,
>  
> The cts querying part is based on indexes only, so no fragments nor data
itself is involved. It is the compilation of the results structure that is
causing the problem. To put it simple: you are returning way too many
results at once. Start using pagination. Try returning something like 20 or
100 results per page. You will see that the pagination works really fast.
>  
> You might also want to take a look at search:search, it can do pagination
for you as well. Use additional-query for the directory-query, and perhaps a
constraint for your keys (so you could use "key:mykey1")..
>  
> Kind regards,
> Geert
>  
> Van: [email protected]
[mailto:[email protected]] Namens Gary Larsen
> Verzonden: donderdag 11 augustus 2011 21:18
> Aan: 'General MarkLogic Developer Discussion'
> Onderwerp: [MarkLogic Dev General] Quering documents with fragments
>  
> Hi,
>  
> I try solve a query which is throwing XDMP-EXPNTREECACHEFULL errors.  It's
possible that the documents being queried have a large number of fragments
(> 30K).  The query is not referencing any elements in the fragments, but
I'm wondering if the query is loading the root document AND fragments into
the expanded tree cache.  Here's main part of the query
>  
> let $cq := cts:and-query((
>  cts:directory-query('/db/netvisn/content/','infinity'),
>  cts:element-value-query(xs:QName('key'), $keys)
> ))  return <results>
> {for $info in cts:search(doc(), $cq, 'filtered')/content/lookupInfo return
> <SyncData>{$info/key}
>  
> Do I need to surround the cts:element-value-query() with a
cts:document-fragment-query() to avoid grabbing the fragments?
>  
> Thanks
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to