I mean exactly what you mentioned to create an aggregate document and it does fully depend on the size of the files. We need facet counts for a number of different items so putting all in the same book made things work nicely for our application. We did not try and do much manipulation on those documents as we did all of the edits on the original chapter (smaller) files.
As for the size, our search is a user based search and we are returning about 20 records at a time and the user would page to get more results. We did find that if we returned too many results it did cause an issue to open up each of those documents in the search response. Brad Rix Technical Lead Office: +1.303.542.2172 Mobile: +1.303.915.2771 Fax: +1.303.544.0522 IM/Gmail Name: bradford.rix [cid:[email protected]]<http://www.flatironssolutions.com/> www.FlatironsSolutions.com [email protected] From: [email protected] [mailto:[email protected]] On Behalf Of Jakob Fix Sent: Tuesday, January 08, 2013 2:51 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] "left join" using search:search or cts:search? Hi Brad, I think I understand what you mean by "materialising" the content, but just to make sure: Do you suggest that we should generate a separate aggregate document for each book and its containing chapters (and then of course also figures and tables while we are at it)? I'm a bit worried that this might potentially get big in terms of what is recommended as optimal document sizes for MarkLogic, as we might have up to many dozens of chapters each with their metadata, abstract, etc, hundreds of tables as well as figures. But we'll give it some thought, also I always wanted to tinker with CPF! ;-) cheers, Jakob. On Tue, Jan 8, 2013 at 7:50 PM, Rix, Brad <[email protected]<mailto:[email protected]>> wrote: An alternate solutions is to materialize your content to include the chapter content in the book content in a new URI. The materialized book would have pointers back to the original assets (if necessary). This does duplicate the storage requirements, but gives you full flexibility to use the search APIs will all of its capabilities. Then you would direct your search to the materialized URI locations instead of the individual documents. You could also have a CPF pipeline that will trigger to create a materialized version once a child (chapter) or top level asset is updated or create as you update/insert the chapter documents. For example, if you have the following structure: /docs/assets/book.xml /docs/assets/chapter.xml You could materialize into /docs/fullbooks/fullbook.xml Then search /docs/fullbooks instead of /docs/assets. Brad Rix Technical Lead Office: +1.303.542.2172<tel:%2B1.303.542.2172> Mobile: +1.303.915.2771<tel:%2B1.303.915.2771> Fax: +1.303.544.0522 IM/Gmail Name: bradford.rix [cid:[email protected]]<http://www.flatironssolutions.com/> www.FlatironsSolutions.com<http://www.FlatironsSolutions.com> [email protected]<mailto:[email protected]> From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Damon Feldman Sent: Tuesday, January 08, 2013 11:36 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] "left join" using search:search or cts:search? Jakob, I want to highlight that Charles' code uses range index values on both "sides" of the join. He pulls values using element-values(), so that comes from a range index. Then he uses element-range-query() to do the shotgun OR. This is important because the values will line up properly (particularly if they use the default collation or if you have a non-default collation set both indexes use the same one). Word queries apply stemming and ignore punctuation. Value queries can even be a little different from the values in the range indexes, so use range indexes on both sides of the join. Yours, Damon From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Charles Greer Sent: Tuesday, January 08, 2013 10:56 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] "left join" using search:search or cts:search? Hi Jakob, The way to accomplish this kind of thing using cts:search is pretty well understood right now. Use a "scatter query" or "shotgun OR" technique -- use cts:values to fetch the identifiers from chapter documents that interest you (with a subquery as needed) Then use the output of that that call as input to a cts:search on book documents. let $vals := cts:values($index-reference-with-book-identifiers, 0, (), cts:query("for chapters")) return cts:search(/, cts:element-range-query(xs:QName("bookid"), "=", $vals) This runs quickly because it's all XQuery on the server. There's no corresponding technique in search API at this time, although with a custom constraint one could accomplish it. It's on my todo list to provide examples of this kind of custom code... The technique would be to make an XQuery module with this kind of code in it, and then define a custom constraint that would invoke it, so book:term-in-chapter would put "term-in-chapter" into the first clause above, and then the extension would return results from the cts:search. Hope that helps a little. There's good discussion of this technique in the "Inside MarkLogic" paper by Jason Hunter. Charles On 01/08/2013 07:29 AM, Jakob Fix wrote: Hi, Is it possible to find documents based on a criteria match on a related document with the search:search (or cts:search) api ? To explain: I have separate documents for books and chapters, where each chapter has a reference to their parent book via an identifier, and I would like to find books which contain either a specific term themselves or in one of their chapters. I can't seem to find a way with the cts:query to make something like a left join in SQL. thanks, Jakob. _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general -- Charles Greer Senior Engineer MarkLogic Corporation [email protected]<mailto:[email protected]> Phone: +1 707 408 3277<tel:%2B1%20707%20408%203277> www.marklogic.com<http://www.marklogic.com> _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general
<<inline: image003.png>>
<<inline: image002.png>>
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
