Hi Brad, I think I understand what you mean by "materialising" the content, but just to make sure: Do you suggest that we should generate a separate aggregate document for each book and its containing chapters (and then of course also figures and tables while we are at it)?
I'm a bit worried that this might potentially get big in terms of what is recommended as optimal document sizes for MarkLogic, as we might have up to many dozens of chapters each with their metadata, abstract, etc, hundreds of tables as well as figures. But we'll give it some thought, also I always wanted to tinker with CPF! ;-) cheers, Jakob. On Tue, Jan 8, 2013 at 7:50 PM, Rix, Brad <[email protected]>wrote: > An alternate solutions is to materialize your content to include the > chapter content in the book content in a new URI. The materialized book > would have pointers back to the original assets (if necessary). This does > duplicate the storage requirements, but gives you full flexibility to use > the search APIs will all of its capabilities.**** > > ** ** > > Then you would direct your search to the materialized URI locations > instead of the individual documents.**** > > ** ** > > You could also have a CPF pipeline that will trigger to create a > materialized version once a child (chapter) or top level asset is updated > or create as you update/insert the chapter documents. **** > > ** ** > > For example, if you have the following structure:**** > > ** ** > > /docs/assets/book.xml**** > > /docs/assets/chapter.xml**** > > ** ** > > You could materialize into **** > > /docs/fullbooks/fullbook.xml **** > > ** ** > > Then search /docs/fullbooks instead of /docs/assets. **** > > ** ** > > *Brad Rix ** > **Technical Lead*** > > * * > > * * > > *Office:***** > > *+1.303.542.2172***** > > *Mobile:***** > > *+1.303.915.2771***** > > *Fax:***** > > *+1.303.544.**0522***** > > *IM/Gmail Name:***** > > * bradford.rix*** > > ** ** > > ** ** > > * * > > * * > > ** <http://www.flatironssolutions.com/>** > > www.FlatironsSolutions.com**** > > [email protected] > > **** > > ** ** > > ** ** > > ** ** > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Damon Feldman > *Sent:* Tuesday, January 08, 2013 11:36 AM > > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] "left join" using search:search or > cts:search?**** > > ** ** > > Jakob,**** > > ** ** > > I want to highlight that Charles’ code uses range index values on both > “sides” of the join. He pulls values using element-values(), so that comes > from a range index. Then he uses element-range-query() to do the shotgun > OR. This is important because the values will line up properly > (particularly if they use the default collation or if you have a > non-default collation set both indexes use the same one).**** > > ** ** > > Word queries apply stemming and ignore punctuation. Value queries can even > be a little different from the values in the range indexes, so use range > indexes on both sides of the join.**** > > ** ** > > Yours, > Damon**** > > ** ** > > ** ** > > *From:* [email protected] [ > mailto:[email protected]<[email protected]>] > *On Behalf Of *Charles Greer > *Sent:* Tuesday, January 08, 2013 10:56 AM > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] "left join" using search:search or > cts:search?**** > > ** ** > > Hi Jakob, > > The way to accomplish this kind of thing using cts:search is pretty well > understood right now. Use a "scatter query" or "shotgun OR" technique -- > use cts:values to fetch the identifiers from chapter documents that > interest you (with a subquery as needed) > > Then use the output of that that call as input to a cts:search on book > documents. > > let $vals := cts:values($index-reference-with-book-identifiers, 0, (), > cts:query("for chapters")) > > return cts:search(/, cts:element-range-query(xs:QName("bookid"), "=", > $vals) > > This runs quickly because it's all XQuery on the server. > > > There's no corresponding technique in search API at this time, although > with a custom constraint one could accomplish it. It's on my todo list to > provide examples of this kind of custom code... The technique would be to > make an XQuery module with this kind of code in it, and then define a > custom constraint that would invoke it, so > > book:term-in-chapter > > would put "term-in-chapter" into the first clause above, and then the > extension would return results from the cts:search. > > Hope that helps a little. There's good discussion of this technique in > the "Inside MarkLogic" paper by Jason Hunter. > > Charles > > > **** > > On 01/08/2013 07:29 AM, Jakob Fix wrote:**** > > Hi, > > Is it possible to find documents based on a criteria match on a related > document with the search:search (or cts:search) api ? To explain: > > I have separate documents for books and chapters, where each chapter has a > reference to their parent book via an identifier, and I would like to find > books which contain either a specific term themselves or in one of their > chapters. I can’t seem to find a way with the cts:query to make something > like a left join in SQL. > > thanks, > Jakob. **** > > > > **** > > _______________________________________________**** > > General mailing list**** > > [email protected]**** > > http://developer.marklogic.com/mailman/listinfo/general**** > > ** ** > > -- **** > > Charles Greer**** > > Senior Engineer**** > > MarkLogic Corporation**** > > [email protected]**** > > Phone: +1 707 408 3277**** > > www.marklogic.com**** > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
<<image003.png>>
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
