I mean exactly what you mentioned to create an aggregate document and it does 
fully depend on the size of the files.     We need facet counts for a number of 
different items so putting all in the same book made things work nicely for our 
application.
We did not try and do much manipulation on those documents as we did all of the 
edits on the original chapter (smaller) files.

As for the size, our search is a user based search and we are returning about 
20 records at a time and the user would page to get more results.   We did find 
that if we returned too many results it did cause an issue to open up each of 
those documents in the search response.

Brad Rix
Technical Lead





Office:

+1.303.542.2172

Mobile:

+1.303.915.2771

Fax:

+1.303.544.0522

IM/Gmail Name:

bradford.rix









[cid:[email protected]]<http://www.flatironssolutions.com/>

www.FlatironsSolutions.com

[email protected]






From: [email protected] 
[mailto:[email protected]] On Behalf Of Jakob Fix
Sent: Tuesday, January 08, 2013 2:51 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] "left join" using search:search or 
cts:search?

Hi Brad,

I think I understand what you mean by "materialising" the content, but just to 
make sure:
Do you suggest that we should generate a separate aggregate document for each 
book and its containing chapters (and then of course also figures and tables 
while we are at it)?

I'm a bit worried that this might potentially get big in terms of what is 
recommended as optimal document sizes for MarkLogic, as we might have up to 
many dozens of chapters each with their metadata, abstract, etc, hundreds of 
tables as well as figures.  But we'll give it some thought, also I always 
wanted to tinker with CPF! ;-)

cheers,
Jakob.

On Tue, Jan 8, 2013 at 7:50 PM, Rix, Brad 
<[email protected]<mailto:[email protected]>> wrote:
An alternate solutions is to materialize your content to include the chapter 
content in the book content in a new URI.   The materialized book would have 
pointers back to the original assets (if necessary).   This does duplicate the 
storage requirements, but gives you full flexibility to use the search APIs 
will all of its capabilities.

Then you would direct your search to the materialized URI locations instead of 
the individual documents.

You could also have a CPF pipeline that will trigger to create a materialized 
version once a child (chapter) or top level asset is updated or create as you 
update/insert the chapter documents.

For example, if you have the following structure:

/docs/assets/book.xml
/docs/assets/chapter.xml

You could materialize into
/docs/fullbooks/fullbook.xml

Then search /docs/fullbooks instead of /docs/assets.

Brad Rix
Technical Lead





Office:

+1.303.542.2172<tel:%2B1.303.542.2172>

Mobile:

+1.303.915.2771<tel:%2B1.303.915.2771>

Fax:

+1.303.544.0522

IM/Gmail Name:

bradford.rix









[cid:[email protected]]<http://www.flatironssolutions.com/>

www.FlatironsSolutions.com<http://www.FlatironsSolutions.com>

[email protected]<mailto:[email protected]>





From: 
[email protected]<mailto:[email protected]>
 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Damon Feldman
Sent: Tuesday, January 08, 2013 11:36 AM

To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] "left join" using search:search or 
cts:search?

Jakob,

I want to highlight that Charles' code uses range index values on both "sides" 
of the join. He pulls values using element-values(), so that comes from a range 
index. Then he uses element-range-query() to do the shotgun OR. This is 
important because the values will line up properly (particularly if they use 
the default collation or if you have a non-default collation set both indexes 
use the same one).

Word queries apply stemming and ignore punctuation. Value queries can even be a 
little different from the values in the range indexes, so use range indexes on 
both sides of the join.

Yours,
Damon


From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Charles Greer
Sent: Tuesday, January 08, 2013 10:56 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] "left join" using search:search or 
cts:search?

Hi Jakob,

The way to accomplish this kind of thing using cts:search is pretty well 
understood right now.  Use a "scatter query" or "shotgun OR" technique -- use 
cts:values to fetch the identifiers from chapter documents that interest you 
(with a subquery as needed)

Then use the output of that that call as input to a cts:search on book 
documents.

let $vals := cts:values($index-reference-with-book-identifiers, 0, (), 
cts:query("for chapters"))

return cts:search(/, cts:element-range-query(xs:QName("bookid"), "=", $vals)

This runs quickly because it's all XQuery on the server.


There's no corresponding technique in search API at this time, although with a 
custom constraint one could accomplish it.  It's on my todo list to provide 
examples of this kind of custom code...  The technique would be to make an 
XQuery module with this kind of code in it, and then define a custom constraint 
that would invoke it, so

book:term-in-chapter

would put "term-in-chapter" into the first clause above, and then the extension 
would return results from the cts:search.

Hope that helps a little.  There's good discussion of this technique in the 
"Inside MarkLogic" paper by Jason Hunter.

Charles

On 01/08/2013 07:29 AM, Jakob Fix wrote:
Hi,

Is it possible to find documents based on a criteria match on a related 
document with the search:search (or cts:search) api ? To explain:

I have separate documents for books and chapters, where each chapter has a 
reference to their parent book via an identifier, and I would like to find 
books which contain either a specific term themselves or in one of their 
chapters. I can't seem to find a way with the cts:query to make something like 
a left join in SQL.

thanks,
Jakob.


_______________________________________________

General mailing list

[email protected]<mailto:[email protected]>

http://developer.marklogic.com/mailman/listinfo/general


--

Charles Greer

Senior Engineer

MarkLogic Corporation

[email protected]<mailto:[email protected]>

Phone: +1 707 408 3277<tel:%2B1%20707%20408%203277>

www.marklogic.com<http://www.marklogic.com>

_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

<<inline: image003.png>>

<<inline: image002.png>>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to