Mattio, Searches return fragments, and the amount and number of fragments returned by a search can determine performance in many cases, especially where disc i/o is the bottleneck. It sounds as if the queries you are issuing are perhaps returning the entire document rather than the parent fragment that links to your chapter fragments, but it is difficult to know based on the information you've provided. I *think* there is a way to return only the parent fragment in your searches, but I confess that I don't know off the top of my head (perhaps a colleague will remind us). Frankly, we discourage from using fragments because they add a layer of complexity to most applications. We of course support them, and they are very useful in some cases, but typically we try to stay away from them. Mike S. suggested what most people do, which is to break the books into chapter documents, and to store the book metadata on each chapter so that you can search metadata + the contents of each chapter at the same time.
If you keep each book as a single document, one option you might consider is to use properties, which themselves are a special type of fragment. Fortunately, properties are indexed per your database settings, so you could issue the same search, but instead of querying the document, you can query the properties. There are a few ways to do this - with xdmp:directory-properties() or xdmp:collection-properties() if you have your documents in collections or directories, and in 4.1 you have cts:property-query(). The advantage of the property approach is that in 4.1 you can "join" across your property fragment and your chapter fragments in your searches. So, you could search your chapters and your front matter (eg, title, author, pub-year, etc) in the same search. I have to imagine that this will become a requirement at some point. :-) Kelly Message: 3 Date: Fri, 21 Aug 2009 20:58:52 -0400 From: Mattio Valentino <[email protected]> Subject: [MarkLogic Dev General] Searching large documents above the fragment root level. To: General Mark Logic Developer Discussion <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1 I have large documents stored in MarkLogic -- books. My fragment roots are set to the chapter level because we display material at that level and we have a search feature at that level. Performance is good with those queries. We also have a feature where we want to search at the title level where title metadata is returned as a result if it contains the search term anywhere within it. I've written this query a number of different ways and I can't get good performance out of it. There are a number of requirements I'm leaving out, but does anyone have a pattern or general strategy for these types of queries where you are searching at the document level instead of the fragment root level? Thanks, Mattio _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
