It is very hard to give you concrete advice without knowing more about your domain and usecases, but here are 2 points that came to mind:
1. You can make use of the highlighting features to show the content that matched. Highlighters can return whole blocks of text, and by using positionIncrements correctly you can get this right. 2. Yes, Elasticsearch is a document-oriented storage, but is it really necessary for you to index entire books as one document? I'd most certainly look at indexing sections or chapters maybe even pages as single documents and use string references to the book ID. Unless you use data from the book level along with full-text searches on the texts, which even then in some scenarios I would consider denormalization. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Jun 19, 2014 at 10:13 PM, liorg <[email protected]> wrote: > Well, assuming we have a book type. the book holds a lot of metadata, lets > say something of the following: > { > "author": { > "name": "Jose", > "lastName": "Martin" > }, > "sections": [{ > "chapters": [{ > "pages": [{ > "pageNum": 1, > "numOfChars": 1000, > "text": "let my people...", > "numofWords": 125 > }, > { > "pageNum": 2, > "numOfChars": 1005, > "text": "let my people go...", > "numofWords": 150 > }], > "chapterName": "the start" > }, > { > "pages": [{ > "pageNum": 3, > "numOfChars": 1000, > "text": "will do...", > "numofWords": 125 > }, > { > "pageNum": 4, > "numOfChars": 1005, > "text": "will do later on...", > "numofWords": 150 > }], > "chapterName": "the end" > }], > "sectionName": "prologue" > }] > } > > we want to search for all the pages that have "let my people" in their > text and more than 100 words. > so, when we use ES we can use nested objects and query on the nested page > object - but the actual returned values are the books (parents) that have > those matching pages. > now, if we want to show the user the pages he was looking for - we cannot > do that, as we get the whole book type returned with all its metadata and > not just the nested objects that matched the criteria... - we need to > search again (maybe in memory?) for the pages that matched the criteria in > order to display the user his search results... (the whole type is returned > as ES does not support yet in returning the nested objects that matched the > criteria). > > i hope it is better understood now > > On Thursday, June 19, 2014 7:22:13 PM UTC+3, Itamar Syn-Hershko wrote: > >> This is usually something that's being solved using parent-child, but the >> question here really is what do you mean by needing to retrieve both books >> & pages. >> >> Can you describe the actual scenario and what you are trying to achieve? >> >> -- >> >> Itamar Syn-Hershko >> http://code972.com | @synhershko <https://twitter.com/synhershko> >> Freelance Developer & Consultant >> Author of RavenDB in Action <http://manning.com/synhershko/> >> >> >> On Thu, Jun 19, 2014 at 7:12 PM, liorg <[email protected]> wrote: >> >>> Hi, >>> >>> we have somehow a complex type holding some nested docs with arrays >>> (lets assume an hierarchy of books and for each book we have an array of >>> pages containing its metadata). >>> >>> we want to search for the nested doc - search for all the books that >>> have the term "XYZ" in one of their pages - but we want to get back not >>> only the book, but the pages themselves. >>> >>> We've understood that it's problematic to achieve with ES (see >>> https://github.com/elasticsearch/elasticsearch/issues/3022). >>> >>> We have a problem to achieve it with parent child model as the data >>> model comes from our mongodb already existing model (and besides, not sure >>> if a parent child model fits here). >>> >>> so... >>> >>> 1. Is there any a workaround we can do to get the results of the nested >>> doc? (the actual pages?) >>> 2. If not, is there a recommended way we can search for the data again >>> in memory after it was narrowed down by ES server?... >>> 3. Any advice will be appreciated as this is quite a big obstacle in our >>> way to implement a solution using ES. >>> >>> thanks, >>> >>> Lior >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/6c3034e7-34d9-4b4d-802a-5110330b31a4%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/6c3034e7-34d9-4b4d-802a-5110330b31a4%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt%2BpBW2OLtML49G9_g0-U%3DsLEkqcA%3DBkc%3DfG%2BSzUCkFuA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
