On Fri, Jul 1, 2011 at 9:28 AM, mark harwood <markharw...@yahoo.co.uk> wrote: >>> I think what would be best is a smallish but feature complete demo, > > For the nested stuff I had a reasonable demo on LUCENE-2454 that was based > around resumes - that use case has the one-to-many characteristics that lends > itself to nested e.g. a person has many different qualifications and records > of > employment. > This scenario was illustrated > here: > http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene > > I also had the "book search" type scenario where a book has many sections and > for the purposes of efficient highlighting/summarisation these sections were > treated as child docs which could be read quickly (rather than highlighting a > whole book)
I think both resumes and book search, and also others like the variants of a product SKU, would all make good examples for the nested docs use case. > I'm not sure what the "parent" was in your doctor and cities example, Mike. > If a > doctor is in only one city then there is no point making city a child doc as > the > one city info can happily be combined with the doctor info into a single > document with no conflict (doctors have different properties to cities). > If the city is the parent with many child doctor docs that makes more sense > but > feels like a less likely use case e.g. "find me a city with doctor x and a > different doctor y" > Searching for a person with excellent java and prefrerably good lucene skills > feels like a more real-world example. In my example the city was parent -- I raised this example to explain that index-time joining is more general than just nested docs (ie, I think we should keep the name "join" for this module... also because we should factor out more general search-time-only join capabilities into it). > It feels like documenting some of the trade-offs behind index design choices > is > useful too e.g. nesting is not too great for very volatile content with > constantly changing children while search-time join is more costly in RAM and > 2-pass processing +1, especially once we've factored out generic joins. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org