Re: revisit naming for grouping/join?

Michael McCandless Sun, 03 Jul 2011 08:26:13 -0700

On Fri, Jul 1, 2011 at 9:28 AM, mark harwood <markharw...@yahoo.co.uk> wrote:
>>> I think what would be best is a smallish but feature complete demo,
>
> For the nested stuff I had a reasonable demo on LUCENE-2454 that was based
> around resumes - that use case has the one-to-many characteristics that lends
> itself to nested e.g. a person has many different qualifications and records 
> of
> employment.
> This scenario was illustrated
> here: 
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene
>
> I also had the "book search" type scenario where a book has many sections and
> for the purposes of efficient highlighting/summarisation  these sections were
> treated as child docs which could be read quickly (rather than highlighting a
> whole book)


I think both resumes and book search, and also others like the
variants of a product SKU, would all make good examples for the nested
docs use case.

> I'm not sure what the "parent" was in your doctor and cities example, Mike. 
> If a
> doctor is in only one city then there is no point making city a child doc as 
> the
> one city info can happily be combined with the doctor info into a single
> document with no conflict (doctors have different properties to cities).
> If the city is the parent with many child doctor docs that makes more sense 
> but
> feels like a less likely use case e.g. "find me a city with doctor x and a
> different doctor y"
> Searching for a person with excellent java and prefrerably good lucene skills
> feels like a more real-world example.

In my example the city was parent -- I raised this example to explain
that index-time joining is more general than just nested docs (ie, I
think we should keep the name "join" for this module... also because
we should factor out more general search-time-only join capabilities
into it).

> It feels like documenting some of the trade-offs behind index design choices 
> is
> useful too e.g. nesting is not too great for very volatile content with
> constantly changing children while search-time join is more costly in RAM and
> 2-pass processing

+1, especially once we've factored out generic joins.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: revisit naming for grouping/join?

Reply via email to