RE: revisit naming for grouping/join?

Steven A Rowe Wed, 06 Jul 2011 12:12:20 -0700

From my external POV on this debate, it seems as though the main point of 
contention is naming the nature of the relation between documents.


Instead of doing that, a name that says that there is some form of relation, 
but leaving open its nature, might work: something like "docrelation"?  
(Avoiding the "related documents" IR concept would be important here.)

Steve

> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
> Sent: Wednesday, July 06, 2011 2:59 PM
> To: dev@lucene.apache.org
> Subject: Re: revisit naming for grouping/join?
> 
> 
> : Also... I think we are over-thinking the name ;)  We can't convey
> : *everything* in this name; as long as the name makes it clear that
> : you'll want to consider this / read its javadocs whenever doing
> : something with "nested docs", I think that's sufficient.  I think
> : NestedQueryWrapper (maybe NestedDocsQuery) and NestedDocsCollector are
> : good enough, at least better than the functional-driven names they now
> : have...
> 
> Yeah, that's fair ... i'm not in love with NestedDocsQuery and
> NestedDocsCollector but i agree they are better then what we have now.
> 
> : Honestly at this point I'm tempted to just stick with what we have
> : (the functionally driven names, instead of the dominant use case
> : driven name).
> :
> : At its heart, this query is performing a join (well, finishing the
> : join that was done during indexing), and despite our efforts to more
> : descriptively capture the dominant use case, I don't think we're
> : succeeding.  We are basically struggling to find ways to explain what
> : a join does, into these class names.
> 
> I really think it's a bad idea to use "Join" in the name ... i understand
> that to you this is a "join", but as you say it's really just finishing a
> join that was already done at index time -- for most users "join" is
> going to have the connotation of a SQL join where you don't have to
> normalize the data in advance (ie: build the index with all the docs you
> want ot join in a block) and we shouldn't use it unless we are talking
> about a truely generic query time join -- particularly if we are going to
> use examples i nthe doc that seem like the kind of think you would do
> with
> a query time join in SQL.
> 
> i know you feel like "nested" (or "subdocs" or "parent") undersells the
> *possible* usecases of this feature, but the thing to remember is that
> even in the use cases where the real life data isn't something you might
> think of as being organized in a "nested" or "hierarchical" model, in
> order to use this feature the user must map their source data model to a
> Lucene Document model that *does* capture a hierarchy relationship so
> they
> can index their data in in the appropraite way.  X and Y may not be in a
> hierarchy, but if you want to join them like this, then the Document for
> X
> and the Document for Y must be thought of as being in a hierarchy and
> indexed in lock step with eachother.
> 
> "Block" just doesn't feel like it really conveys this ... but anything
> along the "Nested", "Parent", "Subdoc", line of terminology would at
> least
> give some point of refrence to the idea that the *Document* model in
> Lucene needs to be organized in this way -- and i think it's really
> important that the name make that clear.
> 
> -Hoss
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org

RE: revisit naming for grouping/join?

Reply via email to