On Sep 6, 2008, at 4:36 AM, Otis Gospodnetic wrote:
Regarding real-time search and Solr, my feeling is the focus should
be on first adding real-time search to Lucene, and then we'll figure
out how to incorporate that into Solr later.
I've read Jason's Wiki as well. Actually, I had to read it a number
of times to understand bits and pieces of it. I have to admit there
is still some fuzziness about the whole things in my head - is
"Ocean" something that already works, a separate project on
googlecode.com? I think so. If so, and if you are working on
getting it integrated into Lucene, would it make it less confusing
to just refer to it as "real-time search", so there is no confusion?
If this is to be initially integrated into Lucene, why are things
like replication, crowding/field collapsing, locallucene, name
service, tag index, etc. all mentioned there on the Wiki and bundled
with description of how real-time search works and is to be
implemented? I suppose mentioning replication kind-of makes sense
because the replication approach is closely tied to real-time search
- all query nodes need to see index changes fast. But Lucene itself
offers no replication mechanism, so maybe the replication is
something to figure out separately, say on the Solr level, later on
"once we get there". I think even just the essential real-time
search requires substantial changes to Lucene (I remember seeing
large patches in JIRA), which makes it hard to digest, understand,
comment on, and ultimately commit (hence the luke warm response, I
think). Bringing other non-essential elements into discussion at
the same time makes it more difficult to
process all this new stuff, at least for me. Am I the only one who
finds this hard?
Yeah, I agree. There's a place for RT search in Lucene, but it seems
to me we have a pretty good search server in Solr that needs some
things going forward, but are reasonable to work on there. It makes
sense to me not to duplicate efforts on all of those fronts and have
two projects/communities that share > 80-90% of their functionality
(either existing, or planned). As Yonik says, it may take longer than
just doing it by oneself, but in the long run, the outcome is usually
better.
My two cents,
Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]