Re: [Ferret-talk] Multithreading / multiprocessing woes

Scott Davies Wed, 21 Nov 2007 14:07:27 -0800

On Nov 21, 2007 12:24 PM, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
> How so?   It's a "search server" with the same goals that I imagine
> you'd have for the JRuby+Lucene+DRb combination.


It's a bit more than I need right out of the gate, what with the
caching, replication, faceted search, etc.  Of course, that might not
be a problem if it uses sensible configuration defaults I can safely
ignore to start with.

> It's not really complicated, especially with the solr-ruby library.
> Add documents, delete them, query for them.  Leverage highlighting
> and more-like these features, dismax querying, etc.

My particular application does enough weird things that, for the most
part, I'd prefer unfettered access to the low-level Lucene APIs.  (For
example, my application uses a lot of gnarly query trees involving
filters and ranges, and I'm not sure whether those are easily
transmitted through the Solr APIs.  Then I have "run all of these
queries against each of the documents in this specific set and tell me
which document/query pairs match in one fell swoop" routines, in which
case it might be a good idea to copy the documents into a temporary
RAM index to run the queries against.)

>
> > , (2) not particularly well-documented given its size
>
> Wow.   Have you seen the Solr wiki?   http://wiki.apache.org/solr -
> there are nooks and crannies documented on that wiki that go well
> beyond what I'd consider good documentation.
>
> By all means point me to areas that aren't documented that you need
> to know (off list) and I'll get those taken care of.

Wikis are fine for looking up details when you already mostly know
what you're doing, but they're not nearly as useful when you're in the
earlier stages trying to get the big "What does this system look like
and how does it work?" picture and evaluate initial plans of attack.
Ferret and Lucene both have entire *books* written about them that are
*excellent* for those purposes.  (They're not free-as-in-beer, but are
well worth the cost.)  By comparison, Solr has a very simple "here is
how you get a straightforward app off the ground" tutorial that says
little about how Solr is actually organized, and then you're basicaly
left staring at a Wiki page with a thousand bullet points and no clear
path to big-picture enlightenment.  And given the choice between (1)
using a lower-level system that's been very well-documented in a
well-organized explanatory fashion and (2) using a slightly
higher-level system I still haven't acquired a mental "big picture"
for, I generally find (1) more productive.

This isn't a criticism of Solr's documentation nearly as much as a
hearty "Book-style documentation is useful, and, holy crap, Ferret and
Lucene actually HAVE IT.  Woohoo!", plus an added bonus testament to
my own laziness.

> > (3) likely to get in my way when I want to do anything low-level and
> > gnarly with Lucene.
>
> Maybe, but not much in your way.  You'd have to wrap your low-level
> mojo inside some Solr API perhaps, but not even if we're just talking
> about custom analyzers or similarity implementation.

Yeah, my guess is that if I sit down and figure out how Solr is laid
out, adding APIs to do what I want won't be too hard.  Might still be
kind of tedious implementing all the necessary marshaling, though.

-- Scott
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Multithreading / multiprocessing woes

Reply via email to