Re: LuceneRAR project announcement

Joseph Ottinger Wed, 19 Jan 2005 12:34:45 -0800

On Wed, 19 Jan 2005, Erik Hatcher wrote:

> On Jan 19, 2005, at 2:27 PM, Joseph Ottinger wrote:
> > After babbling endlessly about an RDMS directory and my lack of success
> > with it, I've created a project on java.net to create a Lucene JCA
> > component, to allow J2EE components to interact with a Lucene service.
> > It's at https://lucenerar.dev.java.net/ currently.
>
> Could you elaborate on some use cases?


Sure, and I'll pick the one that's been driving me along:

I have a set of J2EE servers, all of which can generate new content for
search, and all of which will be performing searches. They're on separate
machines. Sharing directories isn't my idea of "doing J2EE correctly."

Therefore, I chose to represent Lucene as an enterprise service, one
communicated to via a remote service instead, so that every module can
communicate with Lucene without realising the communication layer... for
the most part. Plus, I no longer violate my purist's sensibilities.

> What drove you to consider JCA rather than some other technique?  I'm
> curious why it is important to get all J2EE with it rather than working
> with Lucene much more naturally at a lower level of abstraction.

JCA allows me to provide it as a system service instead of as a dependency
represented at each component layer. An EJB would have served almost as
well, except an EJB has filesystem restrictions that a Connector does not.

> I briefly browsed the source tree from java.net and saw this comment in
> your Hits.java:
>
> "This method loads a LuceneRAR hits object with its equivalent from the
> Apache Lucene Hits object. It basically walks the Lucene Hits object,
> copying values as it goes, so it may not be as light or fast as its
> Apache equivalent"
>
> I'll say!

Haha, it's good to see my propensity for understatement is still alive. :)

The Hits object could CERTAINLY use optimization - callbacks into the
connector would probably be acceptable, for example. The code you were
looking at has a lot of other areas that are, um, surprisingly crippled as
well.

For example, the add() method... well, first, THAT's the signature. Yes,
that's right. It adds constant text. Every time.

Likewise, the super-flexible "search()" -- again, that's the signature. It
searches for "time." That's it. Nothing more. Nothing less.

This is very much a first-cut "can I get it working?" version. I think,
for very limited definitions of "working," the answer is "yes." I
certainly don't think it's got that show-room floor gleam going for it
yet.

> For large result sets, which are more often the norm than the exception
> for a search, you are going to take a huge performance hit doing
> something like this, not to mention possibly even killing the process
> as you run out of RAM.

*nod* As stated, a callback would be far more preferable. Given that
Lucene's internal Hits object is final and nonserializable, at least my
client's Hit object gives me an opportunity to do that.

> JCA sounds like an unnecessary abstraction around Lucene - though I'm
> open to be convinced otherwise.

I'm more than happy to talk about it. If I can fulfill my needs with no
code, hey, that's great! I just haven't been able to successfully do so
yet, and everyone to whom I've spoken who says that they HAVE managed...
well, they've almost invariably done so by lowering the bar a great deal
in order to accept what Lucene requires.

I'm certainly not castigating those who've done this - in fact, in many
ways, I'm very impressed. It's just something I'd prefer not to do, given
any alternative.

-----------------------------------------------------------------------
Joseph B. Ottinger                             http://enigmastation.com
IT Consultant                                    [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: LuceneRAR project announcement

Reply via email to