Hi Geoffrey,
I'm hoping to have some time to look at Lucene integration after we put
10.5 to bed. In the meantime, I was wondering if you have any experience
with implementations of Lucene Directory which place the Lucene indexes
inside a relational database? According to the following link, people
have been disappointed with the performance of this approach (don't know
what that means)--at first blush, however, the approach seems like an
attractive way to keep the Lucene indexes transactionally consistent
with the original character data:
http://wiki.apache.org/lucene-java/LuceneFAQ#head-e55d8e6971f9f01daaf3e14ce1d2f34485adba6e
Thanks,
-Rick
Rick Hillegas wrote:
Hi Geoffrey,
I'm on the road right now but I'd like to make some suggestions after
I gather my thoughts and get over my jet lag. I think that it is
definitely possible to hook into the query processing layer in order
to fork the tuple stream so that a listener process can populate the
Lucene indexes. I think that scraping the replication log stream would
raise a lot of issues around when work is really committed vs. when
savepoints are rolled back, and I would recommend against that approach.
Regards,
-Rick
Geoffrey Hendrey wrote:
Ok, well on to plan B then. Is there some stage in the preparation of
inserts, updates, and deletes at which the logical identity of a row
is established? That could be a good place to provide a lucene hook,
or a more general interceptor.
On Mar 18, 2009, at 6:55 AM, Jørgen Løland <[email protected]>
wrote:
Geoff hendrey wrote:
I've been folowing knuts pointers and reading the docs on the classes
that marshal themselves over the wire via their writeObject method.
So, question about this:
"Type=update, Table=employee, Page=4321, Index=4, field 3=50000"
Does the page and index, collectively, constitute a "row ID".
If it is always a constant, than these three field are sufficient to
permanently identify the row, and we can use that information to
consititute a document ID in lucene.
It's constant until the record is moved to another page (which means
"no", really).