Hi Eirikur,
I recently checked in some first try for initalizing an index for an existing store (Tested with txfile store).
The indexer scans all docs in the store if there is no index on startup.
Give it a try if you want.
Rgadards, Stefan
Eirikur Hrafnsson wrote:
Hi James,
do you have time now to update/integrate the batch indexer for Slide? I really need it badly : /
best regards Eirikur, Idega.
On 12.3.2005, at 06:15, James Mason wrote:
Sorry I didn't post this early. I was hoping I'd have time to clean it up and actually integrate it into Slide, but I've been completely swamped lately.
I've uploaded a code dump from my first working version to http://cvs.apache.org/~masonjm/batchindexer/
I know that it contains bugs, since I've fixed a few I made the dump. Also, keep in mind that the code won't work as posted. The only implementation I've made works with Autonomy for the search engine, and I didn't post the piece that actually talks to Autonomy. There's nothing in there that would be useful for Lucene anyway.
To make this generally useful there will need to be an implementation of QueueProcessor that supports Lucene. I've included an example implementation (for Autonomy) that should be a good starting point.
There also needs to be a way to start/stop the batch indexer. I've implemented a Spring-based MVC webapp for controlling it on my server, but I'm not sure if this is the best approach for a more general solution. Also, this is one area I know for sure contains bugs. Someone who actually knows what they're doing should take a look at the run() logic for BatchIndexer to make it properly resumable. My latest version seems to work alright, but this is an earlier snapshot so the logic still has errors.
Also, since this whole thing uses Spring to glue everything together you'll need to get the Spring jars for it to work. I *think* I patched the code in CVS to expose the ApplicationContext to the lower levels. I think a servlet filter would be a better approach, but be aware that if you want to do this with Slide 2.1 you'll need to go through some extra steps.
Holler if there are any questions.
-James
On Wed, 2005-03-09 at 11:15 +0000, Eirikur Hrafnsson wrote:
Hi Stefan,
On 9.3.2005, at 08:52, Stefan L�tzkendorf wrote:
Hi Eirikur,
the reindex problem is still unresolved :-(. I'm currently thinking about this, because I think it's crucial too.
Yup, especially when you want to use Lucene on an existing store. Somebody mentioned he was working on a batch indexer when we last discussed this and he was going to commit it, was it Christophe or Daniel perhaps...I can't find the email....
cheers Eiki, Idega.
Stefan
Eirikur Hrafnsson wrote:
Hi all (long time no bugging you... ; ) a while ago I asked if there was a way to re-index the lucene index for slide. This is pretty crucial feature in my opinion since the Slide index is always stored on the file system regardless of what kind of store you have thus making it harder to move a website from development to production, backing it up and especially when you want to enable the lucene indexing on an existing Slide store... Is this possible today? Best Regards Eirikur S. Hrafnsson, [EMAIL PROTECTED] Chief Software Engineer Idega Software http://www.idega.com p.s. the SimpleXMLExtractor XPath stuff still doesn't work if you specify a namespace other than "DAV:" : ( -------------------------------------------------------------------- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- Stefan L�tzkendorf -- [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Best Regards
Eirikur S. Hrafnsson, [EMAIL PROTECTED] Chief Software Engineer Idega Software http://www.idega.com
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Best Regards
Eirikur S. Hrafnsson, [EMAIL PROTECTED] Chief Software Engineer Idega Software http://www.idega.com
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- Stefan L�tzkendorf -- [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
