Hi All:
Today I also read Otis's 'Parsing, indexing, and searching XML with Digester and 
Lucene'
at:  http://www-106.ibm.com/developerworks/java/library/j-lucene/

After long time delay, I decide to release a demo of WebLucene while it still not very 
well accomplished.

WebLucene: Lucene Web interface, use XML as a lightweight protocol. developer can 
convert data source (text, DB
, MS Word, PDF... etc) into xml format, indexing with lucene engine, and get full text 
search result via HTTP, 
with XML format output, user can easily intergrated with JSP ASP PHP front end or use 
XSLT at server side trans
form output.
http://sourceforge.net/projects/weblucene/

In this application I think following match some mostly ask question in Lucene user 
list:
1 Custom sorting: use docID based sorting, we can sorting results according data 
source order.
2 Internationalization issue: CJKTokenizer
 XML input avoid a lot of double byte charactor decoding problem for application runs 
on iso-8859-1 plat form.
3 I rewrite some SAXIndexer to fit for RSS like xml source indexing 
4 Highlighting support: WebLuceneHighlighter is a token based highlighter.

TODO:
1 RSS indexing demo
2 Documents

Regards

Che, Dong
http://www.chedong.com/


Reply via email to