I am proud to announce LuSql: LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode it uses threading to take advantage of multiple cores.
LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation. Its only dependencies are three Apache Commons libraries, the Lucene core itself, and a JDBC driver. LuSql has been extensively tested, including a large 6+ million full-text & article metadata document collection, producing an 86GB Lucene index. http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql If you have any questions, please contact me. Thanks, Glen Newton :-) -- Glen Newton | [EMAIL PROTECTED] Researcher, Information Science, CISTI Research & NRC W3C Advisory Committee Representative http://tinyurl.com/yvchmu tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246 Canada Institute for Scientific and Technical Information (CISTI) National Research Council Canada (NRC)| M-55, 1200 Montreal Road http://www.nrc-cnrc.gc.ca/ Institut canadien de l'information scientifique et technique (ICIST) Conseil national de recherches Canada | M-55, 1200 chemin Montréal Ottawa, Ontario K1A 0R6 Government of Canada | Gouvernement du Canada --