You could write a "dummy" Analyzer that provides the tokens from your external process. As for statistics, what kind are you interested in? I suppose you can store them in a field along with the document, or you can set the boost values for the field/document, but that may be a bit simple for your needs.

Ralf Bierig wrote:

Hi,

in the context of a distributed information retrieval project, we would like to use Lucene for its indexing capabilities but not for retrieval. In particular, we would like to populate a Lucene index with the tokens and statistics already computed by an external indexer, thereby bypassing the document-based parsing, analysis, and ingestion into the index which characterises Lucene's standard workflow. Is this possible? That is, is it possible to feed precomputed statistics into a Lucene's index? And is it possible to have control on what statistics are associated with each document (as we will not use Lucene for retrieval we are not interested in complying with the statistics it needs to perform a search).

Any help greatly appreciated, many thanks.

Cheers,


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--

Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to