: a) once a doc is added to an index, it will not get modified/deleted : b) all the fields added are keywords (mostly numbers) - no analysis is : required. : c) indexing speed is more important than querying speed. : d) every document is the same - there is no boost or relevancy required. : : e) Query results should be sorted in the order they were indexed.
given those statements, it really doesn't sound like Lucene (or any inverted index structure) is useful to you at all. if you really have an unbounded prefrence for indexing speed vs query speed you should use a data structure where "add" is a constant time operation, even if that means querying is done via a linear scan of every doc -- which actually aids you by automatically returning everything the order they were added. have you considered just using delimited files and something like perl for finding every record where the specified columns match your input criteria? or if you ar a *little* concerned about query performance: using Hadoop map/reduce to scan multiple text files spread across many boxes. (Disclaimer: haven't been following the whole thread, but did spot check the first message to see that hte query types are all simple field equality tests combined in a boolean epxression.) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org