There are no issues running lucene on any drive that provides fast and reliable random access reads.
Some SAN drives will work better than cheap local disks and those work pretty well. It is even possible to run Lucene with an index in a decidedly unfriendly file system (from the standpoint of random access reads) like HDFS. How well it works depends a lot on your particular work load. The long tail applies here; most retrieval applications are pretty small and only a few are really, really huge. For small applications up to a million or a few million documents and queries arriving every few seconds, and low update rates, you should be fine almost no matter what you are using. For hundreds of queries per second against hundreds of millions of documents with lots of updates, you have a completely different kettle of fish that will require completely different techniques. For really large systems, you have to implement scalable clustered systems and the necessary considerations are much broader than just disk I/O rates. On Thu, Mar 19, 2009 at 5:58 PM, 이지홍 <[email protected]> wrote: > I wonder if there are any known issues having a lucene index on a NAS > or SAN drive? Some > basic tests show that it works fine. But are there performance issues > with indexing on NAS > for instance? > > -- Ted Dunning, CTO DeepDyve
