Avi, --- Avi Drissman <[EMAIL PROTECTED]> wrote: > I've successfully used Lucene to do indexing of about 50-100K files, > and have been keeping the index on a local disk. It's time to move > up, and now I'm planning to index from 100-500K files. > > I'm trying to decide whether or not it pays to hold the index in our > database. Our database (FrontBase) has decent blob support, and a > ~300 meg index likely wouldn't faze it, but I have some concerns. > > First, I'm looking at Directory, and there are two functions: > * OutputStream createFile(String name) > * InputStream openFile(String name) > > How much of the streams do they take advantage of? Does Lucene seek > around? I'm concerned about huge re-writing of files. > > Second is speed. I was looking at SQLDirectory, and although I'd > probably write my own (inspired by that), who's using it? How is the > speed compared to flat-files?
Haven't used it. Reported speed (by the author) was poor. > Third is replication. We're aiming for a replicated environment. If > we wanted to build the index on the disk rather than in the database, > > every server would have to keep their own copy. Does anyone have any > experience in this? I've done that. I simply used scp to copy the index from the build machine to a set of maybe dozen servers. You probably don't want to copy directly into the final destination directory, but rather a temp directory first, and then rename/move to the target directory (atomic and quick, esp. if on the same disk, as opposed to slow copy over the network). Otis __________________________________________________ Do you Yahoo!? Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop! http://platinum.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
