Dror Matalon wrote:

On Tue, Feb 10, 2004 at 03:59:50AM +0100, [EMAIL PROTECTED] wrote:

Hi Lucent Users!

Searching the documentation, API and this mailinglist results in:
"no way to store objects or binary data in an UnIndexed
org.apache.lucene.document.Field to attach it to the index directly"

Is there a way to do this? What would you suggest to do?


1. Store the binary data in files and store the path in Lucene. There's
scallability issues here when you handle more than a few hundred
thousand objects.

Just a comment: for ext2fs and BSD FFS (dunno about NT) scalability issues with this approach can be partially addressed by building a tree of subdirectories, instead of using just one. I.e. a file named "myThesis.pdf" would go into /m/y/t/myThesis.pdf. This way the time needed to list the files in a given directory is reduced (both unixes can already cache the inode numbers for name/inode lookup, so there is no significant time increase to lookup a longer path).


FreeBSD also has a special kind of filesystem, which uses inodes in a flat space (no directories). It was specifically designed for storing large numbers of files efficiently. Recent versions of Java on FreeBSD (1.4.2) seem to be very stable and performing well, so that could also be an option.

After all, a filesystem _is_ a kind of very specialized database... ;-)

--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to