A bit of clarification: Lucene index is made of multiple "segments". Compound format: stores each segment in a single file - less files created/opened. Not-compound format: stores each segment in multi-files - more files created/opened. Not-compound is likely to be faster for indexing. Optimizing the index (no matter if compound or not) would merge index segments into one segment - this would also make search faster.
Also see http://lucene.apache.org/java/docs/fileformats.html "Simon Willnauer" <[EMAIL PROTECTED]> wrote on 10/10/2006 03:56:17: > Hi, > > In Lucene there are two types of index structure compound index and > multi-file index. In multi-file index, when new documents are inserted > to an index, they are stored in a separate segment; this causes > increase of files in an index structure. Therefore, multi-file index > has more files than compound index. > Compound index type consists of three files; two of them are > "deletable" file that shows the unused files in index and "segments" > file that shows the segment names and their size. The third one > contains the all indexed documents and their field values. In compound > index all indexed files are merged into one single file. So, the > number of files in the index is minimized. > The advantage of multi-file is the time for indexing documents takes > less than compound file. Because, in compound file the indexed files > are in addition merged into one single file. This can be suitable when > the number of documents is large while indexing. > On the other hand, the advantage of compound file appears in > searching. Because, the total number of file accesses for reading data > are minimum in compound index. In contrast, using multi-file index the > file fetches increase because the program needs to open more files in > order to retrieve required documents from the index. This is important > while search time in an application is in consideration. > > I do prefer to use compoundFile(true) as I work on unix platforms > otherwise you will end up with "too many open files" very often! > > best regards Simon > > btw: nice to see another person from Berlin on the list! > > ----------- > > mailto: [EMAIL PROTECTED] > > On 10/10/06, Supriya Kumar Shyamal <[EMAIL PROTECTED]> wrote: > > Hello All, > > > > I have question regarding the use of Compound file fo rindex, what is > > the advantage & disadvantage of enabling use of compound file(which is > > default I think) or disabling the useo of it. > > > > Thanks, > > supriya > > > > -- > > Mit freundlichen Grüßen / Regards > > > > Supriya Kumar Shyamal > > > > Software Developer > > tel +49 (30) 443 50 99 -22 > > fax +49 (30) 443 50 99 -99 > > email [EMAIL PROTECTED] > > ___________________________ > > artnology GmbH > > Milastr. 4 > > 10437 Berlin > > ___________________________ > > > > http://www.artnology.com > > __________________________________________________________________________ > > > > News / Aktuelle Projekte: > > * artnology gewinnt Ausschreibung des Bundesministeriums des Innern: > > Softwarelösung für die Verwaltung der Sammlung zeitgenössischer > > Kunstwerke zur kulturellen Repräsentation des Bundes. > > > > Projektreferenzen: > > * Globaler eShop und Corporate-Site für Springer: www.springeronline.com > > * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de > > * Service-Center-Plattform für Biogen: www.ms-life.de > > * eCRM-System für Grünenthal: www.gruenenthal.com > > > > ___________________________________________________________________________ > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]