Hi, I'm wondering if there is a kind of "formule" to estimate the size of a lucene index. Searching the list, I did not find any pointers.
Does anybody has a hint? What I figured out from the file format description and some empirical tests is, that for every index-file: Field-files: field-data .fdt: NumberOfDocs * NumberOfFieldsPerDoc field-index .fdx: NumberOfDocs * 8 field-info .fnm: ignored Term-Files: term-data .tis: NumberOfTerms * 8 term-index .tii: no idea so far term-freq: .frq: estimated as NumberOfDocs * NumberOfTerms Normalization: Norm file: .nrm: NumberOfDocs This concerns only Un-stored fields of course. I estimate the total NumberOfTerms of my document collection with 10% of the NumberOfDocuments. Does someone has similiar experience? lofi --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
