> I don't really understand the low level fileformat details of lucene (I > let Yonik worry abotu those things for me) and I've already forgotten the > details from earlier in this thread of what field properties on a per > field basis and which are stored on a per document basis -- but as someone > who has been burned by lucene storing a field norm for field F for all > docs as soon as one doc has a value for field F, my gut reaction is to shy > away from any proposal to move an existing document based field property > "global"
Norms a somewhat special. You could save a field norm as a per document field value. Then you'd have to fetch this value (one byte) for every document to calculate the relevance. It's at least one order of magnitude slower than reading all norm values into memory at once as is done in the current implementation. You don't loose anything if you define binary/compressed to be global. If there's a real need to store different types of data in the same field: just store binary data and let the application decide what to do with it, e.g. indicate with the first byte whether it's pdf/gif/compressed with your favorite compression algorithm/blowfish encoded/uncompressed/utf8/.... Robert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]