Hi, I am wondering whether we could do some changes for the feature Lucene. 1. Move the Analyzer down to field level from document level so some fields could be applied a specail analyzer.Other fields still use the default analyzer from the document level. For example, I do not need to index the number for the "content" field. It helps me reduce the index size a lot when I have some excel files. But I always need the "created_date" to be indexed though it is a number field.
I know there are some workarounds put in the group, but I think it should be a good feature to have. 2. Does it affect the performance/space a lot if I use the large max length for the field like 1000000(Normally only "content" need that.) If so, could we make this parameter associated with "field" rather than "document"? 3. Based on the document(http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html) from Otic(It is really great! ), the mergeFactor control when we put the indexed documents into one segment and merge the segments into one big segment. For Windows, larger mergeFactor could cause the "Too many files open" problem when merging the segments. But the lower mergeFactor slows down the indexing. Could we export two different parameters here? One is to control when we put the files into one segment in the disk so I can set it larger when the machine has enough memory; another one is to control when merge the segments. Right now it is the power of the mergeFactor. Regards, Hui --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
