Hi,
I am wondering whether we could do some changes for the feature Lucene.
1. Move the Analyzer down to field level from document level so some fields
could be applied a specail analyzer.Other fields still use the default
analyzer from the document level.
For example, I do not need to index the number for the "content" field. It
helps me reduce the index size a lot when I have some excel files. But I
always need the "created_date" to be indexed though it is a number field.

I know there are some workarounds put in the group, but I think it should be
a good feature to have.

2. Does it affect the performance/space a lot if I use the large max length
for the field like 1000000(Normally only "content" need that.) If so, could
we make this parameter associated with "field" rather than "document"?

3. Based on the
document(http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html) from
Otic(It is really great! ), the mergeFactor control when we put the indexed
documents into one segment and merge the segments into one big segment. For
Windows, larger mergeFactor could cause the "Too many files open" problem
when merging the segments. But the lower mergeFactor slows down the
indexing. Could we export two different parameters here? One is  to control
when we put the files into one segment in the disk so I can set it larger
when the machine has enough memory; another one is to control when merge the
segments.
Right now it is the power of the mergeFactor.

Regards,
Hui


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to