Hello Doug. There are a lot of Lucene classes still use Vector & Hashtable instead of ArrayList and HashMap because of compatibility reason with java 1.
Since the changes, proposed and made by Aviran to FieldInfos class made Lucene java 1 incompatible, but can give us some reasonable performance gain, shouldn't we go ahead with the whole Hashtable -> HashMap and Vector -> ArrayList replacement arround the code to have even more performance in other places? Max DC> I think perhaps it is time to make some incompatible changes to Lucene's DC> API. There are a number of places where it is showing its age. I'd DC> like to try to make as many API changes at once as is possible, so that DC> folks only have to port application code once. DC> I propose we do this as follows: DC> 1. Make a 1.9 release which has all the new APIs and deprecates all the DC> outdated APIs. Existing applications should compile and run fine, but DC> with lots of deprecation warnings. DC> 2. Make a 2.0 release which removes all deprecated code. DC> Thus 1.9 would be a migration release. Before an application is moved DC> to 2.0, folks should first make sure that it compiles against 1.9 DC> without deprecation warnings. Once it does then it should move to 2.0 DC> without incident. DC> Does this sound like a good plan? DC> What changes would I like to see in the API? Here are a few candidates: DC> 1. Replace Field factory methods (Field.Text, Field.Keyword, etc.) with DC> a few methods that use type-safe enumerations, as described in: DC> http://www.mail-archive.com/[EMAIL PROTECTED]/msg08479.html DC> 2. Similarly, replace BooleanQuery.add() with a type-safe enumeration, DC> also as described in: DC> http://www.mail-archive.com/[EMAIL PROTECTED]/msg08479.html DC> 3. Replace public IndexWriter fields (mergeFactor, minMergeDocs, etc.) DC> with get/set accessors. Also, minMergeDocs should be renamed DC> maxBufferedDocs. DC> 4. Rename PhrasePrefixQuery to be something like MultiPhraseQuery. Also DC> make MultipleTermPositions a private nested class of this, as this is DC> the only place MultipleTermPositions is used. DC> 5. Rename InputSteam to IndexInput and OutputStream to IndexOutput. DC> Also make both of these interfaces and add BufferedIndexInput and DC> BufferedIndexOutput as the implementation used by FSDirectory, DC> RAMDirectory, etc. This would permit unbuffered and native DC> implementations (e.g., that use mmap) that could potentially speed DC> things considerably. DC> 6. Replace DateField with something that formats dates suitably for DC> RangeQuery. DC> 7. Move language-specific analyzers into separate downloads? DC> 8. Add support for span queries to query parser? DC> Do you have other candidates? DC> Doug DC> --------------------------------------------------------------------- DC> To unsubscribe, e-mail: [EMAIL PROTECTED] DC> For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]