Re: 1.4.x TermInfosWriter.indexInterval not public static ?
Chris Hostetter wrote: 1) If making it mutatable requires changes to other classes to propogate it, then why is it now an instance variable instead of a static? (Presumably making it an instance variable allows subclasses to override the value, but if other classes have internal expectations of the value, that doesn't seem safe) Its an instance variable because it can vary from instance-to-instance. This value is specified when an index segment is written, and subsequently read from disk and used when reading that segment. It's an instance variable in both the writing and reading code. The thing that's lacking is a way to pass in alternate values to the writing code. The reason that other classes are involved is that the reading and writing code are in non-public classes. We don't want to expose the implementation too much by making these public, but would rather expose these as getter/setter methods on the relevant public API. 2) Should it be configurable through a get/set method, or through a system property? (which rehashes the instance/global question) That's indeed the question. My guess is that a system property would be probably be sufficient for most, but perhaps not for all. Similarly with a static setter/getter. But a getter/setter on IndexWriter would make everyone happy. 3) Is it important that a writer updating an existing index use the same value as the writer that initial created the index? if so should there really be a preferedIndexInterval variable which is mutatable, and a currentIndexInterval which is set to the value of the index currently being updated. Such that preferedIndexInterval is used when making an index from scratch and currentIndexInterval is used when adding segments to a new index? It's used whenever an index segment is created. Index segments are created when documents are added and when index segments are merged to form larger index segments. Merging happens frequently while indexing. Optimization merges all segments. The value can vary in each segment. The default value is probably good for all but folks with very large indexes, who may wish to increase the default somewhat. Also folks with smaller indexes and very high query volumes may wish to decrease the default. It's a classic time/memory tradeoff. Higher values use less memory and make searches a bit slower, smaller values use more memory and make searches a bit faster. Unless there are objections I will add this as: IndexWriter.setTermIndexInterval() IndexWriter.getTermIndexInterval() Both will be marked Expert. Further discussion should move to the lucene-dev list. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.x TermInfosWriter.indexInterval not public static ?
Kevin A. Burton wrote: Whats the desired pattern of using of TermInfosWriter.indexInterval ? There isn't one. It is not a part of the public API. It is an unsupported internal feature. Do I have to compile my own version of Lucene to change this? Yes. The last API was public static final but this is not public nor static. It was never public. It used to be static and final, but is now an instance variable. I'm wondering if we should just make this a value that can be set at runtime. Considering the memory savings for larger installs this can/will be important. The place to put getter/setters would be IndexWriter, since that's the public home of all other index parameters. Some changes to DocumentWriter and SegmentMerger would be required to pass this value through to TermInfosWriter from IndexWriter. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.x TermInfosWriter.indexInterval not public static ?
: Whats the desired pattern of using of TermInfosWriter.indexInterval ? : : There isn't one. It is not a part of the public API. It is an : unsupported internal feature. : It was never public. It used to be static and final, but is now an : instance variable. : The place to put getter/setters would be IndexWriter, since that's the : public home of all other index parameters. Some changes to : DocumentWriter and SegmentMerger would be required to pass this value : through to TermInfosWriter from IndexWriter. I don't really understand what this variable does, but from what I do understand: changing it's value can have significant performance impacts depending on the nature of the data being indexed. That leads me to belive3 that making it configurale would be a good idea, but it begs a some questions: 1) If making it mutatable requires changes to other classes to propogate it, then why is it now an instance variable instead of a static? (Presumably making it an instance variable allows subclasses to override the value, but if other classes have internal expectations of the value, that doesn't seem safe) 2) Should it be configurable through a get/set method, or through a system property? (which rehashes the instance/global question) 3) Is it important that a writer updating an existing index use the same value as the writer that initial created the index? if so should there really be a preferedIndexInterval variable which is mutatable, and a currentIndexInterval which is set to the value of the index currently being updated. Such that preferedIndexInterval is used when making an index from scratch and currentIndexInterval is used when adding segments to a new index? -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
1.4.x TermInfosWriter.indexInterval not public static ?
Whats the desired pattern of using of TermInfosWriter.indexInterval ? Do I have to compile my own version of Lucene to change this? The last API was public static final but this is not public nor static. I'm wondering if we should just make this a value that can be set at runtime. Considering the memory savings for larger installs this can/will be important. Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]