Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??
Kevin A. Burton wrote: I finally had some time to take Doug's advice and reburn our indexes with a larger TermInfosWriter.INDEX_INTERVAL value. You know... it looks like the problem is that TermInfosReader uses INDEX_INTERVAL during seeks and is probably just jumping RIGHT past the offsets that I need. If this is going to be a practical way of reducing Lucene memory footprint for HUGE indexes then its going to need a way to change this value based on the current index thats being opened. Is there anyway to determine the INDEX_INTERVAL from the file?It looks according to: http://jakarta.apache.org/lucene/docs/fileformats.html That the .tis file (which according to the docs the .tii file is very similar to the .tis file ) should have this data: So according to this: TermInfoFile (.tis)-- TIVersion, TermCount, IndexInterval, SkipInterval, TermInfos The only problem is that the .tii and .tis files I have on disk don't have a constant preamble and doesnt' look like there's an index interval here... Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??
Kevin A. Burton wrote: I finally had some time to take Doug's advice and reburn our indexes with a larger TermInfosWriter.INDEX_INTERVAL value. It looks like you're using a pre-1.4 version of Lucene. Since 1.4 this is no longer called TermInfosWriter.INDEX_INTERVAL, but rather TermInfosWriter.indexInterval. Is this setting incompatible with older indexes burned with the lower value? Prior to 1.4, yes. After 1.4, no. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??
Doug Cutting wrote: Kevin A. Burton wrote: I finally had some time to take Doug's advice and reburn our indexes with a larger TermInfosWriter.INDEX_INTERVAL value. It looks like you're using a pre-1.4 version of Lucene. Since 1.4 this is no longer called TermInfosWriter.INDEX_INTERVAL, but rather TermInfosWriter.indexInterval. Yes... we're trying to be conservative and haven't migrated yet. Though doing so might be required for this move I think... Is this setting incompatible with older indexes burned with the lower value? Prior to 1.4, yes. After 1.4, no. What happens after 1.4? Can I take indexes burned with 256 (a greater value) in 1.3 and open them up correctly with 1.4? Kevin PS. Once I get this working I'm going to create a wiki page documenting this process. Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??
Kevin A. Burton wrote: Is this setting incompatible with older indexes burned with the lower value? Prior to 1.4, yes. After 1.4, no. What happens after 1.4? Can I take indexes burned with 256 (a greater value) in 1.3 and open them up correctly with 1.4? Not without hacking things. If your 1.3 indexes were generated with 256 then you can modify your version of Lucene 1.4+ to use 256 instead of 128 when reading a Lucene 1.3 format index (SegmentTermEnum.java:54 today). Prior to 1.4 this was a constant, hardwired into the index format. In 1.4 and later each index segment stores this value as a parameter. So once 1.4 has re-written your index you'll no longer need a modified version. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??
Doug Cutting wrote: Not without hacking things. If your 1.3 indexes were generated with 256 then you can modify your version of Lucene 1.4+ to use 256 instead of 128 when reading a Lucene 1.3 format index (SegmentTermEnum.java:54 today). Prior to 1.4 this was a constant, hardwired into the index format. In 1.4 and later each index segment stores this value as a parameter. So once 1.4 has re-written your index you'll no longer need a modified version. Thanks for the feedback doug. This makes more sense now. I didn't understand why the website documented the fact that the .tii file was soring the index interval. I think I'm going to investigate just moving to 1.4 ... I need to do it anyway. Might as well bite the bullet now. Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]