Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??

2005-02-24 Thread Kevin A. Burton
Kevin A. Burton wrote:
I finally had some time to take Doug's advice and reburn our indexes 
with a larger TermInfosWriter.INDEX_INTERVAL value.
You know... it looks like the problem is that TermInfosReader uses 
INDEX_INTERVAL during seeks and is probably just jumping RIGHT past the 
offsets that I need.

If this is going to be a practical way of reducing Lucene memory 
footprint for HUGE indexes then its going to need a way to change this 
value based on the current index thats being opened.

Is there anyway to determine the INDEX_INTERVAL from the file?It 
looks according to:

http://jakarta.apache.org/lucene/docs/fileformats.html
That the .tis file (which according to the docs the .tii file is very 
similar to the .tis file ) should have this data:

So according to this:
TermInfoFile (.tis)-- TIVersion, TermCount, IndexInterval, 
SkipInterval, TermInfos

The only problem is that the .tii and .tis files I have on disk don't 
have a constant preamble and doesnt' look like there's an index interval 
here...

Kevin
--
Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an 
invite!  Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
If you're interested in RSS, Weblogs, Social Networking, etc... then you 
should work for Rojo!  If you recommend someone and we hire them you'll 
get a free iPod!
   
Kevin A. Burton, Location - San Francisco, CA
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??

2005-02-24 Thread Doug Cutting
Kevin A. Burton wrote:
I finally had some time to take Doug's advice and reburn our indexes 
with a larger TermInfosWriter.INDEX_INTERVAL value.
It looks like you're using a pre-1.4 version of Lucene.  Since 1.4 this 
is no longer called TermInfosWriter.INDEX_INTERVAL, but rather 
TermInfosWriter.indexInterval.

Is this setting incompatible with older indexes burned with the lower 
value?
Prior to 1.4, yes.  After 1.4, no.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??

2005-02-24 Thread Kevin A. Burton
Doug Cutting wrote:
Kevin A. Burton wrote:
I finally had some time to take Doug's advice and reburn our indexes 
with a larger TermInfosWriter.INDEX_INTERVAL value.

It looks like you're using a pre-1.4 version of Lucene.  Since 1.4 
this is no longer called TermInfosWriter.INDEX_INTERVAL, but rather 
TermInfosWriter.indexInterval.
Yes... we're trying to be conservative and haven't migrated yet.  Though 
doing so might be required for this move I think...

Is this setting incompatible with older indexes burned with the lower 
value?

Prior to 1.4, yes.  After 1.4, no.
What happens after 1.4?  Can I take indexes burned with 256 (a greater 
value) in 1.3 and open them up correctly with 1.4?

Kevin
PS.  Once I get this working I'm going to create a wiki page documenting 
this process.

Kevin
--
Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an 
invite!  Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
If you're interested in RSS, Weblogs, Social Networking, etc... then you 
should work for Rojo!  If you recommend someone and we hire them you'll 
get a free iPod!
   
Kevin A. Burton, Location - San Francisco, CA
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??

2005-02-24 Thread Doug Cutting
Kevin A. Burton wrote:
Is this setting incompatible with older indexes burned with the lower 
value?
Prior to 1.4, yes.  After 1.4, no.
What happens after 1.4?  Can I take indexes burned with 256 (a greater 
value) in 1.3 and open them up correctly with 1.4?
Not without hacking things.  If your 1.3 indexes were generated with 256 
then you can modify your version of Lucene 1.4+ to use 256 instead of 
128 when reading a Lucene 1.3 format index (SegmentTermEnum.java:54 today).

Prior to 1.4 this was a constant, hardwired into the index format.  In 
1.4 and later each index segment stores this value as a parameter.  So 
once 1.4 has re-written your index you'll no longer need a modified version.

Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ??

2005-02-24 Thread Kevin A. Burton
Doug Cutting wrote:
Not without hacking things.  If your 1.3 indexes were generated with 
256 then you can modify your version of Lucene 1.4+ to use 256 instead 
of 128 when reading a Lucene 1.3 format index (SegmentTermEnum.java:54 
today).

Prior to 1.4 this was a constant, hardwired into the index format.  In 
1.4 and later each index segment stores this value as a parameter.  So 
once 1.4 has re-written your index you'll no longer need a modified 
version.
Thanks for the feedback doug. 

This makes more sense now. I didn't understand why the website 
documented the fact that the .tii file was soring the index interval.

I think I'm going to investigate just moving to 1.4 ...  I need to do it 
anyway.  Might as well bite the bullet now.

Kevin
--
Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an 
invite!  Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
If you're interested in RSS, Weblogs, Social Networking, etc... then you 
should work for Rojo!  If you recommend someone and we hire them you'll 
get a free iPod!
   
Kevin A. Burton, Location - San Francisco, CA
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]