Hi,

If it is really the case that every 128th term is loaded into memory.
Could you use a relational database or b-tree to index to do the work
of indexing of the terms instead?

Even if you create another level of indexing on top of the .tii fle,
it is just a hack and would not scale well.

I would think a b/b+ tree based approach is the way to go for better
memory utilization.

Cheers,

Jian


On Sat, 22 Jan 2005 08:32:50 -0800 (PST), Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> There Kevin, that's what I was referring to, the .tii file.
> 
> Otis
> 
> --- Paul Elschot <[EMAIL PROTECTED]> wrote:
> 
> > On Saturday 22 January 2005 01:39, Kevin A. Burton wrote:
> > > Kevin A. Burton wrote:
> > >
> > > > We have one large index right now... its about 60G ... When I
> > open it
> > > > the Java VM used 940M of memory.  The VM does nothing else
> > besides
> > > > open this index.
> > >
> > > After thinking about it I guess 1.5% of memory per index really
> > isn't
> > > THAT bad.  What would be nice if there was a way to do this from
> > disk
> > > and then use the a buffer (either via the filesystem or in-vm
> > memory) to
> > > access these variables.
> >
> > It's even documented. From:
> > http://jakarta.apache.org/lucene/docs/fileformats.html :
> >
> > >The term info index, or .tii file.
> > >This contains every IndexIntervalth entry from the .tis file, along
> > with its
> > >location in the "tis" file. This is designed to be read entirely
> > into memory
> > >and used to provide random access to the "tis" file.
> >
> > My guess is that this is what you see happening.
> > To see the actuall .tii file, you need the non default file format.
> >
> > Once searching starts you'll also see that the field norms are
> > loaded,
> > these take one byte per searched field per document.
> >
> > > This would be similar to the way the MySQL index cache works...
> >
> > It would be possible to add another level of indexing to the terms.
> > No one has done this yet, so I guess it's prefered to buy RAM
> > instead...
> >
> > Regards,
> > Paul Elschot
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to