Sounds interesting. (Is there a btree seralization impl in java?)
.V

jian chen wrote:

Hi,

If it is really the case that every 128th term is loaded into memory.
Could you use a relational database or b-tree to index to do the work
of indexing of the terms instead?

Even if you create another level of indexing on top of the .tii fle,
it is just a hack and would not scale well.

I would think a b/b+ tree based approach is the way to go for better
memory utilization.

Cheers,

Jian


On Sat, 22 Jan 2005 08:32:50 -0800 (PST), Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:


There Kevin, that's what I was referring to, the .tii file.

Otis

--- Paul Elschot <[EMAIL PROTECTED]> wrote:



On Saturday 22 January 2005 01:39, Kevin A. Burton wrote:


Kevin A. Burton wrote:



We have one large index right now... its about 60G ... When I


open it


the Java VM used 940M of memory. The VM does nothing else


besides


open this index.


After thinking about it I guess 1.5% of memory per index really


isn't


THAT bad. What would be nice if there was a way to do this from


disk


and then use the a buffer (either via the filesystem or in-vm


memory) to


access these variables.


It's even documented. From:
http://jakarta.apache.org/lucene/docs/fileformats.html :



The term info index, or .tii file.
This contains every IndexIntervalth entry from the .tis file, along


with its


location in the "tis" file. This is designed to be read entirely


into memory


and used to provide random access to the "tis" file.


My guess is that this is what you see happening.
To see the actuall .tii file, you need the non default file format.

Once searching starts you'll also see that the field norms are
loaded,
these take one byte per searched field per document.



This would be similar to the way the MySQL index cache works...


It would be possible to add another level of indexing to the terms.
No one has done this yet, so I guess it's prefered to buy RAM
instead...

Regards,
Paul Elschot


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






--
RiA-SoA w/JDNC <http://www.SandraSF.com> forums
- help develop a community
My blog <http://www.sandrasf.com/adminBlog>


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to