Hi,
I ran into a problem earlier this week, where by an index of 8
million small documents resulted in an index file of 2GB. It turns
out this is a common file system limit (some say it might be a java
limit as well).
Anyway, I have no idea which index file it was, but it seems that I
need to be able to control the distribution of the index ?
Each document looks like this:
exact_keywords: my_phrase_with_underscores , indexed, but not tokenized.
body: my phrase with underscores (repeated N times for weight)
<EOM> title words <EOM> body <EOM>
add_info1:
add_info2: (just stored, short info)
add_info3:
The body is pretty short (say 120 words or so).
As it turns out, the Index I had created was actually usable (thank
heaven, after 24 hours of indexing :)).
*** Anyway, is there anyway to control how big the indexes grow ? ****
Cheers,
Winton
p.s. A big thank you to Doug -- I was able to deliver my experimental
results on time :) My boss is V.Happy :)
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>