:
: The first int to Lucene41PostingsFormat is the min block size (default
: 25) and the second is the max (default 48) for the block tree terms
: dict.
we were discussing over on the solr-user mailing list how Tom would/could
go about configuring Solr to use a custom subclass of
Lucene41Postin
Thanks Mike,
Do you know how I can configure Solr to use the min=200 and
max=398 block sizes you suggested? Or should I ask on the Solr list?
Tom
On Sat, Jan 10, 2015 at 4:46 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> The first int to Lucene41PostingsFormat is the min block s
Thanks Mike,
> OK. It would be good to know where all your RAM is being consumed,
> and how much of that is really the terms index: it ought to be a very
> small part of it.
>
> I made a bunch of heap dumps. I just watched with jconsole and ran jmap
-histo when memory use got high.
I've appende
On Sat, Jan 10, 2015 at 7:58 PM, Tom Burton-West wrote:
> Thanks Mike,
>
> We run our Solr 3.x indexing with 10GB/shard. I've been testing Solr 4
> with 4,6, and 8GB for heap. As of Friday night when the indexes were about
> half done (about 400GB on disk) only the 4GB had issues. I'll find out
Tom:
I'll be very interested to see your final numbers. I did a worst-case
test at one
point and saw a 2/3 reduction, but that was deliberately "worst
case", I used
a bunch of string/text types, did some faceting on them, etc, IOW not real-world
at all. So it'll be cool to see what you come up
Thanks Mike,
We run our Solr 3.x indexing with 10GB/shard. I've been testing Solr 4
with 4,6, and 8GB for heap. As of Friday night when the indexes were about
half done (about 400GB on disk) only the 4GB had issues. I'll find out on
Monday if the other runs had issues. If we can go from 10GB i
The first int to Lucene41PostingsFormat is the min block size (default
25) and the second is the max (default 48) for the block tree terms
dict.
The max must be >= 2*(min-1).
Since you were using 8X the default before, maybe try min=200 and
max=398? However, block tree should have been more RAM
Hello all,
We have over 3 billion unique terms in our indexes and with Solr 3.x we set
the TermIndexInterval to about 8 times its default value in order to index
without OOMs. (
http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again)
We are now working with Solr 4 and running in