On 25-May-07, at 2:09 PM, Yonik Seeley wrote:

On 5/25/07, Mike Klaas <[EMAIL PROTECTED]> wrote:

HashDocSet maxSize: perhaps consider increasing this, or making this
by default a parameter which is tuned automatically (.5% of maxDocs,
for instance)

I think when HashDocSet is large enough, it can be slower than
OpenBitSet for taking intersections, even when it still saves memory.
So it depends on what one is optimizing for.

I picked 3000 long ago since that it seemed the fastest for faceting
with one particular data set (between 500K to 1M docs), but that was
before OpenBitSet.  It also caps the max table size at 4096 entries

Wasn't HashDocSet significantly optimized for intersection recently?

(16K RAM) (power of two hash table with a load factor of .75).  Does
it make sense to go up to 8K entries?  Do you have any data on
different sizes?

Unfortunately, I don't. I'm using 20K right now for indices ranging in size from 3-8M docs, but that was based on advice on the wiki, and the memory savings seemed worth it (each bit filter is pushing 500Kb to 1Mb at that scale). I might have time to run some experiments before 1.2 is released. If not, 3000 seems like a well-founded default.

Most people will start with the example solrconfig.xml, I suspect,
and getting the  performance-related settings right at the start will
help the perception of Solr's performance.  I'd be tempted to
increase the default filterCache size too, but that can have quite
high memory requirements.

Yeah, many people won't think to increase the VM heap size.
Perhaps that's better as a documentation fix.

I just added a note to SolrPerformanceFactors. Most of the information is already on the wiki.

What about commenting out most of the default parameters in the dismax
handler config, so it becomes more standard & usable (w/o editing it's
config) after someone customizes their schema?

Makes sense, but I agree with Hoss that it is nice for the user to be able to easily use the example OOB.

-Mike

Reply via email to