Peter Keegan wrote:
I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
(16/32 cpu hyperthreaded). on Linux. Here are the results:

8-cpu:  275 qps
16-cpu: 305 qps
(the dual-core Opteron servers are still faster)

Here is the stack trace of 8 of the 16 query threads during the test:

        at org.apache.lucene.index.SegmentReader.document(SegmentReader.java
:281)
        - waiting to lock <0x0000002adf5b2110> (a
org.apache.lucene.index.SegmentReader)
        at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:83)
        at org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java
:146)
        at org.apache.lucene.search.Hits.doc(Hits.java:103)

SegmentReader.document is a synchronized method. I have one stored field
(binary, uncompressed) with and average length of 0.5Kb. The retrieval of
this stored field is within this synchronized code. Since I am using
MMapDirectory, does this retrieval need to be synchronized?

Yes, since in FieldReader the file positions must be synchronized.

The way to avoid this would be to:

1. Add a clone() method to FieldReader that clones it's two IndexInputs.
2. Add a ThreadLocal to SegmentReader whose value is a cloned FieldReader.
3. Use the ThreadLocal's FieldReader in the document() method.

TermInfosReader has a similar optimization, using a ThreadLocal containing a SegmentTermEnum for each thread.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to