[jira] Updated: (LUCENE-2143) Understand why NRT performance is affected by flush frequency

Michael McCandless (JIRA) Mon, 14 Dec 2009 05:17:44 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael McCandless updated LUCENE-2143:
---------------------------------------

    Attachment: SearchTest.java

So, good news / bad news...

The good news is I got a more mainstream test env (CentOS 5.4) online.

The bad news is the strange anomolies when testing NRT still occur,
and, flushing every 100 docs does not work around them.

But then the good news is, I managed to isolate the problem to the
hotspot compiler: somehow, it consistently compiles Lucene's search
code less efficiently (20-30% slower) depending on which test is being
run, which basically makes it impossible to really test performance
tradeoffs of NRT.

I've attached a simple SearchTest.java that shows the hotspot issue.

Run it like this:
{code}
java SearchTest /path/to/index <warmMethod>
{code}

I'm testing against a 5M doc Wikipedia index.

The <warmMethod> can be:

  * "writer": open a writer, indexes docs, then rollback

  * "nrt": same as "writer", but periodically get an NRT reader

  * "reader": just open an IndexReader N times, then close it

  * "searcher": same as "reader", but do some searching against each
    opened reader

  * "none": do no warming

After the warming, the test just runs a set of searches (TermQuery for
terms 0, 1, 2 ... 9) 10 times, then prints the min time.

I ran the tests on a 5M docs wikipedia index.

On nearly all JREs version I've tested, on OpenSolaris 2009.06 &
CentOS 5.4, warming with NRT causes a "permanent" loss of search
performance of somewhere between 20-30%.  EG here's my results on
OpenSolaris:

{code}
nrt...
  5718 msec
searcher...
  4664 msec
reader...
  4771 msec
writer...
  4785 msec
none...
  4839 msec
{code}

On CentOS:
{code}
nrt...
  4550 msec
searcher...
  3760 msec
reader...
  4730 msec
writer...
  3780 msec
none...
  3766 msec
{code}

(In this case the "reader" warming also kicked hotspot into the slow
mode... it seems to be intermittant because sometimes "reader" is
fast)

I run java as "java -server -Xms1g -Xmx1g"

It's very odd... I suspect something buggy in hotspot, but I'm not
sure how to isolate it.  It seems to somehow kick itself into a state
where it produces less optimal code for searching.  And we're not
talking about that many methods, on the hotspots for running
TermQuery...

I even printed out the assembly code (-XX:+PrintOptoAssembly) and it
was very strange -- eg even IndexInput.readVInt was compiled
differently, if you warmed with "nrt" vs the others.  I don't get it.

I'm trying to find a workaround that makes hotspot more manageable so
I can run real tests....


> Understand why NRT performance is affected by flush frequency
> -------------------------------------------------------------
>
>                 Key: LUCENE-2143
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2143
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 3.1
>
>         Attachments: SearchTest.java
>
>
> In LUCENE-2061 (perf tests for NRT), I test NRT performance by first
> getting a baseline QPS with only searching, using enough threads to
> saturate.
> Then, I pick an indexing rate (I used 100 docs/sec), and index docs at
> that rate, and I also reopen a NRT reader at different frequencies
> (10/sec, 1/sec, every 5 seconds, etc.), and then again test QPS
> (saturated).
> I think this is a good approach for testing NRT -- apps can see, as a
> function of "freshness" and at a fixed indexing rate, what the cost is
> to QPS.  You'd expect as index rate goes up, and freshness goes up,
> QPS will go down.
> But I found something very strange: the low frequency reopen rates
> often caused a highish hit to QPS.  When I forced IW to flush every
> 100 docs (= once per second), the performance was generally much
> better.
> I actually would've expected the reverse -- flushing in batch ought to
> use fewer resoruces.
> One theory is something odd about my test env (based on OpenSolaris),
> so I'd like to retest on a more mainstream OS.
> I'm opening this issue to get to the bottom of it...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2143) Understand why NRT performance is affected by flush frequency

Reply via email to