Thank you guys for all the suggestions and help! I'Ve identified the main
culprit with debug=timing.  It was the mlt component.  After I removed it,
the speed of the query went back to reasonable.  Another culprit is the
expand component, but I can't remove it.  We've downgraded our amazon
instance to 60G mem with general purpose SSD and the performance is pretty
good.  It's only 70 cents/hr versus 2.80/hr for the 244G mem instance :)

I also added all the suggested JMV parameters.  Now I have a gc.log that I
dig into.

One thing I would like to understand is how memory is managed by solr.

If I do 'top -u solr', I see something like this:

Mem:  62920240k total, 62582524k used,   337716k free,   133360k buffers
Swap:        0k total,        0k used,        0k free, 54500892k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    
 4266 solr      20   0  192g 5.1g 854m S  0.0  8.4  37:09.97 java

There are two things:
1) Mem: 62920240k total, 62582524k used. I think this is what the solr
admin "physical memory" bar graph reports on.  Can I assume that most of
the mem is used for loading part of the index?

2) And then there's the VIRT 192g and RES 5.1g.  What is the 5.1 RES
(physical memory) that is used by solr?




Rebecca Tang
Applications Developer, UCSF CKM
Industry Documents Digital Libraries
E: rebecca.t...@ucsf.edu





On 2/25/15 7:57 PM, "Otis Gospodnetic" <otis.gospodne...@gmail.com> wrote:

>Lots of suggestions here already.  +1 for those JVM params from Boogie and
>for looking at JMX.
>Rebecca, try SPM <http://sematext.com/spm> (will look at JMX for you,
>among
>other things), it may save you time figuring out
>JVM/heap/memory/performance issues.  If you can't tell what's slow via
>SPM,
>we can have a look at your metrics (charts are sharable) and may be able
>to
>help you faster than guessing.
>
>Otis
>--
>Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>Solr & Elasticsearch Support * http://sematext.com/
>
>
>On Wed, Feb 25, 2015 at 4:27 PM, Erick Erickson <erickerick...@gmail.com>
>wrote:
>
>> Before diving in too deeply, try attaching &debug=timing to the query.
>> Near the bottom of the response there'll be a list of the time taken
>> by each _component_. So there'll be separate entries for query,
>> highlighting, etc.
>>
>> This may not show any surprises, you might be spending all your time
>> scoring. But it's worth doing as a check and might save you from going
>> down some dead-ends. I mean if your query winds up spending 80% of its
>> time in the highlighter you know where to start looking..
>>
>> Best,
>> Erick
>>
>>
>> On Wed, Feb 25, 2015 at 12:01 PM, Boogie Shafer
>> <boogie.sha...@proquest.com> wrote:
>> > rebecca,
>> >
>> > you probably need to dig into your queries, but if you want to
>> force/preload the index into memory you could try doing something like
>> >
>> > cat `find /path/to/solr/index` > /dev/null
>> >
>> >
>> > if you haven't already reviewed the following, you might take a look
>>here
>> > https://wiki.apache.org/solr/SolrPerformanceProblems
>> >
>> > perhaps going back to a very vanilla/default solr configuration and
>> building back up from that baseline to better isolate what might
>>specific
>> setting be impacting your environment
>> >
>> > ________________________________________
>> > From: Tang, Rebecca <rebecca.t...@ucsf.edu>
>> > Sent: Wednesday, February 25, 2015 11:44
>> > To: solr-user@lucene.apache.org
>> > Subject: RE: how to debug solr performance degradation
>> >
>> > Sorry, I should have been more specific.
>> >
>> > I was referring to the solr admin UI page. Today we started up an AWS
>> > instance with 240 G of memory to see if we fit all of our index
>>(183G) in
>> > the memory and have enough for the JMV, could it improve the
>>performance.
>> >
>> > I attached the admin UI screen shot with the email.
>> >
>> > The top bar is ³Physical Memory² and we have 240.24 GB, but only 4%
>>9.52
>> > GB is used.
>> >
>> > The next bar is Swap Space and it¹s at 0.00 MB.
>> >
>> > The bottom bar is JVM Memory which is at 2.67 GB and the max is 26G.
>> >
>> > My understanding is that when Solr starts up, it reserves some memory
>>for
>> > the JVM, and then it tries to use up as much of the remaining physical
>> > memory as possible.  And I used to see the physical memory at anywhere
>> > between 70% to 90+%.  Is this understanding correct?
>> >
>> > And now, even with 240G of memory, our index is performing at 10 - 20
>> > seconds for a query.  Granted that our queries have fq¹s and
>>highlighting
>> > and faceting, I think with a machine this powerful I should be able to
>> get
>> > the queries executed under 5 seconds.
>> >
>> > This is what we send to Solr:
>> > q=(phillip%20morris)
>> > &wt=json
>> > &start=0
>> > &rows=50
>> > &facet=true
>> > &facet.mincount=0
>> > &facet.pivot=industry,collection_facet
>> > &facet.pivot=availability_facet,availabilitystatus_facet
>> > &facet.field=dddate
>> >
>> 
>>&fq%3DNOT(pg%3A1%20AND%20(dt%3A%22blank%20document%22%20OR%20dt%3A%22blan
>>k%
>> >
>> 
>>20page%22%20OR%20dt%3A%22file%20folder%22%20OR%20dt%3A%22file%20folder%20
>>be
>> >
>> 
>>gin%22%20OR%20dt%3A%22file%20folder%20cover%22%20OR%20dt%3A%22file%20fold
>>er
>> >
>> 
>>%20end%22%20OR%20dt%3A%22file%20folder%20label%22%20OR%20dt%3A%22file%20s
>>he
>> >
>> 
>>et%22%20OR%20dt%3A%22file%20sheet%20beginning%22%20OR%20dt%3A%22tab%20pag
>>e%
>> > 22%20OR%20dt%3A%22tab%20sheet%22))
>> > &facet.field=dt_facet
>> > &facet.field=brd_facet
>> > &facet.field=dg_facet
>> > &hl=true
>> > &hl.simple.pre=%3Ch1%3E
>> > &hl.simple.post=%3C%2Fh1%3E
>> > &hl.requireFieldMatch=false
>> > &hl.preserveMulti=true
>> > &hl.fl=ot,ti
>> > &f.ot.hl.fragsize=300
>> > &f.ot.hl.alternateField=ot
>> > &f.ot.hl.maxAlternateFieldLength=300
>> > &f.ti.hl.fragsize=300
>> > &f.ti.hl.alternateField=ti
>> > &f.ti.hl.maxAlternateFieldLength=300
>> > &fq={!collapse%20field=signature}
>> > &expand=true
>> > &sort=score+desc,availability_facet+asc
>> >
>> >
>> > My guess is that it¹s performing so badly because it¹s only using 4%
>>of
>> > the memory? And searches require disk access.
>> >
>> >
>> > Rebecca
>> > ________________________________________
>> > From: Shawn Heisey [apa...@elyograg.org]
>> > Sent: Tuesday, February 24, 2015 5:23 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: how to debug solr performance degradation
>> >
>> > On 2/24/2015 5:45 PM, Tang, Rebecca wrote:
>> >> We gave the machine 180G mem to see if it improves performance.
>> However,
>> >> after we increased the memory, Solr started using only 5% of the
>> physical
>> >> memory.  It has always used 90-something%.
>> >>
>> >> What could be causing solr to not grab all the physical memory
>>(grabbing
>> >> so little of the physical memory)?
>> >
>> > I would like to know what memory numbers in which program you are
>> > looking at, and why you believe those numbers are a problem.
>> >
>> > The JVM has a very different view of memory than the operating system.
>> > Numbers in "top" mean different things than numbers on the dashboard
>>of
>> > the admin UI, or the numbers in jconsole.  If you're on Windows, then
>> > replace "top" with task manager, process explorer, resource monitor,
>>etc.
>> >
>> > Please provide as many details as you can about the things you are
>> > looking at.
>> >
>> > Thanks,
>> > Shawn
>> >
>>

Reply via email to