Thanks for the links. I think it would be worth getting more detailed info.
Because it could be the performance threshold, or it could be st else /such
as updated java version or st else, loosely related to ram, eg what is held
in memory before the commit, what is cached, leaked custom query objects
with holding to some big object etc/. Btw if i study the graph, i see that
there *are* warning signs. That's the point of testing/measuring after all,
IMHO.

--roman
On 8 Feb 2014 13:51, "Shawn Heisey" <s...@elyograg.org> wrote:

> On 2/8/2014 11:02 AM, Roman Chyla wrote:
> > I would be curious what the cause is. Samarth says that it worked for
> over
> > a year /and supposedly docs were being added all the time/. Did the index
> > grew considerably in the last period? Perhaps he could attach visualvm
> > while it is in the 'black hole' state to see what is actually going on. I
> > don't know if the instance is used also for searching, but if its only
> > indexing, maybe just shorter commit intervals would alleviate the
> problem.
> > To add context, our indexer is configured with 16gb heap, on machine with
> > 64gb ram, but busy one, so sometimes there is no cache to spare for os.
> The
> > index is 300gb (out of which 140gb stored values), and it is working just
> > 'fine' - 30doc/s on average, but our docs are large /0.5mb on avg/ and
> > fetched from two databases, so the slowness is outside solr. I didnt see
> > big improvements with bigger heap, but I don't remember exact numbers.
> This
> > is solr4.
>
> For this discussion, refer to this image, or the Google Books link where
> I originally found it:
>
> https://dl.dropboxusercontent.com/u/97770508/performance-dropoff-graph.png
>
> http://books.google.com/books?id=dUiNGYCiWg0C&pg=PA33#v=onepage&q&f=false
>
> Computer systems have had a long history of performance curves like
> this.  Everything goes really well, possibly for a really long time,
> until you cross some threshold where a resource cannot keep up with the
> demands being placed on it.  That threshold is usually something you
> can't calculate in advance.  Once it is crossed, even by a tiny amount,
> performance drops VERY quickly.
>
> I do recommend that people closely analyze their GC characteristics, but
> jconsole, jvisualvm, and other tools like that are actually not very
> good at this task.  You can only get summary info -- how many GCs
> occurred and total amount of time spent doing GC, often with a useless
> granularity -- jconsole reports the time in minutes on a system that has
> been running for any length of time.
>
> I *was* having occasional super-long GC pauses (15 seconds or more), but
> I did not know it, even though I had religiously looked at GC info in
> jconsole and jstat.  I discovered the problem indirectly, and had to
> find additional tools to quantify it.  After discovering it, I tuned my
> garbage collection and have not had the problem since.
>
> If you have detailed GC logs enabled, this is a good free tool for
> offline analysis:
>
> https://code.google.com/p/gclogviewer/
>
> I have also had good results with this free tool, but it requires a
> little more work to set up:
>
> http://www.azulsystems.com/jHiccup
>
> Azul Systems has an alternate Java implementation for Linux that
> virtually eliminates GC pauses, but it isn't free.  I do not have any
> information about how much it costs.  We found our own solution, but for
> those who can throw money at the problem, I've heard good things about it.
>
> Thanks,
> Shawn
>
>

Reply via email to