On Mon, Jun 29, 2020 at 3:13 PM Erick Erickson <erickerick...@gmail.com> wrote:
> ps aux | grep solr > [solr@faspbsy0002 database-backups]$ ps aux | grep solr solr 72072 1.6 33.4 22847816 10966476 ? Sl 13:35 1:36 java -server -Xms16g -Xmx16g -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=200 -XX:+UseLargePages -XX:+AggressiveOpts -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/opt/solr/server/logs/solr_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M -Dsolr.log.dir=/opt/solr/server/logs -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC -Djetty.home=/opt/solr/server -Dsolr.solr.home=/opt/solr/server/solr -Dsolr.data.home= -Dsolr.install.dir=/opt/solr -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf -Xss256k -Dsolr.jetty.https.port=8983 -Dsolr.log.muteconsole -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /opt/solr/server/logs -jar start.jar --module=http > should show you all the parameters Solr is running with, as would the > admin screen. You should see something like: > > -XX:OnOutOfMemoryError=your_solr_directory/bin/oom_solr.sh > > And there should be some logs laying around if that was the case > similar to: > $SOLR_LOGS_DIR/solr_oom_killer-$SOLR_PORT-$NOW.log > This log is not being written, even though in the oom_solr.sh it does appear a solr_oom_killer-$SOLR_PORT-$NOW.log should be written to the logs directory, but it isn't. There are some log files in /opt/solr/server/logs, and they are indeed being written to. There are fresh entries in the logs, but no sign of any problem. If I grep for oom in the logs directory, the only references I see are benign... just a few entries that list all the flags, and oom_solr.sh is among the settings visible in the entry. And someone did a search for "Mushroom," so there's another instance of oom from that search. As for memory, It Depends (tm). There are configurations > you can make choices about that will affect the heap requirements. > You can’t really draw comparisons between different projects. Your > Drupal + Solr app has how many documents? Indexed how? Searched > how? .vs. this one. > > The usual suspect for configuration settings that are responsible > include: > > - filterCache size too large. Each filterCache entry is bounded by > maxDoc/8 bytes. I’ve seen people set this to over 1M… > > - using non-docValues for fields used for sorting, grouping, function > queries > or faceting. Solr will uninvert the field on the heap, whereas if you have > specified docValues=true, the memory is out in OS memory space rather than > heap. > > - People just putting too many docs in a collection in a single JVM in > aggregate. > All replicas in the same instance are using part of the heap. > > - Having unnecessary options on your fields, although that’s more MMap > space than > heap. > > The problem basically is that all of Solr’s access is essentially random, > so for > performance reasons lots of stuff has to be in memory. > > That said, Solr hasn’t been as careful as it should be about using up > memory, > that’s ongoing. > > If you really want to know what’s using up memory, throw a heap analysis > tool > at it. That’ll give you a clue what’s hogging memory and you can go from > there. > > > On Jun 29, 2020, at 1:48 PM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > little nit picky note here, use 31gb, never 32. > > > > On Mon, Jun 29, 2020 at 1:45 PM Ryan W <rya...@gmail.com> wrote: > > > >> It figures it would happen again a couple hours after I suggested the > issue > >> might be resolved. Just now, Solr stopped running. I cleared the > cache in > >> my app a couple times around the time that it happened, so perhaps that > was > >> somehow too taxing for the server. However, I've never allocated so > much > >> RAM to a website before, so it's odd that I'm getting these failures. > My > >> colleagues were astonished when I said people on the solr-user list were > >> telling me I might need 32GB just for solr. > >> > >> I manage another project that uses Drupal + Solr, and we have a total of > >> 8GB of RAM on that server and Solr never, ever stops. I've been > managing > >> that site for years and never seen a Solr outage. On that project, > >> Drupal + Solr is OK with 8GB, but somehow this other project needs 64 > GB or > >> more? > >> > >> "The thing that’s unsettling about this is that assuming you were > hitting > >> OOMs, and were running the OOM-killer script, you _should_ have had very > >> clear evidence that that was the cause." > >> > >> How do I know if I'm running the OOM-killer script? > >> > >> Thank you. > >> > >> On Mon, Jun 29, 2020 at 12:12 PM Erick Erickson < > erickerick...@gmail.com> > >> wrote: > >> > >>> The thing that’s unsettling about this is that assuming you were > hitting > >>> OOMs, > >>> and were running the OOM-killer script, you _should_ have had very > clear > >>> evidence that that was the cause. > >>> > >>> If you were not running the killer script, the apologies for not asking > >>> about that > >>> in the first place. Java’s performance is unpredictable when OOMs > happen, > >>> which is the point of the killer script: at least Solr stops rather > than > >> do > >>> something inexplicable. > >>> > >>> Best, > >>> Erick > >>> > >>>> On Jun 29, 2020, at 11:52 AM, David Hastings < > >>> hastings.recurs...@gmail.com> wrote: > >>>> > >>>> sometimes just throwing money/ram/ssd at the problem is just the best > >>>> answer. > >>>> > >>>> On Mon, Jun 29, 2020 at 11:38 AM Ryan W <rya...@gmail.com> wrote: > >>>> > >>>>> Thanks everyone. Just to give an update on this issue, I bumped the > >> RAM > >>>>> available to Solr up to 16GB a couple weeks ago, and haven’t had any > >>>>> problem since. > >>>>> > >>>>> > >>>>> On Tue, Jun 16, 2020 at 1:00 PM David Hastings < > >>>>> hastings.recurs...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> me personally, around 290gb. as much as we could shove into them > >>>>>> > >>>>>> On Tue, Jun 16, 2020 at 12:44 PM Erick Erickson < > >>> erickerick...@gmail.com > >>>>>> > >>>>>> wrote: > >>>>>> > >>>>>>> How much physical RAM? A rule of thumb is that you should allocate > >> no > >>>>>> more > >>>>>>> than 25-50 percent of the total physical RAM to Solr. That's > >>>>> cumulative, > >>>>>>> i.e. the sum of the heap allocations across all your JVMs should be > >>>>> below > >>>>>>> that percentage. See Uwe Schindler's mmapdirectiry blog... > >>>>>>> > >>>>>>> Shot in the dark... > >>>>>>> > >>>>>>> On Tue, Jun 16, 2020, 11:51 David Hastings < > >>>>> hastings.recurs...@gmail.com > >>>>>>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> To add to this, i generally have solr start with this: > >>>>>>>> -Xms31000m-Xmx31000m > >>>>>>>> > >>>>>>>> and the only other thing that runs on them are maria db gallera > >>>>> cluster > >>>>>>>> nodes that are not in use (aside from replication) > >>>>>>>> > >>>>>>>> the 31gb is not an accident either, you dont want 32gb. > >>>>>>>> > >>>>>>>> > >>>>>>>> On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey < > apa...@elyograg.org > >>> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> On 6/11/2020 11:52 AM, Ryan W wrote: > >>>>>>>>>>> I will check "dmesg" first, to find out any hardware error > >>>>>> message. > >>>>>>>>> > >>>>>>>>> <snip> > >>>>>>>>> > >>>>>>>>>> [1521232.781801] Out of memory: Kill process 117529 (httpd) > >>>>> score 9 > >>>>>>> or > >>>>>>>>>> sacrifice child > >>>>>>>>>> [1521232.782908] Killed process 117529 (httpd), UID 48, > >>>>>>>>> total-vm:675824kB, > >>>>>>>>>> anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB > >>>>>>>>>> > >>>>>>>>>> Is this a relevant "Out of memory" message? Does this suggest > an > >>>>>> OOM > >>>>>>>>>> situation is the culprit? > >>>>>>>>> > >>>>>>>>> Because this was in the "dmesg" output, it indicates that it is > >> the > >>>>>>>>> operating system killing programs because the *system* doesn't > >> have > >>>>>> any > >>>>>>>>> memory left. It wasn't Java that did this, and it wasn't Solr > >> that > >>>>>> was > >>>>>>>>> killed. It very well could have been Solr that was killed at > >>>>> another > >>>>>>>>> time, though. > >>>>>>>>> > >>>>>>>>> The process that it killed this time is named httpd ... which is > >>>>> most > >>>>>>>>> likely the Apache webserver. Because the UID is 48, this is > >>>>> probably > >>>>>>> an > >>>>>>>>> OS derived from Redhat, where the "apache" user has UID and GID > 48 > >>>>> by > >>>>>>>>> default. Apache with its default config can be VERY memory > hungry > >>>>>> when > >>>>>>>>> it gets busy. > >>>>>>>>> > >>>>>>>>>> -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912 > >>>>>>>>> > >>>>>>>>> This says that you started Solr with the default 512MB heap. > >> Which > >>>>>> is > >>>>>>>>> VERY VERY small. The default is small so that Solr will start on > >>>>>>>>> virtually any hardware. Almost every user must increase the heap > >>>>>> size. > >>>>>>>>> And because the OS is killing processes, it is likely that the > >>>>> system > >>>>>>>>> does not have enough memory installed for what you have running > on > >>>>>> it. > >>>>>>>>> > >>>>>>>>> It is generally not a good idea to share the server hardware > >>>>> between > >>>>>>>>> Solr and other software, unless the system has a lot of spare > >>>>>>> resources, > >>>>>>>>> memory in particular. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Shawn > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>> > >>> > >> > >