Does your web application, by any chance, allow deep paging or something like that which requires returning rows at the end of a large result set? Something like a query where you could have parameters like &rows=10&start=1000000 ? That can easily cause OOM with Solr when using a sharded index. It would typically require a large number of rows to be returned and combined from all shards just to get the few rows to return in the correct order.
For the above example with 8 shards, Solr would have to fetch 1 000 010 rows from each shard. That's over 8 million rows! Even if it's just identifiers, that's a lot of memory required for an operation that seems so simple from the surface. If this is the case, you'll need to prevent the web application from issuing such queries. This may mean something like supporting paging only among the first 10 000 results. Typical requirement may also be to be able to see the last results of a query, but this can be accomplished by allowing sorting in both ascending and descending order. Regards, Ere Kojo kirjoitti 14.8.2019 klo 16.20: > Shawn, > > Only my web application access this solr. at a first look at http server > logs I didnt find something different. Sometimes I have a very big crawler > access to my servers, this was my first bet. > > No scheduled crons running at this time too. > > I think that I will reconfigure my boxes with two solr nodes each instead > of four and increase heap to 16GB. This box only run Solr and has 64Gb. > Each Solr will use 16Gb and the box will still have 32Gb for the OS. What > do you think? > > This is a production server, so I will plan to migrate. > > Regards, > Koji > > > Em ter, 13 de ago de 2019 às 12:58, Shawn Heisey <apa...@elyograg.org> > escreveu: > >> On 8/13/2019 9:28 AM, Kojo wrote: >>> Here are the last two gc logs: >>> >>> >> https://send.firefox.com/download/6cc902670aa6f7dd/#Ee568G9vUtyK5zr-nAJoMQ >> >> Thank you for that. >> >> Analyzing the 20MB gc log actually looks like a pretty healthy system. >> That log covers 58 hours of runtime, and everything looks very good to me. >> >> https://www.dropbox.com/s/yu1pyve1bu9maun/gc-analysis-kojo.png?dl=0 >> >> But the small log shows a different story. That log only covers a >> little more than four minutes. >> >> https://www.dropbox.com/s/vkxfoihh12brbnr/gc-analysis-kojo2.png?dl=0 >> >> What happened at approximately 10:55:15 PM on the day that the smaller >> log was produced? Whatever happened caused Solr's heap usage to >> skyrocket and require more than 6GB. >> >> Thanks, >> Shawn >> > -- Ere Maijala Kansalliskirjasto / The National Library of Finland