Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception

Lyuba Romanchuk Tue, 09 Apr 2013 01:48:15 -0700

Hi all,

We run solr (4.2 and 5.0) in a real time environment with big data. Each
day two Solr cores are generated that can reach ~8-10g, depending on the
insertion rates and on different hardware.


Currently, all cores are loaded on solr startup.

The query rate is not high but the response must be quick and must be
returned even for old data and over a large time frame.

There are a lot of simple queries (facet/facet.pivot for small distributed
fields) but there are also heavy queries like facet.pivot for a large-scale
distributed fields. We use distributed search to query the cores and,
usually, the query over 1-2 weeks (around 7-28 cores).

After some large queries (with facet.pivot for wide distributed fields) we
sometimes encounter a "java.lang.OutOfMemoryError: Java heap space"
exception:.

The software is to be deployed to customer sites so increasing memory would
not always be possible, and the customers may want to get slower responses
for the larger queries, if we can provide them.

We looked at the LotsOfCores functionality that was added in 4.1 and 4.2.
It enables defining an upper limit of online cores and unloading them when
the cache gets full on a LRU basis. However in our case it seems a more
general use case is needed:

* Only cores that are used for updates/inserts must be loaded at all times.
Other cores, which are queried only, should be loaded / unloaded on demand
while the query runs, until completion – according to memory demands.

* Each facet, facet.pivot must be estimated for memory consumption. In case
there is not enough memory to run the query for all cores concurrently it
must be separated into sequential queries, unloading already queried or
irrelevant cores (but not permanent cores) and loading older cores to
complete the query.

* Occasionally, the oldest cores should be unloaded according to a
configurable policy (for example, one type of high volume cores will be
kept loaded for 1 week, while smaller cores can remain loaded for a month).
The policy will allow for data we know is queried less but is higher volume
to be kept live over shorter time periods.

We are considering adding the following functionality to Solr (optional –
turned on by new configs):

The flow of SolrCore.execute() function will be changed:


   - Change status of the core to “USED”
   - Call waitForResource(SolrRequestHandler, SolrQueryRequest) function
      - estimate the required memory for this query/handler on this core
      - if there is no enough free resources to run the query then
         - if all cores are permanent and can’t be unloaded then
            - throw a "OutOfMemoryError " exception // here the status of
            the core should be changed to “UNUSED”
         - else
            -  try to unload unused, not permanent cores
            - if unloading unused cores didn’t release enough resources and
            no core can be unloaded then
               - throw an "OutOfMemoryError " exception // here the status
               of the core should be changed to “UNUSED”
            - if unloading unused cores didn’t release enough resources and
            there are cores that can be unloaded then
            - wait with timeout till some resource is released
               - check again until the required resource is available or
               the exception is thrown
               - reserve the resource
   - Call the current SolrCore.execute()
   - Change status of the core to “UNUSED”

We would like to get some initial feedback on the design / functionality
we’re proposing as we feel this really benefits real-time, high volume
indexing systems such as ours. We are also happy to contribute the code
back if you feel there is a need for this functionality.

Best regards,

Lyuba

Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception

Reply via email to