It seems like bullets don't look nice then I'm sending explanation without bullets.
The flow of SolrCore.execute() function will be changed: Change the status of the core to “USED” and call waitForResource(SolrRequestHandler, SolrQueryRequest) function, after that perform the current SolrCore.execute() flow and change status of the core to “UNUSED”. In waitForResource(SolrRequestHandler, SolrQueryRequest) function, initially, estimate the required memory for this query/handler on this core. If there is no enough free resources to run the query and after unloading all unused, not permanent cores still there is no enough resource throw an "OutOfMemoryError " exception and change the status of the core to “UNUSED”; else wait with timeout till some resource is released and then check again until the required resource is available or the exception is thrown. Best regards, Lyuba ---------- Forwarded message ---------- From: Lyuba Romanchuk <[email protected]> Date: Tue, Apr 9, 2013 at 11:47 AM Subject: Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception To: [email protected] Hi all, We run solr (4.2 and 5.0) in a real time environment with big data. Each day two Solr cores are generated that can reach ~8-10g, depending on the insertion rates and on different hardware. Currently, all cores are loaded on solr startup. The query rate is not high but the response must be quick and must be returned even for old data and over a large time frame. There are a lot of simple queries (facet/facet.pivot for small distributed fields) but there are also heavy queries like facet.pivot for a large-scale distributed fields. We use distributed search to query the cores and, usually, the query over 1-2 weeks (around 7-28 cores). After some large queries (with facet.pivot for wide distributed fields) we sometimes encounter a "java.lang.OutOfMemoryError: Java heap space" exception:. The software is to be deployed to customer sites so increasing memory would not always be possible, and the customers may want to get slower responses for the larger queries, if we can provide them. We looked at the LotsOfCores functionality that was added in 4.1 and 4.2. It enables defining an upper limit of online cores and unloading them when the cache gets full on a LRU basis. However in our case it seems a more general use case is needed: * Only cores that are used for updates/inserts must be loaded at all times. Other cores, which are queried only, should be loaded / unloaded on demand while the query runs, until completion – according to memory demands. * Each facet, facet.pivot must be estimated for memory consumption. In case there is not enough memory to run the query for all cores concurrently it must be separated into sequential queries, unloading already queried or irrelevant cores (but not permanent cores) and loading older cores to complete the query. * Occasionally, the oldest cores should be unloaded according to a configurable policy (for example, one type of high volume cores will be kept loaded for 1 week, while smaller cores can remain loaded for a month). The policy will allow for data we know is queried less but is higher volume to be kept live over shorter time periods. We are considering adding the following functionality to Solr (optional – turned on by new configs): The flow of SolrCore.execute() function will be changed: - Change status of the core to “USED” - Call waitForResource(SolrRequestHandler, SolrQueryRequest) function - estimate the required memory for this query/handler on this core - if there is no enough free resources to run the query then - if all cores are permanent and can’t be unloaded then - throw a "OutOfMemoryError " exception // here the status of the core should be changed to “UNUSED” - else - try to unload unused, not permanent cores - if unloading unused cores didn’t release enough resources and no core can be unloaded then - throw an "OutOfMemoryError " exception // here the status of the core should be changed to “UNUSED” - if unloading unused cores didn’t release enough resources and there are cores that can be unloaded then - wait with timeout till some resource is released - check again until the required resource is available or the exception is thrown - reserve the resource - Call the current SolrCore.execute() - Change status of the core to “UNUSED” We would like to get some initial feedback on the design / functionality we’re proposing as we feel this really benefits real-time, high volume indexing systems such as ours. We are also happy to contribute the code back if you feel there is a need for this functionality. Best regards, Lyuba
