On Mon, 2014-10-20 at 16:25 +0200, Shawn Heisey wrote:
> In general, once OOME happens, program operation (and in some cases the
> status of the most recently indexed documents) is completely
> undetermined.  We can be sure that the data which has already been
> written to disk will be correct, but nothing beyond that.  That's why it
> is considered better to crash the program and restart it for OOME.

Any idea why Lucene/Solr does not do this by itself? It could be
optional (with default to "yes, please crash"), but it seems to me that
shutting down on OOM would be the right thing to do. The need to set a
user-supplied system-specific watch-mechanism on the JVM to get a
reliable Solr is a) not done in a lot of cases and b) prone to errors.

If System.exit() is not available due to JVM options or shutdown is not
reliable by other reasons, the searcher could be marked as unreliable so
that all calls would result in an error "Service unavailable due to OOM.
Please restart", forcing action instead of silent "something's wrong,
but we don't know what".

- Toke Eskildsen, State and University Library


Reply via email to