Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:
On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
<nnagaraja...@transaxtions.com>  wrote:
Realtime NRT algorithm enables NRT functionality in
Solr by not closing the Searcher object  and so is very fast. I am in the
process of contributing the algorithm back to Apache Solr as a patch.
Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.

Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher object. All it does is override the IndexSearcher.getIndexReader() method so as to supply a NRTReader if realtime is enabled. All direct references to the "reader" member has been replaced with a getIndexReader() method access.

The performance is better as SolrIndexSearcher is not closed every 1 sec as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is reference counted. So every 1 sec this object needs to closed, re-allocated and the indexes need to be re-opened, caches invalidated, while waiting for existing searchers to complete, making this very expensive. realtime NRT does not close the SolrIndexSearcher object but makes available a new NRTReader with document updates ie. getIndexReader() returns a new NRTReader.

There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!

The reader member is not replaced in the existing SolrIndexSearcher object. The IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher and all direct reader member access has been replaced with a getIndexReader() method call allowing a NRT reader to be supplied when realtime is enabled. The concurrency is handled by the getNRTReader() method, with the static index view now increased to the granularity provided by the NRTIndexReader.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

-Yonik
http://lucidimagination.com




Reply via email to