Re: More Like This and Caching

2013-05-10 Thread Giammarco Schisani
Hi David, Jason and Otis,

Thank you for the feedback on the question. It is very much appreciated.

To confirm what caches are being used, I will remove on of the Solr servers
from the cluster, restart it, note the status of the various Solr caches,
issue some MLT queries to it, and compare the status of the cache against
the notes previously taken. I believe this will provide the definitive
answer on this.

I will reply to this thread with my findings.

Kind regards,
Giammarco

On Fri, May 10, 2013 at 1:14 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 This is correct,  doc cache for previously read docs regardless of which
 query read them and query cache for repeat query. Plus OS cache for actual
 index files.

 Otis
 Solr  ElasticSearch Support
 http://sematext.com/
 On May 9, 2013 2:32 PM, Jason Hellman jhell...@innoventsolutions.com
 wrote:

  Purely from empirical observation, both the DocumentCache and
  QueryResultCache are being populated and reused in reloads of a simple
 MLT
  search.  You can see in the cache inserts how much extra-curricular
  activity is happening to populate the MLT data by how many inserts and
  lookups occur on the first load.
 
  (lifted right out of the MLT wiki
 http://wiki.apache.org/solr/MoreLikeThis)
 
 
 
 http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score
 
  There is no activity in the filterCache, fieldCache, or fieldValueCache -
  and that makes plenty of sense.
 
  On May 9, 2013, at 11:12 AM, David Parks davidpark...@yahoo.com wrote:
 
   I'm not the expert here, but perhaps what you're noticing is actually
 the
   OS's disk cache. The actual solr index isn't cached by solr, but as you
  read
   the blocks off disk the OS disk cache probably did cache those blocks
 for
   you. On the 2nd run the index blocks were read out of memory.
  
   There was a very extensive discussion on this list not long back
 titled:
   Re: SolrCloud loadbalancing, replication, and failover look that
  thread up
   and you'll get a lot of in-depth on the topic.
  
   David
  
  
   -Original Message-
   From: Giammarco Schisani [mailto:giamma...@schisani.com]
   Sent: Thursday, May 09, 2013 2:59 PM
   To: solr-user@lucene.apache.org
   Subject: More Like This and Caching
  
   Hi all,
  
   Could anybody explain which Solr cache (e.g. queryResultCache,
   documentCache, fieldCache, etc.) can be used by the More Like This
  handler?
  
   One of my colleagues had previously suggested that the More Like This
   handler does not take advantage of any of the Solr caches.
  
   However, if I issue two identical MLT requests to the same Solr
 instance,
   the second request will execute much faster than the first request (for
   example, the first request will execute in 200ms and the second request
  will
   execute in 20ms). This makes me believe that at least one of the Solr
  caches
   is being used by the More Like This handler.
  
   I think the documentCache is the cache that is most likely being
 used,
  but
   would you be able to confirm?
  
   As information, I am currently using Solr version 3.6.1.
  
   Kind regards,
   Giammarco Schisani
  
 
 



More Like This and Caching

2013-05-09 Thread Giammarco Schisani
Hi all,

Could anybody explain which Solr cache (e.g. queryResultCache,
documentCache, fieldCache, etc.) can be used by the More Like This handler?

One of my colleagues had previously suggested that the More Like This
handler does not take advantage of any of the Solr caches.

However, if I issue two identical MLT requests to the same Solr instance,
the second request will execute much faster than the first request (for
example, the first request will execute in 200ms and the second request
will execute in 20ms). This makes me believe that at least one of the Solr
caches is being used by the More Like This handler.

I think the documentCache is the cache that is most likely being used,
but would you be able to confirm?

As information, I am currently using Solr version 3.6.1.

Kind regards,
Giammarco Schisani


RE: More Like This and Caching

2013-05-09 Thread David Parks
I'm not the expert here, but perhaps what you're noticing is actually the
OS's disk cache. The actual solr index isn't cached by solr, but as you read
the blocks off disk the OS disk cache probably did cache those blocks for
you. On the 2nd run the index blocks were read out of memory.

There was a very extensive discussion on this list not long back titled:
Re: SolrCloud loadbalancing, replication, and failover look that thread up
and you'll get a lot of in-depth on the topic.

David


-Original Message-
From: Giammarco Schisani [mailto:giamma...@schisani.com] 
Sent: Thursday, May 09, 2013 2:59 PM
To: solr-user@lucene.apache.org
Subject: More Like This and Caching

Hi all,

Could anybody explain which Solr cache (e.g. queryResultCache,
documentCache, fieldCache, etc.) can be used by the More Like This handler?

One of my colleagues had previously suggested that the More Like This
handler does not take advantage of any of the Solr caches.

However, if I issue two identical MLT requests to the same Solr instance,
the second request will execute much faster than the first request (for
example, the first request will execute in 200ms and the second request will
execute in 20ms). This makes me believe that at least one of the Solr caches
is being used by the More Like This handler.

I think the documentCache is the cache that is most likely being used, but
would you be able to confirm?

As information, I am currently using Solr version 3.6.1.

Kind regards,
Giammarco Schisani



Re: More Like This and Caching

2013-05-09 Thread Jason Hellman
Purely from empirical observation, both the DocumentCache and QueryResultCache 
are being populated and reused in reloads of a simple MLT search.  You can see 
in the cache inserts how much extra-curricular activity is happening to 
populate the MLT data by how many inserts and lookups occur on the first load. 

(lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis )

http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score

There is no activity in the filterCache, fieldCache, or fieldValueCache - and 
that makes plenty of sense.

On May 9, 2013, at 11:12 AM, David Parks davidpark...@yahoo.com wrote:

 I'm not the expert here, but perhaps what you're noticing is actually the
 OS's disk cache. The actual solr index isn't cached by solr, but as you read
 the blocks off disk the OS disk cache probably did cache those blocks for
 you. On the 2nd run the index blocks were read out of memory.
 
 There was a very extensive discussion on this list not long back titled:
 Re: SolrCloud loadbalancing, replication, and failover look that thread up
 and you'll get a lot of in-depth on the topic.
 
 David
 
 
 -Original Message-
 From: Giammarco Schisani [mailto:giamma...@schisani.com] 
 Sent: Thursday, May 09, 2013 2:59 PM
 To: solr-user@lucene.apache.org
 Subject: More Like This and Caching
 
 Hi all,
 
 Could anybody explain which Solr cache (e.g. queryResultCache,
 documentCache, fieldCache, etc.) can be used by the More Like This handler?
 
 One of my colleagues had previously suggested that the More Like This
 handler does not take advantage of any of the Solr caches.
 
 However, if I issue two identical MLT requests to the same Solr instance,
 the second request will execute much faster than the first request (for
 example, the first request will execute in 200ms and the second request will
 execute in 20ms). This makes me believe that at least one of the Solr caches
 is being used by the More Like This handler.
 
 I think the documentCache is the cache that is most likely being used, but
 would you be able to confirm?
 
 As information, I am currently using Solr version 3.6.1.
 
 Kind regards,
 Giammarco Schisani
 



Re: More Like This and Caching

2013-05-09 Thread Otis Gospodnetic
This is correct,  doc cache for previously read docs regardless of which
query read them and query cache for repeat query. Plus OS cache for actual
index files.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On May 9, 2013 2:32 PM, Jason Hellman jhell...@innoventsolutions.com
wrote:

 Purely from empirical observation, both the DocumentCache and
 QueryResultCache are being populated and reused in reloads of a simple MLT
 search.  You can see in the cache inserts how much extra-curricular
 activity is happening to populate the MLT data by how many inserts and
 lookups occur on the first load.

 (lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis)


 http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score

 There is no activity in the filterCache, fieldCache, or fieldValueCache -
 and that makes plenty of sense.

 On May 9, 2013, at 11:12 AM, David Parks davidpark...@yahoo.com wrote:

  I'm not the expert here, but perhaps what you're noticing is actually the
  OS's disk cache. The actual solr index isn't cached by solr, but as you
 read
  the blocks off disk the OS disk cache probably did cache those blocks for
  you. On the 2nd run the index blocks were read out of memory.
 
  There was a very extensive discussion on this list not long back titled:
  Re: SolrCloud loadbalancing, replication, and failover look that
 thread up
  and you'll get a lot of in-depth on the topic.
 
  David
 
 
  -Original Message-
  From: Giammarco Schisani [mailto:giamma...@schisani.com]
  Sent: Thursday, May 09, 2013 2:59 PM
  To: solr-user@lucene.apache.org
  Subject: More Like This and Caching
 
  Hi all,
 
  Could anybody explain which Solr cache (e.g. queryResultCache,
  documentCache, fieldCache, etc.) can be used by the More Like This
 handler?
 
  One of my colleagues had previously suggested that the More Like This
  handler does not take advantage of any of the Solr caches.
 
  However, if I issue two identical MLT requests to the same Solr instance,
  the second request will execute much faster than the first request (for
  example, the first request will execute in 200ms and the second request
 will
  execute in 20ms). This makes me believe that at least one of the Solr
 caches
  is being used by the More Like This handler.
 
  I think the documentCache is the cache that is most likely being used,
 but
  would you be able to confirm?
 
  As information, I am currently using Solr version 3.6.1.
 
  Kind regards,
  Giammarco Schisani