Re: cache monitoring tools?

2011-12-15 Thread Justin Caratzas
Dmitry,

Thats beyond the scope of this thread, but Munin essentially runs
plugins which are essentially scripts that output graph configuration
and values when polled by the Munin server.  So it uses a plain text
protocol, so that the scripts can be written in any language.  Munin
then feeds this info into RRDtool, which displays the graph.  There are
some examples[1] of solr plugins that people have used to scrape the
stats.jsp page.

Justin

1. http://exchange.munin-monitoring.org/plugins/search?keyword=solr

Dmitry Kan dmitry@gmail.com writes:

 Thanks, Justin. With zabbix I can gather jmx exposed stats from SOLR, how
 about munin, what protocol / way it uses to accumulate stats? It wasn't
 obvious from their online documentation...

 On Mon, Dec 12, 2011 at 4:56 PM, Justin Caratzas
 justin.carat...@gmail.comwrote:

 Dmitry,

 The only added stress that munin puts on each box is the 1 request per
 stat per 5 minutes to our admin stats handler.  Given that we get 25
 requests per second, this doesn't make much of a difference.  We don'tg
 have a sharded index (yet) as our index is only 2-3 GB, but we do have
 slave servers with replicated
 indexes that handle the queries, while our master handles
 updates/commits.

 Justin

 Dmitry Kan dmitry@gmail.com writes:

  Justin, in terms of the overhead, have you noticed if Munin puts much of
 it
  when used in production? In terms of the solr farm: how big is a shard's
  index (given you have sharded architecture).
 
  Dmitry
 
  On Sun, Dec 11, 2011 at 6:39 PM, Justin Caratzas
  justin.carat...@gmail.comwrote:
 
  At my work, we use Munin and Nagio for monitoring and alerts.  Munin is
  great because writing a plugin for it so simple, and with Solr's
  statistics handler, we can track almost any solr stat we want.  It also
  comes with included plugins for load, file system stats, processes,
  etc.
 
  http://munin-monitoring.org/
 
  Justin
 
  Paul Libbrecht p...@hoplahup.net writes:
 
   Allow me to chim in and ask a generic question about monitoring tools
   for people close to developers: are any of the tools mentioned in this
   thread actually able to show graphs of loads, e.g. cache counts or CPU
   load, in parallel to a console log or to an http request log??
  
   I am working on such a tool currently but I have a bad feeling of
  reinventing the wheel.
  
   thanks in advance
  
   Paul
  
  
  
   Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit :
  
   Otis, Tomás: thanks for the great links!
  
   2011/12/7 Tomás Fernández Löbbe tomasflo...@gmail.com
  
   Hi Dimitry, I pointed to the wiki page to enable JMX, then you can
 use
  any
   tool that visualizes JMX stuff like Zabbix. See
  
  
 
 http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/
  
   On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan dmitry@gmail.com
  wrote:
  
   The culprit seems to be the merger (frontend) SOLR. Talking to one
  shard
   directly takes substantially less time (1-2 sec).
  
   On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan dmitry@gmail.com
  wrote:
  
   Tomás: thanks. The page you gave didn't mention cache
 specifically,
  is
   there more documentation on this specifically? I have used
 solrmeter
   tool,
   it draws the cache diagrams, is there a similar tool, but which
 would
   use
   jmx directly and present the cache usage in runtime?
  
   pravesh:
   I have increased the size of filterCache, but the search hasn't
  become
   any
   faster, taking almost 9 sec on avg :(
  
   name: search
   class: org.apache.solr.handler.component.SearchHandler
   version: $Revision: 1052938 $
   description: Search using components:
  
  
  
 
 org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.StatsComponent,org.apache.solr.handler.component.DebugComponent,
  
   stats: handlerStart : 1323255147351
   requests : 100
   errors : 3
   timeouts : 0
   totalTime : 885438
   avgTimePerRequest : 8854.38
   avgRequestsPerSecond : 0.008789442
  
   the stats (copying fieldValueCache as well here, to show term
   statistics):
  
   name: fieldValueCache
   class: org.apache.solr.search.FastLRUCache
   version: 1.0
   description: Concurrent LRU Cache(maxSize=1, initialSize=10,
   minSize=9000, acceptableSize=9500, cleanupThread=false)
   stats: lookups : 79
   hits : 77
   hitratio : 0.97
   inserts : 1
   evictions : 0
   size : 1
   warmupTime : 0
   cumulative_lookups : 79
   cumulative_hits : 77
   cumulative_hitratio : 0.97
   cumulative_inserts : 1
   cumulative_evictions : 0
   item_shingleContent_trigram :
  
  
  
 
 {field=shingleContent_trigram,memSize=326924381,tindexSize=4765394,time=215426,phase1=213868,nTerms=14827061,bigTerms=35,termInstances=114359167,uses=78}
   name: filterCache
   class

Re: cache monitoring tools?

2011-12-12 Thread Justin Caratzas
Dmitry,

The only added stress that munin puts on each box is the 1 request per
stat per 5 minutes to our admin stats handler.  Given that we get 25
requests per second, this doesn't make much of a difference.  We don't
have a sharded index (yet) as our index is only 2-3 GB, but we do have slave 
servers with replicated
indexes that handle the queries, while our master handles
updates/commits.

Justin

Dmitry Kan dmitry@gmail.com writes:

 Justin, in terms of the overhead, have you noticed if Munin puts much of it
 when used in production? In terms of the solr farm: how big is a shard's
 index (given you have sharded architecture).

 Dmitry

 On Sun, Dec 11, 2011 at 6:39 PM, Justin Caratzas
 justin.carat...@gmail.comwrote:

 At my work, we use Munin and Nagio for monitoring and alerts.  Munin is
 great because writing a plugin for it so simple, and with Solr's
 statistics handler, we can track almost any solr stat we want.  It also
 comes with included plugins for load, file system stats, processes,
 etc.

 http://munin-monitoring.org/

 Justin

 Paul Libbrecht p...@hoplahup.net writes:

  Allow me to chim in and ask a generic question about monitoring tools
  for people close to developers: are any of the tools mentioned in this
  thread actually able to show graphs of loads, e.g. cache counts or CPU
  load, in parallel to a console log or to an http request log??
 
  I am working on such a tool currently but I have a bad feeling of
 reinventing the wheel.
 
  thanks in advance
 
  Paul
 
 
 
  Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit :
 
  Otis, Tomás: thanks for the great links!
 
  2011/12/7 Tomás Fernández Löbbe tomasflo...@gmail.com
 
  Hi Dimitry, I pointed to the wiki page to enable JMX, then you can use
 any
  tool that visualizes JMX stuff like Zabbix. See
 
 
 http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/
 
  On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan dmitry@gmail.com
 wrote:
 
  The culprit seems to be the merger (frontend) SOLR. Talking to one
 shard
  directly takes substantially less time (1-2 sec).
 
  On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan dmitry@gmail.com
 wrote:
 
  Tomás: thanks. The page you gave didn't mention cache specifically,
 is
  there more documentation on this specifically? I have used solrmeter
  tool,
  it draws the cache diagrams, is there a similar tool, but which would
  use
  jmx directly and present the cache usage in runtime?
 
  pravesh:
  I have increased the size of filterCache, but the search hasn't
 become
  any
  faster, taking almost 9 sec on avg :(
 
  name: search
  class: org.apache.solr.handler.component.SearchHandler
  version: $Revision: 1052938 $
  description: Search using components:
 
 
 
 org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.StatsComponent,org.apache.solr.handler.component.DebugComponent,
 
  stats: handlerStart : 1323255147351
  requests : 100
  errors : 3
  timeouts : 0
  totalTime : 885438
  avgTimePerRequest : 8854.38
  avgRequestsPerSecond : 0.008789442
 
  the stats (copying fieldValueCache as well here, to show term
  statistics):
 
  name: fieldValueCache
  class: org.apache.solr.search.FastLRUCache
  version: 1.0
  description: Concurrent LRU Cache(maxSize=1, initialSize=10,
  minSize=9000, acceptableSize=9500, cleanupThread=false)
  stats: lookups : 79
  hits : 77
  hitratio : 0.97
  inserts : 1
  evictions : 0
  size : 1
  warmupTime : 0
  cumulative_lookups : 79
  cumulative_hits : 77
  cumulative_hitratio : 0.97
  cumulative_inserts : 1
  cumulative_evictions : 0
  item_shingleContent_trigram :
 
 
 
 {field=shingleContent_trigram,memSize=326924381,tindexSize=4765394,time=215426,phase1=213868,nTerms=14827061,bigTerms=35,termInstances=114359167,uses=78}
  name: filterCache
  class: org.apache.solr.search.FastLRUCache
  version: 1.0
  description: Concurrent LRU Cache(maxSize=153600, initialSize=4096,
  minSize=138240, acceptableSize=145920, cleanupThread=false)
  stats: lookups : 1082854
  hits : 940370
  hitratio : 0.86
  inserts : 142486
  evictions : 0
  size : 142486
  warmupTime : 0
  cumulative_lookups : 1082854
  cumulative_hits : 940370
  cumulative_hitratio : 0.86
  cumulative_inserts : 142486
  cumulative_evictions : 0
 
 
  index size: 3,25 GB
 
  Does anyone have some pointers to where to look at and optimize for
  query
  time?
 
 
  2011/12/7 Tomás Fernández Löbbe tomasflo...@gmail.com
 
  Hi Dimitry, cache information is exposed via JMX, so you should be
  able
  to
  monitor that information with any JMX tool. See
  http://wiki.apache.org/solr/SolrJmx
 
  On Wed, Dec 7, 2011 at 6:19 AM, Dmitry Kan dmitry@gmail.com
  wrote:
 
  Yes, we do require that much.
  Ok, thanks, I will try increasing the maxsize.
 
  On Wed, Dec 7, 2011 at 10:56 AM

Re: cache monitoring tools?

2011-12-11 Thread Justin Caratzas
At my work, we use Munin and Nagio for monitoring and alerts.  Munin is
great because writing a plugin for it so simple, and with Solr's
statistics handler, we can track almost any solr stat we want.  It also
comes with included plugins for load, file system stats, processes,
etc.

http://munin-monitoring.org/

Justin

Paul Libbrecht p...@hoplahup.net writes:

 Allow me to chim in and ask a generic question about monitoring tools
 for people close to developers: are any of the tools mentioned in this
 thread actually able to show graphs of loads, e.g. cache counts or CPU
 load, in parallel to a console log or to an http request log??

 I am working on such a tool currently but I have a bad feeling of reinventing 
 the wheel.

 thanks in advance

 Paul



 Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit :

 Otis, Tomás: thanks for the great links!
 
 2011/12/7 Tomás Fernández Löbbe tomasflo...@gmail.com
 
 Hi Dimitry, I pointed to the wiki page to enable JMX, then you can use any
 tool that visualizes JMX stuff like Zabbix. See
 
 http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/
 
 On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan dmitry@gmail.com wrote:
 
 The culprit seems to be the merger (frontend) SOLR. Talking to one shard
 directly takes substantially less time (1-2 sec).
 
 On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan dmitry@gmail.com wrote:
 
 Tomás: thanks. The page you gave didn't mention cache specifically, is
 there more documentation on this specifically? I have used solrmeter
 tool,
 it draws the cache diagrams, is there a similar tool, but which would
 use
 jmx directly and present the cache usage in runtime?
 
 pravesh:
 I have increased the size of filterCache, but the search hasn't become
 any
 faster, taking almost 9 sec on avg :(
 
 name: search
 class: org.apache.solr.handler.component.SearchHandler
 version: $Revision: 1052938 $
 description: Search using components:
 
 
 org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.StatsComponent,org.apache.solr.handler.component.DebugComponent,
 
 stats: handlerStart : 1323255147351
 requests : 100
 errors : 3
 timeouts : 0
 totalTime : 885438
 avgTimePerRequest : 8854.38
 avgRequestsPerSecond : 0.008789442
 
 the stats (copying fieldValueCache as well here, to show term
 statistics):
 
 name: fieldValueCache
 class: org.apache.solr.search.FastLRUCache
 version: 1.0
 description: Concurrent LRU Cache(maxSize=1, initialSize=10,
 minSize=9000, acceptableSize=9500, cleanupThread=false)
 stats: lookups : 79
 hits : 77
 hitratio : 0.97
 inserts : 1
 evictions : 0
 size : 1
 warmupTime : 0
 cumulative_lookups : 79
 cumulative_hits : 77
 cumulative_hitratio : 0.97
 cumulative_inserts : 1
 cumulative_evictions : 0
 item_shingleContent_trigram :
 
 
 {field=shingleContent_trigram,memSize=326924381,tindexSize=4765394,time=215426,phase1=213868,nTerms=14827061,bigTerms=35,termInstances=114359167,uses=78}
 name: filterCache
 class: org.apache.solr.search.FastLRUCache
 version: 1.0
 description: Concurrent LRU Cache(maxSize=153600, initialSize=4096,
 minSize=138240, acceptableSize=145920, cleanupThread=false)
 stats: lookups : 1082854
 hits : 940370
 hitratio : 0.86
 inserts : 142486
 evictions : 0
 size : 142486
 warmupTime : 0
 cumulative_lookups : 1082854
 cumulative_hits : 940370
 cumulative_hitratio : 0.86
 cumulative_inserts : 142486
 cumulative_evictions : 0
 
 
 index size: 3,25 GB
 
 Does anyone have some pointers to where to look at and optimize for
 query
 time?
 
 
 2011/12/7 Tomás Fernández Löbbe tomasflo...@gmail.com
 
 Hi Dimitry, cache information is exposed via JMX, so you should be
 able
 to
 monitor that information with any JMX tool. See
 http://wiki.apache.org/solr/SolrJmx
 
 On Wed, Dec 7, 2011 at 6:19 AM, Dmitry Kan dmitry@gmail.com
 wrote:
 
 Yes, we do require that much.
 Ok, thanks, I will try increasing the maxsize.
 
 On Wed, Dec 7, 2011 at 10:56 AM, pravesh suyalprav...@yahoo.com
 wrote:
 
 facet.limit=50
 your facet.limit seems too high. Do you actually require this
 much?
 
 Since there a lot of evictions from filtercache, so, increase the
 maxsize
 value to your acceptable limit.
 
 Regards
 Pravesh
 
 --
 View this message in context:
 
 
 
 
 http://lucene.472066.n3.nabble.com/cache-monitoring-tools-tp3566645p3566811.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
 --
 Regards,
 
 Dmitry Kan
 
 
 
 
 
 --
 Regards,
 
 Dmitry Kan
 
 
 
 
 --
 Regards,
 
 Dmitry Kan
 
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan


Re: inconsistent JVM crash with version 4.0-SNAPSHOT

2011-11-25 Thread Justin Caratzas
Lasse Aagren aag...@dtic.dk writes:

 Hi,

 We are running Solr-Lucene 4.0-SNAPSHOT (1199777M - hudson - 2011-11-09 
 14:58:50) on severel servers running:

 64bit Debian Squeeze (6.0.3)
 OpenJDK6 (b18-1.8.9-0.1~squeeze1)
 Tomcat 6.028 (6.0.28-9+squeeze1)

 Some of the servers have 48G RAM and in that case java have 16G (-Xmx16g) and 
 some of the servers have 96G RAM and in that case java have 48G (-Xmx48G).

 We are seeing some inconsistent crashes of tomcat's JVM under different 
 Solr/Lucene operations/circumstances. Sadly we can't replicate it. 

 It doesn't happen often, but often enough that we can't rely on it in 
 production.

 When it happens, something like the following appears in the logs:

 ==
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f6c318d0902, pid=16516, tid=139772378892032
 #
 # JRE version: 6.0_18-b18
 # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )
 # Derivative: IcedTea6 1.8.9
 # Distribution: Debian GNU/Linux 6.0.2 (squeeze), package 
 6b18-1.8.9-0.1~squeeze1
 # Problematic frame:
 # j  
 org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(Lorg/apache/lucene/index/IndexReader$AtomicReaderContext;Lorg/apache/lucene/util/Bits;)Lorg/apache/lucene/search/DocIdSet;+193
 #
 # An error report file with more information is saved as:
 # /tmp/hs_err_pid16516.log
 #
 # If you would like to submit a bug report, please include
 # instructions how to reproduce the bug and visit:
 #   http://icedtea.classpath.org/bugzilla
 #
 ==

 Every time it happens the problematic frame is:

 Problematic frame:
 # j  
 org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(Lorg/apache/lucene/index/IndexReader$AtomicReaderContext;Lorg/apache/lucene/util/Bits;
 )Lorg/apache/lucene/search/DocIdSet;+193

 And /tmp/hs_err_pid16516.log is attached to this mail.

 Has anyone seen this before? 

 Please don't hesitate to ask for further specification about our setup.

 Best regards,

I seem to remember a recent java released fixed seemingly random
SIGSEGV's causing Solr/Lucene to crash non-deterministicly.

http://lucene.apache.org/solr/#26+October+2011+-+Java+7u1+fixes+index+corruption+and+crash+bugs+in+Apache+Lucene+Core+and+Apache+Solr

Hopefully this will provide you with some answers. If not, please let
the list know.

justin



Re: Easy way to tell if there are pending documents

2011-11-16 Thread Justin Caratzas

You can enable the stats handler
(https://issues.apache.org/jira/browse/SOLR-1750), and get inspect the
json pragmatically.

-- Justin

Latter, Antoine antoine.lat...@legis.wisconsin.gov writes:

 Thank you, that does help - but I am more looking for a way to get at this 
 programmatically.

 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
 Sent: Tuesday, November 15, 2011 11:22 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Easy way to tell if there are pending documents

 Antoine,

 On Solr Admin Stats page search for docsPending.  I think this is what you 
 are looking for.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem 
 search :: http://search-lucene.com/



From: Latter, Antoine antoine.lat...@legis.wisconsin.gov
To: 'solr-user@lucene.apache.org' solr-user@lucene.apache.org
Sent: Monday, November 14, 2011 11:39 AM
Subject: Easy way to tell if there are pending documents

Hi Solr,

Does anyone know of an easy way to tell if there are pending documents 
waiting for commit?

Our application performs operations that are never safe to perform
 while commits are pending. We make this work by making sure that all
 indexing operations end in a commit, and stop the unsafe operations
 from running while a commit is running.

This works great most of the time, except when we have enough disk
 space to add documents to the pending area, but not enough disk
 space to do a commit - then the indexing operations only error out
 after they've done all of their adds.

It would be nice if the unsafe operation could somehow detect that there are 
pending documents and abort.

In the interim I'll have the unsafe operation perform a commit when it 
starts, but I've been weeding out useless commits from my app recently and I 
don't like them creeping back in.

Thanks,
Antoine






Re: Geo spatial search with multi-valued locations (SOLR-2155 / lucene-spatial-playground)

2011-09-03 Thread Justin Caratzas
Mike,

I've applied the patch as of a June-dated trunk.  There were some
trivial conflicts, but mostly-easy to apply.  It has been in production
for a couple months with no major hiccups so far :).

Justin

Smiley, David W. dsmi...@mitre.org writes:

 Hi Mike.

 I have hopes that LSP will be ready in time for Solr 4. It's usable now with 
 the understanding that it's still fairly early and so there are bound to be 
 bugs. I've been focusing a lot on testing lately.  You could try applying 
 SOLR-2155 but I think there was some Lucene/Solr code re-organization 
 regarding the ValueSource API. It shouldn't be hard to update.  I don't think 
 JTeam's plugin handles multi-value but I could be wrong (Chris Male will be 
 sure to jump in and correct me if so).  QBase/Metacarta has a Solr plugin 
 I've used indirectly through a packaged deal with their products 
 http://www.metacarta.com/products-overview.htm  I have no idea if you can get 
 it stand-alone. As of a few months ago, it was based on a version of Solr 
 trunk from March 2010 and they have yet to update it.

 ~ David Smiley

 On Aug 29, 2011, at 2:27 PM, Mike Austin wrote:

 Besides the full integration into solr for this, would you recommend any
 third party solr plugins such as 
 http://www.jteam.nl/products/spatialsolrplugin.html;, or others?
 
 I can understand that spacial features can get complex and there could be
 many use cases, but this seems like a basic feature that you would use
 with a standard set of spacial features like what is in solr4 now.
 
 Thanks,
 Mike
 
 On Mon, Aug 29, 2011 at 12:38 PM, Darren Govoni dar...@ontrenet.com wrote:
 
 It doesn't.
 
 
 On 08/29/2011 01:37 PM, Mike Austin wrote:
 
 I've been trying to follow the progress of this and I'm not sure what the
 current status is.  Can someone update me on what is currently in Solr4
 and
 does it support multi-valued location in a single document?  I saw that
 SOLR-2155 was not included and is now lucene-spatial-playground.
 
 Thanks,
 Mike