Solr JMX and Cacti

2009-07-20 Thread Edward Capriolo
Hey all,

We have several deployments of Solr across our enterprise. Our largest
one is a several GB and when enough documents are added an OOM
exception is occurring.

To debug this problem I have enable JMX. My goal is to write some
cacti templates similar to the ones I have done for hadoop.
http://www.jointhegrid.com/hadoop/. The only cacti template for solr I
have found is old, broken and is using curl and PHP to try and read
the values off the web interface. I have a few general
questions/comments and also would like to know how others are dealing
with this.

1) SNMP has counters/gauges. With JMX it is hard to know what a
variable is without watching it for a while. Some fields are obvious,
(total_x) (cumulative_x) it is worth wild to add some notes in the
MBEAN info to say works like counter works like gauge. This way a
network engineer like me does not have to go code surfing to figure
out how to graph them.

 Has anyone written up a list of what the attributes are, types, and
what they mean?

2) The values that are not counter style I am assuming are sampled,
what is the sampling rate and is it adjustable?

Any tips are helpful. Thank you,


Re: Solr JMX and Cacti

2009-07-20 Thread Ryan McKinley


On Jul 20, 2009, at 8:47 AM, Edward Capriolo wrote:


Hey all,

We have several deployments of Solr across our enterprise. Our largest
one is a several GB and when enough documents are added an OOM
exception is occurring.

To debug this problem I have enable JMX. My goal is to write some
cacti templates similar to the ones I have done for hadoop.
http://www.jointhegrid.com/hadoop/. The only cacti template for solr I
have found is old, broken and is using curl and PHP to try and read
the values off the web interface. I have a few general
questions/comments and also would like to know how others are dealing
with this.

1) SNMP has counters/gauges. With JMX it is hard to know what a
variable is without watching it for a while. Some fields are obvious,
(total_x) (cumulative_x) it is worth wild to add some notes in the
MBEAN info to say works like counter works like gauge. This way a
network engineer like me does not have to go code surfing to figure
out how to graph them.

Has anyone written up a list of what the attributes are, types, and
what they mean?

2) The values that are not counter style I am assuming are sampled,
what is the sampling rate and is it adjustable?

Any tips are helpful. Thank you,


Check:
http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/handler/RequestHandlerBase.java

For cacti, you should probably ignore the two 'rate' based  
calculations as they are just derivatives:
lst.add(avgTimePerRequest, (float) totalTime / (float)  
this.numRequests); lst.add(avgRequestsPerSecond, (float)  
numRequests*1000 / (float)(System.currentTimeMillis()-handlerStart));






Re: Solr JMX and Cacti

2009-07-20 Thread Edward Capriolo
On Mon, Jul 20, 2009 at 12:31 PM, Ryan McKinleyryan...@gmail.com wrote:

 On Jul 20, 2009, at 9:16 AM, Edward Capriolo wrote:

 On Mon, Jul 20, 2009 at 11:53 AM, Ryan McKinleyryan...@gmail.com wrote:

 On Jul 20, 2009, at 8:47 AM, Edward Capriolo wrote:

 Hey all,

 We have several deployments of Solr across our enterprise. Our largest
 one is a several GB and when enough documents are added an OOM
 exception is occurring.

 To debug this problem I have enable JMX. My goal is to write some
 cacti templates similar to the ones I have done for hadoop.
 http://www.jointhegrid.com/hadoop/. The only cacti template for solr I
 have found is old, broken and is using curl and PHP to try and read
 the values off the web interface. I have a few general
 questions/comments and also would like to know how others are dealing
 with this.

 1) SNMP has counters/gauges. With JMX it is hard to know what a
 variable is without watching it for a while. Some fields are obvious,
 (total_x) (cumulative_x) it is worth wild to add some notes in the
 MBEAN info to say works like counter works like gauge. This way a
 network engineer like me does not have to go code surfing to figure
 out how to graph them.

 Has anyone written up a list of what the attributes are, types, and
 what they mean?

 2) The values that are not counter style I am assuming are sampled,
 what is the sampling rate and is it adjustable?

 Any tips are helpful. Thank you,

 Check:

 http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/handler/RequestHandlerBase.java

 For cacti, you should probably ignore the two 'rate' based calculations
 as
 they are just derivatives:
 lst.add(avgTimePerRequest, (float) totalTime / (float)
 this.numRequests);
 lst.add(avgRequestsPerSecond, (float) numRequests*1000 /
 (float)(System.currentTimeMillis()-handlerStart));




 Thanks Ryan,

 Actually, I typically graph the derivatives directly. As graphing a
 derivative is usually easier then writing cacti CDEF's which can be
 fickle when exporting the templates between versions, but I see your
 point.

 However, do you see the point I was getting at? Without MBEAN info
 stating that these values are derivatives I have to dig through source
 code. It is not a complaint, just a note that their seems to be so
 much work on JMX counters, but just a few word description in the
 MBEAN info would eliminate the need to dig through the source tree
 when it actually comes time to for someone to render these counters.


 no doubt -- i am unfamiliar with how these get passed to JMX (or where extra
 docs would be helpful) -- feel free to submit a patch that adds this info,
 perhaps to the wiki? javadoc? this way it will be easier for the next guy


 Also one more question on my mind, how are the JMX objects effected by
 a multi core deployment. Does each core have its own objects or are
 they shared?


 each core/handler gets its own object -- they are not shared across cores.


 Thank you,
 Edward


Ryan,

After adding a jmx in the solconfig.xml and setting some command
line -D options. JMX is available on a tcp port. From that point a
java tools can read the value directly.

I have console programs that output the value so cacti 'data input
methods' can read the data in. I just subclass this
...http://www.jointhegrid.com/svn/hadoop-cacti-jtg/trunk/src/com/jointhegrid/hadoopjmx/JMXBase.java

The jconsole GUI tool then allows you to browse the JMX tree. If the
MBEAN info is filled in it is displayed directly to the user. Patching
the attributes to have a more verbose descriptions would be very
helpful. I will open a Jira for that.

Thanks,
Edward