Re: MBean cassandra.db.CompactionManager TotalBytesCompacted counts backwards

2012-10-08 Thread Bryan Talbot
I'm attempting to plot how busy the node is doing compactions but there
seems to only be a few metrics reported that might be suitable:
CompletedTasks, PendingTasks, TotalBytesCompacted,
TotalCompactionsCompleted.

It's not clear to me what the difference between CompletedTasks and
TotalCompactionsCompleted is but I am plotting TotalCompactionsCompleted /
sec as one metric; however, this rate is nearly always less than 1 and
doesn't capture how much resources are used doing the compaction.  A
compaction of 4 smallest SSTables counts the same as a compaction of 4
largest SSTables but the cost is hugely different.  Thus, I'm also plotting
TotalBytesCompacted / sec.

Since the TotalBytesCompacted value sometimes moves backwards I'm not
confident that it's reporting what it is meant to report.  The code and
comments indicate that it should only be incremented by the final size of
the newly created SSTable or by the bytes-compacted-so-far for a larger
compaction, so I don't see why it should be reasonable for it to sometimes
decrease.

How should the impact of compaction be measured if not by bytes compacted?

-Bryan


On Sun, Oct 7, 2012 at 7:39 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 I have not looked at this JMX object in a while, however the
 compaction manager can support multiple threads. Also it moves from
 0-filesize each time it has to compact a set of files.

 That is more useful for showing current progress rather then lifetime
 history.



 On Fri, Oct 5, 2012 at 7:27 PM, Bryan Talbot btal...@aeriagames.com
 wrote:
  I've recently added compaction rate (in bytes / second) to my monitors
 for
  cassandra and am seeing some odd values.  I wasn't expecting the values
 for
  TotalBytesCompacted to sometimes decrease from one reading to the next.
  It
  seems that the value should be monotonically increasing while a server is
  running -- obviously it would start again at 0 when the server is
 restarted
  or if the counter rolls over (unlikely for a 64 bit long).
 
  Below are two samples taken 60 seconds apart: the value decreased by
  2,954,369,012 between the two readings.
 
  reported_metric=[timestamp:1349476449, status:200,
  request:[mbean:org.apache.cassandra.db:type=CompactionManager,
  attribute:TotalBytesCompacted, type:read], value:7548675470069]
 
  previous_metric=[timestamp:1349476389, status:200,
  request:[mbean:org.apache.cassandra.db:type=CompactionManager,
  attribute:TotalBytesCompacted, type:read], value:7551629839081]
 
 
  I briefly looked at the code for CompactionManager and a few related
 classes
  and don't see anyplace that is performing subtraction explicitly;
 however,
  there are many additions of signed long values that are not validated and
  could conceivably contain a negative value thus causing the
  totalBytesCompacted to decrease.  It's interesting to note that the all
 of
  the differences I've seen so far are more than the overflow value of a
  signed 32 bit value.  The OS (CentOS 5.7) and sun java vm (1.6.0_29) are
  both 64 bit.  JNA is enabled.
 
  Is this expected and normal?  If so, what is the correct interpretation
 of
  this metric?  I'm seeing the negatives values a few times per hour when
  reading it once every 60 seconds.
 
  -Bryan
 




-- 
Bryan Talbot
Architect / Platform team lead, Aeria Games and Entertainment
Silicon Valley | Berlin | Tokyo | Sao Paulo


Re: MBean cassandra.db.CompactionManager TotalBytesCompacted counts backwards

2012-10-07 Thread Edward Capriolo
I have not looked at this JMX object in a while, however the
compaction manager can support multiple threads. Also it moves from
0-filesize each time it has to compact a set of files.

That is more useful for showing current progress rather then lifetime history.



On Fri, Oct 5, 2012 at 7:27 PM, Bryan Talbot btal...@aeriagames.com wrote:
 I've recently added compaction rate (in bytes / second) to my monitors for
 cassandra and am seeing some odd values.  I wasn't expecting the values for
 TotalBytesCompacted to sometimes decrease from one reading to the next.  It
 seems that the value should be monotonically increasing while a server is
 running -- obviously it would start again at 0 when the server is restarted
 or if the counter rolls over (unlikely for a 64 bit long).

 Below are two samples taken 60 seconds apart: the value decreased by
 2,954,369,012 between the two readings.

 reported_metric=[timestamp:1349476449, status:200,
 request:[mbean:org.apache.cassandra.db:type=CompactionManager,
 attribute:TotalBytesCompacted, type:read], value:7548675470069]

 previous_metric=[timestamp:1349476389, status:200,
 request:[mbean:org.apache.cassandra.db:type=CompactionManager,
 attribute:TotalBytesCompacted, type:read], value:7551629839081]


 I briefly looked at the code for CompactionManager and a few related classes
 and don't see anyplace that is performing subtraction explicitly; however,
 there are many additions of signed long values that are not validated and
 could conceivably contain a negative value thus causing the
 totalBytesCompacted to decrease.  It's interesting to note that the all of
 the differences I've seen so far are more than the overflow value of a
 signed 32 bit value.  The OS (CentOS 5.7) and sun java vm (1.6.0_29) are
 both 64 bit.  JNA is enabled.

 Is this expected and normal?  If so, what is the correct interpretation of
 this metric?  I'm seeing the negatives values a few times per hour when
 reading it once every 60 seconds.

 -Bryan



MBean cassandra.db.CompactionManager TotalBytesCompacted counts backwards

2012-10-05 Thread Bryan Talbot
I've recently added compaction rate (in bytes / second) to my monitors for
cassandra and am seeing some odd values.  I wasn't expecting the values for
TotalBytesCompacted to sometimes decrease from one reading to the next.  It
seems that the value should be monotonically increasing while a server is
running -- obviously it would start again at 0 when the server is restarted
or if the counter rolls over (unlikely for a 64 bit long).

Below are two samples taken 60 seconds apart: the value decreased by
2,954,369,012 between the two readings.

reported_metric=[timestamp:1349476449, status:200,
request:[mbean:org.apache.cassandra.db:type=CompactionManager,
attribute:TotalBytesCompacted, type:read], value:7548675470069]

previous_metric=[timestamp:1349476389, status:200,
request:[mbean:org.apache.cassandra.db:type=CompactionManager,
attribute:TotalBytesCompacted, type:read], value:7551629839081]


I briefly looked at the code for CompactionManager and a few related
classes and don't see anyplace that is performing subtraction explicitly;
however, there are many additions of signed long values that are not
validated and could conceivably contain a negative value thus causing the
totalBytesCompacted to decrease.  It's interesting to note that the all of
the differences I've seen so far are more than the overflow value of a
signed 32 bit value.  The OS (CentOS 5.7) and sun java vm (1.6.0_29) are
both 64 bit.  JNA is enabled.

Is this expected and normal?  If so, what is the correct interpretation of
this metric?  I'm seeing the negatives values a few times per hour when
reading it once every 60 seconds.

-Bryan