Re: Monitoring with Cacti

2010-09-13 Thread Dan Di Spaltro
Cloudkick does monitor JMX now.  That + custom alerts is pretty powerful.

I work for Cloudkick, btw

On Sun, Sep 12, 2010 at 9:39 AM, Dave Viner davevi...@pobox.com wrote:

 I haven't tried cacti, but I'm using CloudKick as an external service for
 monitoring Cassandra.  It's super easy to get setup.  Happy to share my
 setup if that'd help.

 It doesn't currently monitor JMX information, but it does offer some basic
 checks like thread pool and column family stats -
 https://support.cloudkick.com/Cassandra_Checks.

 Dave Viner


 On Fri, Sep 10, 2010 at 8:31 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Fri, Sep 10, 2010 at 7:29 PM, aaron morton aa...@thelastpickle.com
 wrote:
  Am going through the rather painful process of trying to monitor
 cassandra using Cacti (it's what we use at work). At the moment it feels
 like a losing battle :)
 
  Does anyone know of some cacti resources for monitoring the JVM or
 Cassandra metrics other than...
 
  mysql-cacti-templates
  http://code.google.com/p/mysql-cacti-templates/
  - provides templates and data sources that require ssh and can monitor
 JVM heap and a few things.
 
  Cassandra-cacti-m6
  http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
  Coded for version 0.6* , have made some changes to stop it looking for
 stats that no longer exist. Missing some metrics I think but it's probably
 the best bet so far. If I get it working I'll contribute it back to them.
 Most of the problems were probably down the how much effort it takes to
 setup cacti.
 
  jmxterm
  http://www.cyclopsgroup.org/projects/jmxterm/
  Allows for command line access to JMX. I started down the path of
 writing a cacti data source to use this just to see how it worked. Looks
 like a lot of work.
 
  Thanks for any advice.
  Aaron
 
 

 Setting up cacti is easy, the second time, and third time :)
 As for cassandra-cacti-m6 (i am the author). Unfortunately, I have
 been fighting the jmx switcharo battle for about 3 years now
 hadoop/hbase/cassandra/hornetq/vserver

 In a nutshell there is ALWAYS work involved. First, is because as you
 noticed attributes change/remove/add/renamed. Second it takes a human
 to logically group things together. For example, if you have two items
 cache hits and cache misses. You really do not want two separate
 graphs that will scale independently. You want one slick stack graph,
 with nice colors, and you want a CDEF to calculate the cache hit
 percentage by dividing one into the other and show that at the bottom.

 If you want to have a 7.0 branch to cassandra-cacti-m6 I would love
 the help. We are not on 7.0 yet so I have not had the time just to go
 out and make graphs for a version we are not using yet :) but if you
 come up with patches they are happily accepted.

 Edward





-- 
Dan Di Spaltro


Re: TechCrunch article on Twitter and Cassandra

2010-07-10 Thread Dan Di Spaltro
This sounds more like high-throughput external analytics, aka they
will know all the queries consumers will use.  This isn't for internal
analytics.

On Sat, Jul 10, 2010 at 10:33 AM, Marty Greenia martygree...@gmail.com wrote:
 It almost seems counter-intuitive. For analytics, you'd think they'd want a
 database that supports more sophisticated query functionality (sql). Whereas
 for everyday tweet storage, something fast and high-throughput (cassandra)
 makes sense.

 I'd be curious to here the details as well.

 On Sat, Jul 10, 2010 at 10:25 AM, S Ahmed sahmed1...@gmail.com wrote:

 Nice link.
 From what I understood, they are not using it to store tweets but rather
 will use mysql?  I wish they went into more detail as to why...

 On Sat, Jul 10, 2010 at 1:25 AM, Kochheiser,Todd W - TOK-DITT-1
 twkochhei...@bpa.gov wrote:

 A good read.

 http://techcrunch.com/2010/07/09/twitter-analytics-mysql/

 Todd





-- 
Dan Di Spaltro


Re: Time-series data model

2010-04-15 Thread Dan Di Spaltro
This is actually fairly similar to how we store metrics at Cloudkick.
Below has a much more in depth explanation of some of that

https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/

So we store each natural point in the NumericArchive table.

ColumnFamily CompareWith=LongType
  Name=NumericArchive /

ColumnFamily CompareWith=LongType Name=Rollup5m
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup20m
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup30m
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup60m
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup4h
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup12h
ColumnType=Super CompareSubcolumnsWith=BytesType /
ColumnFamily CompareWith=LongType Name=Rollup1d
ColumnType=Super CompareSubcolumnsWith=BytesType /

our keys look like:
serviceuuid.metric-name

Anyways, this has been working out very well for us.

2010/4/15 Ted Zlatanov t...@lifelogs.com:
 On Thu, 15 Apr 2010 11:27:47 +0200 Jean-Pierre Bergamin ja...@ractive.ch 
 wrote:

 JB Am 14.04.2010 15:22, schrieb Ted Zlatanov:
 On Wed, 14 Apr 2010 15:02:29 +0200 Jean-Pierre Bergaminja...@ractive.ch 
  wrote:

 JB The metrics are stored together with a timestamp. The queries we want to
 JB perform are:
 JB * The last value of a specific metric of a device
 JB * The values of a specific metric of a device between two timestamps t1 
 and
 JB t2

 Make your key devicename-metricname-MMDD-HHMM (with whatever time
 sharding makes sense to you; I use UTC by-hours and by-day in my
 environment).  Then your supercolumn is the collection time as a
 LongType and your columns inside the supercolumn can express the metric
 in detail (collector agent, detailed breakdown, etc.).

 JB Just for my understanding. What is time sharding? I couldn't find an
 JB explanation somewhere. Do you mean that the time-series data is rolled
 JB up in 5 minues, 1 hour, 1 day etc. slices?

 Yes.  The usual meaning of shard in RDBMS world is to segment your
 database by some criteria, e.g. US vs. Europe in Amazon AWS because
 their data centers are laid out so.  I was taking a linguistic shortcut
 to mean break down your rows by some convenient criteria.  You can
 actually set up your Partitioner in Cassandra to literally shard your
 keyspace rows based on the key, but I just meant slice in my note.

 JB So this would be defined as:
 JB ColumnFamily Name=measurements ColumnType=Super
 JB CompareWith=UTF8Type  CompareSubcolumnsWith=LongType /

 JB So when i want to read all values of one metric between two timestamps
 JB t0 and t1, I'd have to read the supercolumns that match a key range
 JB (device1:metric1:t0 - device1:metric1:t1) and then all the
 JB supercolumns for this key?

 Yes.  This is a single multiget if you can construct the key range
 explicitly.  Cassandra loads a lot of this in memory already and filters
 it after the fact, that's why it pays to slice your keys and to stitch
 them together on the client side if you have to go across a time
 boundary.  You'll also get better key load balancing with deeper slicing
 if you use the randomizing partitioner.

 In the result set, you'll get each matching supercolumn with all the
 columns inside it.  You may have to page through supercolumns.

 Ted





-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
But I didn't restart the red one.

On Thu, Apr 1, 2010 at 12:18 PM, Jonathan Ellis jbel...@gmail.com wrote:

 There shouldn't be anything to clean up.  (The temporary streaming
 files it anticompacted are automatically removed on restart)

 On Thu, Apr 1, 2010 at 2:17 PM, Dan Di Spaltro dan.dispal...@gmail.com
 wrote:
  Okay, so should I run any more commands like cleanup before?
 
  On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  Bootstrap source restarting will always fail bootstrap.  You'll need
  to restart the blue one too now, I'm afraid.
 
  On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro dan.dispal...@gmail.com
 
  wrote:
   Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has
 0
   in
   STREAM-STAGE.
  
   On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
   dan.dispal...@gmail.com
   wrote:
  
   Red one.
   Gary - both say nothing is happening with no destinations or sources.
  
   On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis jbel...@gmail.com
   wrote:
  
   which node rebooted, the red one, or the blue one?
  
   On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro
   dan.dispal...@gmail.com
   wrote:
So we are adding another node to the cluster with the latest 0.6
branch
(RC1).  It seems to be hung in some limbo state.
Before bootstrapping our cluster had 50-60GB spread fairly evenly
across 4
machines, with RF=3.   One machine had more load than the others,
and
sure
enough bootstrapping selected that node.   That is the red
 machine.
 The
light blue machine is the new machine.
I have attached a graph to illustrate when the bootstrap process
started.
In jconsole the streamingservice status was performing
anticompaction...
for over 18-24 hrs.  It is currently in nothing is happening.
 It
did
have 1 active STREAM-STAGE task, but the machine had to be
 rebooted
for
something unrelated to cassandra. Now the light blue machine
 appears
to
be
getting data, but its growing at virtually the same rate as the
other
machines which makes me think it is part of the cluster and not
actually
streaming data from the machine its supposed to.
Any other ideas on how to debug?
   
--
Dan Di Spaltro
   
  
  
  
   --
   Dan Di Spaltro
  
  
  
   --
   Dan Di Spaltro
  
 
 
 
  --
  Dan Di Spaltro
 




-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Sorry I meant the red one restarted about a day ago.  The graph shows
the dip in disk space.  But it no where near returned to the previous
amount of disk usage.  I was referring to how the red one didn't
reclaim all its space (I figure about 60gb actually belong on that
machine) Is that normal (its currently taking up about 100gb)?

2 minutes ago, I restarted the blue one.

Now the streamservice task is performing anti-compaction on the red one.

On Thu, Apr 1, 2010 at 12:25 PM, Jonathan Ellis jbel...@gmail.com wrote:

 On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro dan.dispal...@gmail.com 
 wrote:
  But I didn't restart the red one.

On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
dan.dispal...@gmail.com
wrote:
   
Red one.
   
On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis jbel...@gmail.com
wrote:
   
which node rebooted, the red one, or the blue one?

 I'm confused.

--
Dan Di Spaltro


Re: Hackathon?!?

2010-03-22 Thread Dan Di Spaltro
Yeah, 22nd seems like its as good as its going to get.

Ill bring you b-day present =)

Best,

On Mon, Mar 22, 2010 at 2:20 PM, Chris Goffinet goffi...@digg.com wrote:

 Dan

 Did we all agree April 22 works for all?

 -Chris


 On Mon, Mar 22, 2010 at 2:13 PM, Dan Di Spaltro 
 dan.dispal...@gmail.comwrote:

 Great - Chris, you still going to put together the invite?

 On Thu, Mar 11, 2010 at 5:36 AM, Jonathan Ellis jbel...@gmail.comwrote:

 Ack, I agreed to speak at http://nosqleu.com/, I never did hear a
 final date but they put up a schedule online (april 20-22).

 But, 22 probably is a better date, and Eric and Stu are fully capable
 of representing rackspace without me. :)

 -Jonathan

 On Wed, Mar 10, 2010 at 10:50 PM, Chris Goffinet goffi...@digg.com
 wrote:
  We could do it on April 22 (1 week later), that's my birthday :-) What
 better way to celebrate haha.
 
  -Chris
 
  On Mar 10, 2010, at 9:58 AM, Jonathan Ellis wrote:
 
  I'm in either way, but if we push it a week later then the twitter
  guys could (a) make it and (b) pimp it at their own conference.
 
  On Wed, Mar 10, 2010 at 12:26 AM, Jeff Hodges jhod...@twitter.com
 wrote:
  Ah, hell. Thought this was the first day. Can't make it.
  --
  Jeff
 
  On Mar 9, 2010 9:32 PM, Ryan King r...@twitter.com wrote:
 
  I'm already committed to talking about cassandra that day at our
  company's developer conference (chirp.twitter.com).
 
  -ryan
 
  On Tue, Mar 9, 2010 at 6:26 PM, Jeff Hodges jhod...@twitter.com
 wrote:
  I'm down.
  --
  Jeff
 
  ...
 
 




 --
 Dan Di Spaltro




 --
 Chris Goffinet




-- 
Dan Di Spaltro