Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
threads). The replication factor is 2 with 10 column families and we
service Counter incrementing write intensive tasks(CL=ONE).

I am trying to figure out the bottleneck,

1) Is using JDK 1.7 any way detrimental to cassandra?

2) What is the max write operation qps that should be expected. Is the
netflix benchmark also applicable for counter incrmenting tasks?

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
idle time is around 30%, cassandra is not disk bound(insignificant
read operations and cpu's iowait is around 0.05%) and is not swapping
its memory(around 15 gb RAM is free or inactive). The average gc pause
time for parnew are 100ms occuring every second. So cassandra spends
10% of its time stuck in Stop the world collector.
The os load is around 16-20 and the average write latency is 3ms.
tpstats do not show any significant pending tasks.

At this point suddenly, Several nodes start dropping several
Mutation messages. There are also lots of pending
MutationStage,replicateOnWriteStage tasks in tpstats.
The number of threads in the java process increase to around 25,000
from the usual 300-400. Almost all the new threads seem to be named
pool-2-thread-*.
The OS load jumps to around 30-40, the write request latency starts
spiking to more than 500ms (even to several tens of seconds sometime).
Even the Local write latency increases fourfolds to 200 microseconds
from 50 microseconds. This happens across all the nodes and in around
2-3 minutes.
My guess is that this might be due to the 128 Writer threads not being
able to perform more writes.(though with  average local write latency
of 100-150 micro seconds, each thread should be able to serve 10,000
qps and with 128 writer threads, should be able to serve 1,280,000 qps
per node)
Could there be any other reason for this? What else should I monitor
since system.log do not seem to say anything conclusive before
dropping messages.



Thanks
Rohit


Re: GC freeze just after repair session

2012-07-05 Thread Ravikumar Govindarajan
We have modified maxTenuringThreshold from 1 to 5. May be it is causing
problems. Will change it back to 1 and see how the system is.

concurrent_compactors=8. We will reduce this, as anyway our system won't be
able to handle this number of compactions at the same time. Think it will
ease GC also to some extent.

Ideally we would like to collect maximum garbage from ParNew itself, during
compactions. What are the steps to take towards to achieving this?

On Wed, Jul 4, 2012 at 4:07 PM, aaron morton aa...@thelastpickle.comwrote:

 It *may* have been compaction from the repair, but it's not a big CF.

 I would look at the logs to see how much data was transferred to the node.
 Was their a compaction going on while the GC storm was happening ? Do you
 have a lot of secondary indexes ?

 If you think it correlated to compaction you can try reducing the
 concurrent_compactors

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote:

 Recently, we faced a severe freeze [around 30-40 mins] on one of our
 servers. There were many mutations/reads dropped. The issue happened just
 after a routine nodetool repair for the below CF completed [1.0.7, NTS,
 DC1:3,DC2:2]

 Column Family: MsgIrtConv
 SSTable count: 12
 Space used (live): 17426379140
  Space used (total): 17426379140
 Number of Keys (estimate): 122624
 Memtable Columns Count: 31180
  Memtable Data Size: 81950175
 Memtable Switch Count: 31
 Read Count: 8074156
  Read Latency: 15.743 ms.
 Write Count: 2172404
 Write Latency: 0.037 ms.
  Pending Tasks: 0
 Bloom Filter False Postives: 1258
 Bloom Filter False Ratio: 0.03598
  Bloom Filter Space Used: 498672
 Key cache capacity: 20
 Key cache size: 20
  Key cache hit rate: 0.9965579513062582
 Row cache: disabled
 Compacted row minimum size: 51
  Compacted row maximum size: 89970660
 Compacted row mean size: 226626


 Our heap config is as follows

 -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
 -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly

 from yaml
 in_memory_compaction_limit=64
 compaction_throughput_mb_sec=8
 multi_threaded_compaction=false

  INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085AntiEntropyService.java 
 (line 762) [repair
 #2b6fcbf0-c1f9-11e1--2ea8811bfbff] MsgIrtConv is fully synced
  INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085AntiEntropyService.java 
 (line 698) [repair
 #2b6fcbf0-c1f9-11e1--2ea8811bfbff] session completed successfully
  INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219CompactionTask.java 
 (line 221) Compacted to
 [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,].  47,907,012 to
 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s.  Time:
 6,186ms.

 After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran
 for every 3 seconds, while CMS ran for every 30 seconds approx continuous
 for 40 minutes.

  INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line
 122) GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line
 122) GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is
 8506048512

 .

  INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line
 122) GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line
 122) GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line
 122) GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line
 122) GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line
 122) GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 GCInspector.java (line
 122) GC for ParNew: 726 ms for 2 collections, 4941511184 used; max is
 8506048512

 After this, the node got stable and was back and running. Any pointers
 will be greatly helpful





Re: datastax aws ami

2012-07-05 Thread Deno Vichas

i did with no luck.  i got my fire put out.

for some reason one of my nodes upgraded itself after rebooting to fix 
the leap second bug.  i use apt-get to put on 1.0.8.  seeing that my 
cluster was running 1.0.7 i had to upgrade the rest of the nodes.  
upgrading was very simple, stop, apt-get install, start one node at a time.


thanks,
deno


On 7/4/2012 3:18 AM, aaron morton wrote:

Try the data stax forums http://www.datastax.com/support-forums/

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/07/2012, at 7:28 AM, Deno Vichas wrote:


is the 2.1 image still around?

On 7/2/2012 11:24 AM, Deno Vichas wrote:

all,

i've got a datastax 2.1 ami instance that's screwed up.  for some 
reason it won't read the config file.  what's the recommended way to 
replace this node with a new one?  it doesn't seem like you can use 
the ami to bring up single nodes as it want to do whole clusters.



thanks,
deno












steps to add node in cassandra 1.1.2

2012-07-05 Thread prasenjit mukherjee
I am using cassandar version 1.1.2. I got the document to add node for
version 0.7 : http://www.datastax.com/docs/0.7/getting_started/configuring

Is it still valid ? Is there a documentation on this topic from
cassandra twiki/docs ?

-Prasenjit


London meetup - 16th July

2012-07-05 Thread Dave Gardner
The next London meetup is coming up on 16th July.

We've got two speakers - Richard Churchill talking about his
experiences rolling out Cassandra at ServiceTick and Tom Wilkie
talking about real time analytics on top of Cassandra.

http://www.meetup.com/Cassandra-London/events/69791362/

Dave


CorruptedBlockException

2012-07-05 Thread Nury Redjepow
Hello to all,

 I have cassandra instance I'm trying to use to store millions of file with 
size ~ 3MB. Data structure is simple, 1 row for 1 file, with row key being the 
id of file.
I'm loaded 1GB of data, and total available space is 10GB. And after a few 
hour, all the available space was taken. In log, it says that while compaction 
a CorruptedBlockException has occured. But I don't understand, how was all the 
available space taken away.

Data structure
CREATE KEYSPACE largeobjectsWITH placement_strategy = 'SimpleStrategy'
AND strategy_options={replication_factor:1};

create column family content
  with column_type = 'Standard'  
  and comparator = 'UTF8Type'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'TimeUUIDType'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 'SizeTieredCompactionStrategy'
  and caching = 'keys_only';


Log messages

INFO [FlushWriter:9] 2012-07-04 19:56:00,783 Memtable.java (line 266) Writing 
Memtable-content@240294142(3955135/49439187 serialized/live bytes, 91 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:00,814 Memtable.java (line 307) 
Completed flushing 
/var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
d-1608-Data.db (1991862 bytes) for commitlog position 
ReplayPosition(segmentId=24245436475633, position=78253718)
 INFO [OptionalTasks:1] 2012-07-04 19:56:02,784 MeteredFlusher.java (line 62) 
flushing high-traffic column family CFS(Keyspace='largeobjects', ColumnFamily='
content') (estimated 46971537 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:02,784 ColumnFamilyStore.java (line 
633) Enqueuing flush of Memtable-content@1755783901(3757723/46971537 serialized/
live bytes, 121 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:02,785 Memtable.java (line 266) Writing 
Memtable-content@1755783901(3757723/46971537 serialized/live bytes, 121 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:02,835 Memtable.java (line 307) 
Completed flushing 
/var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
d-1609-Data.db (1894897 bytes) for commitlog position 
ReplayPosition(segmentId=24245436475633, position=82028986)
 INFO [OptionalTasks:1] 2012-07-04 19:56:04,785 MeteredFlusher.java (line 62) 
flushing high-traffic column family CFS(Keyspace='largeobjects', ColumnFamily='
content') (estimated 56971025 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:04,785 ColumnFamilyStore.java (line 
633) Enqueuing flush of Memtable-content@1441175031(4557682/56971025 serialized/
live bytes, 124 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:04,786 Memtable.java (line 266) Writing 
Memtable-content@1441175031(4557682/56971025 serialized/live bytes, 124 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:04,814 Memtable.java (line 307) 
Completed flushing 
/var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
d-1610-Data.db (2287280 bytes) for commitlog position 
ReplayPosition(segmentId=24245436475633, position=86604648)
 INFO [CompactionExecutor:39] 2012-07-04 19:56:04,815 CompactionTask.java (line 
109) Compacting [SSTableReader(path='/var/lib/cassandra/data/largeobjects/con
tent/largeobjects-content-hd-1610-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/largeobjects/content/largeobjects-content-hd-1608-Data.db'),
 SSTable
Reader(path='/var/lib/cassandra/data/largeobjects/content/largeobjects-content-hd-1609-Data.db'),
 SSTableReader(path='/var/lib/cassandra/data/largeobjects/co
ntent/largeobjects-content-hd-1607-Data.db')]
 INFO [OptionalTasks:1] 2012-07-04 19:56:05,786 MeteredFlusher.java (line 62) 
flushing high-traffic column family CFS(Keyspace='largeobjects', ColumnFamily='
content') (estimated 28300225 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:05,786 ColumnFamilyStore.java (line 
633) Enqueuing flush of Memtable-content@1828084851(2264018/28300225 serialized/
live bytes, 38 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:05,787 Memtable.java (line 266) Writing 
Memtable-content@1828084851(2264018/28300225 serialized/live bytes, 38 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:05,823 Memtable.java (line 307) 
Completed flushing 
/var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
d-1612-Data.db (1134604 bytes) for commitlog position 
ReplayPosition(segmentId=24245436475633, position=88874176)
ERROR [CompactionExecutor:39] 2012-07-04 19:56:06,667 
AbstractCassandraDaemon.java (line 134) Exception in thread 
Thread[CompactionExecutor:39,1,main]
java.io.IOError: org.apache.cassandra.io.compress.CorruptedBlockException: 
(/var/lib/cassandra/data/largeobjects/content/largeobjects-content-hd-1610-Data.db
): corruption detected, chunk at 1573104 of length 65545.
 at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
 at 
org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
 at 

Re: Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
Also,


Looking at gc log. I see messages like this across different servers
before they start dropping messages

2012-07-04T10:48:20.336+: 96771.117: [GC 96771.118: [ParNew:
1367297K-57371K(1474560K), 0.0617350 secs]
6641571K-5340088K(12419072K), 0.0634460 secs] [Times: user=0.56
sys=0.01, real=0.06 secs]
Total time for which application threads were stopped: 0.0850010 seconds
Total time for which application threads were stopped: 16.7663710 seconds

The 16 second pause doesnt seem to be caused by the minor/major gc
which are quite fast and are also logged. Total time for which ...
messages are caused by PrintGCApplicationStoppedTime paramater which
is supposed to be logged whenever threads reach a safepoint. Is there
any way I can figure out what caused the java threads to pause.

Thanks
Rohit

On Thu, Jul 5, 2012 at 12:19 PM, rohit bhatia rohit2...@gmail.com wrote:
 Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
 threads). The replication factor is 2 with 10 column families and we
 service Counter incrementing write intensive tasks(CL=ONE).

 I am trying to figure out the bottleneck,

 1) Is using JDK 1.7 any way detrimental to cassandra?

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?
 
 http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%) and is not swapping
 its memory(around 15 gb RAM is free or inactive). The average gc pause
 time for parnew are 100ms occuring every second. So cassandra spends
 10% of its time stuck in Stop the world collector.
 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.

 At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending
 MutationStage,replicateOnWriteStage tasks in tpstats.
 The number of threads in the java process increase to around 25,000
 from the usual 300-400. Almost all the new threads seem to be named
 pool-2-thread-*.
 The OS load jumps to around 30-40, the write request latency starts
 spiking to more than 500ms (even to several tens of seconds sometime).
 Even the Local write latency increases fourfolds to 200 microseconds
 from 50 microseconds. This happens across all the nodes and in around
 2-3 minutes.
 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(though with  average local write latency
 of 100-150 micro seconds, each thread should be able to serve 10,000
 qps and with 128 writer threads, should be able to serve 1,280,000 qps
 per node)
 Could there be any other reason for this? What else should I monitor
 since system.log do not seem to say anything conclusive before
 dropping messages.



 Thanks
 Rohit


Re: cassandra on re-Start

2012-07-05 Thread puneet loya
-- Forwarded message --
From: Rob Coli rc...@palominodb.com
Date: Mon, Jul 2, 2012 at 11:19 PM
Subject: Re: cassandra on re-Start
To: user@cassandra.apache.org


On Mon, Jul 2, 2012 at 5:43 AM, puneet loya puneetl...@gmail.com wrote:
 When I restarted the system , it is showing the keyspace does not exist.

 Not even letting me to create the keyspace with the same name again.

Paste the error you get.

=Rob

--
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

The name of the keyspace is DA.
On tryinh to create the keyspace it is giving an exception.
I am getting a Ttransport exception for creating the keyspace.

Previous keyspace 'DA  that was created still exists. Because when i
checked the folders,the folder with name 'DA' still exists but i cannot
access it.


Cheers,
Puneet


Upgrade for Cassandra 0.8.4 to 1.+

2012-07-05 Thread Raj N
Hi experts,
 I am planning to upgrade from 0.8.4 to 1.+. Whats the latest stable
version?

Thanks
-Rajesh


batch_mutate

2012-07-05 Thread Leonid Ilyevsky
My current way of inserting rows one by one is too slow (I use cql3 prepared 
statements) , so I want to try batch_mutate.

Could anybody give me more details about the interface? In the javadoc it says:

public void 
batch_mutate(java.util.Mapjava.nio.ByteBuffer,java.util.Mapjava.lang.String,java.util.ListMutationfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\Mutation.html
 mutation_map,
 
ConsistencyLevelfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\ConsistencyLevel.html
 consistency_level)
  throws 
InvalidRequestExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\InvalidRequestException.html,
 
UnavailableExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\UnavailableException.html,
 
TimedOutExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\TimedOutException.html,
 org.apache.thrift.TException
Description copied from interface: 
Cassandra.Ifacefile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\Cassandra.Iface.html#batch_mutate%28java.util.Map,%20org.apache.cassandra.thrift.ConsistencyLevel%29
Mutate many columns or super columns for many row keys. See also: Mutation. 
mutation_map maps key to column family to a list of Mutation objects to take 
place at that scope. *


I need to understand the meaning of the elements of mutation_map parameter.
My guess is, the key in the outer map is columnfamily name, is this correct?
The key in the inner map is, probably, a key to the columnfamily (it is 
somewhat confusing that it is String while the outer key is ByteBuffer, I 
wonder what is the rational). If this is correct, how should I do it if my key 
is a composite one. Does anybody have an example?

Thanks,

Leonid


This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


Re: Upgrade for Cassandra 0.8.4 to 1.+

2012-07-05 Thread rohit bhatia
http://cassandra.apache.org/ says 1.1.2

On Thu, Jul 5, 2012 at 7:46 PM, Raj N raj.cassan...@gmail.com wrote:
 Hi experts,
  I am planning to upgrade from 0.8.4 to 1.+. Whats the latest stable
 version?

 Thanks
 -Rajesh


RE: batch_mutate

2012-07-05 Thread Leonid Ilyevsky
I actually found an answer to my first question at 
http://wiki.apache.org/cassandra/API. So I got it wrong: actually the outer key 
is the key in the table, and the inner key is the table name (this was somewhat 
counter-intuitive). Does it mean that the popular use case is when we need to 
update multiple column families using the same key? Shouldn't we design our 
space in such a way that those columns live in the same column family?

From: Leonid Ilyevsky [mailto:lilyev...@mooncapital.com]
Sent: Thursday, July 05, 2012 10:39 AM
To: 'user@cassandra.apache.org'
Subject: batch_mutate

My current way of inserting rows one by one is too slow (I use cql3 prepared 
statements) , so I want to try batch_mutate.

Could anybody give me more details about the interface? In the javadoc it says:

public void 
batch_mutate(java.util.Mapjava.nio.ByteBuffer,java.util.Mapjava.lang.String,java.util.ListMutationfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\Mutation.html
 mutation_map,
 
ConsistencyLevelfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\ConsistencyLevel.html
 consistency_level)
  throws 
InvalidRequestExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\InvalidRequestException.html,
 
UnavailableExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\UnavailableException.html,
 
TimedOutExceptionfile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\TimedOutException.html,
 org.apache.thrift.TException
Description copied from interface: 
Cassandra.Ifacefile:///C:\Tools\apache-cassandra-1.1.1\javadoc\org\apache\cassandra\thrift\Cassandra.Iface.html#batch_mutate%28java.util.Map,%20org.apache.cassandra.thrift.ConsistencyLevel%29
Mutate many columns or super columns for many row keys. See also: Mutation. 
mutation_map maps key to column family to a list of Mutation objects to take 
place at that scope. *


I need to understand the meaning of the elements of mutation_map parameter.
My guess is, the key in the outer map is columnfamily name, is this correct?
The key in the inner map is, probably, a key to the columnfamily (it is 
somewhat confusing that it is String while the outer key is ByteBuffer, I 
wonder what is the rational). If this is correct, how should I do it if my key 
is a composite one. Does anybody have an example?

Thanks,

Leonid


This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


Wide rows and reads

2012-07-05 Thread Oleg Dulin

Here is my flow:

One process write a really wide row (250K+ supercolumns, each one with 
5 subcolumns, for the total of 1K or so per supercolumn)


Second process comes in literally 2-3 seconds later and starts reading from it.

My observation is that nothing good happens. It is ridiculously slow to 
read. It seems that if I wait long enough, the reads from that row will 
be much faster.


Could someone enlighten me as to what exactly happens when I do this ?

Regards,
Oleg




Re: Wide rows and reads

2012-07-05 Thread Philip Shon
From what I understand, wide rows have quite a bit of overhead, especially
if you are picking columns that are far apart from each other for a given
row.

This post by Aaron Morton was quite good at explaining this issue
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

-Phil

On Thu, Jul 5, 2012 at 12:17 PM, Oleg Dulin oleg.du...@gmail.com wrote:

 Here is my flow:

 One process write a really wide row (250K+ supercolumns, each one with 5
 subcolumns, for the total of 1K or so per supercolumn)

 Second process comes in literally 2-3 seconds later and starts reading
 from it.

 My observation is that nothing good happens. It is ridiculously slow to
 read. It seems that if I wait long enough, the reads from that row will be
 much faster.

 Could someone enlighten me as to what exactly happens when I do this ?

 Regards,
 Oleg





JNA on Windows

2012-07-05 Thread Fredrik Stigbäck
Hello.
I have a question regarding JNA and Windows.
I read about the problem that when taking snapshots might require the
process space x 2 due to how hardlinks are created.
Is JNA for Windows supported?
Looking at jira issue
https://issues.apache.org/jira/browse/CASSANDRA-1371 looks like it but
checking in the Cassandra code base
org.apache.cassandra.utils.CLibrary the only thing I see, is
Native.register(c) which tries to load the c-library but I think
doesn't exists on Windows which will result in creating links with cmd
or fsutil and which might then triggger these extensive memory
requirements.
I'd be happy if someone could shed some light on this issue.
Regards
/Fredrik


Composite Slice Query returning non-sliced data

2012-07-05 Thread Sunit Randhawa
Hello,

I have 2 Columns for a 'RowKey' as below:

 #1 : set CF['RowKey']['1000']='A=1,B=2';
 #2: set CF['RowKey']['1000:C1']='A=2,B=3'';

#2 has the Composite Column and #1 does not.

Now when I execute the Composite Slice query by 1000 and C1, I do get
both the columns above.

I am hoping get #2 only since I am specifically providing C1 as
Start and Finish Composite Range with
Composite.ComponentEquality.EQUAL.


I am not sure if this is by design.

Thanks,
Sunit.


Re: Enable CQL3 from Astyanax

2012-07-05 Thread aaron morton
Can you provide an example ? 

select * should return all the columns from the CF. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 4:31 AM, Thierry Templier wrote:

 Thanks Aaron.
 
 I wonder if it's possible to obtain columns from a CQL 3 select query (with a 
 select *) that aren't defined in the create table. These fields are present 
 when all attributes are loaded but not when using CQL3. Is it the normal 
 behavior? Thanks very much!
 
 Thierry
 
 Thanks for contributing. 
 
 I'm behind the curve on CQL 3, but here is a post about some of the changes 
 http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 28/06/2012, at 2:30 AM, Thierry Templier wrote:
 
 Hello Aaron,
 
 I created an issue on the Astyanax github for this problem. I added a fix 
 to support CQL3 in the tool.
 See the link https://github.com/Netflix/astyanax/issues/75.
 
 Thierry
 
 Had a quick look, the current master does not appear to support it. 
 
 Cheers
 
 
 



Re: Thrift version and OOM errors

2012-07-05 Thread aaron morton
agree. 

It's a good idea to remove as many variables and possible and get to a 
stable/known state. Use a clean install and a well known client and see if the 
problems persist. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 4:58 PM, Tristan Seligmann wrote:

 On Jul 4, 2012 2:02 PM, Vasileios Vlachos vasileiosvlac...@gmail.com 
 wrote:
 
  Any ideas what could be causing strange message lengths?
 
 One cause of this that I've seen is a client using unframed Thrift transport 
 while the server expects framed, or vice versa. I suppose a similar cause 
 could be something that is not a Thrift client at all mistakenly connecting 
 to Cassandra's Thrift port. 
 



Re: GC freeze just after repair session

2012-07-05 Thread aaron morton
 Ideally we would like to collect maximum garbage from ParNew itself, during 
 compactions. What are the steps to take towards to achieving this?
I'm not sure what you are asking. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 6:56 PM, Ravikumar Govindarajan wrote:

 We have modified maxTenuringThreshold from 1 to 5. May be it is causing 
 problems. Will change it back to 1 and see how the system is.
 
 concurrent_compactors=8. We will reduce this, as anyway our system won't be 
 able to handle this number of compactions at the same time. Think it will 
 ease GC also to some extent.
 
 Ideally we would like to collect maximum garbage from ParNew itself, during 
 compactions. What are the steps to take towards to achieving this?
 
 On Wed, Jul 4, 2012 at 4:07 PM, aaron morton aa...@thelastpickle.com wrote:
 It *may* have been compaction from the repair, but it's not a big CF.
 
 I would look at the logs to see how much data was transferred to the node. 
 Was their a compaction going on while the GC storm was happening ? Do you 
 have a lot of secondary indexes ? 
 
 If you think it correlated to compaction you can try reducing the 
 concurrent_compactors 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote:
 
 Recently, we faced a severe freeze [around 30-40 mins] on one of our 
 servers. There were many mutations/reads dropped. The issue happened just 
 after a routine nodetool repair for the below CF completed [1.0.7, NTS, 
 DC1:3,DC2:2]
 
  Column Family: MsgIrtConv
  SSTable count: 12
  Space used (live): 17426379140
  Space used (total): 17426379140
  Number of Keys (estimate): 122624
  Memtable Columns Count: 31180
  Memtable Data Size: 81950175
  Memtable Switch Count: 31
  Read Count: 8074156
  Read Latency: 15.743 ms.
  Write Count: 2172404
  Write Latency: 0.037 ms.
  Pending Tasks: 0
  Bloom Filter False Postives: 1258
  Bloom Filter False Ratio: 0.03598
  Bloom Filter Space Used: 498672
  Key cache capacity: 20
  Key cache size: 20
  Key cache hit rate: 0.9965579513062582
  Row cache: disabled
  Compacted row minimum size: 51
  Compacted row maximum size: 89970660
  Compacted row mean size: 226626
 
 
 Our heap config is as follows
 
 -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC 
 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
 -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75 
 -XX:+UseCMSInitiatingOccupancyOnly
 
 from yaml
 in_memory_compaction_limit=64
 compaction_throughput_mb_sec=8
 multi_threaded_compaction=false
 
  INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085 AntiEntropyService.java 
 (line 762) [repair #2b6fcbf0-c1f9-11e1--2ea8811bfbff] MsgIrtConv is 
 fully synced
  INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085 
 AntiEntropyService.java (line 698) [repair 
 #2b6fcbf0-c1f9-11e1--2ea8811bfbff] session completed successfully
  INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219 CompactionTask.java 
 (line 221) Compacted to 
 [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,].  47,907,012 to 
 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s.  Time: 
 6,186ms.
 
 After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran for 
 every 3 seconds, while CMS ran for every 30 seconds approx continuous for 40 
 minutes.
 
  INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line 122) 
 GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line 122) 
 GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is 8506048512
 
 .
 
  INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line 122) 
 GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line 122) 
 GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line 122) 
 GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line 122) 
 GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line 122) 
 GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 

Re: steps to add node in cassandra 1.1.2

2012-07-05 Thread aaron morton
1.1. docs for the same 
http://www.datastax.com/docs/1.1/operations/cluster_management

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 9:17 PM, prasenjit mukherjee wrote:

 I am using cassandar version 1.1.2. I got the document to add node for
 version 0.7 : http://www.datastax.com/docs/0.7/getting_started/configuring
 
 Is it still valid ? Is there a documentation on this topic from
 cassandra twiki/docs ?
 
 -Prasenjit



Composite key in thrift java api

2012-07-05 Thread Leonid Ilyevsky
I need to create a ByteBuffer instance containing the proper composite key, 
based on the values of the components of the key. I am going to use it for 
update operation.
I tried to simply concatenate the buffers corresponding to the components, but 
I am not sure this is correct, because I am getting exception that comes from 
the server :

InvalidRequestException(why:Not enough bytes to read value of component 0)

In the server log I see this:

org.apache.thrift.transport.TTransportException: Cannot read. Remote side has 
closed. Tried to read 4 bytes, but only got 0 bytes. (This is often indicative 
of an internal error on the server side. Please check your server logs.)

(I believe here when it says server side it actually means client, because it 
is the server's log).

Seems like the buffer that my client sends is too short.  I suspect there is a 
way in thrift to do it properly, but I don't know how.
Looks like Hector has a Composite class that maybe can help, but at this point 
I would really prefer to do it in Cassandra itself, without Hector.

Thanks!

Leonid



This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


Re: CorruptedBlockException

2012-07-05 Thread aaron morton
 But I don't understand, how was all the available space taken away.
Take a look on disk at /var/lib/cassandra/data/your_keyspace and 
/var/lib/cassandra/commitlog to see what is taking up a lot of space. 

Cassandra stores the column names as well as the values, so that can take up 
some space. 

  it says that while compaction a CorruptedBlockException has occured.
Are you able to reproduce this error ? 

Thanks

 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 12:04 AM, Nury Redjepow wrote:

 Hello to all,
 
  I have cassandra instance I'm trying to use to store millions of file with 
 size ~ 3MB. Data structure is simple, 1 row for 1 file, with row key being 
 the id of file.
 I'm loaded 1GB of data, and total available space is 10GB. And after a few 
 hour, all the available space was taken. In log, it says that while 
 compaction a CorruptedBlockException has occured. But I don't understand, how 
 was all the available space taken away.
 
 Data structure
 CREATE KEYSPACE largeobjectsWITH placement_strategy = 'SimpleStrategy'
 AND strategy_options={replication_factor:1};
 
 create column family content
   with column_type = 'Standard'  
   and comparator = 'UTF8Type'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'TimeUUIDType'
   and read_repair_chance = 0.1
   and dclocal_read_repair_chance = 0.0
   and gc_grace = 864000
   and min_compaction_threshold = 4
   and max_compaction_threshold = 32
   and replicate_on_write = true
   and compaction_strategy = 'SizeTieredCompactionStrategy'
   and caching = 'keys_only';
 
 
 Log messages
 
 INFO [FlushWriter:9] 2012-07-04 19:56:00,783 Memtable.java (line 266) Writing 
 Memtable-content@240294142(3955135/49439187 serialized/live bytes, 91 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:00,814 Memtable.java (line 307) 
 Completed flushing 
 /var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
 d-1608-Data.db (1991862 bytes) for commitlog position 
 ReplayPosition(segmentId=24245436475633, position=78253718)
 INFO [OptionalTasks:1] 2012-07-04 19:56:02,784 MeteredFlusher.java (line 62) 
 flushing high-traffic column family CFS(Keyspace='largeobjects', 
 ColumnFamily='
 content') (estimated 46971537 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:02,784 ColumnFamilyStore.java (line 
 633) Enqueuing flush of Memtable-content@1755783901(3757723/46971537 
 serialized/
 live bytes, 121 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:02,785 Memtable.java (line 266) Writing 
 Memtable-content@1755783901(3757723/46971537 serialized/live bytes, 121 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:02,835 Memtable.java (line 307) 
 Completed flushing 
 /var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
 d-1609-Data.db (1894897 bytes) for commitlog position 
 ReplayPosition(segmentId=24245436475633, position=82028986)
 INFO [OptionalTasks:1] 2012-07-04 19:56:04,785 MeteredFlusher.java (line 62) 
 flushing high-traffic column family CFS(Keyspace='largeobjects', 
 ColumnFamily='
 content') (estimated 56971025 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:04,785 ColumnFamilyStore.java (line 
 633) Enqueuing flush of Memtable-content@1441175031(4557682/56971025 
 serialized/
 live bytes, 124 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:04,786 Memtable.java (line 266) Writing 
 Memtable-content@1441175031(4557682/56971025 serialized/live bytes, 124 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:04,814 Memtable.java (line 307) 
 Completed flushing 
 /var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
 d-1610-Data.db (2287280 bytes) for commitlog position 
 ReplayPosition(segmentId=24245436475633, position=86604648)
 INFO [CompactionExecutor:39] 2012-07-04 19:56:04,815 CompactionTask.java 
 (line 109) Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/largeobjects/con
 tent/largeobjects-content-hd-1610-Data.db'), 
 SSTableReader(path='/var/lib/cassandra/data/largeobjects/content/largeobjects-content-hd-1608-Data.db'),
  SSTable
 Reader(path='/var/lib/cassandra/data/largeobjects/content/largeobjects-content-hd-1609-Data.db'),
  SSTableReader(path='/var/lib/cassandra/data/largeobjects/co
 ntent/largeobjects-content-hd-1607-Data.db')]
 INFO [OptionalTasks:1] 2012-07-04 19:56:05,786 MeteredFlusher.java (line 62) 
 flushing high-traffic column family CFS(Keyspace='largeobjects', 
 ColumnFamily='
 content') (estimated 28300225 bytes)
 INFO [OptionalTasks:1] 2012-07-04 19:56:05,786 ColumnFamilyStore.java (line 
 633) Enqueuing flush of Memtable-content@1828084851(2264018/28300225 
 serialized/
 live bytes, 38 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:05,787 Memtable.java (line 266) Writing 
 Memtable-content@1828084851(2264018/28300225 serialized/live bytes, 38 ops)
 INFO [FlushWriter:9] 2012-07-04 19:56:05,823 Memtable.java (line 307) 
 Completed flushing 
 /var/lib/cassandra/data/largeobjects/content/largeobjects-content-h
 d-1612-Data.db (1134604 

Re: cassandra on re-Start

2012-07-05 Thread aaron morton
Sounds like this problem in 1.1.0 
https://issues.apache.org/jira/browse/CASSANDRA-4219 upgrade if you are on 1.1.0

If not please paste the entire exception. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 1:32 AM, puneet loya wrote:

 
 
 -- Forwarded message --
 From: Rob Coli rc...@palominodb.com
 Date: Mon, Jul 2, 2012 at 11:19 PM
 Subject: Re: cassandra on re-Start
 To: user@cassandra.apache.org
 
 
 On Mon, Jul 2, 2012 at 5:43 AM, puneet loya puneetl...@gmail.com wrote:
  When I restarted the system , it is showing the keyspace does not exist.
 
  Not even letting me to create the keyspace with the same name again.
 
 Paste the error you get.
 
 =Rob
 
 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb
 
 The name of the keyspace is DA.
 On tryinh to create the keyspace it is giving an exception.
 I am getting a Ttransport exception for creating the keyspace. 
 
 Previous keyspace 'DA  that was created still exists. Because when i checked 
 the folders,the folder with name 'DA' still exists but i cannot access it.
 
 
 Cheers,
 Puneet



Re: Upgrade for Cassandra 0.8.4 to 1.+

2012-07-05 Thread aaron morton
Consult the NEWS.txt file for help on upgrading 
https://github.com/apache/cassandra/blob/trunk/NEWS.txt

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 2:52 AM, rohit bhatia wrote:

 http://cassandra.apache.org/ says 1.1.2
 
 On Thu, Jul 5, 2012 at 7:46 PM, Raj N raj.cassan...@gmail.com wrote:
 Hi experts,
 I am planning to upgrade from 0.8.4 to 1.+. Whats the latest stable
 version?
 
 Thanks
 -Rajesh



Re: Composite key in thrift java api

2012-07-05 Thread aaron morton
  I would really prefer to do it in Cassandra itself,
See 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/marshal/CompositeType.java

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 10:40 AM, Leonid Ilyevsky wrote:

 I need to create a ByteBuffer instance containing the proper composite key, 
 based on the values of the components of the key. I am going to use it for 
 update operation.
 I tried to simply concatenate the buffers corresponding to the components, 
 but I am not sure this is correct, because I am getting exception that comes 
 from the server :
  
 InvalidRequestException(why:Not enough bytes to read value of component 0)
  
 In the server log I see this:
  
 org.apache.thrift.transport.TTransportException: Cannot read. Remote side has 
 closed. Tried to read 4 bytes, but only got 0 bytes. (This is often 
 indicative of an internal error on the server side. Please check your server 
 logs.)
  
 (I believe here when it says “server side” it actually means client, because 
 it is the server’s log).
  
 Seems like the buffer that my client sends is too short.  I suspect there is 
 a way in thrift to do it properly, but I don’t know how.
 Looks like Hector has a Composite class that maybe can help, but at this 
 point I would really prefer to do it in Cassandra itself, without Hector.
  
 Thanks!
  
 Leonid
  
 
 This email, along with any attachments, is confidential and may be legally 
 privileged or otherwise protected from disclosure. Any unauthorized 
 dissemination, copying or use of the contents of this email is strictly 
 prohibited and may be in violation of law. If you are not the intended 
 recipient, any disclosure, copying, forwarding or distribution of this email 
 is strictly prohibited and this email and any attachments should be deleted 
 immediately. This email and any attachments do not constitute an offer to 
 sell or a solicitation of an offer to purchase any interest in any investment 
 vehicle sponsored by Moon Capital Management LP (“Moon Capital”). Moon 
 Capital does not provide legal, accounting or tax advice. Any statement 
 regarding legal, accounting or tax matters was not intended or written to be 
 relied upon by any person as advice. Moon Capital does not waive 
 confidentiality or privilege as a result of this email.



Re: Finding bottleneck of a cluster

2012-07-05 Thread aaron morton
 12G Heap,
 1600Mb Young gen, 
Is a bit higher than the normal recommendation. 1600MB young gen can cause some 
extra ParNew pauses. 

 128 Concurrent writer
 threads
Unless you are on SSD this is too many.
 
 1) Is using JDK 1.7 any way detrimental to cassandra?
as far as I know it's not fully certified, thanks for trying it :)

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?
Counters use a different write path than normal writes and are a bit slower. 

To benchmark, get a single node and work out the max throughput. Then multiply 
by the number of nodes and divide by the RF to get a rough idea.

 the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%) 
Wait until compaction kicks in and handle all your inserts. 

 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.
The node is overloaded. What is the write latency for a single thread doing as 
single increment against a node that has not other traffic ? The latency for a 
request is the time spent working and the time spent waiting, once you read the 
max throughput the time spent waiting increases. The SEDA architecture is 
designed to limit the time spent working.  

At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending
The cluster is overwhelmed. 

  Almost all the new threads seem to be named
 pool-2-thread-*.
These are client connection threads. 

 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(
Yes.
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L214

Work out the latency for a single client single node, then start adding 
replication, nodes and load. When the latency increases you are getting to the 
max throughput for that config.

Hope that helps

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 6:49 PM, rohit bhatia wrote:

 Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
 threads). The replication factor is 2 with 10 column families and we
 service Counter incrementing write intensive tasks(CL=ONE).
 
 I am trying to figure out the bottleneck,
 
 1) Is using JDK 1.7 any way detrimental to cassandra?
 
 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?

 http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
 
 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%) and is not swapping
 its memory(around 15 gb RAM is free or inactive). The average gc pause
 time for parnew are 100ms occuring every second. So cassandra spends
 10% of its time stuck in Stop the world collector.
 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.
 
At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending
 MutationStage,replicateOnWriteStage tasks in tpstats.
 The number of threads in the java process increase to around 25,000
 from the usual 300-400. Almost all the new threads seem to be named
 pool-2-thread-*.
 The OS load jumps to around 30-40, the write request latency starts
 spiking to more than 500ms (even to several tens of seconds sometime).
 Even the Local write latency increases fourfolds to 200 microseconds
 from 50 microseconds. This happens across all the nodes and in around
 2-3 minutes.
 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(though with  average local write latency
 of 100-150 micro seconds, each thread should be able to serve 10,000
 qps and with 128 writer threads, should be able to serve 1,280,000 qps
 per node)
 Could there be any other reason for this? What else should I monitor
 since system.log do not seem to say anything conclusive before
 dropping messages.
 
 
 
 Thanks
 Rohit



Re: batch_mutate

2012-07-05 Thread aaron morton
 Does it mean that the popular use case is when we need to update multiple 
 column families using the same key?
Yes. 

 Shouldn’t we design our space in such a way that those columns live in the 
 same column family?

Design a model where the data for common queries is stored in one row+cf. You 
can also take into consideration the workload. e.g. things are are updated 
frequently often live together, things that are updated infrequently often live 
together.

cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 3:16 AM, Leonid Ilyevsky wrote:

 I actually found an answer to my first question at 
 http://wiki.apache.org/cassandra/API. So I got it wrong: actually the outer 
 key is the key in the table, and the inner key is the table name (this was 
 somewhat counter-intuitive). Does it mean that the popular use case is when 
 we need to update multiple column families using the same key? Shouldn’t we 
 design our space in such a way that those columns live in the same column 
 family?
  
 From: Leonid Ilyevsky [mailto:lilyev...@mooncapital.com] 
 Sent: Thursday, July 05, 2012 10:39 AM
 To: 'user@cassandra.apache.org'
 Subject: batch_mutate
  
 My current way of inserting rows one by one is too slow (I use cql3 prepared 
 statements) , so I want to try batch_mutate.
  
 Could anybody give me more details about the interface? In the javadoc it 
 says:
  
 public 
 voidbatch_mutate(java.util.Mapjava.nio.ByteBuffer,java.util.Mapjava.lang.String,java.util.ListMutation
  mutation_map,
  ConsistencyLevel consistency_level)
   throws InvalidRequestException,
  UnavailableException,
  TimedOutException,
  org.apache.thrift.TException
 Description copied from interface: Cassandra.Iface
 Mutate many columns or super columns for many row keys. See also: Mutation. 
 mutation_map maps key to column family to a list of Mutation objects to take 
 place at that scope. *
  
  
 I need to understand the meaning of the elements of mutation_map parameter.
 My guess is, the key in the outer map is columnfamily name, is this correct?
 The key in the inner map is, probably, a key to the columnfamily (it is 
 somewhat confusing that it is String while the outer key is ByteBuffer, I 
 wonder what is the rational). If this is correct, how should I do it if my 
 key is a composite one. Does anybody have an example?
  
 Thanks,
  
 Leonid
  
 This email, along with any attachments, is confidential and may be legally 
 privileged or otherwise protected from disclosure. Any unauthorized 
 dissemination, copying or use of the contents of this email is strictly 
 prohibited and may be in violation of law. If you are not the intended 
 recipient, any disclosure, copying, forwarding or distribution of this email 
 is strictly prohibited and this email and any attachments should be deleted 
 immediately. This email and any attachments do not constitute an offer to 
 sell or a solicitation of an offer to purchase any interest in any investment 
 vehicle sponsored by Moon Capital Management LP (“Moon Capital”). Moon 
 Capital does not provide legal, accounting or tax advice. Any statement 
 regarding legal, accounting or tax matters was not intended or written to be 
 relied upon by any person as advice. Moon Capital does not waive 
 confidentiality or privilege as a result of this email.
 
 This email, along with any attachments, is confidential and may be legally 
 privileged or otherwise protected from disclosure. Any unauthorized 
 dissemination, copying or use of the contents of this email is strictly 
 prohibited and may be in violation of law. If you are not the intended 
 recipient, any disclosure, copying, forwarding or distribution of this email 
 is strictly prohibited and this email and any attachments should be deleted 
 immediately. This email and any attachments do not constitute an offer to 
 sell or a solicitation of an offer to purchase any interest in any investment 
 vehicle sponsored by Moon Capital Management LP (“Moon Capital”). Moon 
 Capital does not provide legal, accounting or tax advice. Any statement 
 regarding legal, accounting or tax matters was not intended or written to be 
 relied upon by any person as advice. Moon Capital does not waive 
 confidentiality or privilege as a result of this email.



Re: Composite Slice Query returning non-sliced data

2012-07-05 Thread aaron morton
 #2 has the Composite Column and #1 does not.
They are both strings. 

All column names *must* be of the same type. What was your CF definition ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 7:26 AM, Sunit Randhawa wrote:

 Hello,
 
 I have 2 Columns for a 'RowKey' as below:
 
 #1 : set CF['RowKey']['1000']='A=1,B=2';
 #2: set CF['RowKey']['1000:C1']='A=2,B=3'';
 
 #2 has the Composite Column and #1 does not.
 
 Now when I execute the Composite Slice query by 1000 and C1, I do get
 both the columns above.
 
 I am hoping get #2 only since I am specifically providing C1 as
 Start and Finish Composite Range with
 Composite.ComponentEquality.EQUAL.
 
 
 I am not sure if this is by design.
 
 Thanks,
 Sunit.



Re: Composite Slice Query returning non-sliced data

2012-07-05 Thread Sunit Randhawa
HI Aaron,

It is

create column family CF
with comparator =
'CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)'
and key_validation_class = UTF8Type
and default_validation_class = UTF8Type;

This is allowing me to insert column names of different type.

Thanks,
Sunit.
On Thu, Jul 5, 2012 at 4:24 PM, aaron morton aa...@thelastpickle.com wrote:
 #2 has the Composite Column and #1 does not.

 They are both strings.

 All column names *must* be of the same type. What was your CF definition ?

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6/07/2012, at 7:26 AM, Sunit Randhawa wrote:

 Hello,

 I have 2 Columns for a 'RowKey' as below:

 #1 : set CF['RowKey']['1000']='A=1,B=2';
 #2: set CF['RowKey']['1000:C1']='A=2,B=3'';

 #2 has the Composite Column and #1 does not.

 Now when I execute the Composite Slice query by 1000 and C1, I do get
 both the columns above.

 I am hoping get #2 only since I am specifically providing C1 as
 Start and Finish Composite Range with
 Composite.ComponentEquality.EQUAL.


 I am not sure if this is by design.

 Thanks,
 Sunit.




Re: Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
On Fri, Jul 6, 2012 at 4:47 AM, aaron morton aa...@thelastpickle.com wrote:
 12G Heap,
 1600Mb Young gen,

 Is a bit higher than the normal recommendation. 1600MB young gen can cause
 some extra ParNew pauses.
Thanks for heads up, i'll try tinkering on this


 128 Concurrent writer
 threads

 Unless you are on SSD this is too many.

I mean 
http://www.datastax.com/docs/0.8/configuration/node_configuration#concurrent-writes
, this is not memtable flush queue writers.
Suggested value is 8*number of cores(16) = 128 itself.

 1) Is using JDK 1.7 any way detrimental to cassandra?

 as far as I know it's not fully certified, thanks for trying it :)

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?

 Counters use a different write path than normal writes and are a bit slower.

 To benchmark, get a single node and work out the max throughput. Then
 multiply by the number of nodes and divide by the RF to get a rough idea.

 the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%)

 Wait until compaction kicks in and handle all your inserts.

 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.

 The node is overloaded. What is the write latency for a single thread doing
 as single increment against a node that has not other traffic ? The latency
 for a request is the time spent working and the time spent waiting, once you
 read the max throughput the time spent waiting increases. The SEDA
 architecture is designed to limit the time spent working.

At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending

 The cluster is overwhelmed.

  Almost all the new threads seem to be named
 pool-2-thread-*.

 These are client connection threads.

 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(

 Yes.
 https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L214

 Work out the latency for a single client single node, then start adding
 replication, nodes and load. When the latency increases you are getting to
 the max throughput for that config.

Also, as mentioned in my second mail, seeing messages like this Total
time for which application threads were stopped: 16.7663710 seconds,
if something pauses for this long, it might be overwhelmed by the
hints stored at other nodes. This can further cause the node to wait
on/drop a lot of client connection threads. I'll look into what is
causing these non-gc pauses. Thanks for the help.


 Hope that helps

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/07/2012, at 6:49 PM, rohit bhatia wrote:

 Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
 threads). The replication factor is 2 with 10 column families and we
 service Counter incrementing write intensive tasks(CL=ONE).

 I am trying to figure out the bottleneck,

 1) Is using JDK 1.7 any way detrimental to cassandra?

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?

 http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%) and is not swapping
 its memory(around 15 gb RAM is free or inactive). The average gc pause
 time for parnew are 100ms occuring every second. So cassandra spends
 10% of its time stuck in Stop the world collector.
 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.

At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending
 MutationStage,replicateOnWriteStage tasks in tpstats.
 The number of threads in the java process increase to around 25,000
 from the usual 300-400. Almost all the new threads seem to be named
 pool-2-thread-*.
 The OS load jumps to around 30-40, the write request latency starts
 spiking to more than 500ms (even to several tens of seconds sometime).
 Even the Local write latency increases fourfolds to 200 microseconds
 from 50 microseconds. This happens across all the nodes and in around
 2-3 minutes.
 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(though with  average local write latency
 of 100-150 micro seconds, each thread should be able to serve 10,000
 qps and with 128 writer threads, should be able to serve 1,280,000 qps
 per node)
 Could there be any other reason for this? What else should I monitor
 since system.log do not 

Re: Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
On Fri, Jul 6, 2012 at 9:44 AM, rohit bhatia rohit2...@gmail.com wrote:
 On Fri, Jul 6, 2012 at 4:47 AM, aaron morton aa...@thelastpickle.com wrote:
 12G Heap,
 1600Mb Young gen,

 Is a bit higher than the normal recommendation. 1600MB young gen can cause
 some extra ParNew pauses.
 Thanks for heads up, i'll try tinkering on this


 128 Concurrent writer
 threads

 Unless you are on SSD this is too many.

 I mean 
 http://www.datastax.com/docs/0.8/configuration/node_configuration#concurrent-writes
 , this is not memtable flush queue writers.
 Suggested value is 8*number of cores(16) = 128 itself.

 1) Is using JDK 1.7 any way detrimental to cassandra?

 as far as I know it's not fully certified, thanks for trying it :)

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?

 Counters use a different write path than normal writes and are a bit slower.

 To benchmark, get a single node and work out the max throughput. Then
 multiply by the number of nodes and divide by the RF to get a rough idea.

 the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%)

 Wait until compaction kicks in and handle all your inserts.

 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.

 The node is overloaded. What is the write latency for a single thread doing
 as single increment against a node that has not other traffic ? The latency
 for a request is the time spent working and the time spent waiting, once you
 read the max throughput the time spent waiting increases. The SEDA
 architecture is designed to limit the time spent working.
The write latency I reported is as reported by datastax opscenter for
the total latency of a client's request. This is minimum at .5ms.
In contrast, the local write request latency as reported by cfstats
are around 50 micro seconds but jump to 150 microseconds during the
crash.



At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending

 The cluster is overwhelmed.

  Almost all the new threads seem to be named
 pool-2-thread-*.

 These are client connection threads.

 My guess is that this might be due to the 128 Writer threads not being
 able to perform more writes.(

 Yes.
 https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L214

 Work out the latency for a single client single node, then start adding
 replication, nodes and load. When the latency increases you are getting to
 the max throughput for that config.

 Also, as mentioned in my second mail, seeing messages like this Total
 time for which application threads were stopped: 16.7663710 seconds,
 if something pauses for this long, it might be overwhelmed by the
 hints stored at other nodes. This can further cause the node to wait
 on/drop a lot of client connection threads. I'll look into what is
 causing these non-gc pauses. Thanks for the help.


 Hope that helps

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/07/2012, at 6:49 PM, rohit bhatia wrote:

 Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
 threads). The replication factor is 2 with 10 column families and we
 service Counter incrementing write intensive tasks(CL=ONE).

 I am trying to figure out the bottleneck,

 1) Is using JDK 1.7 any way detrimental to cassandra?

 2) What is the max write operation qps that should be expected. Is the
 netflix benchmark also applicable for counter incrmenting tasks?

 http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
 idle time is around 30%, cassandra is not disk bound(insignificant
 read operations and cpu's iowait is around 0.05%) and is not swapping
 its memory(around 15 gb RAM is free or inactive). The average gc pause
 time for parnew are 100ms occuring every second. So cassandra spends
 10% of its time stuck in Stop the world collector.
 The os load is around 16-20 and the average write latency is 3ms.
 tpstats do not show any significant pending tasks.

At this point suddenly, Several nodes start dropping several
 Mutation messages. There are also lots of pending
 MutationStage,replicateOnWriteStage tasks in tpstats.
 The number of threads in the java process increase to around 25,000
 from the usual 300-400. Almost all the new threads seem to be named
 pool-2-thread-*.
 The OS load jumps to around 30-40, the write request latency starts
 spiking to more than 500ms (even to several tens of seconds sometime).
 Even the Local write latency increases fourfolds to 200 microseconds
 from 50 microseconds. This happens across all the nodes and in around
 2-3 minutes.
 My guess is that this might 

Multiple keyspace question

2012-07-05 Thread Ben Kaehne
Good evening,

I have read multiple keyspaces are bad before in a few discussions, but to
what extent?

We have some reasonably powerful machines and looking to host
an additional (currently we have 1) 2 keyspaces within our cassandra
cluster (of 3 nodes, using RF3).

At what point does adding extra keyspaces start becoming an issue? Is there
anything special we should be considering or watching out for as we
implement this?

I could not imagine that all cassandra users out there are running one
massive keyspace, and at the same time can not imaging that all cassandra
users have multiple clusters just to host different keyspaces.

Regards.

-- 
-Ben


Re: GC freeze just after repair session

2012-07-05 Thread rohit bhatia
@ravi, u can increase young gen size, keep a high tenuring rate or
increase survivor ratio..


On Fri, Jul 6, 2012 at 4:03 AM, aaron morton aa...@thelastpickle.com wrote:
 Ideally we would like to collect maximum garbage from ParNew itself, during
 compactions. What are the steps to take towards to achieving this?

 I'm not sure what you are asking.

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/07/2012, at 6:56 PM, Ravikumar Govindarajan wrote:

 We have modified maxTenuringThreshold from 1 to 5. May be it is causing
 problems. Will change it back to 1 and see how the system is.

 concurrent_compactors=8. We will reduce this, as anyway our system won't be
 able to handle this number of compactions at the same time. Think it will
 ease GC also to some extent.

 Ideally we would like to collect maximum garbage from ParNew itself, during
 compactions. What are the steps to take towards to achieving this?

 On Wed, Jul 4, 2012 at 4:07 PM, aaron morton aa...@thelastpickle.com
 wrote:

 It *may* have been compaction from the repair, but it's not a big CF.

 I would look at the logs to see how much data was transferred to the node.
 Was their a compaction going on while the GC storm was happening ? Do you
 have a lot of secondary indexes ?

 If you think it correlated to compaction you can try reducing the
 concurrent_compactors

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote:

 Recently, we faced a severe freeze [around 30-40 mins] on one of our
 servers. There were many mutations/reads dropped. The issue happened just
 after a routine nodetool repair for the below CF completed [1.0.7, NTS,
 DC1:3,DC2:2]

 Column Family: MsgIrtConv
 SSTable count: 12
 Space used (live): 17426379140
 Space used (total): 17426379140
 Number of Keys (estimate): 122624
 Memtable Columns Count: 31180
 Memtable Data Size: 81950175
 Memtable Switch Count: 31
 Read Count: 8074156
 Read Latency: 15.743 ms.
 Write Count: 2172404
 Write Latency: 0.037 ms.
 Pending Tasks: 0
 Bloom Filter False Postives: 1258
 Bloom Filter False Ratio: 0.03598
 Bloom Filter Space Used: 498672
 Key cache capacity: 20
 Key cache size: 20
 Key cache hit rate: 0.9965579513062582
 Row cache: disabled
 Compacted row minimum size: 51
 Compacted row maximum size: 89970660
 Compacted row mean size: 226626


 Our heap config is as follows

 -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
 -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly

 from yaml
 in_memory_compaction_limit=64
 compaction_throughput_mb_sec=8
 multi_threaded_compaction=false

  INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085 AntiEntropyService.java
 (line 762) [repair #2b6fcbf0-c1f9-11e1--2ea8811bfbff] MsgIrtConv is
 fully synced
  INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085
 AntiEntropyService.java (line 698) [repair
 #2b6fcbf0-c1f9-11e1--2ea8811bfbff] session completed successfully
  INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219 CompactionTask.java
 (line 221) Compacted to
 [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,].  47,907,012 to
 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s.  Time:
 6,186ms.

 After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran
 for every 3 seconds, while CMS ran for every 30 seconds approx continuous
 for 40 minutes.

  INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line
 122) GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line
 122) GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is
 8506048512

 .

  INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line
 122) GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line
 122) GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line
 122) GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line
 122) GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line
 122) GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is
 8506048512
  INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 GCInspector.java (line
 122) GC for ParNew: 726 ms for 2 collections, 4941511184 used; max is
 8506048512

 After this, the node got stable and was back and running. Any