Re: Rhombus - A time-series object store for Cassandra

2013-07-14 Thread Ananth Gundabattula
Thanks for the pointer Aaron.

Regards,
Ananth

On 15-Jul-2013, at 8:30 AM, aaron morton 
aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:

For those following along at home, recently another project in this space was 
announced https://github.com/deanhiller/databus

Cheers

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/07/2013, at 4:01 PM, Ananth Gundabattula 
agundabatt...@threatmetrix.commailto:agundabatt...@threatmetrix.com wrote:

Hello Rob,

Thanks for the pointer. I have a couple of queries:

How does this project compare to the KairosDb project on github ( For one
I see that Rhombus supports multi column query which is cool whereas
kairos time series DB/OpenTSDB do not seem to have such a feature -
although we can use the tags to achieve something similar ? )

Are there any roll ups performed automatically by Rhombus ?

Can we control the TTL of the data being inserted ?

I am looking at the some of the time series based projects for production
use preferably running on top of cassandra and was wondering if Rhombus
can be seen as a pure time series optimized schema or something more than
that ?

Regards,
Ananth




On 7/12/13 7:15 AM, Rob Righter 
rob.righ...@pardot.commailto:rob.righ...@pardot.com wrote:

Hello,

Just wanted to share a project that we have been working on. It's a
time-series object store for Cassandra. We tried to generalize the
common use cases for storing time-series data in Cassandra and
automatically handle the denormalization, indexing, and wide row
sharding. It currently exists as a Java Library. We have it deployed
as a web service in a Dropwizard app server with a REST style
interface. The plan is to eventually release that Dropwizard app too.

The project and explanation is available on Github at:
https://github.com/Pardot/Rhombus

I would love to hear feedback.

Many Thanks,
Rob




Re: Rhombus - A time-series object store for Cassandra

2013-07-12 Thread Ananth Gundabattula
Hello Rob,

Thanks for the pointer. I have a couple of queries:

How does this project compare to the KairosDb project on github ( For one
I see that Rhombus supports multi column query which is cool whereas
kairos time series DB/OpenTSDB do not seem to have such a feature -
although we can use the tags to achieve something similar ? )

Are there any roll ups performed automatically by Rhombus ?

 Can we control the TTL of the data being inserted ?

I am looking at the some of the time series based projects for production
use preferably running on top of cassandra and was wondering if Rhombus
can be seen as a pure time series optimized schema or something more than
that ? 

Regards,
Ananth 




On 7/12/13 7:15 AM, Rob Righter rob.righ...@pardot.com wrote:

Hello,

Just wanted to share a project that we have been working on. It's a
time-series object store for Cassandra. We tried to generalize the
common use cases for storing time-series data in Cassandra and
automatically handle the denormalization, indexing, and wide row
sharding. It currently exists as a Java Library. We have it deployed
as a web service in a Dropwizard app server with a REST style
interface. The plan is to eventually release that Dropwizard app too.

The project and explanation is available on Github at:
https://github.com/Pardot/Rhombus

I would love to hear feedback.

Many Thanks,
Rob



Re: Migrating data from 2 node cluster to a 3 node cluster

2013-07-09 Thread Ananth Gundabattula
Hello everybody,

The thread below makes me wonder Does RF matter when using sstable loader.?  
My assumption was that stable loader will take care of RF when the streaming is 
done but just wanted to cross check. We are currently moving data from a RF=1 
to RF=3 cluster by using sstable loader tool. We will of course be running 
repair on the destination nodes but was wondering how is the following issue 
resolved using a repair if my understanding is wrong?

If the above assumption is wrong and since we are using Sstableloader which 
streams relevant parts to of each table to the destination cluster, it means 
the destination folder will only get one copy only (because origin RF =1 ) ? If 
that is the case, how will a repair resolve when a data chunk from an empty 
folder is used as the chosen replica to perform repair  ( as it possible that  
two nearest neighbors are empty in the first place ) .

Regards,
Ananth



From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Tuesday, July 9, 2013 3:24 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Migrating data from 2 node cluster to a 3 node cluster

Without vnodes the initial_token is stored in the yaml file, as well as the 
system LocationInfo CF.

With vnodes the only place the tokens are stored is in the system KS. So moving 
a node without it's system KS will cause it to generate new ones which will 
mean data is moved around.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 9/07/2013, at 11:23 AM, sankalp kohli 
kohlisank...@gmail.commailto:kohlisank...@gmail.com wrote:

Leaving the system keyspaces behind is OK if you are not using vnodes. 

Why is it different for vnodes?


On Mon, Jul 8, 2013 at 3:37 PM, aaron morton 
aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:
This might work for user created keyspaces but might not work for system 
keyspace
Leaving the system keyspaces behind is OK if you are not using vnodes.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.comhttp://www.thelastpickle.com/

On 9/07/2013, at 10:03 AM, sankalp kohli 
kohlisank...@gmail.commailto:kohlisank...@gmail.com wrote:

If RF=N or RFN, you can just copy all SStables to all nodes, watching out for 
name collision.

This might work for user created keyspaces but might not work for system 
keyspace


On Mon, Jul 8, 2013 at 2:07 PM, Robert Coli 
rc...@eventbrite.commailto:rc...@eventbrite.com wrote:
On Fri, Jul 5, 2013 at 7:54 PM, srmore 
comom...@gmail.commailto:comom...@gmail.com wrote:
RF of old and new cluster is the same RF=3. Keyspaces and schema info is also 
same.

You have a cluster where RF=3 and N=2? Does it.. work?

What are the tokens of old and new nodes?
tokens for old cluster ( 2-node )

If RF=N or RFN, you can just copy all SStables to all nodes, watching out for 
name collision.

=Rob







Re: Errors while upgrading from 1.1.10 version to 1.2.4 version

2013-06-30 Thread Ananth Gundabattula
Thanks for the pointer Fabien.


From: Fabien Rousseau fab...@yakaz.commailto:fab...@yakaz.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, June 28, 2013 6:35 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Errors while upgrading from 1.1.10 version to 1.2.4 version

Hello,

Have a look at : https://issues.apache.org/jira/browse/CASSANDRA-5476


2013/6/28 Ananth Gundabattula 
agundabatt...@threatmetrix.commailto:agundabatt...@threatmetrix.com
Hello Everybody,

We were performing an upgrade of our cluster from 1.1.10 version to 1.2.4 . We 
tested the upgrade process in a QA environment and found no issues. However in 
the production node, we faced loads of errors and had to abort the upgrade 
process.

I was wondering how we ran into such a situation. The main difference between 
the QA environment and the production environments is the Replication Factor. 
In QA , RF=1 and in production RF=3.

Example stack traces are  as seen on the other nodes are : 
http://pastebin.com/fSnMAd8q

The other observation is that the node which was being upgraded is a seed node 
in the 1.1.10. We aborted right after the first node gave the above issues. 
Does this mean that there will be an application downtime required if we go for 
rolling upgrade on a live cluster from 1.1.10 version to 1.2.4 version ?

Regards,
Ananth







--
Fabien Rousseau

[http://www.yakaz.com/img/logo_yakaz_small.png]
mailto:aur...@yakaz.comwww.yakaz.comhttp://www.yakaz.com/


Errors while upgrading from 1.1.10 version to 1.2.4 version

2013-06-27 Thread Ananth Gundabattula
Hello Everybody,

We were performing an upgrade of our cluster from 1.1.10 version to 1.2.4 . We 
tested the upgrade process in a QA environment and found no issues. However in 
the production node, we faced loads of errors and had to abort the upgrade 
process.

I was wondering how we ran into such a situation. The main difference between 
the QA environment and the production environments is the Replication Factor. 
In QA , RF=1 and in production RF=3.

Example stack traces are  as seen on the other nodes are : 
http://pastebin.com/fSnMAd8q

The other observation is that the node which was being upgraded is a seed node 
in the 1.1.10. We aborted right after the first node gave the above issues. 
Does this mean that there will be an application downtime required if we go for 
rolling upgrade on a live cluster from 1.1.10 version to 1.2.4 version ?

Regards,
Ananth






Re: Upgrade from 1.1.10 to 1.2.4

2013-06-24 Thread Ananth Gundabattula
Hello Rob,

I ran into the stack trace when the situation was :

num_tokens unset ( by this I mean not specifying anything ) and
initial_token set to some value.

I was initially under the impression that specifying num_tokens will over
ride the initial_token value and hence left num_tokens blank. I was able
to get past that exception only when num_token was specified with a value
of 1. 

Regards,
Ananth







On 6/25/13 3:27 AM, Robert Coli rc...@eventbrite.com wrote:

On Sun, Jun 23, 2013 at 2:31 AM, Ananth Gundabattula
agundabatt...@threatmetrix.com wrote:
 Looks like the cause of the error was because of not specifying
num_tokens
 in the cassandra.yaml file. I was under the impression that setting a
value
 of num_tokens will override the initial_token value . Looks like we
need to
 set num_tokens to 1 to get around this error. Not specifying anything
causes
 the above error.

My understanding is that the 1.2.x behavior here is :

1) initial_token set, num_tokens set = cassandra picks the num_tokens
value, ignores initial_token
2) initial_token unset, num_tokens unset = cassandra (until 2.0) picks
a single token via range bisection
3) initial_token unset, num_tokens set = cassandra uses num_tokens
number of vnodes

Are you saying this is not the behavior you saw?

=Rob



Re: Upgrade from 1.1.10 to 1.2.4

2013-06-23 Thread Ananth Gundabattula

Looks like the cause of the error was because of not specifying num_tokens in 
the cassandra.yaml file. I was under the impression that setting a value of 
num_tokens will override the initial_token value . Looks like we need to set 
num_tokens to 1 to get around this error. Not specifying anything causes the 
above error.

Regards,
Ananth


From: Ananth Gundabattula 
agundabatt...@threatmetrix.commailto:agundabatt...@threatmetrix.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Sunday, June 23, 2013 1:25 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Upgrade from 1.1.10 to 1.2.4

Hello everybody,

I am trying to perform a rolling upgrade from 1.1.10 to 1.2.4 ( with two 
patches to 1.2.4 https://issues.apache.org/jira/browse/CASSANDRA-5554
https://issues.apache.org/jira/browse/CASSANDRA-5418 as they might effect us in 
production )

I was wondering if anyone was able to perform a successful rolling upgrade from 
1.1.10 to 1.2.4?

I tried both a rolling upgrade while other nodes are on 1.1.10 version and also 
while all nodes in the cluster were shutdown and just the new version cassandra 
node coming up. The 1.1.10 version nodes see the 1.2.4 version node up but the 
1.2.4 version node crashes a few seconds after the start up.

I see the following exception in the logs when the 1.2.4 node starts up.

……
 INFO 03:03:26,399 Log replay complete, 13 replayed mutations
 INFO 03:03:26,631 Cassandra version: 1.2.4-SNAPSHOT
 INFO 03:03:26,631 Thrift API version: 19.35.0
 INFO 03:03:26,632 CQL supported versions: 2.0.0,3.0.1 (default: 3.0.1)
 INFO 03:03:26,660 Starting up server gossip
 INFO 03:03:26,671 Enqueuing flush of Memtable-local@1284117703(253/253 
serialized/live bytes, 9 ops)
 INFO 03:03:26,672 Writing Memtable-local@1284117703(253/253 serialized/live 
bytes, 9 ops)
 INFO 03:03:26,676 Completed flushing 
/data/cassandra/data/system/local/system-local-ib-4-Data.db (250 bytes) for 
commitlog position ReplayPosition(segmentId=1371956606055, position=50387)
 INFO 03:03:26,684 Compacting 
[SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-3-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-2-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-4-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-1-Data.db')]
 INFO 03:03:26,706 Compacted 4 sstables to 
[/data/cassandra/data/system/local/system-local-ib-5,].  852 bytes to 457 (~53% 
of original) in 19ms = 0.022938MB/s.  4 total rows, 1 unique.  Row merge counts 
were {1:0, 2:0, 3:0, 4:1, }
 INFO 03:03:26,769 Starting Messaging Service on port 7000
ERROR 03:03:26,842 Exception encountered during startup
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:716)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:542)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:439)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:716)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:542)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:439)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
Exception encountered during startup: null
ERROR 03:03:26,848 Exception in thread Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:321)
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:507)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.lang.Thread.run(Thread.java:722)


Regards,
Ananth




Merge two clusters into one - renaming an existing cluster

2013-06-23 Thread Ananth Gundabattula
Hello Everybody,

I am trying to merge two clusters into a single cluster ( the rationale being 
easier administration apart from better load balancing etc)

The plan is to rename a cluster (QAPERF1) as the same name as the second 
cluster (QAPERF2). Then alter the cassandra-toppology.properties and make them 
appear as different Dcs. Then finally alter replication settings and rebuild 
nodes of course after changing the seeds. It is made sure that the schema is 
same across the  two clusters. This is a test on apache cassandra 1.2.4.

In the process of  renaming an existing cluster, I have followed the 
instructions here : http://wiki.apache.org/cassandra/FAQ#clustername_mismatch

I get the following when restarting the node after restarting the first node 
after cluster name change ( The other nodes are yet to be restarted). It looks 
like the old cluster name has not taken into effect in spite of completing the 
flush as mentioned in the wiki.

ERROR [main] 2013-06-24 04:44:35,812 CassandraDaemon.java (line 222) Fatal 
exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name 
QAPERF1 != configured name QAPERF2
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:447)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:218)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)

In the process of reverting it back, I changed the configuration file back to 
have the old cluster name and now I get this exception.

ERROR [main] 2013-06-24 04:48:34,746 CassandraDaemon.java (line 428) Exception 
encountered during startup
java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:794)
at org.apache.cassandra.db.SystemTable.upgradeSystemData(SystemTable.java:164)
at org.apache.cassandra.db.SystemTable.finishStartup(SystemTable.java:98)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:317)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)

Can experts please advise what is the best way to rename a cluster in case I 
want to change it for version 1.2.4 ? Thanks for your time.

Regards,
Ananth


Upgrade from 1.1.10 to 1.2.4

2013-06-22 Thread Ananth Gundabattula
Hello everybody,

I am trying to perform a rolling upgrade from 1.1.10 to 1.2.4 ( with two 
patches to 1.2.4 https://issues.apache.org/jira/browse/CASSANDRA-5554
https://issues.apache.org/jira/browse/CASSANDRA-5418 as they might effect us in 
production )

I was wondering if anyone was able to perform a successful rolling upgrade from 
1.1.10 to 1.2.4?

I tried both a rolling upgrade while other nodes are on 1.1.10 version and also 
while all nodes in the cluster were shutdown and just the new version cassandra 
node coming up. The 1.1.10 version nodes see the 1.2.4 version node up but the 
1.2.4 version node crashes a few seconds after the start up.

I see the following exception in the logs when the 1.2.4 node starts up.

……
 INFO 03:03:26,399 Log replay complete, 13 replayed mutations
 INFO 03:03:26,631 Cassandra version: 1.2.4-SNAPSHOT
 INFO 03:03:26,631 Thrift API version: 19.35.0
 INFO 03:03:26,632 CQL supported versions: 2.0.0,3.0.1 (default: 3.0.1)
 INFO 03:03:26,660 Starting up server gossip
 INFO 03:03:26,671 Enqueuing flush of Memtable-local@1284117703(253/253 
serialized/live bytes, 9 ops)
 INFO 03:03:26,672 Writing Memtable-local@1284117703(253/253 serialized/live 
bytes, 9 ops)
 INFO 03:03:26,676 Completed flushing 
/data/cassandra/data/system/local/system-local-ib-4-Data.db (250 bytes) for 
commitlog position ReplayPosition(segmentId=1371956606055, position=50387)
 INFO 03:03:26,684 Compacting 
[SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-3-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-2-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-4-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/local/system-local-ib-1-Data.db')]
 INFO 03:03:26,706 Compacted 4 sstables to 
[/data/cassandra/data/system/local/system-local-ib-5,].  852 bytes to 457 (~53% 
of original) in 19ms = 0.022938MB/s.  4 total rows, 1 unique.  Row merge counts 
were {1:0, 2:0, 3:0, 4:1, }
 INFO 03:03:26,769 Starting Messaging Service on port 7000
ERROR 03:03:26,842 Exception encountered during startup
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:716)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:542)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:439)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:716)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:542)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:439)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
Exception encountered during startup: null
ERROR 03:03:26,848 Exception in thread Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:321)
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:507)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.lang.Thread.run(Thread.java:722)


Regards,
Ananth




What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread Ananth Gundabattula
We are currently running on 1.1.10 and planning to migrate to a higher
version 1.2.4.

The question pertains to tweaking all the knobs to reduce GC related issues
( we have been fighting a lot of really bad GC issues on 1.1.10 and met with 
little
success all the way using 1.1.10)

Taking into consideration GC tuning is a black art, I was wondering if we
can have some good effect on the GC by tweaking the following settings:

*thrift_framed_transport_size_in_mb  thrift_max_message_length_in_mb*
*
*
Our system is a very short column (both in number of columns and data sizes
) tables but having millions/billions of rows in each column family. The typical
number of columns in each column family is 4. The typical lookup involves
specifying the row key and fetching one column most of the times. The
writes are also similar except for one keyspace where the number of columns
are 50 but very small data sizes per column.

Assuming we can tweak the config values :
*
*
*  thrift_framed_transport_size_in_mb  *
*   thrift_max_message_length_in_mb *

to lower values in the above context, I was wondering if it helps in the GC
being invoked less if the thrift settings reflect our data model reads and 
writes ?

For example: What is the impact by reducing the above config values on the
GC to say 1 mb rather than say 15 or 16 ?

Thanks a lot for your inputs and thoughts.


Regards,
Ananth


Re: What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread Ananth Gundabattula
Thanks Aaron for the insight.

One quick question:

The buffers are not pre allocated, but once they are allocated they are
not returned. So it's only an issue if have lots of clients connecting
and reading a lot of data.
So to understand you correctly, the buffer is allocated per client
connection and remains all the while during the JVM and is reused for each
request ? 
If that is the case, then I am presuming there is no much gain by playing
around with this config with respect to optimizing for Gcs.

reduce bloom filters, index intervals Š.
Well we have tried all the configs as advised below (and others like key
cache sizes etc ) and hit a dead end and that is the reason for a 1.2.4
move. Thanks for all your thoughts and advice on this.


Regards,
Ananth 



On 6/18/13 5:56 PM, aaron morton aa...@thelastpickle.com wrote:

 *thrift_framed_transport_size_in_mb  thrift_max_message_length_in_mb*
This control the max size of a bugger allocated by thrift when processing
requests / responses. The buffers are not pre allocated, but once they
are allocated they are not returned. So it's only an issue if have lots
of clients connecting and reading a lot of data.

 Our system is a very short column (both in number of columns and data
sizes
 ) tables but having millions/billions of rows in each column family.
If you have over 500 million rows per node you may be running into issues
with the bloom filters and index samples.

This typically looks like the heap usage does not reduce after CMS
compaction has completed.

Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size
tiered compaction and 0.1 for levelled compaction. If you need to change
it  run nodetool upgradesstables

Then consider increasing the index_interval in the yaml file, see the
comments. 

Note that v 1.2 moves the bloom filters off heap, so if you upgrade to
1.2 it will probably resolve your issues.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:30 PM, Ananth Gundabattula
agundabatt...@threatmetrix.com wrote:

 We are currently running on 1.1.10 and planning to migrate to a higher
 version 1.2.4.
 
 The question pertains to tweaking all the knobs to reduce GC related
issues
 ( we have been fighting a lot of really bad GC issues on 1.1.10 and met
with little
 success all the way using 1.1.10)
 
 Taking into consideration GC tuning is a black art, I was wondering if
we
 can have some good effect on the GC by tweaking the following settings:
 
 *thrift_framed_transport_size_in_mb  thrift_max_message_length_in_mb*
 *
 *
 Our system is a very short column (both in number of columns and data
sizes
 ) tables but having millions/billions of rows in each column family.
The typical
 number of columns in each column family is 4. The typical lookup
involves
 specifying the row key and fetching one column most of the times. The
 writes are also similar except for one keyspace where the number of
columns
 are 50 but very small data sizes per column.
 
 Assuming we can tweak the config values :
 *
 *
 *  thrift_framed_transport_size_in_mb  *
 *   thrift_max_message_length_in_mb *
 
 to lower values in the above context, I was wondering if it helps in
the GC
 being invoked less if the thrift settings reflect our data model reads
and writes ?
 
 For example: What is the impact by reducing the above config values on
the
 GC to say 1 mb rather than say 15 or 16 ?
 
 Thanks a lot for your inputs and thoughts.
 
 
 Regards,
 Ananth




Re: Query regarding SSTable timestamps and counts

2012-11-20 Thread Ananth Gundabattula
Thanks a lot Aaron and Edward.

The mail thread clarifies some things for me.

For letting others know on this thread, running an upgradesstables did
decrease our bloom filter false positive ratios a lot. ( upgradesstables
was run not to upgrade from a casasndra version to a higher cassandra
version but because of all the node movement we had done to upgrade our
cluster in a staggered way with aborted attempts in between and I
understand that upgradesstables was not necessarily required for the high
bloom filter false positives rates we were seeing )


Regards,
Ananth


On Wed, Nov 21, 2012 at 9:45 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Tue, Nov 20, 2012 at 5:23 PM, aaron morton aa...@thelastpickle.com
 wrote:
  My understanding of the compaction process was that since data files keep
  continuously merging we should not have data files with very old last
  modified timestamps
 
  It is perfectly OK to have very old SSTables.
 
  But performing an upgradesstables did decrease the number of data files
 and
  removed all the data files with the old timestamps.
 
  upgradetables re-writes every sstable to have the same contents in the
  newest format.
 
  Cheers
 
  -
  Aaron Morton
  Freelance Cassandra Developer
  New Zealand
 
  @aaronmorton
  http://www.thelastpickle.com
 
  On 19/11/2012, at 4:57 PM, Ananth Gundabattula agundabatt...@gmail.com
  wrote:
 
  Hello Aaron,
 
  Thanks a lot for the reply.
 
  Looks like the documentation is confusing. Here is the link I am
 referring
  to:
 http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction
 
 
  It does not disable compaction.
  As per the above url,  After running a major compaction, automatic minor
  compactions are no longer triggered, frequently requiring you to manually
  run major compactions on a routine basis. ( Just before the heading
 Tuning
  Column Family compression in the above link)
 
  With respect to the replies below :
 
 
  it creates one big file, which will not be compacted until there are (by
  default) 3 other very big files.
  This is for the minor compaction and major compaction should
 theoretically
  result in one large file irrespective of the number of data files
 initially?
 
 This is not something you have to worry about. Unless you are seeing
  1,000's of files using the default compaction.
 
  Well my worry has been because of the large amount of node movements we
 have
  done in the ring. We started off with 6 nodes and increased the capacity
 to
  12 with disproportionate increases every time which resulted in a lot of
  clean of data folders except system, run repair and then a cleanup with
 an
  aborted attempt in between.
 
  There were some data.db files older by more than 2 weeks and were not
  modified since then. My understanding of the compaction process was that
  since data files keep continuously merging we should not have data files
  with very old last modified timestamps (assuming there is a good amount
 of
  writes to the table continuously) I did not have a for sure way of
 telling
  if everything is alright with the compaction looking at the last modified
  timestamps of all the data.db files.
 
 What are the compaction issues you are having ?
  Your replies confirm that the timestamps should not be an issue to worry
  about. So I guess I should not be calling them as issues any more.  But
  performing an upgradesstables did decrease the number of data files and
  removed all the data files with the old timestamps.
 
 
 
  Regards,
  Ananth
 
 
  On Mon, Nov 19, 2012 at 6:54 AM, aaron morton aa...@thelastpickle.com
  wrote:
 
  As per datastax documentation, a manual compaction forces the admin to
  start compaction manually and disables the automated compaction
 (atleast for
  major compactions but not minor compactions )
 
  It does not disable compaction.
  it creates one big file, which will not be compacted until there are (by
  default) 3 other very big files.
 
 
  1. Does a nodetool stop compaction also force the admin to manually run
  major compaction ( I.e. disable automated major compactions ? )
 
  No.
  Stop just stops the current compaction.
  Nothing is disabled.
 
  2. Can a node restart reset the automated major compaction if a node
 gets
  into a manual mode compaction for whatever reason ?
 
  Major compaction is not automatic. It is the manual nodetool compact
  command.
  Automatic (minor) compaction is controlled by min_compaction_threshold
 and
  max_compaction_threshold (for the default compaction strategy).
 
  3. What is the ideal  number of SSTables for a table in a keyspace ( I
  mean are there any indicators as to whether my compaction is alright or
 not
  ? )
 
  This is not something you have to worry about.
  Unless you are seeing 1,000's of files using the default compaction.
 
   For example, I have seen SSTables on the disk more than 10 days old
  wherein there were other SSTables belonging to the same table but much

Re: Query regarding SSTable timestamps and counts

2012-11-18 Thread Ananth Gundabattula
Hello Aaron,

Thanks a lot for the reply.

Looks like the documentation is confusing. Here is the link I am referring
to:  http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction


 It does not disable compaction.
As per the above url,  After running a major compaction, automatic minor
compactions are no longer triggered, frequently requiring you to manually
run major compactions on a routine basis. ( Just before the heading Tuning
Column Family compression in the above link)

With respect to the replies below :


 it creates one big file, which will not be compacted until there are (by
default) 3 other very big files.
This is for the minor compaction and major compaction
should theoretically result in one large file irrespective of the number of
data files initially?

This is not something you have to worry about. Unless you are seeing
1,000's of files using the default compaction.

Well my worry has been because of the large amount of node movements we
have done in the ring. We started off with 6 nodes and increased the
capacity to 12 with disproportionate increases every time which resulted in
a lot of clean of data folders except system, run repair and then a cleanup
with an aborted attempt in between.

There were some data.db files older by more than 2 weeks and were not
modified since then. My understanding of the compaction process was that
since data files keep continuously merging we should not have data files
with very old last modified timestamps (assuming there is a good amount of
writes to the table continuously) I did not have a for sure way of telling
if everything is alright with the compaction looking at the last modified
timestamps of all the data.db files.

What are the compaction issues you are having ?
Your replies confirm that the timestamps should not be an issue to worry
about. So I guess I should not be calling them as issues any more.  But
performing an upgradesstables did decrease the number of data files and
removed all the data files with the old timestamps.



Regards,
Ananth


On Mon, Nov 19, 2012 at 6:54 AM, aaron morton aa...@thelastpickle.comwrote:

 As per datastax documentation, a manual compaction forces the admin to
 start compaction manually and disables the automated compaction (atleast
 for major compactions but not minor compactions )

 It does not disable compaction.
 it creates one big file, which will not be compacted until there are (by
 default) 3 other very big files.


 1. Does a nodetool stop compaction also force the admin to manually run
 major compaction ( I.e. disable automated major compactions ? )

 No.
 Stop just stops the current compaction.
 Nothing is disabled.

 2. Can a node restart reset the automated major compaction if a node gets
 into a manual mode compaction for whatever reason ?

 Major compaction is not automatic. It is the manual nodetool compact
 command.
 Automatic (minor) compaction is controlled by min_compaction_threshold and
 max_compaction_threshold (for the default compaction strategy).

 3. What is the ideal  number of SSTables for a table in a keyspace ( I
 mean are there any indicators as to whether my compaction is alright or not
 ? )

 This is not something you have to worry about.
 Unless you are seeing 1,000's of files using the default compaction.

  For example, I have seen SSTables on the disk more than 10 days old
 wherein there were other SSTables belonging to the same table but much
 younger than the older SSTables (

 No problems.

 4. Does a upgradesstables fix any compaction issues ?

 What are the compaction issues you are having ?


 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 18/11/2012, at 1:18 AM, Ananth Gundabattula agundabatt...@gmail.com
 wrote:


 We have a cluster  running cassandra 1.1.4. On this cluster,

 1. We had to move the nodes around a bit  when we were adding new nodes
 (there was quite a good amount of node movement )

 2. We had to stop compactions during some of the days to save some disk
  space on some of the nodes when they were running very very low on disk
 spaces. (via nodetool stop COMPACTION)


 As per datastax documentation, a manual compaction forces the admin to
 start compaction manually and disables the automated compaction (atleast
 for major compactions but not minor compactions )


 Here are the questions I have regarding compaction:

 1. Does a nodetool stop compaction also force the admin to manually run
 major compaction ( I.e. disable automated major compactions ? )

 2. Can a node restart reset the automated major compaction if a node gets
 into a manual mode compaction for whatever reason ?

 3. What is the ideal  number of SSTables for a table in a keyspace ( I
 mean are there any indicators as to whether my compaction is alright or not
 ? )  . For example, I have seen SSTables on the disk more than 10 days old
 wherein there were other SSTables belonging

Re: read request distribution

2012-11-12 Thread Ananth Gundabattula
Hi all,

On an unrelated observation of the below readings, it looks like all the 3
nodes own 100% of the data. This confuses me a bit. We have a 12 node
cluster with RF=3 but the effective ownership is shown as 8.33 % .

So here is my question. How is the ownership calculated : Is Replica factor
considered in the ownership calculation ? ( If yes , then 8.33 % ownership
of a cluster seems wrong to me . If not 100% ownership for a node cluster
seems wrong to me. Am I missing something in the calculation?

Regards,
Ananth

On Fri, Nov 9, 2012 at 4:37 PM, Wei Zhu wz1...@yahoo.com wrote:

 Hi All,
 I am doing a benchmark on a Cassandra. I have a three node cluster with
 RF=3. I generated 6M rows with sequence  number from 1 to 6m, so the rows
 should be evenly distributed among the three nodes disregarding the
 replicates.
 I am doing a benchmark with read only requests, I generate read request
 for randomly generated keys from 1 to 6M. Oddly, nodetool cfstats, reports
 that one node has only half the requests as the other one and the third
 node sits in the middle. So the ratio is like 2:3:4. The node with the most
 read requests actually has the smallest latency and the one with the least
 read requests reports the largest latency. The difference is pretty big,
 the fastest is almost double the slowest.
 All three nodes have the exactly the same hardware and the data size on
 each node are the same since the RF is three and all of them have the
 complete data. I am using Hector as client and the random read request are
 in millions. I can't think of a reasonable explanation.  Can someone please
 shed some lights?

 Thanks.
 -Wei



Re: configure KeyCahce to use Non-Heap memory ?

2012-09-05 Thread Ananth Gundabattula
Hello Aaron,

Thanks a lot for the response. Raised a request 
https://issues.apache.org/jira/browse/CASSANDRA-4619

Here is the nodetool dump: (from one of the two nodes in the cluster)

Token: 0
Gossip active: true
Thrift active: true
Load : 147.64 GB
Generation No: 1346635362
Uptime (seconds) : 182707
Heap Memory (MB) : 4884.33 / 8032.00
Data Center  : datacenter1
Rack : rack1
Exceptions   : 0
Key Cache: size 777651120 (bytes), capacity 777651120 (bytes), 44354999 
hits, 98275175 requests, 0.451 recent hit rate, 14400 save period in seconds
Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN 
recent hit rate, 0 save period in seconds


Number of rows in the 2 node cluster is 74+ Million



Regards,
Ananth




From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wednesday, September 5, 2012 11:33 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: configure KeyCahce to use Non-Heap memory ?

Is there any way I can configure KeyCahce to use Non-Heap memory ?
No.
You could add a feature request here 
https://issues.apache.org/jira/browse/CASSANDRA

Could you post some stats on the current key cache size and hit rate ? (from 
nodetool info)
It would be interesting to know how many keys it contains Vs the number of rows 
on the box and the hit rate.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/09/2012, at 3:01 PM, Ananth Gundabattula 
agundabatt...@threatmetrix.commailto:agundabatt...@threatmetrix.com wrote:


Is there any way I can configure KeyCahce to use Non-Heap memory ?

We have large memory nodes :  ~96GB memory per node and effectively using only  
8 GB configured for heap ( to avoid GC issues because of a large heap)

We have a constraint with respect to :

 1.  Row cache models don't reflect our data query patterns and hence can only 
optimize on the key cache
 2.  Time constrained to change our schema to be more NO-SQL specific


Regards,
Ananth



configure KeyCahce to use Non-Heap memory ?

2012-09-03 Thread Ananth Gundabattula

Is there any way I can configure KeyCahce to use Non-Heap memory ?

We have large memory nodes :  ~96GB memory per node and effectively using only  
8 GB configured for heap ( to avoid GC issues because of a large heap)

We have a constraint with respect to :

 1.  Row cache models don't reflect our data query patterns and hence can only 
optimize on the key cache
 2.  Time constrained to change our schema to be more NO-SQL specific


Regards,
Ananth