Re: slow read

2012-03-05 Thread ruslan usifov
2012/3/5 Jeesoo Shin bsh...@gmail.com

 Hi all.

 I have very SLOW READ here. :-(
 I made a cluster with three node (aws xlarge, replication = 3)
 Cassandra version is 1.0.6
 I have inserted 1,000,000 rows. (standard column)
 Each row has 200 columns.
 Each column has 16 byte key,  512 byte value.

 I used Hector createSliceQuery to get one column in a row.
 This basic query(random row, fixed column) is created with multiple
 thread and hit cassandra.

 I only get up to 140 request per second. Is this all I can get for read?
 Or am I doing something wrong?
 Interestingly, when I request rows which doesn't exist, it goes up to
 1600 per second.


You must test read performance by paralel test (ie multiple threads). The
result when not existent rows are more faster is result of bloom filter




 ANY insight, share will be extremely helpful.
 Thank you.

 Regards,
 Jeesoo.



Re: Secondary indexes don't go away after metadata change

2012-03-05 Thread aaron morton
The secondary index CF's are marked as no longer required / marked as 
compacted. under 1.x they would then be deleted reasonably quickly, and 
definitely deleted after a restart. 

Is there a zero length .Compacted file there ? 

 Also, when adding a new node to the ring the new node will build indexes for 
 the ones that supposedly don’t exist any longer.  Is this supposed to happen? 
  Would this have happened if I had deleted the old SSTables from the 
 previously existing nodes?
Check you have a consistent schema using describe cluster in the CLI. And check 
the schema is what you think it is using show schema. 

Another trick is to do a snapshot. Only the files in use are included the 
snapshot. 

Hope that helps. 
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 2:53 AM, Frisch, Michael wrote:

 I have a few column families that I decided to get rid of the secondary 
 indexes on.  I see that there aren’t any new index SSTables being created, 
 but all of the old ones remain (some from as far back as September).  Is it 
 safe to just delete then when the node is offline?  Should I run clean-up or 
 scrub?
  
 Also, when adding a new node to the ring the new node will build indexes for 
 the ones that supposedly don’t exist any longer.  Is this supposed to happen? 
  Would this have happened if I had deleted the old SSTables from the 
 previously existing nodes?
  
 The nodes in question have either been upgraded from v0.8.1 = v1.0.2 
 (scrubbed at this time) = v1.0.6 or from v1.0.2 = v1.0.6.  The secondary 
 index was dropped when the nodes were version 1.0.6.  The new node added was 
 also 1.0.6.
  
 - Mike



Re: slow read

2012-03-05 Thread Jeesoo Shin
Thank you for reply. :)
Yes I did multiple thread.
160, 320 gave me same result.

On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote:
 2012/3/5 Jeesoo Shin bsh...@gmail.com

 Hi all.

 I have very SLOW READ here. :-(
 I made a cluster with three node (aws xlarge, replication = 3)
 Cassandra version is 1.0.6
 I have inserted 1,000,000 rows. (standard column)
 Each row has 200 columns.
 Each column has 16 byte key,  512 byte value.

 I used Hector createSliceQuery to get one column in a row.
 This basic query(random row, fixed column) is created with multiple
 thread and hit cassandra.

 I only get up to 140 request per second. Is this all I can get for read?
 Or am I doing something wrong?
 Interestingly, when I request rows which doesn't exist, it goes up to
 1600 per second.


 You must test read performance by paralel test (ie multiple threads). The
 result when not existent rows are more faster is result of bloom filter




 ANY insight, share will be extremely helpful.
 Thank you.

 Regards,
 Jeesoo.




Re: can't find rows

2012-03-05 Thread aaron morton
am guessing a lot here, but I would check if auto_bootstrap is enabled. It is 
by default. 

When a new node joins reads are not directed to it until it is marked as UP 
(writes are sent to it as it is joining). So reads should continue to go to the 
original UP node.

Sounds like it's all running now. If you get see it again can you provide some 
detailed steps such as the type of query and the CL level. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 7:35 AM, Casey Deccio wrote:

 On Thu, Mar 1, 2012 at 9:33 AM, aaron morton aa...@thelastpickle.com wrote:
 What RF were you using and had you been running repair regularly ? 
 
 
 RF 1 *sigh*.  Waiting until I have more/better resources to use RF  1.  
 Hopefully soon.
 
 In the mean time... Oddly (to me), when I removed the most recently added 
 node, all my rows re-appeared, but were only up-to-date as of a 10 days ago 
 (a few days before I added the node).  None of the supercolumns since then 
 show up.  But when I look at the sstable files on the different nodes, I see 
 large files with timestamps in between that date and today's, which makes me 
 think the data is still there.  Also, if I re-add the troublesome new node 
 (not having run cleanup), all rows are again inaccessible until I again 
 decommission it.
 
 Casey



Re: Schema change causes exception when adding data

2012-03-05 Thread aaron morton
I don't have a lot of Hector experience but it sounds like the way to go. 

The CLI and cqlsh will take care of this. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 10:12 AM, Tharindu Mathew wrote:

 There are 2. I'd like to wait till there are one, when I insert the value.
 
 Going through the code, calling client.describe_schema_versions() seems to 
 give a good answer to this. And I discovered that if I wait till there is 
 only 1 version, I will not get this error.
 
 Is this the best practice if I want to check this programatically?
 
 On Thu, Mar 1, 2012 at 11:15 PM, aaron morton aa...@thelastpickle.com wrote:
 use describe cluster in the CLI to see how many schema versions there are. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 2/03/2012, at 12:25 AM, Tharindu Mathew wrote:
 
 
 
 On Thu, Mar 1, 2012 at 11:47 AM, Tharindu Mathew mcclou...@gmail.com wrote:
 Jeremiah,
 
 Thanks for the reply.
 
 This is what we have been doing, but it's not reliable as we don't know a 
 definite time that the schema would get replicated. Is there any way I can 
 know for sure that changes have propagated?
 [Edit: corrected to a question] 
 
 Then I can block the insertion of data until then.
 
 
 On Thu, Mar 1, 2012 at 4:33 AM, Jeremiah Jordan 
 jeremiah.jor...@morningstar.com wrote:
 The error is that the specified colum family doesn’t exist.  If you connect 
 with the CLI and describe the keyspace does it show up?  Also, after adding 
 a new column family programmatically you can’t use it immediately, you have 
 to wait for it to propagate.  You can use calls to describe schema to do so, 
 keep calling it until every node is on the same schema.
 
  
 
 -Jeremiah
 
  
 
 From: Tharindu Mathew [mailto:mcclou...@gmail.com] 
 Sent: Wednesday, February 29, 2012 8:27 AM
 To: user
 Subject: Schema change causes exception when adding data
 
  
 
 Hi,
 
 I have a 3 node cluster and I'm dynamically updating a keyspace with a new 
 column family. Then, when I try to write records to it I get the following 
 exception shown at [1].
 
 How do I avoid this. I'm using Hector and the default consistency level of 
 QUORUM is used. Cassandra version 0.7.8. Replication Factor is 1.
 
 How can I solve my problem?
 
 [1] -
 me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
 InvalidRequestException(why:unconfigured columnfamily proxySummary)
 
 at 
 me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
 
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:397)
 
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:383)
 
 at 
 me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
 
 at 
 me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:156)
 
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
 
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl.multigetSlice(KeyspaceServiceImpl.java:401)
 
 at 
 me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery$1.doInKeyspace(ThriftMultigetSliceQuery.java:67)
 
 at 
 me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery$1.doInKeyspace(ThriftMultigetSliceQuery.java:59)
 
 at 
 me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
 
 at 
 me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:72)
 
 at 
 me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery.execute(ThriftMultigetSliceQuery.java:58)
 
 
 
 -- 
 Regards,
 
 Tharindu
 
  
 
 blog: http://mackiemathew.com/
 
  
 
 
 
 
 -- 
 Regards,
 
 Tharindu
 
 blog: http://mackiemathew.com/
 
 
 
 
 -- 
 Regards,
 
 Tharindu
 
 blog: http://mackiemathew.com/
 
 
 
 
 
 -- 
 Regards,
 
 Tharindu
 
 blog: http://mackiemathew.com/
 



Re: slow read

2012-03-05 Thread ruslan usifov
And sum of all rq/s threads is 160??

2012/3/5 Jeesoo Shin bsh...@gmail.com

 Thank you for reply. :)
 Yes I did multiple thread.
 160, 320 gave me same result.

 On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote:
  2012/3/5 Jeesoo Shin bsh...@gmail.com
 
  Hi all.
 
  I have very SLOW READ here. :-(
  I made a cluster with three node (aws xlarge, replication = 3)
  Cassandra version is 1.0.6
  I have inserted 1,000,000 rows. (standard column)
  Each row has 200 columns.
  Each column has 16 byte key,  512 byte value.
 
  I used Hector createSliceQuery to get one column in a row.
  This basic query(random row, fixed column) is created with multiple
  thread and hit cassandra.
 
  I only get up to 140 request per second. Is this all I can get for read?
  Or am I doing something wrong?
  Interestingly, when I request rows which doesn't exist, it goes up to
  1600 per second.
 
 
  You must test read performance by paralel test (ie multiple threads). The
  result when not existent rows are more faster is result of bloom filter
 
 
 
 
  ANY insight, share will be extremely helpful.
  Thank you.
 
  Regards,
  Jeesoo.
 
 



Re: composite types in CQL

2012-03-05 Thread aaron morton
It's not currently supported in CQL 
https://issues.apache.org/jira/browse/CASSANDRA-3761

You can do it using the CLI, see the online help. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 10:39 AM, Bayle Shanks wrote:

 hi, i'm wondering how to do composite data
 storage types in CQL. I am trying to mimic the Composite Types
 functionality of the Pycassa client:
 
 http://pycassa.github.com/pycassa/assorted/composite_types.html
 
 
 In short, in Pycassa you can do something like:
 
 ---
 
 itemTimeCompositeType = CompositeType(UTF8Type(), LongType())
 
 pycassa.system_manager.SystemManager().create_column_family(
  keyspaceName,
  columnFamilyName, 
  key_validation_class=itemTimeCompositeType
 )
 
 ...
 
 columnFamily.insert(self._makeKey(item, time_as_integer), {field : value})
 
 ---
 
 
 and then your primary key for this column family is a pair of a string
 and an integer. This is important because i am using
 ByteOrderedPartitioner and doing range scans among keys which share
 the same string 'item' but have different values for the integer.
 
 My motivation is that i am trying to port
 https://github.com/bshanks/cassandra-timeseries-py to Ruby and i
 thought i might try CQL.
 
 thanks,
  bayle



Re: Test Data creation in Cassandra

2012-03-05 Thread aaron morton
try tools/stress in the source distribution. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/03/2012, at 6:01 AM, A J wrote:

 What is the best way to create millions of test data in Cassandra ?
 
 I would like to have some script where I first insert say 100 rows in
 a CF. Then reinsert the same data on 'server side' with new unique
 key. That will make it 200 rows. Then continue the exercise a few
 times till I get lot of records.
 I don't care if the column names and values are identical between the
 different rows. Just a lot of records generated for a few seed
 records.
 
 The rows are very fat. So I don't want to use any client side
 scripting that would push individual or batched rows to cassandra.
 
 Thanks for any tips.



RE: cli question

2012-03-05 Thread Rishabh Agrawal
I faced the same issue some time back. Solution which fit my bill is as follows:

CREATE COLUMN FAMILY aaa
with comparator = 'CompositeType(UTF8Type,UTF8Type)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)';

notice I have mentioned three datatypes or validators in key_validation_class 
under CompositeType.

Now if I have to insert with key aaa:bbb:ccc it will work smoothly and even if 
I wish to insert with just aaa:bbb it will work just fine.

Do let me know if it solves your problem.

Regards
RIshabh Agrawal


From: Tamar Fraenkel [mailto:ta...@tok-media.com]
Sent: Monday, March 05, 2012 1:19 PM
To: cassandra-u...@incubator.apache.org
Subject: cli question

Hi!
I have CF with the following deffinition:

CREATE COLUMN FAMILY a_b_indx
with comparator = 'CompositeType(LongType,UUIDType)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'CompositeType(UTF8Type,UTF8Type)';

Where the key may be a composite of the following two strings: 'AAA' and 
'BBB:CCC'
Notice, that the second string has ':' in it.
I try to query for rows I know exist in the CF but can't.
I tried those and many more :)

  *   get  a_b_indx ['AAA:BBB:CCC'];
  *   get  a_b_indx ['AAA:BBB\:CCC'];
  *   get  a_b_indx [utf8('AAA'):utf8('BBB:CCC')];

Is it possible? Does anyone know how?

Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.commailto:ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956






Impetus' Head of Innovation labs, Vineet Tyagi will be presenting on 'Big Data 
Big Costs?' at the Strata Conference, CA (Feb 28 - Mar 1) http://bit.ly/bSMWd7.

Listen to our webcast 'Hybrid Approach to Extend Web Apps to Tablets  
Smartphones' available at http://bit.ly/yQC1oD.


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.
inline: image001.png

running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Hi!
I have a Cassandra  cluster with two nodes

nodetool ring -h localhost
Address DC  RackStatus State   LoadOwns
   Token

   85070591730234615865843651857942052864
10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB
50.00%  0
10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB
50.00%  85070591730234615865843651857942052864

I want to create a second ring with the same name but two different nodes.
using tokengentool I get the same tokens as they are affected from the
number of nodes in a ring.

My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
*In 10.0.0.31 cassandra.yaml I will set*
initial_token: 0
seeds: 10.0.0.31
listen_address: 10.0.0.31
rpc_address: 0.0.0.0

*In 10.0.0.11 cassandra.yaml I will set*
initial_token: 85070591730234615865843651857942052864
seeds: 10.0.0.31
listen_address: 10.0.0.11
rpc_address: 0.0.0.0

*Would the rings be separate?*

Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956
tokLogo.png

Re: cli question

2012-03-05 Thread Tamar Fraenkel
Thanks!
I decided to just replace all : with ^ and I can simply run:
get  a_b_indx ['AAA:BBB^CCC'];


*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 11:58 AM, Rishabh Agrawal 
rishabh.agra...@impetus.co.in wrote:

  I faced the same issue some time back. Solution which fit my bill is as
 follows:



 CREATE COLUMN FAMILY aaa

 with comparator = 'CompositeType(UTF8Type,UTF8Type)'

 and default_validation_class = 'UTF8Type'

 and key_validation_class =
 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)';



 notice I have mentioned three datatypes or validators in
 key_validation_class under CompositeType.



 Now if I have to insert with key aaa:bbb:ccc it will work smoothly and
 even if I wish to insert with just aaa:bbb it will work just fine.



 Do let me know if it solves your problem.



 Regards

 RIshabh Agrawal





 *From:* Tamar Fraenkel [mailto:ta...@tok-media.com]
 *Sent:* Monday, March 05, 2012 1:19 PM
 *To:* cassandra-u...@incubator.apache.org
 *Subject:* cli question



 Hi!
 I have CF with the following deffinition:



 CREATE COLUMN FAMILY a_b_indx

 with comparator = 'CompositeType(LongType,UUIDType)'

 and default_validation_class = 'UTF8Type'

 and key_validation_class = 'CompositeType(UTF8Type,UTF8Type)';



 Where the key may be a composite of the following two strings: 'AAA' and
 'BBB:CCC'

 Notice, that the second string has ':' in it.

 I try to query for rows I know exist in the CF but can't.

 I tried those and many more :)

- get  a_b_indx ['AAA:BBB:CCC'];
- get  a_b_indx ['AAA:BBB\:CCC'];
- get  a_b_indx [utf8('AAA'):utf8('BBB:CCC')];



 Is it possible? Does anyone know how?



 Thanks,


   *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956







 --

 Impetus’ Head of Innovation labs, Vineet Tyagi will be presenting on ‘Big
 Data Big Costs?’ at the Strata Conference, CA (Feb 28 - Mar 1)
 http://bit.ly/bSMWd7.

 Listen to our webcast ‘Hybrid Approach to Extend Web Apps to Tablets 
 Smartphones’ available at http://bit.ly/yQC1oD.


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this email is prohibited when
 received in error. Impetus does not represent, warrant and/or guarantee,
 that the integrity of this communication has been maintained nor that the
 communication is free of errors, virus, interception or interference.

tokLogo.pngimage001.png

Re: Maximum Row Size in Cassandra : Potential Bottleneck

2012-03-05 Thread aaron morton
 Is there any way in which the writes can be made pretty slow on different 
 nodes. Ideally I would like data to be written on one node and eventually 
 replicating across other nodes I dont really need a real time update, so can 
 pretty much live with slow writes.
Replicating inside the mutation request is a core feature of cassandra. 

You can hack something by disabling the the gossip on a node and doing the 
inserts on it (at CL One). Re-enable gossip and let HH send the data to the 
other nodes, or disable HH and use repair to distribute the changes. HH will be 
less resource intensive. 

 1250.188: [Full GC [PSYoungGen: 76825K-0K(571648K)] [PSOldGen: 
 7356362K-2356764K(7569408K)] 7433188K-2356764K(8141056K) [PSPermGen: 
 32579K-32579K(61056K)], 8.1019330 secs] [Times: user=8.10 sys=0.00, 
 real=8.10 secs]
What JVM are you using and what are the JVM options ? 
(The info I can find about PSYoungGen suggest it's pretty old. )

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 1:13 AM, Shubham Srivastava wrote:

 Tried all the possible options and nothing actually seems to work.
 
 Was trying to get insights of where exactly problem is arising when writes on 
 done on one node and read on another . I found that GC gets triggered when 
 writes are done on the other node through RowMutationVerbHandler.
 
 Settings
 
 1.I tested this on two node setup with RF:2 and Read CL:1. 
 2.Also heap is of 8G and Xmn:800M on both the nodes with 4Cores.
 3.I am using concurrent_write:32 as default (8 * core). 
 4. -Dcassandra.compaction.priority=1
 5.No explicit GC settings commented the one in solandra-env.sh
 6.in_memory_compaction_limit_in_mb: 1
 7.read repair:0.1
 8.concurrent_compactors: 1
 
 
 Is there any way in which the writes can be made pretty slow on different 
 nodes. Ideally I would like data to be written on one node and eventually 
 replicating across other nodes I dont really need a real time update, so can 
 pretty much live with slow writes.
 
 Sharing the cassandra and GC logs as below
 
 
 
 
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
 (line 56) RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf7365617263686669656c64efbfbf64657320', 
 modifications=[ColumnFamily(TI [357025:false:4@1330861061454,])]) applied.  
 Sending response to 14541197@/10.86.29.21
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
 (line 44) Applying RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365',
  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])])
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 Table.java (line 387) 
 applying mutation of row 
 686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
 (line 56) RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365',
  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])]) applied.  
 Sending response to 14541198@/10.86.29.21
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
 (line 44) Applying RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573',
  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])])
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 Table.java (line 387) 
 applying mutation of row 
 686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573
 DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
 (line 56) RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573',
  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])]) applied.  
 Sending response to 14541199@/10.86.29.21
 DEBUG [248896865@qtp-1257398760-2] 2012-03-04 17:07:51,572 
 SolrIndexReader.java (line 928) getCoreCacheKey() - start
 DEBUG [MutationStage:32] 2012-03-04 17:07:51,426 RowMutationVerbHandler.java 
 (line 56) RowMutation(keyspace='L', 
 key='686f74656c737e30efbfbf636f756e74727953636f7265', 
 modifications=[ColumnFamily(FC [356996:false:4@1330861059172,])]) applied.  
 Sending response to 14532390@/10.86.29.21
 DEBUG [1568261127@qtp-1257398760-14] 2012-03-04 17:07:51,425 
 SolrIndexSearcher.java (line 557) doc(int, SetString) - start
 DEBUG [MutationStage:2] 2012-03-04 17:07:51,425 Table.java (line 387) 
 applying mutation of row 686f74656c737e30efbfbf6e616d65efbfbf6c61
 DEBUG [1568261127@qtp-1257398760-14] 2012-03-04 17:07:52,015 
 SolrIndexSearcher.java (line 557) doc(int, SetString) - start
 DEBUG [1861954021@qtp-1257398760-11] 2012-03-04 17:07:51,425 
 StorageProxy.java (line 696) Read: 0 ms.
  INFO 

Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hi All,

While benchmarking Cassandra I found Mutation Dropped messages in the logs.  
Now I know this is a good old question. It will be really great if someone can 
provide a check list to recover when such a thing happens. I am looking for 
answers of the following questions  -


1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes

2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant

--
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


Re: slow read

2012-03-05 Thread aaron morton
Where is the client running from ? 

To see if a node it keeping up with requests look at nodetool tpstats, check if 
the read stage is backing up. 

To see how long a read takes, use nodetool cfstats and look at the read 
latency. (this the latency of a read on that node, not cluster wide)

To see how long a read takes cluster wide, use the StorageProxyMBean via 
JConsole. 

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 10:46 PM, ruslan usifov wrote:

 And sum of all rq/s threads is 160?? 
 
 2012/3/5 Jeesoo Shin bsh...@gmail.com
 Thank you for reply. :)
 Yes I did multiple thread.
 160, 320 gave me same result.
 
 On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote:
  2012/3/5 Jeesoo Shin bsh...@gmail.com
 
  Hi all.
 
  I have very SLOW READ here. :-(
  I made a cluster with three node (aws xlarge, replication = 3)
  Cassandra version is 1.0.6
  I have inserted 1,000,000 rows. (standard column)
  Each row has 200 columns.
  Each column has 16 byte key,  512 byte value.
 
  I used Hector createSliceQuery to get one column in a row.
  This basic query(random row, fixed column) is created with multiple
  thread and hit cassandra.
 
  I only get up to 140 request per second. Is this all I can get for read?
  Or am I doing something wrong?
  Interestingly, when I request rows which doesn't exist, it goes up to
  1600 per second.
 
 
  You must test read performance by paralel test (ie multiple threads). The
  result when not existent rows are more faster is result of bloom filter
 
 
 
 
  ANY insight, share will be extremely helpful.
  Thank you.
 
  Regards,
  Jeesoo.
 
 
 



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
 Would the rings be separate?
Yes. 
But I would recommend you give them different cluster names. It's a good 
protections against nodes accidentally  joining the wrong cluster. 

cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:06 PM, Tamar Fraenkel wrote:

 Hi!
 I have a Cassandra  cluster with two nodes
 
 nodetool ring -h localhost
 Address DC  RackStatus State   LoadOwns   
  Token
   
  85070591730234615865843651857942052864
 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   50.00% 
  0
 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   50.00% 
  85070591730234615865843651857942052864
 
 I want to create a second ring with the same name but two different nodes.
 using tokengentool I get the same tokens as they are affected from the number 
 of nodes in a ring.
 
 My question is like this:
 Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
 In 10.0.0.31 cassandra.yaml I will set
 initial_token: 0
 seeds: 10.0.0.31
 listen_address: 10.0.0.31
 rpc_address: 0.0.0.0
 
 In 10.0.0.11 cassandra.yaml I will set
 initial_token: 85070591730234615865843651857942052864
 seeds: 10.0.0.31
 listen_address: 10.0.0.11
 rpc_address: 0.0.0.0 
 
 Would the rings be separate?
 
 Thanks,
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 tokLogo.png
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 



-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com



Re: running two rings on the same subnet

2012-03-05 Thread Hontvári József Levente

  
  
You have to use PropertyFileSnitch and NetworkTopologyStrategy to
create a multi-datacenter setup with two circles. You can start
reading from this page:
http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

Moreover all tokens must be unique (even across datacenters),
although - from pure curiosity - I wonder what is the rationale
behind this.

By the way, can someone enlighten me about the first line in the
output of the nodetool. Obviously it contains a token, but nothing
else. It seems like a formatting glitch, but maybe it has a role. 

On 2012.03.05. 11:06, Tamar Fraenkel wrote:

  Hi!
I have aCassandra
  clusterwith two nodes

  
  
  nodetool ring -h localhost
  Address DC Rack 
Status State  Load  Owns  Token
   
  
  85070591730234615865843651857942052864
  10.0.0.19datacenter1 rack1 
Up   Normal 488.74 KB50.00% 0
  10.0.0.28datacenter1 rack1 
Up   Normal 504.63 KB50.00%
  85070591730234615865843651857942052864



I want to create a second ring with the same name but two
  different nodes.
using tokengentool I get the same tokens as they are
  affected from the number of nodes in a ring.


My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and
  10.0.0.11
In 10.0.0.31 cassandra.yaml I will set
initial_token: 0
seeds: "10.0.0.31"
listen_address:10.0.0.31
rpc_address: 0.0.0.0


In 10.0.0.11cassandra.yamlI will set
initial_token:85070591730234615865843651857942052864
seeds: "10.0.0.31"

listen_address: 10.0.0.11

rpc_address: 0.0.0.0


Would the rings be separate?


Thanks,

  
Tamar Fraenkel
  Senior Software Engineer, TOK Media
  
  

  ta...@tok-media.com
  Tel:+972
2 6409736
  Mob:+972
54 8356490
  Fax:+972
2 5612956
  
  


  
  

  


  



Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
 1.   Which parameters to tune in the config files? – Especially looking 
 for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the 
node is under temporary stress such as GC or repair. 
If you have spare IO / CPU capacity you could increase the current_writes to 
increase throughput on the write stage. You then need to ensure the commit log 
and, to a lesser degree, the data volumes can keep up. 

 2.   What is the difference between TimedOutException and silently 
 dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before 
rpc_timeout. Dropping messages happens when a message is removed from the queue 
in the a thread pool after rpc_timeout has occurred. it is a feature of the 
architecture, and correct behaviour under stress. 
Inconsistencies created by dropped messages are repaired via reads as high CL, 
HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:

 Hi All,
  
 While benchmarking Cassandra I found “Mutation Dropped” messages in the logs. 
  Now I know this is a good old question. It will be really great if someone 
 can provide a check list to recover when such a thing happens. I am looking 
 for answers of the following questions  -
  
 1.   Which parameters to tune in the config files? – Especially looking 
 for heavy writes
 2.   What is the difference between TimedOutException and silently 
 dropping mutation messages while operating on a CL of QUORUM.
  
  
 Regards,
 Dushyant
 NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
 or views contained herein are not intended to be, and do not constitute, 
 advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
 and Consumer Protection Act. If you have received this communication in 
 error, please destroy all electronic and paper copies and notify the sender 
 immediately. Mistransmission is not intended to waive confidentiality or 
 privilege. Morgan Stanley reserves the right, to the extent permitted under 
 applicable law, to monitor electronic communications. This message is subject 
 to terms available at the following link: 
 http://www.morganstanley.com/disclaimers. If you cannot access these links, 
 please notify us by reply message and we will send the contents to you. By 
 messaging with Morgan Stanley you consent to the foregoing.



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Do you want to create two separate clusters or a single cluster with two data 
centres ? 

If it's the later, token selection is discussed here 
http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
 
 Moreover all tokens must be unique (even across datacenters), although - from 
 pure curiosity - I wonder what is the rationale behind this.
Otherwise data is not evenly distributed.

 By the way, can someone enlighten me about the first line in the output of 
 the nodetool. Obviously it contains a token, but nothing else. It seems like 
 a formatting glitch, but maybe it has a role. 
It's the exclusive lower bound token for the first node in the ring. This also 
happens to be the token for the last node in the ring. 

In your setup 
10.0.0.19 owns (85070591730234615865843651857942052864+1) to 0
10.0.0.28 owns  (0 + 1) to 85070591730234615865843651857942052864

(does not imply primary replica, just used to map keys to nodes.)
 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

 You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a 
 multi-datacenter setup with two circles. You can start reading from this page:
 http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
 
 Moreover all tokens must be unique (even across datacenters), although - from 
 pure curiosity - I wonder what is the rationale behind this.
 
 By the way, can someone enlighten me about the first line in the output of 
 the nodetool. Obviously it contains a token, but nothing else. It seems like 
 a formatting glitch, but maybe it has a role. 
 
 On 2012.03.05. 11:06, Tamar Fraenkel wrote:
 
 Hi!
 I have a Cassandra  cluster with two nodes
 
 nodetool ring -h localhost
 Address DC  RackStatus State   LoadOwns  
   Token
  
   85070591730234615865843651857942052864
 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   
 50.00%  0
 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   
 50.00%  85070591730234615865843651857942052864
 
 I want to create a second ring with the same name but two different nodes.
 using tokengentool I get the same tokens as they are affected from the 
 number of nodes in a ring.
 
 My question is like this:
 Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
 In 10.0.0.31 cassandra.yaml I will set
 initial_token: 0
 seeds: 10.0.0.31
 listen_address: 10.0.0.31
 rpc_address: 0.0.0.0
 
 In 10.0.0.11 cassandra.yaml I will set
 initial_token: 85070591730234615865843651857942052864
 seeds: 10.0.0.31
 listen_address: 10.0.0.11
 rpc_address: 0.0.0.0 
 
 Would the rings be separate?
 
 Thanks,
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 Mail Attachment.png
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 
 



Re: how stable is 1.0 these days?

2012-03-05 Thread Viktor Jevdokimov
1.0.7 is very stable, weeks in high-load production environment without any
exception, 1.0.8 should be even more stable, check changes.txt for what was
fixed.


2012/3/2 Marcus Eriksson krum...@gmail.com

 beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if
 you have many keys per node

 other than that, yep, it seems solid

 /Marcus


 On Wed, Feb 29, 2012 at 6:20 PM, Thibaut Britz 
 thibaut.br...@trendiction.com wrote:

 Thanks!

 We will test it on our test cluster in the coming weeks and hopefully put
 it into production on our 200 node main cluster. :)

 Thibaut

 On Wed, Feb 29, 2012 at 5:52 PM, Edward Capriolo 
 edlinuxg...@gmail.comwrote:

 On Wed, Feb 29, 2012 at 10:35 AM, Thibaut Britz
 thibaut.br...@trendiction.com wrote:
  Any more feedback on larger deployments of 1.0.*?
 
  We are eager to try out the new features in production, but don't want
 to
  run into bugs as on former 0.7 and 0.8 versions.
 
  Thanks,
  Thibaut
 
 
 
  On Tue, Jan 31, 2012 at 6:59 AM, Ben Coverston 
 ben.covers...@datastax.com
  wrote:
 
  I'm not sure what Carlo is referring to, but generally if you have
 done,
  thousands of migrations you can end up in a situation where the
 migrations
  take a long time to replay, and there are some race conditions that
 can be
  problematic in the case where there are thousands of migrations that
 may
  need to be replayed while a node is bootstrapped. If you get into this
  situation it can be fixed by copying migrations from a known good
 schema to
  the node that you are trying to bootstrap.
 
  Generally I would advise against frequent schema updates. Unlike rows
 in
  column families the schema itself is designed to be relatively static.
 
  On Mon, Jan 30, 2012 at 2:14 PM, Jim Newsham jnews...@referentia.com
 
  wrote:
 
 
  Could you also elaborate for creating/dropping column families?
  We're
  currently working on moving to 1.0 and using dynamically created
 tables, so
  I'm very interested in what issues we might encounter.
 
  So far the only thing I've encountered (with 1.0.7 + hector 1.0-2) is
  that dropping a cf may sometimes fail with UnavailableException.  I
 think
  this happens when the cf is busy being compacted.  When I
 sleep/retry within
  a loop it eventually succeeds.
 
  Thanks,
  Jim
 
 
  On 1/26/2012 7:32 AM, Pierre-Yves Ritschard wrote:
 
  Can you elaborate on the composite types instabilities ? is this
  specific to hector as the radim's posts suggests ?
  These one liner answers are quite stressful :)
 
  On Thu, Jan 26, 2012 at 1:28 PM, Carlo Pirescarlopi...@gmail.com
   wrote:
 
  If you need to use composite types and create/drop column families
 on
  the
  fly you must be prepared to instabilities.
 
 
 
 
 
  --
  Ben Coverston
  DataStax -- The Apache Cassandra Company
 
 

 I would call 1.0.7 rock fricken solid. Incredibly stable. It has been
 that way since I updated to 0.8.8  really. TBs of data, billions of
 requests a day, and thanks to JAMM, memtable type auto-tuning, and
 other enhancements I rarely, if ever, find a node in a state where it
 requires a restart. My clusters are beast-ing.

 There always is bugs in software, but coming from a guy who ran
 cassandra 0.6.1.Administration on my Cassandra cluster is like a
 vacation now.






Re: Huge amount of empty files in data directory.

2012-03-05 Thread Viktor Jevdokimov
After running Cassandra for 2 years in production on Windows servers,
starting from 0.7 beta2 up to 1.0.7 we have moved to Linux and forgot all
the hell we had on Windows. Having JNA, off-heap row cache and normally
working MMAP on Linux you're getting a lot better performance and stability
comparing to Windows, and less maintenance.

2012/3/1 Henrik Schröder skro...@gmail.com

 Great, thanks!


 /Henrik


 On Thu, Mar 1, 2012 at 13:08, Sylvain Lebresne sylv...@datastax.comwrote:

 It's a bug, namely: https://issues.apache.org/jira/browse/CASSANDRA-3616
 You'd want to upgrade.

 --
 Sylvain

 On Thu, Mar 1, 2012 at 1:01 PM, Henrik Schröder skro...@gmail.com
 wrote:
  Hi,
 
  We're running Cassandra 1.0.6 on Windows, and noticed that the amount of
  files in the datadirectory just keeps growing. We have about 60GB of
 data
  per node, we do a major compaction about once a week, but after
 compaction
  there's a lot of 0-byte temp files and old files that are kept for some
  reason. After 50 days of uptime there was around 5 files in each
  datadirectory, but when we restarted a server it deleted all the
 unnecessary
  files and it shrunk down to about 200 files.
 
  We're running without compression, and with the regular compaction
 strategy,
  not leveldb. I don't remember seeing this behaviour in older versions of
  Cassandra, shouldn't it delete temp files while running? Is it possible
 to
  force it to delete temp files while running? Is this fixed in a later
  version? Or do we have to periodically restart servers to clean up the
  datadirectories?
 
 
  /Henrik Schröder





Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
I want tow separate clusters.
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 12:48 PM, aaron morton aa...@thelastpickle.comwrote:

 Do you want to create two separate clusters or a single cluster with two
 data centres ?

 If it's the later, token selection is discussed here
 http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra


 Moreover all tokens must be unique (even across datacenters), although -
 from pure curiosity - I wonder what is the rationale behind this.

 Otherwise data is not evenly distributed.

 By the way, can someone enlighten me about the first line in the output of
 the nodetool. Obviously it contains a token, but nothing else. It seems
 like a formatting glitch, but maybe it has a role.

 It's the exclusive lower bound token for the first node in the ring. This
 also happens to be the token for the last node in the ring.

 In your setup
 10.0.0.19 owns (85070591730234615865843651857942052864+1) to 0
 10.0.0.28 owns  (0 + 1) to 85070591730234615865843651857942052864

 (does not imply primary replica, just used to map keys to nodes.)



   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

  You have to use PropertyFileSnitch and NetworkTopologyStrategy to create
 a multi-datacenter setup with two circles. You can start reading from this
 page:

 http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

 Moreover all tokens must be unique (even across datacenters), although -
 from pure curiosity - I wonder what is the rationale behind this.

 By the way, can someone enlighten me about the first line in the output of
 the nodetool. Obviously it contains a token, but nothing else. It seems
 like a formatting glitch, but maybe it has a role.

 On 2012.03.05. 11:06, Tamar Fraenkel wrote:

 Hi!
 I have a Cassandra  cluster with two nodes

  nodetool ring -h localhost
 Address DC  RackStatus State   Load
  OwnsToken

  85070591730234615865843651857942052864
 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB
 50.00%  0
 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB
 50.00%  85070591730234615865843651857942052864

  I want to create a second ring with the same name but two different
 nodes.
 using tokengentool I get the same tokens as they are affected from the
 number of nodes in a ring.

  My question is like this:
 Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
 *In 10.0.0.31 cassandra.yaml I will set*
 initial_token: 0
 seeds: 10.0.0.31
 listen_address: 10.0.0.31
 rpc_address: 0.0.0.0

  *In 10.0.0.11 cassandra.yaml I will set*
 initial_token: 85070591730234615865843651857942052864
 seeds: 10.0.0.31
 listen_address: 10.0.0.11
 rpc_address: 0.0.0.0

  *Would the rings be separate?*

  Thanks,

  *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 Mail Attachment.png


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956






tokLogo.png

Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Hontvári József Levente

I am thinking about the frequent example:

dc1 - node1: 0
dc1 - node2: large...number

dc2 - node1: 1
dc2 - node2: large...number + 1

In theory using the same tokens in dc2 as in dc1 does not significantly 
affect key distribution, specifically the two keys on the border will 
move to the next one, but that is not much. However it seems that there 
is an unexplained requirement (at least I could not find an 
explanation), that all nodes must have a unique token, even if they are 
put into a different circle by NetworkTopologyStrategy.





On 2012.03.05. 11:48, aaron morton wrote:
Moreover all tokens must be unique (even across datacenters), although 
- from pure curiosity - I wonder what is the rationale behind this.

Otherwise data is not evenly distributed.


Adding a second datacenter

2012-03-05 Thread David Koblas
Everything that I've read about data centers focuses on setting things 
up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data from 
DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring up 
all 20 machines in DC2 but then start getting a huge number of read 
errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Re: Adding a second datacenter

2012-03-05 Thread Jeremiah Jordan
You need to make sure your clients are reading using LOCAL_* settings so 
that they don't try to get data from the other data center.  But you 
shouldn't get errors while replication_factor is 0.  Once you change the 
replication factor to 4, you should get missing data if you are using 
LOCAL_* for reading.


What version are you using?

See the IRC logs at the begining of this JIRA discussion thread for some 
info:


https://issues.apache.org/jira/browse/CASSANDRA-3483

But you should be able to:
1. Set dc2:0 in the replication_factor.
2. Set bootstrap to false on the new nodes.
2. Start all of the new nodes.
3. Change replication_factor to dc2:4
4. run repair on the nodes in dc2.

Once the repairs finish you should be able to start using DC2.  You are 
still going to need a bunch of extra space because the repair is going 
to get you a couple copies of the data.


Once 1.1 comes out it will have new nodetool commands for making this a 
little nicer per CASSANDRA-3483


-Jeremiah


On 03/05/2012 09:42 AM, David Koblas wrote:
Everything that I've read about data centers focuses on setting things 
up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data from 
DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring up 
all 20 machines in DC2 but then start getting a huge number of read 
errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Division by zero

2012-03-05 Thread Vanger

After upgrading from version 1.0.1 to 1.0.8  we started to get exception:

ERROR [http-8095-1 WideEntityServiceImpl.java:142] - get: key1 - 
{type=RANGE, start=0, end=9223372036854775807, orderDesc=false, limit=1}
me.prettyprint.hector.api.exceptions.HCassandraInternalException: 
Cassandra encountered an internal error processing this request: 
TApplicationError type: 6 message:Internal error processing get_slice
at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:31)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268)
at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)



I already (not too soon?) created an issue in jira with more detailed 
description:

https://issues.apache.org/jira/browse/CASSANDRA-4000

Any ideas?

Thanks.


Re: Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Jeremiah Jordan
There is a requirement that all nodes have a unique token.  There is 
still one global cluster/ring that each node needs to be unique on.  The 
logically seperate rings that NetworkTopologyStrategy puts them into is 
hidden from the rest of the code.


-Jeremiah

On 03/05/2012 05:13 AM, Hontvári József Levente wrote:

I am thinking about the frequent example:

dc1 - node1: 0
dc1 - node2: large...number

dc2 - node1: 1
dc2 - node2: large...number + 1

In theory using the same tokens in dc2 as in dc1 does not 
significantly affect key distribution, specifically the two keys on 
the border will move to the next one, but that is not much. However it 
seems that there is an unexplained requirement (at least I could not 
find an explanation), that all nodes must have a unique token, even if 
they are put into a different circle by NetworkTopologyStrategy.





On 2012.03.05. 11:48, aaron morton wrote:
Moreover all tokens must be unique (even across datacenters), 
although - from pure curiosity - I wonder what is the rationale 
behind this.

Otherwise data is not evenly distributed.


Re: how stable is 1.0 these days?

2012-03-05 Thread Thibaut Britz
Thanks for the feedback. I will certainly execute scrub after the update.


On Mon, Mar 5, 2012 at 11:55 AM, Viktor Jevdokimov vjevdoki...@gmail.comwrote:

 1.0.7 is very stable, weeks in high-load production environment without
 any exception, 1.0.8 should be even more stable, check changes.txt for what
 was fixed.


 2012/3/2 Marcus Eriksson krum...@gmail.com

 beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if
 you have many keys per node

 other than that, yep, it seems solid

 /Marcus


 On Wed, Feb 29, 2012 at 6:20 PM, Thibaut Britz 
 thibaut.br...@trendiction.com wrote:

 Thanks!

 We will test it on our test cluster in the coming weeks and hopefully
 put it into production on our 200 node main cluster. :)

 Thibaut

 On Wed, Feb 29, 2012 at 5:52 PM, Edward Capriolo 
 edlinuxg...@gmail.comwrote:

 On Wed, Feb 29, 2012 at 10:35 AM, Thibaut Britz
 thibaut.br...@trendiction.com wrote:
  Any more feedback on larger deployments of 1.0.*?
 
  We are eager to try out the new features in production, but don't
 want to
  run into bugs as on former 0.7 and 0.8 versions.
 
  Thanks,
  Thibaut
 
 
 
  On Tue, Jan 31, 2012 at 6:59 AM, Ben Coverston 
 ben.covers...@datastax.com
  wrote:
 
  I'm not sure what Carlo is referring to, but generally if you have
 done,
  thousands of migrations you can end up in a situation where the
 migrations
  take a long time to replay, and there are some race conditions that
 can be
  problematic in the case where there are thousands of migrations that
 may
  need to be replayed while a node is bootstrapped. If you get into
 this
  situation it can be fixed by copying migrations from a known good
 schema to
  the node that you are trying to bootstrap.
 
  Generally I would advise against frequent schema updates. Unlike
 rows in
  column families the schema itself is designed to be relatively
 static.
 
  On Mon, Jan 30, 2012 at 2:14 PM, Jim Newsham 
 jnews...@referentia.com
  wrote:
 
 
  Could you also elaborate for creating/dropping column families?
  We're
  currently working on moving to 1.0 and using dynamically created
 tables, so
  I'm very interested in what issues we might encounter.
 
  So far the only thing I've encountered (with 1.0.7 + hector 1.0-2)
 is
  that dropping a cf may sometimes fail with UnavailableException.  I
 think
  this happens when the cf is busy being compacted.  When I
 sleep/retry within
  a loop it eventually succeeds.
 
  Thanks,
  Jim
 
 
  On 1/26/2012 7:32 AM, Pierre-Yves Ritschard wrote:
 
  Can you elaborate on the composite types instabilities ? is this
  specific to hector as the radim's posts suggests ?
  These one liner answers are quite stressful :)
 
  On Thu, Jan 26, 2012 at 1:28 PM, Carlo Pirescarlopi...@gmail.com
   wrote:
 
  If you need to use composite types and create/drop column
 families on
  the
  fly you must be prepared to instabilities.
 
 
 
 
 
  --
  Ben Coverston
  DataStax -- The Apache Cassandra Company
 
 

 I would call 1.0.7 rock fricken solid. Incredibly stable. It has been
 that way since I updated to 0.8.8  really. TBs of data, billions of
 requests a day, and thanks to JAMM, memtable type auto-tuning, and
 other enhancements I rarely, if ever, find a node in a state where it
 requires a restart. My clusters are beast-ing.

 There always is bugs in software, but coming from a guy who ran
 cassandra 0.6.1.Administration on my Cassandra cluster is like a
 vacation now.







Re: Adding a second datacenter

2012-03-05 Thread David Koblas

Jeremiah,

Thanks!

I'm running 1.0.8, two interesting things to note:

- I don't have sufficient disk space to handle the straight bump to a 
replication factor of 4, so I think I'm going to have to do it one by 
one (1,2,3 and 4) with a bunch of cleanups in between.


- Also, using a LOCAL_QUORUM doesn't work since my application has a 
hard response time limit then my read speed ends up being the speed of 
the slowest node.  What I want is LOCAL_ONE which doesn't exist in the 
API (unless I missed something).


Yes, CASSANDRA-3483 is really what I'm looking for.

--david

On 3/5/12 8:02 AM, Jeremiah Jordan wrote:
You need to make sure your clients are reading using LOCAL_* settings 
so that they don't try to get data from the other data center.  But 
you shouldn't get errors while replication_factor is 0.  Once you 
change the replication factor to 4, you should get missing data if you 
are using LOCAL_* for reading.


What version are you using?

See the IRC logs at the begining of this JIRA discussion thread for 
some info:


https://issues.apache.org/jira/browse/CASSANDRA-3483

But you should be able to:
1. Set dc2:0 in the replication_factor.
2. Set bootstrap to false on the new nodes.
2. Start all of the new nodes.
3. Change replication_factor to dc2:4
4. run repair on the nodes in dc2.

Once the repairs finish you should be able to start using DC2.  You 
are still going to need a bunch of extra space because the repair is 
going to get you a couple copies of the data.


Once 1.1 comes out it will have new nodetool commands for making this 
a little nicer per CASSANDRA-3483


-Jeremiah


On 03/05/2012 09:42 AM, David Koblas wrote:
Everything that I've read about data centers focuses on setting 
things up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data 
from DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring 
up all 20 machines in DC2 but then start getting a huge number of 
read errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Re: Issue with nodetool clearsnapshot

2012-03-05 Thread aaron morton
 It seems that instead of removing the snapshot, clearsnapshot moved the data 
 files from the snapshot directory to the parent directory and the size of the 
 data for that keyspace has doubled.
That is not possible, there is only code there to delete a files in the 
snapshot. 

Note that in the snapshot are hard links to the files in the data dir. Deleting 
/ clearing the snapshot will not delete the files from the data dir if they are 
still in use. 

  Many of the files are looking like duplicates.
 
 in Keyspace1 directory
 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
Under 0.8.x files are not immediately deleted. Did the data directory contain 
zero size -Compacted files with the same number ?
  
Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:50 PM, B R wrote:

 Version 0.8.9
 
 We run a 2 node cluster with RF=2. We ran a scrub and after that ran the 
 clearsnapshot to remove the backup snapshot created by scrub. It seems that 
 instead of removing the snapshot, clearsnapshot moved the data files from the 
 snapshot directory to the parent directory and the size of the data for that 
 keyspace has doubled. Many of the files are looking like duplicates.
 
 in Keyspace1 directory
 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
 118211555728 Jan 31 12:50 Standard1-g-7968-Data.db
 118211555728 Mar  3 22:58 Standard1-g-8840-Data.db
 116902342895 Feb 25 02:04 Standard1-g-8832-Data.db
 116902342895 Mar  3 22:10 Standard1-g-8836-Data.db
 93788425710 Feb 21 04:20 Standard1-g-8791-Data.db
 93788425710 Mar  4 00:29 Standard1-g-8845-Data.db
 .
 
 Even though the nodetool ring command shows the correct data size for the 
 node, the du -sh on the keyspace directory gives double the size.
 
 Can you guide us to proceed from this situation ?
 
 Thanks.



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Create nodes that do not share seeds, and give the clusters different names as 
a safety measure. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:

 I want tow separate clusters.
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 tokLogo.png
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 
 
 
 On Mon, Mar 5, 2012 at 12:48 PM, aaron morton aa...@thelastpickle.com wrote:
 Do you want to create two separate clusters or a single cluster with two data 
 centres ? 
 
 If it's the later, token selection is discussed here 
 http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
  
 Moreover all tokens must be unique (even across datacenters), although - 
 from pure curiosity - I wonder what is the rationale behind this.
 Otherwise data is not evenly distributed.
 
 By the way, can someone enlighten me about the first line in the output of 
 the nodetool. Obviously it contains a token, but nothing else. It seems like 
 a formatting glitch, but maybe it has a role. 
 It's the exclusive lower bound token for the first node in the ring. This 
 also happens to be the token for the last node in the ring. 
 
 In your setup 
 10.0.0.19 owns (85070591730234615865843651857942052864+1) to 0
 10.0.0.28 owns  (0 + 1) to 85070591730234615865843651857942052864
 
 (does not imply primary replica, just used to map keys to nodes.)
  
 
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:
 
 You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a 
 multi-datacenter setup with two circles. You can start reading from this 
 page:
 http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
 
 Moreover all tokens must be unique (even across datacenters), although - 
 from pure curiosity - I wonder what is the rationale behind this.
 
 By the way, can someone enlighten me about the first line in the output of 
 the nodetool. Obviously it contains a token, but nothing else. It seems like 
 a formatting glitch, but maybe it has a role. 
 
 On 2012.03.05. 11:06, Tamar Fraenkel wrote:
 Hi!
 I have a Cassandra  cluster with two nodes
 
 nodetool ring -h localhost
 Address DC  RackStatus State   LoadOwns 
Token
 
85070591730234615865843651857942052864
 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   
 50.00%  0
 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   
 50.00%  85070591730234615865843651857942052864
 
 I want to create a second ring with the same name but two different nodes.
 using tokengentool I get the same tokens as they are affected from the 
 number of nodes in a ring.
 
 My question is like this:
 Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
 In 10.0.0.31 cassandra.yaml I will set
 initial_token: 0
 seeds: 10.0.0.31
 listen_address: 10.0.0.31
 rpc_address: 0.0.0.0
 
 In 10.0.0.11 cassandra.yaml I will set
 initial_token: 85070591730234615865843651857942052864
 seeds: 10.0.0.31
 listen_address: 10.0.0.11
 rpc_address: 0.0.0.0 
 
 Would the rings be separate?
 
 Thanks,
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 Mail Attachment.png
 
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 
 
 
 



Re: Division by zero

2012-03-05 Thread aaron morton
(Commented in the ticket as well)
What is the error in the server log ? 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 5:04 AM, Vanger wrote:

 After upgrading from version 1.0.1 to 1.0.8  we started to get exception:
 
 ERROR [http-8095-1 WideEntityServiceImpl.java:142] - get: key1 - {type=RANGE, 
 start=0, end=9223372036854775807, orderDesc=false, limit=1}
 me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra 
 encountered an internal error processing this request: TApplicationError 
 type: 6 message:Internal error processing get_slice
 at 
 me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:31)
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285)
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268)
 at 
 me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
 at 
 me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
 at 
 me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
 at 
 me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
 at 
 me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
 at 
 me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
 at 
 me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
 at 
 me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
 
 
 I already (not too soon?) created an issue in jira with more detailed 
 description:
 https://issues.apache.org/jira/browse/CASSANDRA-4000
 
 Any ideas?
 
 Thanks.



Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
 I increased the size of the cluster also the concurrent_writes parameter. 
 Still there is a node which keeps on dropping the mutation messages.
Ensure all the nodes have the same spec, and the nodes have the same config. In 
a virtual environment consider moving the node.

 Is this due to some improper load balancing? 
What does nodetool ring say and what sort of queries (and RF and CL) are you 
sending.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:

 Hey Aaron,
  
 I increased the size of the cluster also the concurrent_writes parameter. 
 Still there is a node which keeps on dropping the mutation messages. The 
 other nodes are not dropping mutation messages. I am using Hector API and had 
 done nothing for load balancing so far. Just provided the host:port of the 
 nodes in the Cassandrahostconfig. Is this due to some improper load 
 balancing? Also the physical host where the node is hosted is relatively 
 heavier than other nodes’ host. What can I do to improve?
 PS: The node is seed of the cluster.
  
 Thanks,
 Dushyant
  
 From: aaron morton [mailto:aa...@thelastpickle.com] 
 Sent: Monday, March 05, 2012 4:15 PM
 To: user@cassandra.apache.org
 Subject: Re: Mutation Dropped Messages
  
 1.   Which parameters to tune in the config files? – Especially looking 
 for heavy writes
 The node is overloaded. It may be because there are no enough nodes, or the 
 node is under temporary stress such as GC or repair. 
 If you have spare IO / CPU capacity you could increase the current_writes to 
 increase throughput on the write stage. You then need to ensure the commit 
 log and, to a lesser degree, the data volumes can keep up. 
  
 2.   What is the difference between TimedOutException and silently 
 dropping mutation messages while operating on a CL of QUORUM.
 TimedOutExceptions means CL nodes did not respond to the coordinator before 
 rpc_timeout. Dropping messages happens when a message is removed from the 
 queue in the a thread pool after rpc_timeout has occurred. it is a feature of 
 the architecture, and correct behaviour under stress. 
 Inconsistencies created by dropped messages are repaired via reads as high 
 CL, HH (in 1.+), Read Repair or Anti Entropy.
  
 Cheers
  
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
  
 On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:
 
 
 Hi All,
  
 While benchmarking Cassandra I found “Mutation Dropped” messages in the logs. 
  Now I know this is a good old question. It will be really great if someone 
 can provide a check list to recover when such a thing happens. I am looking 
 for answers of the following questions  -
  
 1.   Which parameters to tune in the config files? – Especially looking 
 for heavy writes
 2.   What is the difference between TimedOutException and silently 
 dropping mutation messages while operating on a CL of QUORUM.
  
  
 Regards,
 Dushyant
 NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
 or views contained herein are not intended to be, and do not constitute, 
 advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
 and Consumer Protection Act. If you have received this communication in 
 error, please destroy all electronic and paper copies and notify the sender 
 immediately. Mistransmission is not intended to waive confidentiality or 
 privilege. Morgan Stanley reserves the right, to the extent permitted under 
 applicable law, to monitor electronic communications. This message is subject 
 to terms available at the following link: 
 http://www.morganstanley.com/disclaimers. If you cannot access these links, 
 please notify us by reply message and we will send the contents to you. By 
 messaging with Morgan Stanley you consent to the foregoing.
  
 NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
 or views contained herein are not intended to be, and do not constitute, 
 advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
 and Consumer Protection Act. If you have received this communication in 
 error, please destroy all electronic and paper copies and notify the sender 
 immediately. Mistransmission is not intended to waive confidentiality or 
 privilege. Morgan Stanley reserves the right, to the extent permitted under 
 applicable law, to monitor electronic communications. This message is subject 
 to terms available at the following 
 link:http://www.morganstanley.com/disclaimers. If you cannot access these 
 links, please notify us by reply message and we will send the contents to 
 you. By messaging with Morgan Stanley you consent to the foregoing.



Re: Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Hi Aaron,

1)Since you mentioned hard links, I would like to add that our data
directory itself is a sym-link. Could that be causing an issue ?

2)Yes, there are 0 byte files of the same numbers
in Keyspace1 directory
0 Mar  4 01:33 Standard1-g-7317-Compacted
0 Mar  3 22:58 Standard1-g-7968-Compacted
0 Mar  3 23:10 Standard1-g-8778-Compacted
0 Mar  3 23:47 Standard1-g-8782-Compacted
...

I restarted the node and it went about deleting the files and the disk
space has been released. Can this be done using nodetool, and without
restarting ?

Thanks.

On Mon, Mar 5, 2012 at 10:59 PM, aaron morton aa...@thelastpickle.comwrote:

   It seems that instead of removing the snapshot, clearsnapshot moved the
 data files from the snapshot directory to the parent directory and the size
 of the data for that keyspace has doubled.

 That is not possible, there is only code there to delete a files in the
 snapshot.

 Note that in the snapshot are hard links to the files in the data dir.
 Deleting / clearing the snapshot will not delete the files from the data
 dir if they are still in use.

  Many of the files are looking like duplicates.

 in Keyspace1 directory
 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db

 Under 0.8.x files are not immediately deleted. Did the data directory
 contain zero size -Compacted files with the same number ?

 Cheers


   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/03/2012, at 11:50 PM, B R wrote:

 Version 0.8.9

 We run a 2 node cluster with RF=2. We ran a scrub and after that ran the
 clearsnapshot to remove the backup snapshot created by scrub. It seems that
 instead of removing the snapshot, clearsnapshot moved the data files from
 the snapshot directory to the parent directory and the size of the data for
 that keyspace has doubled. Many of the files are looking like duplicates.

 in Keyspace1 directory
 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
 118211555728 Jan 31 12:50 Standard1-g-7968-Data.db
 118211555728 Mar  3 22:58 Standard1-g-8840-Data.db
 116902342895 Feb 25 02:04 Standard1-g-8832-Data.db
 116902342895 Mar  3 22:10 Standard1-g-8836-Data.db
 93788425710 Feb 21 04:20 Standard1-g-8791-Data.db
 93788425710 Mar  4 00:29 Standard1-g-8845-Data.db
 .

 Even though the nodetool ring command shows the correct data size for the
 node, the du -sh on the keyspace directory gives double the size.

 Can you guide us to proceed from this situation ?

 Thanks.





hector connection pool

2012-03-05 Thread Daning Wang
I just got this error : All host pools marked down. Retry burden pushed
out to client. in a few clients recently, client could not  recover, we
have to restart client application.  we are using 0.8.0.3 hector.

At that time we did compaction  for a CF, it takes several hours, server
was busy. But I think client should recover after server load was down.

Any bug reported about this? I did search but could not find one.

Thanks,

Daning


RE: Secondary indexes don't go away after metadata change

2012-03-05 Thread Frisch, Michael
Thank you very much for your response.  It is true that the older, previously 
existing nodes are not snapshotting the indexes that I had removed.  I'll go 
ahead and just delete those SSTables from the data directory.  They may be 
around still because they were created back when we used 0.8.

The more troubling issue is with adding new nodes to the cluster though.  It 
built indexes for column families that have had all indexes dropped weeks or 
months in the past.  It also will snapshot the index SSTables that it created.  
The index files are non-empty as well, some are hundreds of megabytes.

All nodes have the same schema, none list themselves as having the rows 
indexed.  I cannot drop the indexes via the CLI either because it says that 
they don't exist.  It's quite perplexing.

- Mike


From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, March 05, 2012 3:58 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes don't go away after metadata change

The secondary index CF's are marked as no longer required / marked as 
compacted. under 1.x they would then be deleted reasonably quickly, and 
definitely deleted after a restart.

Is there a zero length .Compacted file there ?

Also, when adding a new node to the ring the new node will build indexes for 
the ones that supposedly don't exist any longer.  Is this supposed to happen?  
Would this have happened if I had deleted the old SSTables from the previously 
existing nodes?
Check you have a consistent schema using describe cluster in the CLI. And check 
the schema is what you think it is using show schema.

Another trick is to do a snapshot. Only the files in use are included the 
snapshot.

Hope that helps.

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 2:53 AM, Frisch, Michael wrote:


I have a few column families that I decided to get rid of the secondary indexes 
on.  I see that there aren't any new index SSTables being created, but all of 
the old ones remain (some from as far back as September).  Is it safe to just 
delete then when the node is offline?  Should I run clean-up or scrub?

Also, when adding a new node to the ring the new node will build indexes for 
the ones that supposedly don't exist any longer.  Is this supposed to happen?  
Would this have happened if I had deleted the old SSTables from the previously 
existing nodes?

The nodes in question have either been upgraded from v0.8.1 = v1.0.2 (scrubbed 
at this time) = v1.0.6 or from v1.0.2 = v1.0.6.  The secondary index was 
dropped when the nodes were version 1.0.6.  The new node added was also 1.0.6.

- Mike



Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Maciej Miklas
I've asked this question already on stackoverflow but without answer - I
wll try again:


My use case expects heavy read load - there are two possible model design
strategies:

   1.

   Tiny rows with row cache: In this case row is small enough to fit into
   RAM and all columns are being cached. Read access should be fast.
   2.

   Wide rows with key cache. Wide rows with large columns amount are to big
   for row cache. Access to column subset requires HDD seek.

As I understand using wide rows is a good design pattern. But we would need
to disable row cache - so  what is the benefit of such wide row (at
least for read access)?

Which approach is better 1 or 2?


Re: hector connection pool

2012-03-05 Thread Maciej Miklas
Have you tried to change:
me.prettyprint.cassandra.service.CassandraHostConfigurator#retryDownedHostsDelayInSeconds
?

Hector will ping down hosts every xx seconds and recover connection.

Regards,
Maciej

On Mon, Mar 5, 2012 at 8:13 PM, Daning Wang dan...@netseer.com wrote:

 I just got this error : All host pools marked down. Retry burden pushed
 out to client. in a few clients recently, client could not  recover, we
 have to restart client application.  we are using 0.8.0.3 hector.

 At that time we did compaction  for a CF, it takes several hours, server
 was busy. But I think client should recover after server load was down.

 Any bug reported about this? I did search but could not find one.

 Thanks,

 Daning




Re: Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Viktor Jevdokimov
Depends on how large is a data set, specifically hot data, comparing to
available RAM, what is a heavy read load, and what are the latency
requirements.


2012/3/6 Maciej Miklas mac.mik...@googlemail.com

 I've asked this question already on stackoverflow but without answer - I
 wll try again:


 My use case expects heavy read load - there are two possible model design
 strategies:

1.

Tiny rows with row cache: In this case row is small enough to fit into
RAM and all columns are being cached. Read access should be fast.
2.

Wide rows with key cache. Wide rows with large columns amount are to
big for row cache. Access to column subset requires HDD seek.

 As I understand using wide rows is a good design pattern. But we would
 need to disable row cache - so  what is the benefit of such wide row
 (at least for read access)?

 Which approach is better 1 or 2?



Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Works..

But during the night my setup encountered a problem.
I have two VMs on my cluster (running on VmWare ESXi).
Each VM has1GB memory, and two Virtual Disks of 16 GB
They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory
(together with two other VMs)
I put cassandra data on the second disk of each machine.
VMs are running Ubuntu 11.10 and cassandra 1.0.7.

I left them running overnight and this morning when I came:
In one node cassandra was down, and the last thing in the system.log is:

 INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 CompactionTask.java
(line 113) Compacting
[SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')]
 INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 CompactionTask.java
(line 221) Compacted to
[/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,].
 32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at
8.144165MB/s.  Time: 3,097ms.


The other node was using all it's CPU and I had to restart it.
After that, I can see that the last lines in it's system.log are that the
other node is down...

 INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line 246)
Writing Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556
serialized/live bytes, 21173 ops)
 INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line 283)
Completed flushing
/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930
bytes)
 INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818)
InetAddress /10.0.0.31 is now dead.

How can I trace why that happened?
Also, I brought cassandra up in both nodes. They both spend long time
reading commit logs, but now they seem to run.
Any idea how to debug or improve my setup?
Thanks,
Tamar



*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 7:30 PM, aaron morton aa...@thelastpickle.comwrote:

 Create nodes that do not share seeds, and give the clusters different
 names as a safety measure.

 Cheers

   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:

 I want tow separate clusters.
 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 tokLogo.png


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Mon, Mar 5, 2012 at 12:48 PM, aaron morton aa...@thelastpickle.comwrote:

 Do you want to create two separate clusters or a single cluster with two
 data centres ?

 If it's the later, token selection is discussed here
 http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra


 Moreover all tokens must be unique (even across datacenters), although -
 from pure curiosity - I wonder what is the rationale behind this.

 Otherwise data is not evenly distributed.

 By the way, can someone enlighten me about the first line in the output
 of the nodetool. Obviously it contains a token, but nothing else. It seems
 like a formatting glitch, but maybe it has a role.

 It's the exclusive lower bound token for the first node in the ring. This
 also happens to be the token for the last node in the ring.

 In your setup
 10.0.0.19 owns (85070591730234615865843651857942052864+1) to 0
 10.0.0.28 owns  (0 + 1) to 85070591730234615865843651857942052864

 (does not imply primary replica, just used to map keys to nodes.)



   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

  You have to use PropertyFileSnitch and NetworkTopologyStrategy to create
 a multi-datacenter setup with two circles. You can start reading from this
 page:

 http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

 Moreover all tokens must be unique (even across datacenters), although -
 from pure curiosity - I wonder what is the rationale behind this.

 By the way, can someone enlighten me about the first line in the output
 of the nodetool. Obviously it contains a token, but nothing else. It seems
 like a formatting glitch, but maybe it has a role.

 On 2012.03.05. 11:06, Tamar Fraenkel wrote:

 Hi!
 I have a Cassandra  cluster with two nodes

  nodetool ring -h localhost
 Address DC  RackStatus State   Load
  OwnsToken

  85070591730234615865843651857942052864
 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB
 50.00%  0
 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB
 50.00%