date:20110412

This is normal when you just add single nodes.   When no token is assigned, 
the new node takes a portion of the ring from the most heavily loaded node.
As a consequence of this, the nodes will be out of balance.

In other words, when you double the amount nodes you would not have this 
problem.

The best way to rebalance the cluster is to generate new tokens and use the 
nodetool move new-token command to rebalance the nodes, one at a time.

After rebalancing you can run cleanup so the nodes get rid of data they no 
longer are responsible for.

links:

http://wiki.apache.org/cassandra/Operations#Range_changes

http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity



On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

 I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
 [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090 
 ring
 Address Status State   LoadOwnsToken  
  

 109028275973926493413574716008500203721 
 192.168.1.25Up Normal  157.25 MB   69.92%  
 57856537434773737201679995572503935972  
 192.168.1.27Up Normal  201.71 MB   24.28%  
 99165710459060760249270263771474737125  
 192.168.1.28Up Normal  68.12 MB5.80%   
 109028275973926493413574716008500203721
 
 The load and owns vary on each node, is this normal?  And is there a way to 
 balance the three nodes?
 
 Thanks.
 
 -- 
 Dikang Gu
 
 0086 - 18611140205

Re: Questions about the nodetool ring.

2011-04-12 Thread Dikang Gu

The 3 nodes were added to the cluster at the same time, so I'm not sure whey
the data vary.

I calculate the tokens and get:
node 0: 0
node 1: 56713727820156410577229101238628035242
node 2: 113427455640312821154458202477256070485

So I should set these tokens to the three nodes?

And during the time I execute the nodetool move commands, can the cassandra
servers serve the front end requests at the same time? Is the data safe?

Thanks.

On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.comwrote:

 This is normal when you just add single nodes.   When no token is
 assigned, the new node takes a portion of the ring from the most heavily
 loaded node.As a consequence of this, the nodes will be out of balance.

 In other words, when you double the amount nodes you would not have this
 problem.

 The best way to rebalance the cluster is to generate new tokens and use the
 nodetool move new-token command to rebalance the nodes, one at a time.

 After rebalancing you can run cleanup so the nodes get rid of data they
 no longer are responsible for.

 links:

 http://wiki.apache.org/cassandra/Operations#Range_changes

 http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity



 On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

  I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
  [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
 8090 ring
  Address Status State   LoadOwnsToken
 
  109028275973926493413574716008500203721
  192.168.1.25Up Normal  157.25 MB   69.92%
  57856537434773737201679995572503935972
  192.168.1.27Up Normal  201.71 MB   24.28%
  99165710459060760249270263771474737125
  192.168.1.28Up Normal  68.12 MB5.80%
 109028275973926493413574716008500203721
 
  The load and owns vary on each node, is this normal?  And is there a way
 to balance the three nodes?
 
  Thanks.
 
  --
  Dikang Gu
 
  0086 - 18611140205
 




-- 
Dikang Gu

0086 - 18611140205

Re: Questions about the nodetool ring.

2011-04-12 Thread Dikang Gu

After the nodetool move, I got this:

[root@server3 apache-cassandra-0.7.4]# bin/nodetool -h 10.18.101.213 ring
Address Status State   LoadOwnsToken


 113427455640312821154458202477256070485
10.18.101.211   ?  Normal  82.31 MB33.33%  0

10.18.101.212   ?  Normal  84.24 MB33.33%
 56713727820156410577229101238628035242
10.18.101.213   Up Normal  54.44 MB33.33%
 113427455640312821154458202477256070485

Is this correct? Why is the status ? ?

Thanks.

On Tue, Apr 12, 2011 at 5:43 PM, Dikang Gu dikan...@gmail.com wrote:

 The 3 nodes were added to the cluster at the same time, so I'm not sure
 whey the data vary.

 I calculate the tokens and get:
 node 0: 0
 node 1: 56713727820156410577229101238628035242
 node 2: 113427455640312821154458202477256070485

 So I should set these tokens to the three nodes?

 And during the time I execute the nodetool move commands, can the cassandra
 servers serve the front end requests at the same time? Is the data safe?

 Thanks.

 On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby 
 jonathan.co...@gmail.comwrote:

 This is normal when you just add single nodes.   When no token is
 assigned, the new node takes a portion of the ring from the most heavily
 loaded node.As a consequence of this, the nodes will be out of balance.

 In other words, when you double the amount nodes you would not have this
 problem.

 The best way to rebalance the cluster is to generate new tokens and use
 the nodetool move new-token command to rebalance the nodes, one at a time.

 After rebalancing you can run cleanup so the nodes get rid of data they
 no longer are responsible for.

 links:

 http://wiki.apache.org/cassandra/Operations#Range_changes

 http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity



 On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

  I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
  [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
 8090 ring
  Address Status State   LoadOwnsToken
 
  109028275973926493413574716008500203721
  192.168.1.25Up Normal  157.25 MB   69.92%
  57856537434773737201679995572503935972
  192.168.1.27Up Normal  201.71 MB   24.28%
  99165710459060760249270263771474737125
  192.168.1.28Up Normal  68.12 MB5.80%
 109028275973926493413574716008500203721
 
  The load and owns vary on each node, is this normal?  And is there a way
 to balance the three nodes?
 
  Thanks.
 
  --
  Dikang Gu
 
  0086 - 18611140205
 




 --
 Dikang Gu

 0086 - 18611140205




-- 
Dikang Gu

0086 - 18611140205

Unsubscribe

2011-04-12 Thread Prasanna Jayapalan

On Apr 12, 2011 5:01 AM, Dikang Gu dikan...@gmail.com wrote:
 I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:

 [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
8090
 ring
 Address Status State Load Owns Token


 109028275973926493413574716008500203721
 192.168.1.25 Up Normal 157.25 MB 69.92%
 57856537434773737201679995572503935972
 192.168.1.27 Up Normal 201.71 MB 24.28%
 99165710459060760249270263771474737125
 192.168.1.28 Up Normal 68.12 MB 5.80%
 109028275973926493413574716008500203721

 The load and owns vary on each node, is this normal? And is there a way to
 balance the three nodes?

 Thanks.

 --
 Dikang Gu

 0086 - 18611140205

Re: Timeout during stress test

Couple of hits here, one from jonathan and some previous discussions on the
user list http://www.google.co.nz/search?q=cassandra+iostat

Same here for cfhistograms
http://www.google.co.nz/search?q=cassandra+cfhistograms
cfhistograms includes information on the number of sstables read during recent
requests. As your initial cfstats showed 236 sstables I thought it may be
useful see if there was a high number of sstables been accessed per read.

70 requests per second is slow against a 6 node cluster where each node has 12
cores and 96GB of ram. Something is not right.

Aaron
On 12 Apr 2011, at 17:11, mcasandra wrote:

aaron morton wrote:

You'll need to provide more information, from the TP stats the read stage
could not keep up. If the node is not CPU bound then it is probably IO
bound.

What sort of read?
How many columns was it asking for ?
How many columns do the rows have ?
Was the test asking for different rows ?
How many ops requests per second did it get up to?
What do the io stats look like ?
What does nodetool cfhistograms say ?

It's simple read of 1M rows with one column of avg size of 200K. Got around
70 req per sec.

Not sure how to intepret the iostats output with things happening async in
cassandra. Can you give little description on how to interpret it?

I have posted output of cfstats. Does cfhistograms provide better info?

--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263859.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.

Re: Questions about the nodetool ring.

when you do a move, the node is decommissioned and bootstrapped. During the 
autobootstrap process the node will not receive reads until bootstrapping is 
complete.  I assume during the decommission phase the node will also be 
unavailable,  someone correct me if I'm wrong.

the ring distribution looks better now.

The ? I get all the time too.   And if you run ring against different 
hosts, the question marks probably appear in different places.   I'm not sure 
if it means there is a problem.  I haven't taken those question marks too 
seriously.



On Apr 12, 2011, at 11:57 AM, Dikang Gu wrote:

 After the nodetool move, I got this:
 
 [root@server3 apache-cassandra-0.7.4]# bin/nodetool -h 10.18.101.213 ring
 Address Status State   LoadOwnsToken  
  

 113427455640312821154458202477256070485 
 10.18.101.211   ?  Normal  82.31 MB33.33%  0  
  
 10.18.101.212   ?  Normal  84.24 MB33.33%  
 56713727820156410577229101238628035242  
 10.18.101.213   Up Normal  54.44 MB33.33%  
 113427455640312821154458202477256070485
 
 Is this correct? Why is the status ? ?
 
 Thanks.
 
 On Tue, Apr 12, 2011 at 5:43 PM, Dikang Gu dikan...@gmail.com wrote:
 The 3 nodes were added to the cluster at the same time, so I'm not sure whey 
 the data vary.
 
 I calculate the tokens and get:
 node 0: 0
 node 1: 56713727820156410577229101238628035242
 node 2: 113427455640312821154458202477256070485
 
 So I should set these tokens to the three nodes?  
 
 And during the time I execute the nodetool move commands, can the cassandra 
 servers serve the front end requests at the same time? Is the data safe?
 
 Thanks.
 
 On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.com 
 wrote:
 This is normal when you just add single nodes.   When no token is assigned, 
 the new node takes a portion of the ring from the most heavily loaded node.   
  As a consequence of this, the nodes will be out of balance.
 
 In other words, when you double the amount nodes you would not have this 
 problem.
 
 The best way to rebalance the cluster is to generate new tokens and use the 
 nodetool move new-token command to rebalance the nodes, one at a time.
 
 After rebalancing you can run cleanup so the nodes get rid of data they no 
 longer are responsible for.
 
 links:
 
 http://wiki.apache.org/cassandra/Operations#Range_changes
 
 http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes
 
 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity
 
 
 
 On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:
 
  I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
  [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 
  8090 ring
  Address Status State   LoadOwnsToken
 
  109028275973926493413574716008500203721
  192.168.1.25Up Normal  157.25 MB   69.92%  
  57856537434773737201679995572503935972
  192.168.1.27Up Normal  201.71 MB   24.28%  
  99165710459060760249270263771474737125
  192.168.1.28Up Normal  68.12 MB5.80%   
  109028275973926493413574716008500203721
 
  The load and owns vary on each node, is this normal?  And is there a way to 
  balance the three nodes?
 
  Thanks.
 
  --
  Dikang Gu
 
  0086 - 18611140205
 
 
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205

cassandra 0.6.3 error Connection refused to host: 127.0.0.1;


Hi All

I have migrated my server to centos 5.5.Every thing is up but facing a 
little issue i have two cassandra nodes.


10.0.0.4  cassandra2
10.0.0.3 cassandra1

I am using open jdk with cassandra,We are faing following error when 
using nodetool.Only on one server that is cassandra2.Hosts file is also 
pasted below.I please let me know how can i fix this issue.


-
sh  nodetool -h 10.0.0.3  ring
Error connecting to remote JMX agent!
java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested 
exception is:

---


sh  nodetool -h 10.0.0.4  ring
Address   Status Load  
Range  Ring
   
129069858893052904163677015069685590304
10.0.0.3  Up 10.02 GB  
104465788091875410298027059042850717029|--|
10.0.0.4  Up 9.98 GB   
129069858893052904163677015069685590304|--|




Hosts file

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1localhost.localdomain localhost
10.0.0.4cassandra2.pringit.com


#::1localhost6.localdomain6 localhost6



--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

repair never completes with finished successfully

There are a few other threads related to problems with the nodetool repair in 
0.7.4.  However I'm not seeing any errors, just never getting a message that 
the repair completed successfully.

In my production and test cluster (with just a few MB data)  the repair 
nodetool prompt never returns and the last entry in the cassandra.log is always 
something like:

#TreeRequest manual-repair-f739ca7a-bef8-4683-b249-09105f6719d9, 
/10.46.108.102, (DFS,main) completed successfully: 1 outstanding

But I don't see a message, even hours later, that the 1 outstanding request 
finished successfully.

Anyone else experience this?  These are physical server nodes in local data 
centers and not EC2

Re: Read time get worse during dynamic snitch reset

Something feels odd. 

From Peters nice write up of the dynamic snitch 
http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html The 
RackInferringSnitch (and the PropertyFileSnitch) derive from the 
AbstractNetworkTopologySnitch and should...

In the case of the NetworkTopologyStrategy, it inherits the
implementation in AbstractNetworkTopologySnitch which sorts by
AbstractNetworkTopologySnitch.compareEndPoints(), which:

(1) Always prefers itself to any other node. So myself is always
closest, no matter what.
(2) Else, always prefers a node in the same rack, to a node in a different rack.
(3) Else, always prefers a node in the same dc, to a node in a different dc.


AFAIK the (data) request should be going to the local DC even after the 
DynamicSnitch has reset the scores. Because the underlying RackInferringSnitch 
should prefer local nodes.

Just for fun check rack and dc assignments are what you thought using the 
operations on o.a.c.db.EndpointSnitchInfo bean in JConsole. Pass in the ip 
address for the nodes in each dc. If possible can you provide some info on the 
ip's in each dc?

Aaron
 
On 12 Apr 2011, at 18:24, shimi wrote:

 On Tue, Apr 12, 2011 at 12:26 AM, aaron morton aa...@thelastpickle.com 
 wrote:
 The reset interval clears the latency tracked for each node so a bad node 
 will be read from again. The scores for each node are then updated every 
 100ms (default) using the last 100 responses from a node. 
 
 How long does the bad performance last for?
 Only a few seconds and but there are a lot of read requests during this time
 
 What CL are you reading at ? At Quorum with RF 4 the read request will be 
 sent to 3 nodes, ordered by proximity and wellness according to the dynamic 
 snitch. (for background recent discussion on dynamic snitch 
 http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)
 I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch 
 and keys_cached = rows_cached = 0
 
 You can take a look at the weights and timings used by the DynamicSnitch in 
 JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level you 
 will be able to see which nodes the request is sent to. 
 Everything looks OK. The weights are around 3 for the nodes in the same data 
 center and around 5 for the others. I will turn on the DEBUG level to see if 
 I can find more info.
 
 My guess is the DynamicSnitch is doing the right thing and the slow down is a 
 node with a problem getting back into the list of nodes used for your read. 
 It's then moved down the list as it's bad performance is noticed.
 Looking the DynamicSnitch MBean I don't see any problems with any of the 
 nodes. My guess is that during the reset time there are reads that are sent 
 to the other data center. 
 
 Hope that helps
 Aaron
 
 Shimi
  
 
 On 12 Apr 2011, at 01:28, shimi wrote:
 
 I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new 
 version for several days across 2 data centers.
 I noticed that the read time in some of the nodes increase by x50-60 every 
 ten minutes.
 There was no indication in the logs for something that happen at the same 
 time. The only thing that I know that is running every 10 minutes is the 
 dynamic snitch reset.
 So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I 
 have the problem once in every 20 minutes.
 
 I am running all nodes with:
 replica_placement_strategy: 
 org.apache.cassandra.locator.NetworkTopologyStrategy
   strategy_options:
 DC1 : 2
 DC2 : 2
   replication_factor: 4
 
 (DC1 and DC2 are taken from the ips)
 Does anyone familiar with this kind of behavior?
 
 Shimi

Re: repair never completes with finished successfully

2011-04-12 Thread Karl Hiramoto


On 12/04/2011 13:31, Jonathan Colby wrote:

There are a few other threads related to problems with the nodetool repair in 
0.7.4.  However I'm not seeing any errors, just never getting a message that 
the repair completed successfully.

In my production and test cluster (with just a few MB data)  the repair 
nodetool prompt never returns and the last entry in the cassandra.log is always 
something like:

#TreeRequest manual-repair-f739ca7a-bef8-4683-b249-09105f6719d9, /10.46.108.102, 
(DFS,main)  completed successfully: 1 outstanding

But I don't see a message, even hours later, that the 1 outstanding request 
finished successfully.

Anyone else experience this?  These are physical server nodes in local data 
centers and not EC2



I've seen this.   To fix it  try a nodetool compact then repair.


--
Karl

Re: Strange readRepairChance in server logs

Bug in the CLI, created / fixed 
https://issues.apache.org/jira/browse/CASSANDRA-2458

use 70 for now. 

Thanks
Aaron

On 12 Apr 2011, at 20:46, Héctor Izquierdo Seliva wrote:

 Hi everyone.
 
 I've changed the read repair chance of one of my column families from
 cassandra-cli with the following entry:
 
 update column family cf with read_repair_chance = 0.7
 
 I expected to see in the server log 
 
 readRepairChance=0.7
 
 Instead I saw this
 
 readRepairChance=0.006999,
 
 Should I use read_repair_chance = 70 instead of 0.7?

unsubscribe

2011-04-12 Thread Bevan Christians

Re: Questions about the nodetool ring.

If you are seeing a different views of the ring from different nodes you may
have some sickness
http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodes

The ? in the ring output happens when one node does not know if the other is
alice or dead. This could be due to the corrupt gossip state described in the
link above.

During a move the node will decommission and stop taking requests for the rang
it was responsible for. But other nodes int he cluster will take it's place.
Once it starts bootstrapping it will start accepting writes but not reads. The
cluster stays online for all token ranges.

Dikang, did you allow the first move to complete before starting the second?

Aaron

On 12 Apr 2011, at 22:55, Jonathan Colby wrote:

when you do a move, the node is decommissioned and bootstrapped. During the
autobootstrap process the node will not receive reads until bootstrapping is
complete. I assume during the decommission phase the node will also be
unavailable, someone correct me if I'm wrong.

the ring distribution looks better now.

The ? I get all the time too. And if you run ring against different
hosts, the question marks probably appear in different places. I'm not sure
if it means there is a problem. I haven't taken those question marks too
seriously.

On Apr 12, 2011, at 11:57 AM, Dikang Gu wrote:

After the nodetool move, I got this:

[root@server3 apache-cassandra-0.7.4]# bin/nodetool -h 10.18.101.213 ring
Address Status State LoadOwnsToken

113427455640312821154458202477256070485
10.18.101.211 ? Normal 82.31 MB33.33% 0

10.18.101.212 ? Normal 84.24 MB33.33%
56713727820156410577229101238628035242
10.18.101.213 Up Normal 54.44 MB33.33%
113427455640312821154458202477256070485

Is this correct? Why is the status ? ?

Thanks.

On Tue, Apr 12, 2011 at 5:43 PM, Dikang Gu dikan...@gmail.com wrote:
The 3 nodes were added to the cluster at the same time, so I'm not sure whey
the data vary.

I calculate the tokens and get:
node 0: 0
node 1: 56713727820156410577229101238628035242
node 2: 113427455640312821154458202477256070485

So I should set these tokens to the three nodes?

And during the time I execute the nodetool move commands, can the cassandra
servers serve the front end requests at the same time? Is the data safe?

Thanks.

On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.com
wrote:
This is normal when you just add single nodes. When no token is
assigned, the new node takes a portion of the ring from the most heavily
loaded node.As a consequence of this, the nodes will be out of balance.

In other words, when you double the amount nodes you would not have this
problem.

The best way to rebalance the cluster is to generate new tokens and use the
nodetool move new-token command to rebalance the nodes, one at a time.

After rebalancing you can run cleanup so the nodes get rid of data they no
longer are responsible for.

links:

http://wiki.apache.org/cassandra/Operations#Range_changes

http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity

On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:

[root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
8090 ring
Address Status State LoadOwnsToken

109028275973926493413574716008500203721
192.168.1.25Up Normal 157.25 MB 69.92%
57856537434773737201679995572503935972
192.168.1.27Up Normal 201.71 MB 24.28%
99165710459060760249270263771474737125
192.168.1.28Up Normal 68.12 MB5.80%
109028275973926493413574716008500203721

The load and owns vary on each node, is this normal? And is there a way
to balance the three nodes?

Thanks.

--
Dikang Gu

0086 - 18611140205

--
Dikang Gu

0086 - 18611140205

--
Dikang Gu

0086 - 18611140205

Re: repair never completes with finished successfully

There is no Repair session message either.   It just starts with a message 
like:

INFO [manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723] 2011-04-10 
14:00:59,051 AntiEntropyService.java (line 770) Waiting for repair requests: 
[#TreeRequest manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, 
/10.46.108.101, (DFS,main), #TreeRequest 
manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, /10.47.108.100, 
(DFS,main), #TreeRequest manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, 
/10.47.108.102, (DFS,main), #TreeRequest 
manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, /10.47.108.101, (DFS,main)]

NETSTATS:

Mode: Normal
Not sending any streams.
Not receiving any streams.
Pool NameActive   Pending  Completed
Commandsn/a 0 150846
Responses   n/a 0 443183

One node in our cluster still has unreadable rows, where the reads trip up 
every time for certain sstables (you've probably seen my earlier threads 
regarding that).   My suspicion is that the bloom filter read on the node with 
the corrupt sstables is never reporting back to the repair, thereby causing it 
to hang.


What would be great is a scrub tool that ignores unreadable/unserializable 
rows!  : )
 

On Apr 12, 2011, at 2:15 PM, aaron morton wrote:

 Do you see a message starting Repair session  and ending with completed 
 successfully ?
 
 Or do you see any streaming activity using nodetool netstats
 
 Repair can hang if a neighbour dies and fails to send a requested stream. It 
 will timeout after 24 hours (I think). 
 
 Aaron
 
 On 12 Apr 2011, at 23:39, Karl Hiramoto wrote:
 
 On 12/04/2011 13:31, Jonathan Colby wrote:
 There are a few other threads related to problems with the nodetool repair 
 in 0.7.4.  However I'm not seeing any errors, just never getting a message 
 that the repair completed successfully.
 
 In my production and test cluster (with just a few MB data)  the repair 
 nodetool prompt never returns and the last entry in the cassandra.log is 
 always something like:
 
 #TreeRequest manual-repair-f739ca7a-bef8-4683-b249-09105f6719d9, 
 /10.46.108.102, (DFS,main)  completed successfully: 1 outstanding
 
 But I don't see a message, even hours later, that the 1 outstanding request 
 finished successfully.
 
 Anyone else experience this?  These are physical server nodes in local data 
 centers and not EC2
 
 
 I've seen this.   To fix it  try a nodetool compact then repair.
 
 
 --
 Karl

Re: Strange readRepairChance in server logs

Thanks Aaron!

El mar, 12-04-2011 a las 23:52 +1200, aaron morton escribió:
 Bug in the CLI, created /
 fixed https://issues.apache.org/jira/browse/CASSANDRA-2458
 
 
 use 70 for now. 
 
 
 Thanks
 Aaron
 
 
 On 12 Apr 2011, at 20:46, Héctor Izquierdo Seliva wrote:
 
 
 
  Hi everyone.
  
  I've changed the read repair chance of one of my column families
  from
  cassandra-cli with the following entry:
  
  update column family cf with read_repair_chance = 0.7
  
  I expected to see in the server log 
  
  readRepairChance=0.7
  
  Instead I saw this
  
  readRepairChance=0.006999,
  
  Should I use read_repair_chance = 70 instead of 0.7?

Cassandra monitoring tool

Hi everyone.

Looking for ways to monitor cassandra with zabbix I could not found
anything that was really usable, till I found mention of a nice class by
smeet. I have based my modification upon his work and now I give it back
to the community.

Here's the project url:

http://code.google.com/p/simple-cassandra-monitoring/

It allows to get statistics for any Keyspace/ColumnFamily you want. To
start it just build the jar, and launch it using as classpath your
cassandra installation lib folder.

The first parameter is the node host name. The second parameter is a
comma separated list of KS:CF values. For example:

java -cp blablabla localhost ks1:cf1,ks1:cf2.

Then point curl to http://localhost:9090/ks1/cf1 and some basic stats
will be displayed.

You can also point to http://localhost:9090/nodeinfo to get some info
about the server.

If you have any suggestion or improvement you would like to see, please
contact me and I will be glad to work on it. Right now it's a bit rough,
but it gets the job done.

Thanks for your time!

quick repair tool question

does a repair just compare the existing data from sstables on the node being 
repaired, or will it figure out which data this node should have and copy it 
in?

I'm trying to refresh all the data for a given node (without reassigning the 
token) starting with an emptied out data directory.

I tried nodetool move, but if I give the same token it previously was assigned 
it doesn't seem to trigger a decommission/bootstrap. 

Thanks.

Re: quick repair tool question

I think I answered the question myself.  The data is streaming in from other 
replicas even though the node's data dir was emptied out (system dir was left 
alone).   

I'm not sure if this is the kosher way to rebuild the sstable data, but it 
seemed to work.   

/var/lib/cassandra/data # /opt/cassandra/bin/nodetool -h $HOSTNAME -p 35014 
netstats 
Mode: Normal
Not sending any streams.
Streaming from: /10.46.108.100
  DFS: 
/var/lib/cassandra/data/DFS/main-f-85-Data.db/(101772144,192460041),(192460041,267088244)
 progress=0/165316100 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-86-Data.db/(118410757,194489915),(194489915,247653739)
 progress=0/129242982 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-40-Data.db/(4823893695,4850323665),(4850323665,7818579650)
 progress=0/2994685955 - 0%
  DFS: /var/lib/cassandra/data/DFS/main-f-89-Data.db/(0,707948),(707948,2011040)
 progress=0/2011040 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-70-Data.db/(778069440,1015544852),(1015544852,1200443249)
 progress=0/422373809 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-71-Data.db/(119366025,132069485),(132069485,156787816)
 progress=0/37421791 - 0%
Streaming from: /10.47.108.100
  DFS: 
/var/lib/cassandra/data/DFS/main-f-365-Data.db/(0,24748050),(126473995,170409694)
 progress=0/68683749 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-367-Data.db/(0,935041),(935041,2238133)
 progress=0/2238133 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-366-Data.db/(0,4608808),(37713613,46884920)
 progress=0/13780115 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-242-Data.db/(0,1057203157),(3307900143,4339490352)
 progress=0/2088793366 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-352-Data.db/(0,19422069),(81246761,122537002)
 progress=0/60712310 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-225-Data.db/(0,1580865981),(4540941750,6024843721)
 progress=0/3064767952 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-349-Data.db/(0,21720053),(54115405,71716716)
 progress=0/39321364 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-364-Data.db/(0,72606213),(175419693,238159626)
 progress=0/135346146 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-363-Data.db/(0,1184983783),(3458591846,4556646617)
 progress=0/2283038554 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-368-Data.db/(0,756228),(756228,1626647)
 progress=0/1626647 - 0%
  DFS: /var/lib/cassandra/data/DFS/main-f-361-Data.db/(48074007,78009236)
 progress=0/29935229 - 0%
  DFS: 
/var/lib/cassandra/data/DFS/main-f-226-Data.db/(0,3111952321),(8592898278,11484622800)
 progress=0/6003676843 - 0%
Pool NameActive   Pending  Completed
Commandsn/a 0   5765
Responses   n/a 0   9811
On Apr 12, 2011, at 4:59 PM, Jonathan Colby wrote:

 does a repair just compare the existing data from sstables on the node being 
 repaired, or will it figure out which data this node should have and copy 
 it in?
 
 I'm trying to refresh all the data for a given node (without reassigning the 
 token) starting with an emptied out data directory.
 
 I tried nodetool move, but if I give the same token it previously was 
 assigned it doesn't seem to trigger a decommission/bootstrap. 
 
 Thanks.

Cassandra 2 DC deployment

2011-04-12 Thread Raj N

Hi experts,
 We are planning to deploy Cassandra in 2 datacenters. Let assume there
are 3 nodes, RF=3, 2 nodes in 1 DC and 1 node in 2nd DC. Under normal
operations, we would read and write at QUORUM. What we want to do though is
if we lose a datacenter which has 2 nodes, DC1 in this case, we want to
downgrade our consistency to ONE. Basically I am saying that whenever there
is a partition, then prefer availability over consistency. In order to do
this we plan to catch UnavailableException and take corrective action. So
try QUORUM under normal circumstances, if unavailable try ONE. My questions
-
Do you guys see any flaws with this approach?
What happens when DC1 comes back up and we start reading/writing at QUORUM
again? Will we read stale data in this case?

Thanks
-Raj

Re: Cassandra 2 DC deployment

When the down data center comes back up, the Quorum reads will result in a 
read-repair, so you will get valid data.   Besides that, hinted handoff will 
take care of getting data replicated to a previously down node.

You're example is a little unrealistic because you could theoretically have a 
DC with only one node.  So CL.ONE would work every time.   But if you have more 
than 1 node, you have to decide if your application can tolerate getting NULL 
 for a read if the write hasn't propagated from the responsible node to the 
replica.

disclaimer:  I'm a cassandra novice.

On Apr 12, 2011, at 5:12 PM, Raj N wrote:

 Hi experts,
  We are planning to deploy Cassandra in 2 datacenters. Let assume there 
 are 3 nodes, RF=3, 2 nodes in 1 DC and 1 node in 2nd DC. Under normal 
 operations, we would read and write at QUORUM. What we want to do though is 
 if we lose a datacenter which has 2 nodes, DC1 in this case, we want to 
 downgrade our consistency to ONE. Basically I am saying that whenever there 
 is a partition, then prefer availability over consistency. In order to do 
 this we plan to catch UnavailableException and take corrective action. So try 
 QUORUM under normal circumstances, if unavailable try ONE. My questions -
 Do you guys see any flaws with this approach? 
 What happens when DC1 comes back up and we start reading/writing at QUORUM 
 again? Will we read stale data in this case?
 
 Thanks
 -Raj

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;


Please any one can

On 04/12/2011 04:07 PM, Ali Ahsan wrote:

Hi All

I have migrated my server to centos 5.5.Every thing is up but facing a 
little issue i have two cassandra nodes.


10.0.0.4  cassandra2
10.0.0.3 cassandra1

I am using open jdk with cassandra,We are faing following error when 
using nodetool.Only on one server that is cassandra2.Hosts file is 
also pasted below.I please let me know how can i fix this issue.


- 


sh  nodetool -h 10.0.0.3  ring
Error connecting to remote JMX agent!
java.rmi.ConnectException: Connection refused to host: 127.0.0.1; 
nested exception is:
--- 




sh  nodetool -h 10.0.0.4  ring
Address   Status Load  
Range  Ring
   
129069858893052904163677015069685590304
10.0.0.3  Up 10.02 GB  
104465788091875410298027059042850717029|--|
10.0.0.4  Up 9.98 GB   
129069858893052904163677015069685590304|--|




Hosts file

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1localhost.localdomain localhost
10.0.0.4cassandra2.pringit.com


#::1localhost6.localdomain6 localhost6






--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

Re: Cassandra monitoring tool

Thanks for sharing this info,I am getting following error,Can please be 
more specific how can i run this



java -cp 
/home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar 
127.0.0.1 ks1:cf1,ks1:cf2

Exception in thread main java.lang.NoClassDefFoundError: 127/0/0/1
Caused by: java.lang.ClassNotFoundException: 127.0.0.1
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: 127.0.0.1. Program will exit.



OR


java -jar 
/home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar  localhost 
ks1:cf1,ks1:cf2


Failed to load Main-Class manifest attribute from
/home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar



On 04/12/2011 07:26 PM, Héctor Izquierdo Seliva wrote:

Hi everyone.

Looking for ways to monitor cassandra with zabbix I could not found
anything that was really usable, till I found mention of a nice class by
smeet. I have based my modification upon his work and now I give it back
to the community.

Here's the project url:

http://code.google.com/p/simple-cassandra-monitoring/

It allows to get statistics for any Keyspace/ColumnFamily you want. To
start it just build the jar, and launch it using as classpath your
cassandra installation lib folder.

The first parameter is the node host name. The second parameter is a
comma separated list of KS:CF values. For example:

java -cp blablabla localhost ks1:cf1,ks1:cf2.

Then point curl to http://localhost:9090/ks1/cf1 and some basic stats
will be displayed.

You can also point to http://localhost:9090/nodeinfo to get some info
about the server.

If you have any suggestion or improvement you would like to see, please
contact me and I will be glad to work on it. Right now it's a bit rough,
but it gets the job done.

Thanks for your time!






--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

pycassa timeouts resolved by killing a random node in the ring

2011-04-12 Thread Jason Harvey

Interesting issue this morning.

My apps started throwing a bunch of pycassa timeouts all of a sudden.
The ring looked perfect. No load issues anywhere, and no errors in the
logs.

The site was basically down, so I got desperate and whacked a random
node in the ring. As soon as gossip saw it go down, the timeouts went
away. Thinking that was kinda crazy, I started the node back up. As
soon as it rejoined the ring, pycassa started timing out again. I then
killed another random node, far away from the first node I killed, and
the timeouts stopped again. Started it back up, and the timeouts
started again when it rejoined the ring.

Repeated this process once more just to make sure I wasn't insane, and
the same result happened. Killing any single node, anywhere in the
ring, fixes my timeouts.

Actively able to repro this. I am having to just keep one node down
right now so the site doesn't break. Desperate for any suggestions or
advice on this.

Using pycassa 1.0.7. Timeout is set to 15 seconds, with 3 retries.
Reads and writes are in quorum. 27 nodes in the ring, with an RF of 3.

Thanks,
Jason

Re: Timeout during stress test

Here is what cfhistograms look like. Don't really understand what this means,
will try to read. I also %util in iostat continuously 90%. Not sure if this
is caused by extra reads by cassandra. It seems unusual.

[root@dsdb4 ~]# nodetool -h `hostname` cfhistograms StressKeyspace
StressStandard
StressKeyspace/StressStandard histograms
Offset  SSTables Write Latency  Read Latency  Row Size 
Column Count
1  45720 0 0 0  
 
498857
2  0 0 0 0  
  
0
3  0 0 0 0  
  
0
4  0 0 0 0  
  
0
5  0 0 0 0  
  
0
6  0 0 1 0  
  
0
7  0 0 1 0  
  
0
8  0 0 0 0  
  
0
10 0 0 0 0  
  
0
12 0 0 0 0  
  
0
14 0 0 0 0  
  
0
17 0 1 0 0  
  
0
20 0 2 0 0  
  
0
24 0 1 0 0  
  
0
29 0 6 0 0  
  
0
35 068 0 0  
  
0
42 0   509 0 0  
  
0
50 0  1128 0 0  
  
0
60 0  1449 0 0  
  
0
72 0   789 0 0  
  
0
86 0   400 0 0  
  
0
1030   319 0 0  
  
0
1240   388 0 0  
  
0
1490   456 0 0  
  
0
1790   519 0 0  
  
0
2150   262 0 0  
  
0
2580   194 0 0  
  
0
310048 0 0  
  
0
3720 5 0 0  
  
0
4460 1 0 0  
  
0
5350 0 0 0  
  
0
6420 0 0 0  
  
0
7700 1 0 0  
  
0
9240 1 0 0  
  
0
1109   0 0 0 0  
  
0
1331   0 1 0 0  
  
0
1597 0 0 0  
  
0
1916 1 0 0  
  
0
2299 0 0 0  
  
0
2759 0 0 0  
  
0
3311 0 0 0  
  
0
3973 1 0 0  
  
0
4768 5 0 0  
  
0
572219 0 0  
  
0
686646 0 0  
  
0
8239   102 0 0  
  
0
9887   226 0 0  
  
0
11864  368 0 0  
  
0
14237  572 0

RE: batch_mutate failed: out of sequence response

2011-04-12 Thread Stephen McKamey

[I wrote this Apr 10, 2011 at 12:09 but my message seems to have gotten lost
along the way.]

I use Pelops (the 1.0-0.7.x build from the Github Maven repo) and have
occasionally seen this message (under load or during GC). I have a test app
running in two separate single-threaded processes doing a slow trickle
insert into a single Cassandra 0.7.4 node all on the same box (Mac OS X).

This has been running off and on for over a week with no exceptions and I
just this same error about two hours ago. Both client processes experienced
it at about the same time, and it seemed related to a GC/compaction on the
Cassandra instance.

I'm guessing that it is either actually a read timeout on the clients,
or (less likely) somehow the Cassandra instance mixed up the two responses?

On Fri, Apr 8 2011 at 07:28, Dan Washusen d...@reactive.org wrote:

 Dan Hendry mentioned that he sees these errors.  Is he also using Pelops?
  From his comment about retrying I'd assume not...

 --
 Dan Washusen

 On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:

 El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:

 out of sequence response is thrift's way of saying I got a response
 for request Y when I expected request X.

 my money is on using a single connection from multiple threads. don't do
 that.


 I'm not using thrift directly, and my application is single thread, so I
 guess this is Pelops fault somehow. Since I managed to tame memory
 comsuption the problem has not appeared again, but it always happened
 during a stop-the-world GC. Could it be that the message was sent
 instead of being dropped by the server when the client assumed it had
 timed out?

Re: Cassandra monitoring tool

El mar, 12-04-2011 a las 21:24 +0500, Ali Ahsan escribió:
 Thanks for sharing this info,I am getting following error,Can please be 
 more specific how can i run this
 
 
 java -cp 
 /home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar 
 127.0.0.1 ks1:cf1,ks1:cf2
 Exception in thread main java.lang.NoClassDefFoundError: 127/0/0/1
 Caused by: java.lang.ClassNotFoundException: 127.0.0.1
  at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
 Could not find the main class: 127.0.0.1. Program will exit.
 
 
  
  OR
 
 java -jar 
 /home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar  
 localhost 
 ks1:cf1,ks1:cf2
 
 Failed to load Main-Class manifest attribute from
 /home/ali/apache-cassandra-0.6.3/lib/simple-cassandra-monitoring-1.0.jar
 
 

Hi Ali. You should run it like this

java -cp /home/ali/apache-cassandra-0.6.3/lib/*
com.google.code.scm.CassandraMonitoring localhost ks1:cf1,ks2:cf2,etc

I forgot to mention it has been coded against 0.7.x, and I'm not sure it
will work on 0.6.x. I'll try to add support for both 0.6.x and the new
0.8.x version as soon as possible.

 
 On 04/12/2011 07:26 PM, Héctor Izquierdo Seliva wrote:
  Hi everyone.
 
  Looking for ways to monitor cassandra with zabbix I could not found
  anything that was really usable, till I found mention of a nice class by
  smeet. I have based my modification upon his work and now I give it back
  to the community.
 
  Here's the project url:
 
  http://code.google.com/p/simple-cassandra-monitoring/
 
  It allows to get statistics for any Keyspace/ColumnFamily you want. To
  start it just build the jar, and launch it using as classpath your
  cassandra installation lib folder.
 
  The first parameter is the node host name. The second parameter is a
  comma separated list of KS:CF values. For example:
 
  java -cp blablabla localhost ks1:cf1,ks1:cf2.
 
  Then point curl to http://localhost:9090/ks1/cf1 and some basic stats
  will be displayed.
 
  You can also point to http://localhost:9090/nodeinfo to get some info
  about the server.
 
  If you have any suggestion or improvement you would like to see, please
  contact me and I will be glad to work on it. Right now it's a bit rough,
  but it gets the job done.
 
  Thanks for your time!

forced index creation?

2011-04-12 Thread Sasha Dolgy

hi, just deployed a new keyspace on 0.7.4 and added the following column family:

create column family applications with comparator=UTF8Type and column_metadata=[
{column_name: app_name, validation_class: UTF8Type},
{column_name: app_uri, validation_class: UTF8Type,index_type: KEYS},
{column_name: app_id, validation_class: UTF8Type}
];

I then proceeded to add two new rows of data to it.  When i try and
query the secondary index on app_uri, my query with phpcassa fails.
on the same CF in a different cluster, it works fine.  when comparing
the CF between clusters, see there's a difference: ---  Built indexes:
--- shows up when i run -- describe keyspace foobar;



  Column Metadata:
Column Name: app_name (app_name)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
Column Name: app_id (app_id)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
Column Name: app_uri (app_uri)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Index Type: KEYS

Checking out a bit further:

get applications where 'app_uri' = 'get-test';
---
RowKey: 9d699733-9afe-4a41-83ca-c60d040dacc0


get applications where 'app_id' = '9d699733-9afe-4a41-83ca-c60d040dacc0';
No indexed columns present in index clause with operator EQ

So .. I can see that the secondary indexes are working.

Question 1:  Has Built indexes been removed from the describe
keyspace output?  Or have i done something 
Question 2:  Is there a way to force secondary index creation?





-- 
Sasha Dolgy
sasha.do...@gmail.com

Re: Cassandra monitoring tool


On 04/12/2011 10:42 PM, Héctor Izquierdo Seliva wrote:


I forgot to mention it has been coded against 0.7.x, and I'm not sure it
will work on 0.6.x. I'll try to add support for both 0.6.x and the new
0.8.x version as soon as possible.




I think these error  is because of 0.6.3 ?



xception in thread main java.io.IOException: Failed to retrieve 
RMIServer stub: javax.naming.CommunicationException [Root exception is 
java.rmi.ConnectIOException: error during JRMP connection establishment; 
nested exception is:

java.io.EOFException]
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:342)
at 
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)
at 
com.google.code.scm.CassandraMonitoring.start(CassandraMonitoring.java:58)
at 
com.google.code.scm.CassandraMonitoring.main(CassandraMonitoring.java:190)
Caused by: javax.naming.CommunicationException [Root exception is 
java.rmi.ConnectIOException: error during JRMP connection establishment; 
nested exception is:

java.io.EOFException]
at 
com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118)
at 
com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203)

at javax.naming.InitialContext.lookup(InitialContext.java:409)
at 
javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1902)
at 
javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1871)
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:276)

... 3 more
Caused by: java.rmi.ConnectIOException: error during JRMP connection 
establishment; nested exception is:

java.io.EOFException
at 
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:304)

at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
at 
com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:114)

... 8 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:267)
at 
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:246)

... 12 more


--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

Cassandra node's replication factor two with random partition non Bootstrap node problem


Hi All

I have two cassandra node's,If Boot strapped  nodes goes down my service 
remains alive,But if my non Bootstrap (master) node goes down my live 
site goes down as well,I am using cassandra 0.6.3 can any elaborate on 
this problem.

Re: Cassandra monitoring tool

I'm not sure. Are you runing it in the same host as the cassandra node?

El mar, 12-04-2011 a las 22:54 +0500, Ali Ahsan escribió:
 On 04/12/2011 10:42 PM, Héctor Izquierdo Seliva wrote:
 
  I forgot to mention it has been coded against 0.7.x, and I'm not sure it
  will work on 0.6.x. I'll try to add support for both 0.6.x and the new
  0.8.x version as soon as possible.
 
 
 
 I think these error  is because of 0.6.3 ?
 
 
 
 xception in thread main java.io.IOException: Failed to retrieve 
 RMIServer stub: javax.naming.CommunicationException [Root exception is 
 java.rmi.ConnectIOException: error during JRMP connection establishment; 
 nested exception is:
  java.io.EOFException]
  at 
 javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:342)
  at 
 javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)
  at 
 com.google.code.scm.CassandraMonitoring.start(CassandraMonitoring.java:58)
  at 
 com.google.code.scm.CassandraMonitoring.main(CassandraMonitoring.java:190)
 Caused by: javax.naming.CommunicationException [Root exception is 
 java.rmi.ConnectIOException: error during JRMP connection establishment; 
 nested exception is:
  java.io.EOFException]
  at 
 com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118)
  at 
 com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203)
  at javax.naming.InitialContext.lookup(InitialContext.java:409)
  at 
 javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1902)
  at 
 javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1871)
  at 
 javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:276)
  ... 3 more
 Caused by: java.rmi.ConnectIOException: error during JRMP connection 
 establishment; nested exception is:
  java.io.EOFException
  at 
 sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:304)
  at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
  at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
  at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
  at 
 com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:114)
  ... 8 more
 Caused by: java.io.EOFException
  at java.io.DataInputStream.readByte(DataInputStream.java:267)
  at 
 sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:246)
  ... 12 more

Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-12 Thread William Oberman

Hi,

I'm getting closer to commiting to cassandra, and now I'm in system/IT
issues and questions.  I'm in the amazon EC2 cloud.  I previously used this
forum to discover the best practice for disk layouts (large instance + the
two ephemeral disks in RAID0 for data + root volume for everything else).
Now I'm hoping to confirm bits and pieces of things I've read about for
snitch/replication strategies.  I was thinking of using
endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'
(for people hitting this from the mailing list or google, I feel obligated
to note that the former setting is in cassandra.yaml, and the latter is an
option on a keyspace).

But, I'm only in one region. Is using the amazon snitch/networktopology
overkill given everything I have is in one DC (I believe region==DC and
availability_zone==rack).  I'm using multiple availability zones for some
level of redundancy, I'm just not yet to the point I'm using multiple
regions.  If someday I move to using multiple regions, would that change the
answer?

Thanks!

-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com

Re: Cassandra monitoring tool

Yes same host,I will test this with my developer team  and let you know 
more on it.


On 04/12/2011 11:14 PM, Héctor Izquierdo Seliva wrote:

I'm not sure. Are you runing it in the same host as the cassandra node?



--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

Re: Lot of pending tasks for writes

Can someone please help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Lot-of-pending-tasks-for-writes-tp6263462p6266213.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

flush_largest_memtables_at messages in 7.4

I am using cassandra 7.4 and getting these messages.

Heap is 0.7802529021498031 full. You may need to reduce memtable and/or
cache sizes Cassandra will now flush up to the two largest memtables to free
up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if
you don't want Cassandra to do this automatically

How do I verify that I need to adjust any thresholds? And how to calculate
correct value?

When I got this message only reads were occuring.

create keyspace StressKeyspace
with replication_factor = 3
and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

use StressKeyspace;
drop column family StressStandard;
create column family StressStandard
with comparator = UTF8Type
and keys_cached = 100
and memtable_flush_after = 1440
and memtable_throughput = 128;

 nodetool -h dsdb4 tpstats
Pool NameActive   Pending  Completed
ReadStage32   281 456598
RequestResponseStage  0 0 797237
MutationStage 0 0 499205
ReadRepairStage   0 0 149077
GossipStage   0 0 217227
AntiEntropyStage  0 0  0
MigrationStage0 0201
MemtablePostFlusher   0 0   1842
StreamStage   0 0  0
FlushWriter   0 0   1841
FILEUTILS-DELETE-POOL 0 0   3670
MiscStage 0 0  0
FlushSorter   0 0  0
InternalResponseStage 0 0  0
HintedHandoff 0 0 15

cfstats

Keyspace: StressKeyspace
Read Count: 460988
Read Latency: 38.07654727454945 ms.
Write Count: 499205
Write Latency: 0.007409593253272703 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 9
Space used (live): 247408645485
Space used (total): 247408645485
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 1878
Read Count: 460989
Read Latency: 28.237 ms.
Write Count: 499205
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 299862
Key cache hit rate: 0.6031833150384193
Row cache: disabled
Compacted row minimum size: 219343
Compacted row maximum size: 5839588
Compacted row mean size: 497474


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266221.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-12 Thread Jonathan Ellis

NTS is overkill in the sense that it doesn't really benefit you in a
single DC, but if you think you may expand to another DC in the future
it's much simpler if you were already using NTS, than first migrating
to NTS (changing strategy is painful).

I can't think of any downsides to using NTS in a single-DC
environment, so that's the safe option.

On Tue, Apr 12, 2011 at 1:15 PM, William Oberman
ober...@civicscience.com wrote:
 Hi,

 I'm getting closer to commiting to cassandra, and now I'm in system/IT
 issues and questions.  I'm in the amazon EC2 cloud.  I previously used this
 forum to discover the best practice for disk layouts (large instance + the
 two ephemeral disks in RAID0 for data + root volume for everything else).
 Now I'm hoping to confirm bits and pieces of things I've read about for
 snitch/replication strategies.  I was thinking of using
 endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
 placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'
 (for people hitting this from the mailing list or google, I feel obligated
 to note that the former setting is in cassandra.yaml, and the latter is an
 option on a keyspace).

 But, I'm only in one region. Is using the amazon snitch/networktopology
 overkill given everything I have is in one DC (I believe region==DC and
 availability_zone==rack).  I'm using multiple availability zones for some
 level of redundancy, I'm just not yet to the point I'm using multiple
 regions.  If someday I move to using multiple regions, would that change the
 answer?

 Thanks!

 --
 Will Oberman
 Civic Science, Inc.
 3030 Penn Avenue., First Floor
 Pittsburgh, PA 15201
 (M) 412-480-7835
 (E) ober...@civicscience.com




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

help

2011-04-12 Thread Denis Kirpichenkov

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-12 Thread William Oberman

Excellent to know! (and yes, I figure I'll expand someday, so I'm glad I
found this out before digging a hole).

The other issue I've been pondering is a normal column family of encoded
objects (in my case JSON) vs. a super column.  Based on my use case, things
I've read, etc...  right now I'm coming down on normal + encoded.

will

On Tue, Apr 12, 2011 at 2:57 PM, Jonathan Ellis jbel...@gmail.com wrote:

 NTS is overkill in the sense that it doesn't really benefit you in a
 single DC, but if you think you may expand to another DC in the future
 it's much simpler if you were already using NTS, than first migrating
 to NTS (changing strategy is painful).

 I can't think of any downsides to using NTS in a single-DC
 environment, so that's the safe option.

 On Tue, Apr 12, 2011 at 1:15 PM, William Oberman
 ober...@civicscience.com wrote:
  Hi,
 
  I'm getting closer to commiting to cassandra, and now I'm in system/IT
  issues and questions.  I'm in the amazon EC2 cloud.  I previously used
 this
  forum to discover the best practice for disk layouts (large instance +
 the
  two ephemeral disks in RAID0 for data + root volume for everything else).
  Now I'm hoping to confirm bits and pieces of things I've read about for
  snitch/replication strategies.  I was thinking of using
  endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
  placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'
  (for people hitting this from the mailing list or google, I feel
 obligated
  to note that the former setting is in cassandra.yaml, and the latter is
 an
  option on a keyspace).
 
  But, I'm only in one region. Is using the amazon snitch/networktopology
  overkill given everything I have is in one DC (I believe region==DC and
  availability_zone==rack).  I'm using multiple availability zones for some
  level of redundancy, I'm just not yet to the point I'm using multiple
  regions.  If someday I move to using multiple regions, would that change
 the
  answer?
 
  Thanks!
 
  --
  Will Oberman
  Civic Science, Inc.
  3030 Penn Avenue., First Floor
  Pittsburgh, PA 15201
  (M) 412-480-7835
  (E) ober...@civicscience.com
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com

Re: help

2011-04-12 Thread Joaquin Casares

http://wiki.apache.org/cassandra/FAQ#unsubscribe

http://wiki.apache.org/cassandra/FAQ#unsubscribeIs this what you're
looking for?

Joaquin Casares
DataStax
Software Engineer/Support



On Tue, Apr 12, 2011 at 2:03 PM, Denis Kirpichenkov 
den.doki.kirpichen...@gmail.com wrote:

Re: Help on decommission

how long as it been in Leaving status?   Is the cluster under stress test load 
while you are doing the decommission?

On Apr 12, 2011, at 6:53 PM, Baskar Duraikannu wrote:

 I have setup a 4 node cluster for testing. When I setup the cluster, I have 
 setup initial tokens in such a way that each gets 25% of load and then 
 started the node with autobootstrap=false.
  
  
 After all nodes are up, I loaded data using the stress test tool with 
 replication factor of 3.  As per of my testing, I am trying to remove one of 
 the node using nodetool decomission but the node seems to be stuck in 
 leaving status.
  
 How do I check whether it is doing any work at all? Please help
  
  
 [root@localhost bin]# ./nodetool -h 10.140.22.25 ring
 Address Status State   LoadOwnsToken

 127605887595351923798765477786913079296
 10.140.22.66Up Leaving 119.41 MB   25.00%  0
 10.140.22.42Up Normal  116.23 MB   25.00%  
 42535295865117307932921825928971026432
 10.140.22.28Up Normal  119.93 MB   25.00%  
 85070591730234615865843651857942052864
 10.140.22.25Up Normal  116.21 MB   25.00%  
 127605887595351923798765477786913079296
 [root@localhost bin]# ./nodetool -h 10.140.22.66 netstats
 Mode: Leaving: streaming data to other nodes
 Streaming to: /10.140.22.42
/var/lib/cassandra/data/Keyspace1/Standard1-f-1-Data.db/(0,120929157)
  progress=120929157/120929157 - 100%
/var/lib/cassandra/data/Keyspace1/Standard1-f-2-Data.db/(0,3361291)
  progress=0/3361291 - 0%
 Not receiving any streams.
 Pool NameActive   Pending  Completed
 Commandsn/a 0 17
 Responses   n/a 0 108109
 [root@usnynyc1cass02 bin]# ./nodetool -h 10.140.22.42 netstats
 Mode: Normal
 Not sending any streams.
 Streaming from: /10.140.22.66
Keyspace1: 
 /var/lib/cassandra/data/Keyspace1/Standard1-f-2-Data.db/(0,3361291)
  progress=0/3361291 - 0%
 Pool NameActive   Pending  Completed
 Commandsn/a 0 11
 Responses   n/a 0 107879
  
  
 Regards,
 Baskar

Re: flush_largest_memtables_at messages in 7.4

your jvm heap has reached 78% so cassandra automatically flushes its memtables.
you need to explain more about your configuration. 32 or 64 bit OS, what is
max heap, how much ram installed?

If this happens under stress test conditions its probably understandable. you
should look into graphing your memory usage, or use the jconsole to graph heap
during your tests.

On Apr 12, 2011, at 8:36 PM, mcasandra wrote:

I am using cassandra 7.4 and getting these messages.

Heap is 0.7802529021498031 full. You may need to reduce memtable and/or
cache sizes Cassandra will now flush up to the two largest memtables to free
up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if
you don't want Cassandra to do this automatically

How do I verify that I need to adjust any thresholds? And how to calculate
correct value?

When I got this message only reads were occuring.

create keyspace StressKeyspace
with replication_factor = 3
and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

use StressKeyspace;
drop column family StressStandard;
create column family StressStandard
with comparator = UTF8Type
and keys_cached = 100
and memtable_flush_after = 1440
and memtable_throughput = 128;

nodetool -h dsdb4 tpstats
Pool NameActive Pending Completed
ReadStage32 281 456598
RequestResponseStage 0 0 797237
MutationStage 0 0 499205
ReadRepairStage 0 0 149077
GossipStage 0 0 217227
AntiEntropyStage 0 0 0
MigrationStage0 0201
MemtablePostFlusher 0 0 1842
StreamStage 0 0 0
FlushWriter 0 0 1841
FILEUTILS-DELETE-POOL 0 0 3670
MiscStage 0 0 0
FlushSorter 0 0 0
InternalResponseStage 0 0 0
HintedHandoff 0 0 15

cfstats

Keyspace: StressKeyspace
Read Count: 460988
Read Latency: 38.07654727454945 ms.
Write Count: 499205
Write Latency: 0.007409593253272703 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 9
Space used (live): 247408645485
Space used (total): 247408645485
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 1878
Read Count: 460989
Read Latency: 28.237 ms.
Write Count: 499205
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 299862
Key cache hit rate: 0.6031833150384193
Row cache: disabled
Compacted row minimum size: 219343
Compacted row maximum size: 5839588
Compacted row mean size: 497474

--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266221.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.

json2sstable

2011-04-12 Thread Steven Teo

Hi,

I am trying to run json2sstable with the following command but am receiving the 
below error.
json2sstable -K testks -c testcf output.json 
/var/lib/cassandra/data/testks/testcf-f-1-Data.db 

Importing 321 keys...
java.lang.NullPointerException
at 
org.apache.cassandra.tools.SSTableImport.addColumnsToCF(SSTableImport.java:136)
at 
org.apache.cassandra.tools.SSTableImport.addToSuperCF(SSTableImport.java:173)
at 
org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:228)
at 
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:197)
at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:421)
ERROR: null

anything i did wrongly here?

Thanks!

Re: Help on decommission

2011-04-12 Thread Baskar Duraikannu

No. I stopped the stress test before issuing decommission command. So, it was 
not under ANY load.

I waited for over an hour and nothing changed. 

Then , I turned on DEBUG in the log4j-server.properties and then restarted the 
Cassandra process . 

As soon as I restarted, the decommissioned node left the cluster and everything 
was back to normal.

Have you seen this behaviour before? 



From: Jonathan Colby 
Sent: Tuesday, April 12, 2011 3:15 PM
To: user@cassandra.apache.org 
Subject: Re: Help on decommission


how long as it been in Leaving status?   Is the cluster under stress test load 
while you are doing the decommission? 


On Apr 12, 2011, at 6:53 PM, Baskar Duraikannu wrote:


  I have setup a 4 node cluster for testing. When I setup the cluster, I have 
setup initial tokens in such a way that each gets 25% of load and then started 
the node with autobootstrap=false. 


  After all nodes are up, I loaded data using the stress test tool with 
replication factor of 3.  As per of my testing, I am trying to remove one of 
the node using nodetool decomission but the node seems to be stuck in 
leaving status. 

  How do I check whether it is doing any work at all? Please help


  [root@localhost bin]# ./nodetool -h 10.140.22.25 ring
  Address Status State   LoadOwnsToken
 
127605887595351923798765477786913079296
  10.140.22.66Up Leaving 119.41 MB   25.00%  0
  10.140.22.42Up Normal  116.23 MB   25.00%  
42535295865117307932921825928971026432
  10.140.22.28Up Normal  119.93 MB   25.00%  
85070591730234615865843651857942052864
  10.140.22.25Up Normal  116.21 MB   25.00%  
127605887595351923798765477786913079296

  [root@localhost bin]# ./nodetool -h 10.140.22.66 netstats
  Mode: Leaving: streaming data to other nodes
  Streaming to: /10.140.22.42
 /var/lib/cassandra/data/Keyspace1/Standard1-f-1-Data.db/(0,120929157)
   progress=120929157/120929157 - 100%
 /var/lib/cassandra/data/Keyspace1/Standard1-f-2-Data.db/(0,3361291)
   progress=0/3361291 - 0%
  Not receiving any streams.
  Pool NameActive   Pending  Completed
  Commandsn/a 0 17
  Responses   n/a 0 108109

  [root@usnynyc1cass02 bin]# ./nodetool -h 10.140.22.42 netstats
  Mode: Normal
  Not sending any streams.
  Streaming from: /10.140.22.66
 Keyspace1: 
/var/lib/cassandra/data/Keyspace1/Standard1-f-2-Data.db/(0,3361291)
   progress=0/3361291 - 0%
  Pool NameActive   Pending  Completed
  Commandsn/a 0 11
  Responses   n/a 0 107879



  Regards,
  Baskar

Re: flush_largest_memtables_at messages in 7.4

64 bit 12 core 96 GB RAM

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266400.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Cassandra 2 DC deployment

2011-04-12 Thread Narendra Sharma

I think this is reasonable assuming you have enough backhaul to perform
reads across DC if read requests hit DC2 (with one copy of data) or one
replica from DC1 is down.

Moreover, since you clearly stated that you would prefer availability over
consistency, you should be prepared for stale reads :)


On Tue, Apr 12, 2011 at 8:12 AM, Raj N raj.cassan...@gmail.com wrote:

 Hi experts,
  We are planning to deploy Cassandra in 2 datacenters. Let assume there
 are 3 nodes, RF=3, 2 nodes in 1 DC and 1 node in 2nd DC. Under normal
 operations, we would read and write at QUORUM. What we want to do though is
 if we lose a datacenter which has 2 nodes, DC1 in this case, we want to
 downgrade our consistency to ONE. Basically I am saying that whenever there
 is a partition, then prefer availability over consistency. In order to do
 this we plan to catch UnavailableException and take corrective action. So
 try QUORUM under normal circumstances, if unavailable try ONE. My questions
 -
 Do you guys see any flaws with this approach?
 What happens when DC1 comes back up and we start reading/writing at QUORUM
 again? Will we read stale data in this case?

 Thanks
 -Raj




-- 
Narendra Sharma
Solution Architect
*http://www.persistentsys.com*
*http://narendrasharma.blogspot.com/*

Update the Keyspace replication factor online

2011-04-12 Thread Yudong Gao

Hi,

What operations will be executed (and what is the associated overhead)
when the Keyspace replication factor is changed online, in a
multi-datacenter setup with NetworkTopologyStrategy?

I checked the wiki and the archive of the mailing list and find this,
but it is not very complete.

http://wiki.apache.org/cassandra/Operations

Replication factor is not really intended to be changed in a live
cluster either, but increasing it may be done if you (a) use
ConsistencyLevel.QUORUM or ALL (depending on your existing replication
factor) to make sure that a replica that actually has the data is
consulted, (b) are willing to accept downtime while anti-entropy
repair runs (see below), or (c) are willing to live with some clients
potentially being told no data exists if they read from the new
replica location(s) until repair is done.


More specifically, in this scenario:

{DC1:1, DC2:1} - {DC2:1, DC3:1}

1. Can this be done online without shutting down the cluster? I
thought there is an update keyspace command in the cassandra-cli.

2. If so, what operations will be executed? Will new replicas be
created in new locations (in DC3) and existing replicas be deleted in
old locations (in DC1)?

3. Or they will be updated only with read with ConssitencyLevel.QUORUM
or All, or nodetool repair?

Thanks!

Yudong

erros which starting cassandra

2011-04-12 Thread Anurag Gujral

Hi All,
  I am getting the following errors when I am trying to start
cassandra .

Error occurred during initialization of VM
Could not reserve enough space for object heap

I am using cassandra 0.7.3

uname -a
Linux  hostname 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010
x86_64 x86_64 x86_64 GNU/Linux

Please Suggest

Thanks
Anurag

Re: Cassandra node's replication factor two with random partition non Bootstrap node problem

 I have two cassandra node's,If Boot strapped  nodes goes down my service
 remains alive,But if my non Bootstrap (master) node goes down my live site
 goes down as well,I am using cassandra 0.6.3 can any elaborate on this
 problem.

Assuming your RF is 2 (not 1), and that you are reading at consistency
level ONE (not QUORUM, which would be 2 in the case of RF=2),
single-node failures should be tolerated.

In order for people to help you'd have to specify some more
informaton. For example, your site goes down - but what is the
actual error condition w.r.t. Cassandra? What is the error reported by
the Cassandra client (and which client is it)?

I'm not sure what you mean w.r.t. boostrap/master etc. All nodes
should be entirely equal, with the exception of nodes that are marked
as seed nodes. But seed nodes going down should not cause reads and
writes to fail.

-- 
/ Peter Schuller

Re: erros which starting cassandra

2011-04-12 Thread Anurag Gujral

I was able to resolve this by changing the heap size
Thanks
Anurag

On Tue, Apr 12, 2011 at 1:38 PM, Anurag Gujral anurag.guj...@gmail.comwrote:

 Hi All,
   I am getting the following errors when I am trying to start
 cassandra .

 Error occurred during initialization of VM
 Could not reserve enough space for object heap

 I am using cassandra 0.7.3

 uname -a
 Linux  hostname 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010
 x86_64 x86_64 x86_64 GNU/Linux

 Please Suggest

 Thanks
 Anurag

Re: Lot of pending tasks for writes

 I am just running simple test in 6 node cassandra 4 GB heap, 96 GB RAM and
 12 core per host. I am inserting 1M rows with avg col size of 250k. I keep
 getting Dropped mutation messages in logs. Not sure how to troubleshoot or
 tune it.

Average col size of 250k - that sounds to me like you're almost
certainly going to be bottlenecking on disk I/O.

Saturating your active in the mutation stage and building up pending
is consistent with simply writing faster than writes can be handled.
At first I was skeptical and figured maybe something was wrong, but
upon re-reading and spotting your 250k column size - it's really easy
to have a stress client saturate nodes with data sizes that large.

The first thing I would do is to just look at what's going on on the
system. For example, just run iostat -x -k 1 on the machines and see
whether you're completely disk bound or not. I suspect you are, and
that the effects you're seeing is simply the result of that.

However that would depend on how many mutations per second you're
actually sending. But if you're using out-of-the-box stress.py without
rate limiting and using a column size of 250k, I am not at all
surprised that you're easily able to saturate your nodes.

-- 
/ Peter Schuller

Re: flush_largest_memtables_at messages in 7.4

 Heap is 0.7802529021498031 full. You may need to reduce memtable and/or
 cache sizes Cassandra will now flush up to the two largest memtables to free
 up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if
 you don't want Cassandra to do this automatically

 How do I verify that I need to adjust any thresholds? And how to calculate
 correct value?

Is this on the same cluster/nodes that you're doing your 250k column
stresses (the other thread)?

In any case, for typical cases there is:

   http://www.datastax.com/docs/0.7/operations/tuning

-- 
/ Peter Schuller

Re: Cassandra 2 DC deployment

 When the down data center comes back up, the Quorum reads will result in a 
 read-repair, so you will get valid data.   Besides that, hinted handoff will 
 take care of getting data replicated to a previously down node.

*Eventually* though, but yes. I.e., there would be no expectation to
instantly go back to full consistency once it goes back up.

Also, I would argue that it's useful to consider this: If you're
implementing automatic fallback to ONE whenever QUORUM fails;
consider all cases where this might happen for reasons *other* than
there being a legitimate partition of the DC:s. For example, some
random networking issues causing fewer nodes to be up etc.

A valid question is: If you simply do automatic fallback whenever
QUORUM fails anyway, are you significantly increasing consistency with
respect to ONE anyway? In some cases yes, but just be sure you know
what you're doing... Keep in mind that when all nodes are up and all
is working well, CL.ONE doesn't mean that writes won't be replicated
to all nodes. It just means that only one is *required* - and same for
reads.

If you have some situation whereby you normally want the strict
requirement that a read subsequent to a write sees the written data,
that doesn't sound very compatible with automatically falling back to
CL.ONE...

Anyways, those are my off-the-cuff thoughts - maybe it doesn't apply
in the situation in question.
-- 
/ Peter Schuller

Re: erros which starting cassandra

 I was able to resolve this by changing the heap size

And that is the preferred solution. While adjusting stuff like the
kernel overcommit settings might allow the JVM to start, there is no
reason ever to have a heap size larger than what physical memory on
the server can actually sustain. So decreasing heap size is the
appropriate course of action.

-- 
/ Peter Schuller

Re: flush_largest_memtables_at messages in 7.4

Yes

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266726.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Lot of pending tasks for writes

 It does appear that I am IO bound. Disks show about 90% util.

Well, also pay attention to the average queue size column. If there
are constantly more requests waiting to be serviced than you have
platters, you're almost certainly I/O bound. The utilization number
can be a bit flaky sometimes, although 90% doesn't a bit too far below
100% to be attributed to inexactness in the kernel's measurements.

 What are my options then? Is cassandra not suitable for columns of this
 size?

It depends. Cassandra is a log-structured database, meaning that all
writes are sequential and you are going to be doing background
compactions that imply re-reading and re-writing data.

This optimization makes sense in particular for smaller values where
the cost of doing sequential I/O is a lot less than seek-bound I/O,
but it is less relevant for large values.

The main cost of background compactions is the extra reading and
writing of data that happens. If your workload is full of huge values,
then the only significant cost *is* the sequential I/O. So in that
sense, background compaction becomes more expensive relative to the
theoretical optimum than it does for small values.

It depends on details of the access pattern, but I'd say that (1) for
very large values, Cassandra's advantages become less pronounced in
terms of local storage on each nodes, although the clustering
capabilities remain relevant, and that (2) depending on the details of
the use-case, Cassandra *may* not be terribly suitable.

 I am running stress code from hector which doesn't sound like give ability
 to do operations per sec. I am insert 1M rows and then reading. Have not
 been able to do in parallel because of io issues.

stress.py doesn't support any throttling, except very very indirectly
by limiting the total number of threads.

In a situation like this I think you need to look at what your target
traffic is going to be like. Throwing un-throttled traffic at the
cluster like stress.py does is not indicative of normal traffic
patterns. For typical use-cases with small columns this is still
handled well, but when you are both unthrottled *and* are throwing
huge columns at it, there is no expectation that this is handled very
well.

So, for large values like this I recommend figuring out what the
actual expected sustained amount of writes is, and then benchmark
that. Using stress.py out-of-the-box is not giving you much relevant
information, other than the known fact that throwing huge-column
traffic at Cassandra without throttling is not handled very
gracefully.

But that said, when using un-throttled benchmarking like stress.py -
at any time where you're throwing more traffic at the cluster than it
can handle, is it *fully expected* that you will see the 'active'
stages be saturated and a build-up of 'pending' operations. This is
the expected results of submitting a greater number of requests per
second than can be processed - in pretty much any system. You queue up
to some degree, and eventually you start having to drop or fail
requests.

The unique thing about large columns is that it becomes a lot easier
to saturate a node with a single (or few) stress.py clients than it is
when stressing with a more normal type of load. The extra cost of
dealing with large values is higher in Cassandra than it is in
stress.py; so suddenly a single stress.py can easily saturate lots of
nodes simply because you can so trivially be writing data at very high
throughput by upping the column sizes

-- 
/ Peter Schuller

Re: flush_largest_memtables_at messages in 7.4

 Yes

Without checking I don't know the details of the memtable threshold
calculations enough to be sure whether large columns are somehow
causing the size estimations to be ineffective (off hand I would
expect the reverse since the overhead of the Java object structures
become much less significant); but if this is not the case, then this
particular problem should be a matter of adjusting heap size according
to your memtable thresholds. I.e., increase heap size and/or decrease
memtable flush thresholds.

-- 
/ Peter Schuller

Re: CLI does not list data after upgrading to 0.7.4

2011-04-12 Thread Aaron Turner

I'm running into the same issue with 0.7.4.  You don't need to specify
lexicaluuid, seems any valid key type will work- it just needs to fit
with your data (ascii, bytes, etc).

On Sun, Apr 10, 2011 at 7:13 PM, Patrick Julien pjul...@gmail.com wrote:
 put in an assumption first, so from cassandra-cli, do:

 assume aCF KEYS as lexicaluuid;

 then do your list

 On Sun, Apr 10, 2011 at 10:03 PM, Wenjun Che wen...@openf.in wrote:

 It is happening on clean 0.7.4 server as well.  Here is how to reproduce:

 1. create a CF with UUID as row key
 2. add some data
 3. list CF always returns Input length = 1

 I figured out one way to fix this: run 'assume CF keys as lexicaluuid;.

 This issue does not happen to CLI of 0.7.0 or earlier, even running against
 0.7.4 server.



 On Sat, Apr 9, 2011 at 5:53 PM, aaron morton aa...@thelastpickle.com
 wrote:

 Just tested the 0.7.4 cli against an clean 0.7.4 server and list worked.

 If I restart the server while the cli is connected i get...

 [default@dev] list data;
 Using default limit of 100
 null

 Aaron

 On 8 Apr 2011, at 17:23, Wenjun Che wrote:

  Hello
 
  I just upgraded a 1-node setup from rc2 to 0.7.4 and ran scrub without
  any error.  Now 'list CF' in CLI does not return any data as followings:
 
  list User;
  Using default limit of 100
  Input length = 1
 
  I don't see any errors or exceptions in the log.
 
  If I run CLi from 0.7.0 against 0.7.4 server, I am getting data.
 
 
 
  Thanks
 







-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
carpe diem quam minimum credula postero

Re: Cassandra Database Modeling

Yes for  interactive == real time queries.  Hadoop based techniques are non 
time critical queries, but they do have greater analytical capabilities. 

particle_pairs:
1) Yes and no and sort of. Under the hood the get_slice api call will be used 
by your client library to pull back chunks of (ordered) columns. Most client 
libraries abstract away the chunking for you. 

2) If you are using a packed structure like JSON then no, Cassandra will have 
no idea what you've put in the columns other than bytes . It really depends on 
how much data you have per pair, but generally it's easier to pull back more 
data than try to get exactly what you need. Downside is you have to update all 
the data. 

3) No, you would need to update all the data for the pair. I was assuming most 
of the data was written once, and that your simulation had something like a 
stop-the-world phase between time slices where state was dumped and then read 
to start the next interval. You could either read it first, or we can come up 
with something else.

distance_cf
1) the query would return an list of columns, which have a name and value (as 
well as a timestamp and ttl).
2) depends on the client library, if using python go for 
https://github.com/pycassa/pycassa It will return objects 
3) returning millions of columns is going to be slow, would also be slow using 
a RDBMS. Creating millions objects in python is going to be slow. You would 
need to have a better idea of what queries you will actually want to run to see 
if it's *too* slow. If it is one approach is to store the particles at the same 
distance in the same column, so you need to read less columns. Again depends on 
how your sim works. 
  
Time complexity depends on the number of columns read. Finding a row will not 
be O(1) as it it may have to read from several files. Writes are more constant 
than reads. But remember, you can have a lot of io and cpu power in your 
cluster.

Best advice is to jump in and see if the data model works for you at a small 
single node scale, most performance issues can be solved. 

Aaron

On 12 Apr 2011, at 15:34, csharpplusproject wrote:

 Hi Aaron,
 
 Yes, of course it helps, I am starting to get a flavor of Cassandra -- thank 
 you very much!
 
 First of all, by 'interactive' queries, are you referring to 'real-time' 
 queries? (meaning, where experiments data is 'streaming', data needs to be 
 stored and following that, the query needs to be run in real time)?
 
 Looking at the design of the particle pairs:
 
 - key: expriement_id.time_interval 
 - column name: pair_id 
 - column value: distance, angle, other data packed together as JSON or some 
 other format
 
 A couple of questions:
 
 (1) Will a query such as pairID[ expriement_id.time_interval ] will basically 
 return an array of all paidIDs for the experiment, where each item is a 
 'packed' JSON?
 (2) Would it be possible, rather than returning the whole JSON object per 
 every pairID, to get (say) only the distance?
 (3) Would it be possible to easily update certain 'pairIDs' with new values 
 (for example, update pairIDs = {2389, 93434} with new distance values)? 
 
 Looking at the design of the distance CF (for example):
 
 this is VERY INTERESTING. basically you are suggesting a design that will 
 save the actual distance between each pair of particles, and will allow 
 queries where we can find all pairIDs (for an experiment, on time_interval) 
 that meet a certain distance criteria. VERY, VERY INTERESTING!
 
 A couple of questions:
 
 (1) Will a query such as distanceCF[ expriement_id.time_interval ] will 
 basically return an array of all 'zero_padded_distance.pair_id' elements for 
 the experiment?
 (2) In such a case, I will get (presumably) a python list where every item is 
 a string (and I will need to process it)?
 (3) Given the fact that we're doing a slice on millions of columns (?), any 
 idea how fast such an operation would be?
 
 
 Just to make sure I understand, is it true that in both situations, the query 
 complexity is basically O(1) since it's simply a HASH?
 
 
 Thank you for all of your help!
 
 Shalom.
 
 -Original Message-
 From: aaron morton aa...@thelastpickle.com
 Reply-to: user@cassandra.apache.org
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Database Modeling
 Date: Tue, 12 Apr 2011 10:43:42 +1200
 
 The tricky part here is the level of flexibility you want for the querying. 
 In general you will want to denormalise to support the read queries.   
 
 If your queries are not interactive you may be able to use Hadoop / Pig / 
 Hive e.g. http://www.datastax.com/products/brisk In which case you can 
 probably have a simpler data model where you spend less effort supporting the 
 queries. But it sounds like you need interactive queries as part of the 
 experiment. 
 
 You could store the data per pair in a standard CF (lets call it the pair cf) 
 as follows: 
 
 - key: expriement_id.time_interval - column name: pair_id - column

Re: CL.ONE reads / RR / badness_threshold interaction

 To now answer my own question, the critical points that are different
 from what I said earlier are: that CL.ONE does prefer *one* node (which
 one depending on snitch) and that RR uses digests (which are not
 mentioned on the wiki page [1]) instead of comparing raw requests.

I updated it to mention digest queries with a link to another page to
explain what that is, and why they are used.

 I am assuming that RR digests save on bandwidth, but to generate the
 digest with a row cache miss the same number of disk seeks are required
 (my nemesis is disk io).

Yes. It's only a bandwidth optimization.

 So to increase pinny-ness I'll further reduce RR chance and set a
 badness threshold.  Thanks all.

Just be aware that, assuming I am not missing something, while this
will indeed give you better cache locality under normal circumstances
- once that closest node does go down, traffic will then go to a
node which will have potentially zero cache hit rate on that data
since all reads up to that point were taken by the node that just went
down.

So it's not an obvious win depending.

-- 
/ Peter Schuller

Re: CL.ONE reads / RR / badness_threshold interaction

2011-04-12 Thread Chris Burroughs

On 04/12/2011 06:27 PM, Peter Schuller wrote:
 So to increase pinny-ness I'll further reduce RR chance and set a
 badness threshold.  Thanks all.
 
 Just be aware that, assuming I am not missing something, while this
 will indeed give you better cache locality under normal circumstances
 - once that closest node does go down, traffic will then go to a
 node which will have potentially zero cache hit rate on that data
 since all reads up to that point were taken by the node that just went
 down.
 
 So it's not an obvious win depending.


Yeah there less than great behaviour when nodes are restarted or
otherwise go down with this configuration.  Probably still preferable
for my current situation.  Other's mileage may vary.


http://img27.imageshack.us/img27/85/cacherestart.png

Re: quick repair tool question

2011-04-12 Thread Chris Burroughs

On 04/12/2011 11:11 AM, Jonathan Colby wrote:
 I'm not sure if this is the kosher way to rebuild the sstable data, but it 
 seemed to work.  

http://wiki.apache.org/cassandra/Operations#Handling_failure

Option #3.

Re: flush_largest_memtables_at messages in 7.4

One thing I am noticing is that cache hit rate is very low even though my
cache key size is 1M and I have less than 1M rows. Not sure why so many
cache miss?

Keyspace: StressKeyspace
Read Count: 162506
Read Latency: 45.22479006928975 ms.
Write Count: 247180
Write Latency: 0.011610943442026053 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 184
Space used (live): 99616537894
Space used (total): 99616537894
Memtable Columns Count: 351
Memtable Data Size: 171716049
Memtable Switch Count: 543
Read Count: 162507
Read Latency: 317.892 ms.
Write Count: 247180
Write Latency: 0.006 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 256013
Key cache hit rate: 0.33801452784503633
Row cache: disabled
Compacted row minimum size: 182786
Compacted row maximum size: 5839588
Compacted row mean size: 537470



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6267234.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Remove call vs. delete mutation

2011-04-12 Thread Josep Blanquer

Is there anybody else that might see a problem with just using delete
mutations instead of remove calls?

I'm thinking about changing a Cassandra client to always use delete
mutations when removing objects, that way the delete/remove call
interface can be kept the same:
1- the delete/remove client call would always support all features:
single-key/column, multi-column and slice range deletes.
2- it could be used in the same way regardless of embedding the calls
into batch mutations or removing a single column/key

 I'd like to hear some more thoughts about this change not causing the
Cassandra server to take a much higher CPU toll just because decoding
mutations is much less optimized than straight removes or something
like that...(I don't think so but...). In other words, if I do 1000
inserts or 1000 single-delete mutations, would the Cassandra server
see much of a difference?

 Cheers,

Josep M.

On Mon, Apr 11, 2011 at 3:49 PM, aaron morton aa...@thelastpickle.com wrote:
 AFAIK both follow the same path internally.

 Aaron

 On 12 Apr 2011, at 06:47, Josep Blanquer wrote:

 All,

 From a thrift client perspective using Cassandra, there are currently
 2 options for deleting keys/columns/subcolumns:

 1- One can use the remove call: which only takes a column path so
 you can only delete 'one thing' at a time (an entire key, an entire
 supercolumn, a column or a subcolumn)
 2- A delete mutation: which is more flexible as it allows to delete a
 list of columns an even a slice range of them within a single call.

 The question I have is: is there a noticeable difference in
 performance between issuing a remove call, or a mutation with a single
 delete? In other words, why would I use the remove call if it's much
 less flexible than the mutation?

 ...or another way to put it: is the remove call just there for
 backwards compatibility and will be superseded by the delete mutations
 in the future?

 Cheers,

 Josep M.

Exception on cassandra startup 0.7.4

2011-04-12 Thread Paul Lorenz

Hello,
  I've been running a single node cluster (0.7.4 built from the SVN
tag, running on JDK 1.6.0_21 on Ubuntu 10.10) for testing purposes.
After running fine for a couple of weeks, I got the error below on
startup. It sounded like the error which is supposed to be fixed by
the nodetool scrub command, but since I can't run the scrub command
without starting up the instance, and the instance won't start, this
wasn't any use. Also, I'm fairly certain that the keyspaces in this
node have only been written by 0.7.4 code.

Since it was just a test node, I just blew away the data directory.
Had I been thinking, I would have saved it off so I could duplicate
the issue. If I can provide any other information, please let me know.

Thank you,
Paul

paul@host:~/apps/cassandra-svn/bin$ ./cassandra -f
 INFO 20:57:44,344 Logging initialized
 INFO 20:57:44,357 Heap size: 3051814912/3052863488
 INFO 20:57:44,358 JNA not found. Native methods will be disabled.
 INFO 20:57:44,365 Loading settings from
file:/home/paul/apps/cassandra-svn/conf/cassandra.yaml
 INFO 20:57:44,474 DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
 INFO 20:57:44,593 Opening
/home/paul/apps/cassandra/node1/data/system/Schema-f-378
 INFO 20:57:44,606 Opening
/home/paul/apps/cassandra/node1/data/system/Schema-f-379
 INFO 20:57:44,608 Opening
/home/paul/apps/cassandra/node1/data/system/Schema-f-377
 INFO 20:57:44,618 Opening
/home/paul/apps/cassandra/node1/data/system/Migrations-f-377
 INFO 20:57:44,620 Opening
/home/paul/apps/cassandra/node1/data/system/Migrations-f-378
 INFO 20:57:44,622 Opening
/home/paul/apps/cassandra/node1/data/system/Migrations-f-379
 INFO 20:57:44,627 Opening
/home/paul/apps/cassandra/node1/data/system/LocationInfo-f-29
 INFO 20:57:44,629 Opening
/home/paul/apps/cassandra/node1/data/system/LocationInfo-f-30
 INFO 20:57:44,631 Opening
/home/paul/apps/cassandra/node1/data/system/LocationInfo-f-31
 INFO 20:57:44,674 Loading schema version debf273e-631f-11e0-ac72-e700f669bcfc
 INFO 20:57:44,883 Opening
/home/paul/apps/cassandra/node1/data/DaisyWorksKS/User-f-1
 INFO 20:57:44,886 Opening
/home/paul/apps/cassandra/node1/data/DaisyWorksKS/User-f-2
 INFO 20:57:44,892 Opening
/home/paul/apps/cassandra/node1/data/DaisyWorksTest/User-f-11
 INFO 20:57:44,895 Opening
/home/paul/apps/cassandra/node1/data/DaisyWorksTest/Product-f-10
 INFO 20:57:44,908 Creating new commitlog segment
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302569864908.log
 INFO 20:57:44,916 Replaying
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302379611027.log,
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567818267.log,
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567841352.log,
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567871659.log,
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302568152030.log,
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302569289258.log
 INFO 20:57:44,937 Finished reading
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302379611027.log
ERROR 20:57:44,937 Exception encountered during startup.
java.io.IOError: java.io.EOFException
at 
org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:246)
at 
org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:262)
at 
org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:223)
at 
java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
at 
java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443)
at 
org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:363)
at 
org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:311)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:320)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:289)
at

Re: repair never completes with finished successfully

Ah, unreadable rows and in the validation compaction no less. Makes a little 
more sense now. 

Anyone help with the EOF when deserializing columns ? Is the fix to run scrub 
or drop the sstable ?

Here's a a theory, AES is trying to...

1) Create TreeRequest 's that specify a range we want to validate. 
2) Send TreeRequest 's to local node and neighbour
3) Process TreeRequest by running a validation compaction 
(CompactionManager.doValidationCompaction in your prev stacks)
4) When both TreeRequests return back work out the differences and then stream 
data if needed. 

Perhaps step 3 is not completing because of errors like 
http://www.mail-archive.com/user@cassandra.apache.org/msg12196.html If the row 
is over multiple sstables we can skip the row in one sstable. However if it's 
in a single sstable PrecompactedRow will raise an IOError if there is a 
problem. This is not what is in the linked error stack that shows a row been 
skipped, just a hunch we could checkout.

Do you see an IOErrors (not exceptions) in the logs or exceptions with 
doValidationCompaction in the stack?

For a tree request on the node you start compaction on you should see these 
logs...
1) Waiting for repair requests...
2) One of Stored local tree or Stored remote tree depending on which 
returns first at DEBUG level
3) Queuing comparison

If we do not have the 3rd log then we did not get a replay from either local or 
remote. 

Aaron

On 13 Apr 2011, at 00:57, Jonathan Colby wrote:

 There is no Repair session message either.   It just starts with a message 
 like:
 
 INFO [manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723] 2011-04-10 
 14:00:59,051 AntiEntropyService.java (line 770) Waiting for repair requests: 
 [#TreeRequest manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, 
 /10.46.108.101, (DFS,main), #TreeRequest 
 manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, /10.47.108.100, 
 (DFS,main), #TreeRequest 
 manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, /10.47.108.102, 
 (DFS,main), #TreeRequest 
 manual-repair-2af33a51-f46a-4ba2-b1fb-ead5159dc723, /10.47.108.101, 
 (DFS,main)]
 
 NETSTATS:
 
 Mode: Normal
 Not sending any streams.
 Not receiving any streams.
 Pool NameActive   Pending  Completed
 Commandsn/a 0 150846
 Responses   n/a 0 443183
 
 One node in our cluster still has unreadable rows, where the reads trip up 
 every time for certain sstables (you've probably seen my earlier threads 
 regarding that).   My suspicion is that the bloom filter read on the node 
 with the corrupt sstables is never reporting back to the repair, thereby 
 causing it to hang.
 
 
 What would be great is a scrub tool that ignores unreadable/unserializable 
 rows!  : )
 
 
 On Apr 12, 2011, at 2:15 PM, aaron morton wrote:
 
 Do you see a message starting Repair session  and ending with completed 
 successfully ?
 
 Or do you see any streaming activity using nodetool netstats
 
 Repair can hang if a neighbour dies and fails to send a requested stream. It 
 will timeout after 24 hours (I think). 
 
 Aaron
 
 On 12 Apr 2011, at 23:39, Karl Hiramoto wrote:
 
 On 12/04/2011 13:31, Jonathan Colby wrote:
 There are a few other threads related to problems with the nodetool repair 
 in 0.7.4.  However I'm not seeing any errors, just never getting a message 
 that the repair completed successfully.
 
 In my production and test cluster (with just a few MB data)  the repair 
 nodetool prompt never returns and the last entry in the cassandra.log is 
 always something like:
 
 #TreeRequest manual-repair-f739ca7a-bef8-4683-b249-09105f6719d9, 
 /10.46.108.102, (DFS,main)  completed successfully: 1 outstanding
 
 But I don't see a message, even hours later, that the 1 outstanding 
 request finished successfully.
 
 Anyone else experience this?  These are physical server nodes in local 
 data centers and not EC2
 
 
 I've seen this.   To fix it  try a nodetool compact then repair.
 
 
 --
 Karl

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

Can you connect from the local machine using 127.0.0.1 ?
Are you running any sort of fire wall?
Check you can connect from the node to the JMX port (8080 by default) using 
telnet 

Aaron

On 13 Apr 2011, at 04:25, Ali Ahsan wrote:

 Any one can guide me on this issue ?
 
 
 On 04/12/2011 04:07 PM, Ali Ahsan wrote:
 Hi All
 
 I have migrated my server to centos 5.5.Every thing is up but facing a 
 little issue i have two cassandra nodes.
 
 10.0.0.4  cassandra2
 10.0.0.3 cassandra1
 
 I am using open jdk with cassandra,We are faing following error when using 
 nodetool.Only on one server that is cassandra2.Hosts file is also pasted 
 below.I please let me know how can i fix this issue.
 
 -
  
 sh  nodetool -h 10.0.0.3  ring
 Error connecting to remote JMX agent!
 java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested 
 exception is:
 ---
  
 
 
 sh  nodetool -h 10.0.0.4  ring
 Address   Status Load  Range 
  Ring
   129069858893052904163677015069685590304
 10.0.0.3  Up 10.02 GB  
 104465788091875410298027059042850717029|--|
 10.0.0.4  Up 9.98 GB   
 129069858893052904163677015069685590304|--|
 
 
 
 Hosts file
 
 # Do not remove the following line, or various programs
 # that require network functionality will fail.
 127.0.0.1localhost.localdomain localhost
 10.0.0.4cassandra2.pringit.com
 
 
 #::1localhost6.localdomain6 localhost6
 
 
 
 
 
 -- 
 S.Ali Ahsan
 
 Senior System Engineer
 
 e-Business (Pvt) Ltd
 
 49-C Jail Road, Lahore, P.O. Box 676
 Lahore 54000, Pakistan
 
 Tel: +92 (0)42 3758 7140 Ext. 128
 
 Mobile: +92 (0)345 831 8769
 
 Fax: +92 (0)42 3758 0027
 
 Email: ali.ah...@panasiangroup.com
 
 
 
 www.ebusiness-pg.com
 
 www.panasiangroup.com
 
 Confidentiality: This e-mail and any attachments may be confidential
 and/or privileged. If you are not a named recipient, please notify the
 sender immediately and do not disclose the contents to another person
 use it for any purpose or store or copy the information in any medium.
 Internet communications cannot be guaranteed to be timely, secure, error
 or virus-free. We do not accept liability for any errors or omissions.

Re: forced index creation?

Built indexes are there for me

[default@unknown] describe keyspace Keyspace1;
Keyspace: Keyspace1:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 1
  Column Families:
ColumnFamily: Indexed1
  default_validation_class: org.apache.cassandra.db.marshal.LongType
  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.145312498/31/1440 (millions of 
ops/minutes/MB)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Built indexes: [Indexed1.birthdate_idx]
  Column Metadata:
Column Name: birthdate
  Validation Class: org.apache.cassandra.db.marshal.LongType
  Index Name: birthdate_idx
  Index Type: KEYS

When the index is created existing data is indexed async, and any new data is 
indexed as part of the write. Not sure how to force/check things though. 

Can you turn logging up to DEBUG and compare the requests between the two 
clusters ? 

Aaron

On 13 Apr 2011, at 05:46, Sasha Dolgy wrote:

 hi, just deployed a new keyspace on 0.7.4 and added the following column 
 family:
 
 create column family applications with comparator=UTF8Type and 
 column_metadata=[
{column_name: app_name, validation_class: UTF8Type},
{column_name: app_uri, validation_class: UTF8Type,index_type: KEYS},
{column_name: app_id, validation_class: UTF8Type}
 ];
 
 I then proceeded to add two new rows of data to it.  When i try and
 query the secondary index on app_uri, my query with phpcassa fails.
 on the same CF in a different cluster, it works fine.  when comparing
 the CF between clusters, see there's a difference: ---  Built indexes:
 --- shows up when i run -- describe keyspace foobar;
 
 
 
  Column Metadata:
Column Name: app_name (app_name)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
Column Name: app_id (app_id)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
Column Name: app_uri (app_uri)
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Index Type: KEYS
 
 Checking out a bit further:
 
 get applications where 'app_uri' = 'get-test';
 ---
 RowKey: 9d699733-9afe-4a41-83ca-c60d040dacc0
 
 
 get applications where 'app_id' = '9d699733-9afe-4a41-83ca-c60d040dacc0';
 No indexed columns present in index clause with operator EQ
 
 So .. I can see that the secondary indexes are working.
 
 Question 1:  Has Built indexes been removed from the describe
 keyspace output?  Or have i done something 
 Question 2:  Is there a way to force secondary index creation?
 
 
 
 
 
 -- 
 Sasha Dolgy
 sasha.do...@gmail.com

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

If you can use standard + encoded I would go with that. 

Aaron

On 13 Apr 2011, at 07:07, William Oberman wrote:

 Excellent to know! (and yes, I figure I'll expand someday, so I'm glad I 
 found this out before digging a hole).
 
 The other issue I've been pondering is a normal column family of encoded 
 objects (in my case JSON) vs. a super column.  Based on my use case, things 
 I've read, etc...  right now I'm coming down on normal + encoded.
 
 will
 
 On Tue, Apr 12, 2011 at 2:57 PM, Jonathan Ellis jbel...@gmail.com wrote:
 NTS is overkill in the sense that it doesn't really benefit you in a
 single DC, but if you think you may expand to another DC in the future
 it's much simpler if you were already using NTS, than first migrating
 to NTS (changing strategy is painful).
 
 I can't think of any downsides to using NTS in a single-DC
 environment, so that's the safe option.
 
 On Tue, Apr 12, 2011 at 1:15 PM, William Oberman
 ober...@civicscience.com wrote:
  Hi,
 
  I'm getting closer to commiting to cassandra, and now I'm in system/IT
  issues and questions.  I'm in the amazon EC2 cloud.  I previously used this
  forum to discover the best practice for disk layouts (large instance + the
  two ephemeral disks in RAID0 for data + root volume for everything else).
  Now I'm hoping to confirm bits and pieces of things I've read about for
  snitch/replication strategies.  I was thinking of using
  endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
  placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'
  (for people hitting this from the mailing list or google, I feel obligated
  to note that the former setting is in cassandra.yaml, and the latter is an
  option on a keyspace).
 
  But, I'm only in one region. Is using the amazon snitch/networktopology
  overkill given everything I have is in one DC (I believe region==DC and
  availability_zone==rack).  I'm using multiple availability zones for some
  level of redundancy, I'm just not yet to the point I'm using multiple
  regions.  If someday I move to using multiple regions, would that change the
  answer?
 
  Thanks!
 
  --
  Will Oberman
  Civic Science, Inc.
  3030 Penn Avenue., First Floor
  Pittsburgh, PA 15201
  (M) 412-480-7835
  (E) ober...@civicscience.com
 
 
 
 
 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com
 
 
 
 -- 
 Will Oberman
 Civic Science, Inc.
 3030 Penn Avenue., First Floor
 Pittsburgh, PA 15201
 (M) 412-480-7835
 (E) ober...@civicscience.com

Re: json2sstable

Reading the code looks like it could not find a subColumns item for the row in 
the json file. 

The target CF is a super CF, is the data from a super CF ? 

Aaron

On 13 Apr 2011, at 07:24, Steven Teo wrote:

 Hi,
 
 I am trying to run json2sstable with the following command but am receiving 
 the below error.
   json2sstable -K testks -c testcf output.json 
 /var/lib/cassandra/data/testks/testcf-f-1-Data.db 
 
 Importing 321 keys...
 java.lang.NullPointerException
   at 
 org.apache.cassandra.tools.SSTableImport.addColumnsToCF(SSTableImport.java:136)
   at 
 org.apache.cassandra.tools.SSTableImport.addToSuperCF(SSTableImport.java:173)
   at 
 org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:228)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:197)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:421)
 ERROR: null
 
 anything i did wrongly here?
 
 Thanks!

Re: Update the Keyspace replication factor online

Are you changing the replication factor or moving nodes ? 

To change the RF you need to repair and then once all repairing is done run 
cleanup to remove the hold data. 

You can move whole nodes by moving all their data with them, assigning a new 
ip, and updating the topology file if used.  

Aaron

On 13 Apr 2011, at 07:56, Yudong Gao wrote:

 Hi,
 
 What operations will be executed (and what is the associated overhead)
 when the Keyspace replication factor is changed online, in a
 multi-datacenter setup with NetworkTopologyStrategy?
 
 I checked the wiki and the archive of the mailing list and find this,
 but it is not very complete.
 
 http://wiki.apache.org/cassandra/Operations
 
 Replication factor is not really intended to be changed in a live
 cluster either, but increasing it may be done if you (a) use
 ConsistencyLevel.QUORUM or ALL (depending on your existing replication
 factor) to make sure that a replica that actually has the data is
 consulted, (b) are willing to accept downtime while anti-entropy
 repair runs (see below), or (c) are willing to live with some clients
 potentially being told no data exists if they read from the new
 replica location(s) until repair is done.
 
 
 More specifically, in this scenario:
 
 {DC1:1, DC2:1} - {DC2:1, DC3:1}
 
 1. Can this be done online without shutting down the cluster? I
 thought there is an update keyspace command in the cassandra-cli.
 
 2. If so, what operations will be executed? Will new replicas be
 created in new locations (in DC3) and existing replicas be deleted in
 old locations (in DC1)?
 
 3. Or they will be updated only with read with ConssitencyLevel.QUORUM
 or All, or nodetool repair?
 
 Thanks!
 
 Yudong

Re: flush_largest_memtables_at messages in 7.4

 One thing I am noticing is that cache hit rate is very low even though my
 cache key size is 1M and I have less than 1M rows. Not sure why so many
 cache miss?

The key cache should be strictly LRU for read-only workloads. For
write/read workloads it may not be strictly LRU because compaction
causes key cache migration.

In your case:

                Key cache capacity: 100
                Key cache size: 256013
                Key cache hit rate: 0.33801452784503633

So you have only 256k in the cache. Have you run for long enough after
enabling it for it to actually be fully populated?

-- 
/ Peter Schuller

Re: Cassandra Database Modeling

2011-04-12 Thread csharpplusproject

Aaron,

Thank you so much for your help. It is greatly appreciated!

Looking at the design of the particle pairs:

 
 - key: expriement_id.time_interval 
 - column name: pair_id 
 - column value: distance, angle, other data packed together as JSON or
 some other format


You wrote that retrieving millions of columns (I will have about
10,000,000 particles pairs) would be slow. You are also right that the
retrieval of millions of columns into Python, won't be fast.

If my desired query is to get all particle pairs on time interval
[ Tn..T(n+1) ] where the distance between the two particles is smaller
than X and the angle between the two particles is greater than Y.

In such a query (as the above), given the fact that retrieving millions
of columns could be slow, would it be best to say 'concatenate' all
values for all particle pairs for a given 'expriement_id.time_interval'
into one column?

If data is stored in this way, I will be getting from Cassandra a binary
string / JSON Object that I will have to 'unpack' in my application. Is
this a recommended approach? are there better approaches?

Is there a limit to the size that can be stored in one 'cell' (by 'cell'
I mean the intersection between a key and a data column)? is there a
limit to the size of data of one key?  one data column?

Thanks in advance for any help / guidance.

-Original Message-
From: aaron morton aa...@thelastpickle.com
Reply-to: user@cassandra.apache.org
To: user@cassandra.apache.org
Subject: Re: Cassandra Database Modeling
Date: Wed, 13 Apr 2011 10:14:21 +1200

Yes for  interactive == real time queries.  Hadoop based techniques are
non time critical queries, but they do have greater analytical
capabilities. 


particle_pairs:
1) Yes and no and sort of. Under the hood the get_slice api call will be
used by your client library to pull back chunks of (ordered) columns.
Most client libraries abstract away the chunking for you. 


2) If you are using a packed structure like JSON then no, Cassandra will
have no idea what you've put in the columns other than bytes . It really
depends on how much data you have per pair, but generally it's easier to
pull back more data than try to get exactly what you need. Downside is
you have to update all the data. 


3) No, you would need to update all the data for the pair. I was
assuming most of the data was written once, and that your simulation had
something like a stop-the-world phase between time slices where state
was dumped and then read to start the next interval. You could either
read it first, or we can come up with something else.


distance_cf
1) the query would return an list of columns, which have a name and
value (as well as a timestamp and ttl).
2) depends on the client library, if using python go
for https://github.com/pycassa/pycassa It will return objects 
3) returning millions of columns is going to be slow, would also be slow
using a RDBMS. Creating millions objects in python is going to be slow.
You would need to have a better idea of what queries you will actually
want to run to see if it's *too* slow. If it is one approach is to store
the particles at the same distance in the same column, so you need to
read less columns. Again depends on how your sim works. 
  
Time complexity depends on the number of columns read. Finding a row
will not be O(1) as it it may have to read from several files. Writes
are more constant than reads. But remember, you can have a lot of io and
cpu power in your cluster.


Best advice is to jump in and see if the data model works for you at a
small single node scale, most performance issues can be solved. 


Aaron

On 12 Apr 2011, at 15:34, csharpplusproject wrote:

 Hi Aaron,
 
 Yes, of course it helps, I am starting to get a flavor of Cassandra --
 thank you very much!
 
 First of all, by 'interactive' queries, are you referring to
 'real-time' queries? (meaning, where experiments data is 'streaming',
 data needs to be stored and following that, the query needs to be run
 in real time)?
 
 Looking at the design of the particle pairs:
 
 - key: expriement_id.time_interval 
 - column name: pair_id 
 - column value: distance, angle, other data packed together as JSON or
 some other format
 
 A couple of questions:
 
 (1) Will a query such as pairID[ expriement_id.time_interval ] will
 basically return an array of all paidIDs for the experiment, where
 each item is a 'packed' JSON?
 (2) Would it be possible, rather than returning the whole JSON object
 per every pairID, to get (say) only the distance?
 (3) Would it be possible to easily update certain 'pairIDs' with new
 values (for example, update pairIDs = {2389, 93434} with new distance
 values)? 
 
 Looking at the design of the distance CF (for example):
 
 this is VERY INTERESTING. basically you are suggesting a design that
 will save the actual distance between each pair of particles, and will
 allow queries where we can find all pairIDs (for an experiment, on
 time_interval) that meet a

Re: Cassandra Database Modeling

2011-04-12 Thread Steven Yen-Liang Su


 Is there a limit to the size that can be stored in one 'cell' (by 'cell' I
 mean the intersection between a *key* and a *data column*)? is there a
 limit to the size of data of one *key*?  one *data column*?


http://wiki.apache.org/cassandra/CassandraLimitations

http://wiki.apache.org/cassandra/CassandraLimitationsThe data of cassandra
are partitioned by the row key; therefore, if you want to put all pairs into
the same row, you should consider the disk size.



 Thanks in advance for any help / guidance.

 -Original Message-
 *From*: aaron morton 
 aa...@thelastpickle.comaaron%20morton%20%3caa...@thelastpickle.com%3e
 
 *Reply-to*: user@cassandra.apache.org
 *To*: user@cassandra.apache.org
 *Subject*: Re: Cassandra Database Modeling
 *Date*: Wed, 13 Apr 2011 10:14:21 +1200

 Yes for  interactive == real time queries.  Hadoop based techniques are non
 time critical queries, but they do have greater analytical capabilities.

 particle_pairs: 1) Yes and no and sort of. Under the hood the get_slice api
 call will be used by your client library to pull back chunks of (ordered)
 columns. Most client libraries abstract away the chunking for you.

 2) If you are using a packed structure like JSON then no, Cassandra will
 have no idea what you've put in the columns other than bytes . It really
 depends on how much data you have per pair, but generally it's easier to
 pull back more data than try to get exactly what you need. Downside is you
 have to update all the data.

 3) No, you would need to update all the data for the pair. I was assuming
 most of the data was written once, and that your simulation had something
 like a stop-the-world phase between time slices where state was dumped and
 then read to start the next interval. You could either read it first, or we
 can come up with something else.

 distance_cf 1) the query would return an list of columns, which have a name
 and value (as well as a timestamp and ttl). 2) depends on the client
 library, if using python go for https://github.com/pycassa/pycassa It will
 return objects  3) returning millions of columns is going to be slow, would
 also be slow using a RDBMS. Creating millions objects in python is going to
 be slow. You would need to have a better idea of what queries you will
 actually want to run to see if it's *too* slow. If it is one approach is to
 store the particles at the same distance in the same column, so you need to
 read less columns. Again depends on how your sim works. Time complexity
 depends on the number of columns read. Finding a row will not be O(1) as it
 it may have to read from several files. Writes are more constant than reads.
 But remember, you can have a lot of io and cpu power in your cluster.

 Best advice is to jump in and see if the data model works for you at a
 small single node scale, most performance issues can be solved.

 Aaron
 On 12 Apr 2011, at 15:34, csharpplusproject wrote:

 Hi Aaron,

 Yes, of course it helps, I am starting to get a flavor of *Cassandra* --
 thank you very much!

 First of all, by 'interactive' queries, are you referring to 'real-time'
 queries? (meaning, where experiments data is 'streaming', data needs to be
 stored and following that, the query needs to be run in real time)?

 *Looking at the design of the **particle pairs**:*

 - key: expriement_id.time_interval
 - column name: pair_id
 - column value: distance, angle, other data packed together as JSON or some
 other format

 *A couple of questions:*

 (1) Will a query such as *pairID[ *expriement_id.time_interval* ] *will
 basically return an array of all paidIDs for the experiment, where each item
 is a 'packed' JSON?
 (2) Would it be possible, rather than returning the whole JSON object per
 every pairID, to get (say) only the distance?
 (3) Would it be possible to easily update certain 'pairIDs' with new values
 (for example, update pairIDs = {2389, 93434} with new *distance* values)?

 *Looking at the design of the **distance CF* (for example)*:*

 this is VERY INTERESTING. basically you are suggesting a design that will
 save the actual distance between each pair of particles, and will allow
 queries where we can find all pairIDs (for an experiment, on time_interval)
 that meet a certain distance criteria. VERY, VERY INTERESTING!

 *A couple of questions:*

 (1) Will a query such as *distanceCF[ *expriement_id.time_interval* ] *will
 basically return an array of all '*zero_padded_distance.pair_id*' elements
 for the experiment?
 (2) In such a case, I will get (presumably) a python list where every item
 is a string (and I will need to process it)?
 (3) Given the fact that we're doing a slice on millions of columns (?), any
 idea how fast such an operation would be?


 Just to make sure I understand, is it true that in both situations, the
 query complexity is basically O(1) since it's simply a HASH?


 Thank you for all of your help!

 Shalom.

 -Original Message-
 *From*: aaron morton

Re: Exception on cassandra startup 0.7.4

2011-04-12 Thread Jonathan Ellis

This is a problem reading the commitlog, which is not something scrub
can help with.

Looks like there is bad data in
/home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567818267.log.
Somehow it's corrupt in a way that the checksum is ok. (Which sounds
like https://issues.apache.org/jira/browse/CASSANDRA-2128 but that was
fixed for 0.7.2.)

Quick fix to get up and running again would be to just remove that
file. (Any data in it will be missing, of course.) Longer term you
should gzip it (and your system keyspace, so we get the schema too)
and attach it to a ticket so we can take a closer look.

On Tue, Apr 12, 2011 at 7:22 PM, Paul Lorenz plor...@gmail.com wrote:
 Hello,
  I've been running a single node cluster (0.7.4 built from the SVN
 tag, running on JDK 1.6.0_21 on Ubuntu 10.10) for testing purposes.
 After running fine for a couple of weeks, I got the error below on
 startup. It sounded like the error which is supposed to be fixed by
 the nodetool scrub command, but since I can't run the scrub command
 without starting up the instance, and the instance won't start, this
 wasn't any use. Also, I'm fairly certain that the keyspaces in this
 node have only been written by 0.7.4 code.

 Since it was just a test node, I just blew away the data directory.
 Had I been thinking, I would have saved it off so I could duplicate
 the issue. If I can provide any other information, please let me know.

 Thank you,
 Paul

 paul@host:~/apps/cassandra-svn/bin$ ./cassandra -f
  INFO 20:57:44,344 Logging initialized
  INFO 20:57:44,357 Heap size: 3051814912/3052863488
  INFO 20:57:44,358 JNA not found. Native methods will be disabled.
  INFO 20:57:44,365 Loading settings from
 file:/home/paul/apps/cassandra-svn/conf/cassandra.yaml
  INFO 20:57:44,474 DiskAccessMode 'auto' determined to be mmap,
 indexAccessMode is mmap
  INFO 20:57:44,593 Opening
 /home/paul/apps/cassandra/node1/data/system/Schema-f-378
  INFO 20:57:44,606 Opening
 /home/paul/apps/cassandra/node1/data/system/Schema-f-379
  INFO 20:57:44,608 Opening
 /home/paul/apps/cassandra/node1/data/system/Schema-f-377
  INFO 20:57:44,618 Opening
 /home/paul/apps/cassandra/node1/data/system/Migrations-f-377
  INFO 20:57:44,620 Opening
 /home/paul/apps/cassandra/node1/data/system/Migrations-f-378
  INFO 20:57:44,622 Opening
 /home/paul/apps/cassandra/node1/data/system/Migrations-f-379
  INFO 20:57:44,627 Opening
 /home/paul/apps/cassandra/node1/data/system/LocationInfo-f-29
  INFO 20:57:44,629 Opening
 /home/paul/apps/cassandra/node1/data/system/LocationInfo-f-30
  INFO 20:57:44,631 Opening
 /home/paul/apps/cassandra/node1/data/system/LocationInfo-f-31
  INFO 20:57:44,674 Loading schema version debf273e-631f-11e0-ac72-e700f669bcfc
  INFO 20:57:44,883 Opening
 /home/paul/apps/cassandra/node1/data/DaisyWorksKS/User-f-1
  INFO 20:57:44,886 Opening
 /home/paul/apps/cassandra/node1/data/DaisyWorksKS/User-f-2
  INFO 20:57:44,892 Opening
 /home/paul/apps/cassandra/node1/data/DaisyWorksTest/User-f-11
  INFO 20:57:44,895 Opening
 /home/paul/apps/cassandra/node1/data/DaisyWorksTest/Product-f-10
  INFO 20:57:44,908 Creating new commitlog segment
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302569864908.log
  INFO 20:57:44,916 Replaying
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302379611027.log,
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567818267.log,
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567841352.log,
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302567871659.log,
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302568152030.log,
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302569289258.log
  INFO 20:57:44,937 Finished reading
 /home/paul/apps/cassandra/node1/commitlog/CommitLog-1302379611027.log
 ERROR 20:57:44,937 Exception encountered during startup.
 java.io.IOError: java.io.EOFException
        at 
 org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:246)
        at 
 org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:262)
        at 
 org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:223)
        at 
 java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
        at 
 java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443)
        at 
 org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:363)
        at 
 org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:311)
        at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
        at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
        at 
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
        at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
        at

Re: Cassandra Database Modeling

2011-04-12 Thread csharpplusproject

Steven,

Thank you. 

You wrote: The data of cassandra are partitioned by the row key;
therefore, if you want to put all pairs into the same row, you should
consider the disk size

Can you please explain why the disk size is / might be a problem?

Thanks,
Shalom.

-Original Message-
From: Steven Yen-Liang Su xpste...@gmail.com
Reply-to: user@cassandra.apache.org
To: user@cassandra.apache.org
Subject: Re: Cassandra Database Modeling
Date: Wed, 13 Apr 2011 12:16:00 +0800

Is there a limit to the size that can be stored in one
'cell' (by 'cell' I mean the intersection between a key and a
data column)? is there a limit to the size of data of one key?
one data column?



http://wiki.apache.org/cassandra/CassandraLimitations


The data of cassandra are partitioned by the row key; therefore, if you
want to put all pairs into the same row, you should consider the disk
size.
 

Thanks in advance for any help / guidance.

-Original Message-
From: aaron morton aa...@thelastpickle.com
Reply-to: user@cassandra.apache.org
To: user@cassandra.apache.org
Subject: Re: Cassandra Database Modeling
Date: Wed, 13 Apr 2011 10:14:21 +1200

Yes for  interactive == real time queries.  Hadoop based
techniques are non time critical queries, but they do have
greater analytical capabilities.  

particle_pairs: 1) Yes and no and sort of. Under the hood the
get_slice api call will be used by your client library to pull
back chunks of (ordered) columns. Most client libraries abstract
away the chunking for you.  

2) If you are using a packed structure like JSON then no,
Cassandra will have no idea what you've put in the columns other
than bytes . It really depends on how much data you have per
pair, but generally it's easier to pull back more data than try
to get exactly what you need. Downside is you have to update all
the data.  

3) No, you would need to update all the data for the pair. I was
assuming most of the data was written once, and that your
simulation had something like a stop-the-world phase between
time slices where state was dumped and then read to start the
next interval. You could either read it first, or we can come up
with something else. 

distance_cf 1) the query would return an list of columns, which
have a name and value (as well as a timestamp and ttl). 2)
depends on the client library, if using python go
for https://github.com/pycassa/pycassa It will return objects
3) returning millions of columns is going to be slow, would also
be slow using a RDBMS. Creating millions objects in python is
going to be slow. You would need to have a better idea of what
queries you will actually want to run to see if it's *too* slow.
If it is one approach is to store the particles at the same
distance in the same column, so you need to read less columns.
Again depends on how your sim works. Time complexity depends
on the number of columns read. Finding a row will not be O(1) as
it it may have to read from several files. Writes are more
constant than reads. But remember, you can have a lot of io and
cpu power in your cluster. 

Best advice is to jump in and see if the data model works for
you at a small single node scale, most performance issues can be
solved.  

Aaron 
On 12 Apr 2011, at 15:34, csharpplusproject wrote: 

 Hi Aaron,
 
 Yes, of course it helps, I am starting to get a flavor of
 Cassandra -- thank you very much!
 
 First of all, by 'interactive' queries, are you referring to
 'real-time' queries? (meaning, where experiments data is
 'streaming', data needs to be stored and following that, the
 query needs to be run in real time)?
 
 Looking at the design of the particle pairs:
 
 - key: expriement_id.time_interval 
 - column name: pair_id 
 - column value: distance, angle, other data packed together as
 JSON or some other format
 
 A couple of questions:
 
 (1) Will a query such as pairID[ expriement_id.time_interval ]
 will basically return an array of all paidIDs for the
 experiment, where each item is a 'packed' JSON?
 (2) Would it be possible, rather than returning the whole JSON
 object per every pairID, to get (say) only the distance?
 (3) Would it be possible to easily update certain 'pairIDs'
 with new values (for example, update pairIDs = {2389, 93434}
 with new distance values)?

Re: json2sstable

2011-04-12 Thread Steven Teo

the data is a custom json, seems like i may have got the structure wrong. 
how should the import json be like?
 
Steven Teo

On 13-Apr-2011, at 10:43 AM, aaron morton wrote:

 Reading the code looks like it could not find a subColumns item for the row 
 in the json file. 
 
 The target CF is a super CF, is the data from a super CF ? 
 
 Aaron
 
 On 13 Apr 2011, at 07:24, Steven Teo wrote:
 
 Hi,
 
 I am trying to run json2sstable with the following command but am receiving 
 the below error.
  json2sstable -K testks -c testcf output.json 
 /var/lib/cassandra/data/testks/testcf-f-1-Data.db 
 
 Importing 321 keys...
 java.lang.NullPointerException
  at 
 org.apache.cassandra.tools.SSTableImport.addColumnsToCF(SSTableImport.java:136)
  at 
 org.apache.cassandra.tools.SSTableImport.addToSuperCF(SSTableImport.java:173)
  at 
 org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:228)
  at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:197)
  at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:421)
 ERROR: null
 
 anything i did wrongly here?
 
 Thanks!

Re: flush_largest_memtables_at messages in 7.4

Does it really matter how long cassandra has been running? I thought it will
keep keys of 1M at least.

Regarding your previous question about queue size in iostat I see it ranging
from 114-300.



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6267728.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

 Does it really matter how long cassandra has been running? I thought it will
 keep keys of 1M at least.

It will keep up to the limit, and it will save caches periodically and
reload them on start. But the cache needs to be populated by traffic
first. If you wrote a bunch of data, enabled the row cache, and began
reading you have to first wait for population of the cache prior to
looking at cache locality.

Note that the saving of caches is periodic and if you were constantly
restarting nodes during testing maybe it never got saved with the full
set of keys.

 Regarding your previous question about queue size in iostat I see it ranging
 from 114-300.

Saturated.

-- 
/ Peter Schuller

Error while startup - latest trunk build

2011-04-12 Thread Shariq

Hi,

I am getting the following exception while starting Cassandra trunk build,
am I missing any configuration options, please help ?

Thanks,
Shariq.

Stack track

~/work/cassandra-trunk$ ./bin/cassandra -f
 INFO 11:04:07,864 Logging initialized
 INFO 11:04:07,877 Heap size: 1893728256/1893728256
 INFO 11:04:07,878 JNA not found. Native methods will be disabled.
 INFO 11:04:07,885 Loading settings from
file:/home/shariq/work/cassandra-trunk/conf/cassandra.yaml
 INFO 11:04:08,003 DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
 INFO 11:04:08,083 Global memtable threshold is enabled at 602MB
 INFO 11:04:08,136 reading saved cache
/var/lib/cassandra/saved_caches/system-IndexInfo-KeyCache
 INFO 11:04:08,145 Opening /var/lib/cassandra/data/system/IndexInfo-f-5
 INFO 11:04:08,163 reading saved cache
/var/lib/cassandra/saved_caches/system-Schema-KeyCache
 INFO 11:04:08,165 Opening /var/lib/cassandra/data/system/Schema-f-57
 INFO 11:04:08,169 Opening /var/lib/cassandra/data/system/Schema-f-59
 INFO 11:04:08,171 Opening /var/lib/cassandra/data/system/Schema-f-58
 INFO 11:04:08,176 Opening /var/lib/cassandra/data/system/Migrations-f-58
 INFO 11:04:08,177 Opening /var/lib/cassandra/data/system/Migrations-f-57
 INFO 11:04:08,178 Opening /var/lib/cassandra/data/system/Migrations-f-59
 INFO 11:04:08,182 reading saved cache
/var/lib/cassandra/saved_caches/system-LocationInfo-KeyCache
 INFO 11:04:08,185 Opening /var/lib/cassandra/data/system/LocationInfo-f-46
 INFO 11:04:08,188 Opening /var/lib/cassandra/data/system/LocationInfo-f-47
 INFO 11:04:08,191 Opening /var/lib/cassandra/data/system/LocationInfo-f-45
 INFO 11:04:08,236 Loading schema version
33ac001b-60fc-11e0-8f89-e700f669bcfc
ERROR 11:04:08,463 Exception encountered during startup.
java.lang.RuntimeException:
org.apache.cassandra.config.ConfigurationException: SimpleStrategy requires
a replication_factor strategy option.
at org.apache.cassandra.db.Table.init(Table.java:277)
at org.apache.cassandra.db.Table.open(Table.java:109)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:160)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80)
Caused by: org.apache.cassandra.config.ConfigurationException:
SimpleStrategy requires a replication_factor strategy option.
at
org.apache.cassandra.locator.SimpleStrategy.validateOptions(SimpleStrategy.java:75)
at
org.apache.cassandra.locator.AbstractReplicationStrategy.createReplicationStrategy(AbstractReplicationStrategy.java:262)
at
org.apache.cassandra.db.Table.createReplicationStrategy(Table.java:327)
at org.apache.cassandra.db.Table.init(Table.java:273)
... 4 more
Exception encountered during startup.
java.lang.RuntimeException:
org.apache.cassandra.config.ConfigurationException: SimpleStrategy requires
a replication_factor strategy option.
at org.apache.cassandra.db.Table.init(Table.java:277)
at org.apache.cassandra.db.Table.open(Table.java:109)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:160)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80)
Caused by: org.apache.cassandra.config.ConfigurationException:
SimpleStrategy requires a replication_factor strategy option.
at
org.apache.cassandra.locator.SimpleStrategy.validateOptions(SimpleStrategy.java:75)
at
org.apache.cassandra.locator.AbstractReplicationStrategy.createReplicationStrategy(AbstractReplicationStrategy.java:262)
at
org.apache.cassandra.db.Table.createReplicationStrategy(Table.java:327)
at org.apache.cassandra.db.Table.init(Table.java:273)
... 4 more

Re: quick repair tool question

cool!  and I thought I made that one up myself : )

On Apr 13, 2011, at 2:13 AM, Chris Burroughs wrote:

 On 04/12/2011 11:11 AM, Jonathan Colby wrote:
 I'm not sure if this is the kosher way to rebuild the sstable data, but it 
 seemed to work.  
 
 http://wiki.apache.org/cassandra/Operations#Handling_failure
 
 Option #3.

Re: repair never completes with finished successfully