Re: How to use join_ring=false?

2011-03-22 Thread Jason Harvey
Gah! Thx :)

Jason

On Mar 21, 10:34 pm, Chris Goffinet c...@chrisgoffinet.com wrote:
 -Dcassandra.join_ring=false

 -Chris

 On Mar 21, 2011, at 10:32 PM, Jason Harvey wrote:

  I set join_ring=false in my java opts:
  -Djoin_ring=false

  However, when the node started up, it joined the ring. Is there
  something I am missing? Using 0.7.4

  Thanks,
  Jason


Re: stress.py bug?

2011-03-22 Thread Sheng Chen
I am just wondering, why the stress test tools (python, java) need more
threads ?
Is the bottleneck of a single thread in the client, or in the server?
Thanks.

Sean

2011/3/22 Ryan King r...@twitter.com

 On Mon, Mar 21, 2011 at 4:02 AM, pob peterob...@gmail.com wrote:
  Hi,
  I'm inserting data from client node with stress.py to cluster of 6 nodes.
  They are all on 1Gbps network, max real throughput of network is 930Mbps
  (after measurement).
  python stress.py -c 1 -S 17  -d{6nodes}  -l3 -e QUORUM
   --operation=insert -i 1 -n 50 -t100
  The problem is stress.py show up it does avg ~750ops/sec what is 127MB/s,
  but the real throughput of network is ~116MB/s.

 You may need more concurrency in order to saturate your network.

 -ryan



SSL Streaming

2011-03-22 Thread Sasha Dolgy
Hi,

Is there documentation available anywhere that describes how one can
use org.apache.cassandra.security.streaming.* ?   After the EC2 posts
yesterday, one question I was asked was about the security of data
being shifted between nodes.  Is it done in clear text, or
encrypted..?  I haven't seen anything to suggest that it's encrypted,
but see in the source that security.streaming does leverage SSL ...

Thanks in advance for some pointers to documentation.

Also, for anyone who is using SSL .. how much of a performance impact
have you noticed?  Is it minimal or significant?

-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: stress.py bug?

2011-03-22 Thread Maki Watanabe
A client thread need to wait for response, during the server can
handle multiple requests simultaneously.

2011/3/22 Sheng Chen chensheng2...@gmail.com:
 I am just wondering, why the stress test tools (python, java) need more
 threads ?
 Is the bottleneck of a single thread in the client, or in the server?
 Thanks.
 Sean

 2011/3/22 Ryan King r...@twitter.com

 On Mon, Mar 21, 2011 at 4:02 AM, pob peterob...@gmail.com wrote:
  Hi,
  I'm inserting data from client node with stress.py to cluster of 6
  nodes.
  They are all on 1Gbps network, max real throughput of network is 930Mbps
  (after measurement).
  python stress.py -c 1 -S 17  -d{6nodes}  -l3 -e QUORUM
   --operation=insert -i 1 -n 50 -t100
  The problem is stress.py show up it does avg ~750ops/sec what is
  127MB/s,
  but the real throughput of network is ~116MB/s.

 You may need more concurrency in order to saturate your network.

 -ryan





-- 
w3m


Re: cassandra nodes with mixed hard disk sizes

2011-03-22 Thread buddhasystem

aaron morton wrote:
 
 
 Also a node is be responsible for storing it's token range and acting as a
 replica for other token ranges. So reducing the token range may not have a
 dramatic affect on the storage requirements. 
 

Aaron,

is there a way to configure wimpy nodes such that the replicas are
elsewhere?


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-nodes-with-mixed-hard-disk-sizes-tp6194071p6195543.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: cassandra nodes with mixed hard disk sizes

2011-03-22 Thread Daniel Doubleday

On Mar 22, 2011, at 5:09 AM, aaron morton wrote:

 1) You should use nodes with the same capacity (CPU, RAM, HDD), cassandra 
 assumes they are all equal. 

Care to elaborate? While equal node will certainly make life easier I would 
have thought that  dynamic snitch would take care of performance differences 
and manual assignment of token ranges can yield to any data distribution. 
Obviously if a node has twice as much data will probably get twice the load. 
But if that is no problem ...

Where does cassandra assume that all are equal?  

Cheers Daniel


 
 2) Not sure what exactly would happen. Am guessing either the node would 
 shutdown or writes would eventually block, probably the former. If the node 
 was up read performance may suffer (if there were more writes been sent in). 
 If you really want to know more let me know and I may find time to dig into 
 it. 
 
 Also a node is be responsible for storing it's token range and acting as a 
 replica for other token ranges. So reducing the token range may not have a 
 dramatic affect on the storage requirements. 
 
 Hope that helps. 
 Aaron
 
 On 22 Mar 2011, at 09:50, Jonathan Colby wrote:
 
 
 This is a two part question ...
 
 1. If you have cassandra nodes with different sized hard disks,  how do you 
 deal with assigning the token ring such that the nodes with larger disks get 
 more data?   In other words, given equally distributed token ranges, when 
 the smaller disk nodes run out of space, the larger disk nodes with still 
 have unused capacity.Or is installing a mixed hardware cluster a no-no?
 
 2. What happens when a cassandra node runs out of disk space for its data 
 files?  Does it continue serving the data while not accepting new data?  Or 
 does the node break and require manual intervention?
 
 This info has alluded me elsewhere.
 Jon
 



Re: cassandra nodes with mixed hard disk sizes

2011-03-22 Thread Aaron Morton
Not that I know of.
Aaron

On 22/03/2011, at 10:45 PM, buddhasystem potek...@bnl.gov wrote:

 
 aaron morton wrote:
 
 
 Also a node is be responsible for storing it's token range and acting as a
 replica for other token ranges. So reducing the token range may not have a
 dramatic affect on the storage requirements. 
 
 
 Aaron,
 
 is there a way to configure wimpy nodes such that the replicas are
 elsewhere?
 
 
 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-nodes-with-mixed-hard-disk-sizes-tp6194071p6195543.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Re: cassandra nodes with mixed hard disk sizes

2011-03-22 Thread Aaron Morton
My assumption is from not seeing anything in the code to explicitly support 
nodes of different specs (also think I saw it somewhere ages ago). AFAIK the 
dynamic snitch is there to detect nodes with a temporarily reduced throughput 
and try to reduce the read load on them. 

I may be wrong on this, so anyone else feel free to jump in. Here are some 
issues to consider...

- keyspace memory requirements are global, all nodes must have enough memory to 
support the CFs.
- During node moves, additions or deletions the token range may increase, nodes 
with less total  space than others would make this more complicated.
- during a write the mutation is sent to all replicas, a weak node that is a 
replica for a strong and busy node will be asked to store data from the strong 
node.
- read repair reads from all replicas
- when strong nodes that replicate to a weak node are compacting or repairing 
the dynamic snitch may order them lower than the weak node. Potentially 
increasing read requests on the weak one.
- down time for a strong node (or cluster partition) may result in increased 
read traffic to a weak node if all up replicas are needed to achieve the CL.
- nodes store their token range and the token range for RF-1 other nodes.

Overall when a node goes down other nodes need to be able to handle the 
potential extra load (connections, reads, storing HH). If you have some weak 
and some strong nodes there is a chance of the weak nodes been overwhelmed 
which may reduce the availability of your cluster.

Hope that helps.
Aaron

On 22/03/2011, at 10:54 PM, Daniel Doubleday daniel.double...@gmx.net wrote:

 
 On Mar 22, 2011, at 5:09 AM, aaron morton wrote:
 1) You should use nodes with the same capacity (CPU, RAM, HDD), cassandra 
 assumes they are all equal. 
 
 Care to elaborate? While equal node will certainly make life easier I would 
 have thought that  dynamic snitch would take care of performance differences 
 and manual assignment of token ranges can yield to any data distribution. 
 Obviously if a node has twices  as much data will probably get twice the 
 load. But if that is no problem ...
 
 Where does cassandra assume that all are equal?  
 
 Cheers Daniel
 
 
 
 2) Not sure what exactly would happen. Am guessing either the node would 
 shutdown or writes would eventually block, probably the former. If the node 
 was up read performance may suffer (if there were more writes been sent in). 
 If you really want to know more let me know and I may find time to dig into 
 it. 
 
 Also a node is be responsible for storing it's token range and acting as a 
 replica for other token ranges. So reducing the token range may not have a 
 dramatic affect on the storage requirements. 
 
 Hope that helps. 
 Aaron
 
 On 22 Mar 2011, at 09:50, Jonathan Colby wrote:
 
 
 This is a two part question ...
 
 1. If you have cassandra nodes with different sized hard disks,  how do you 
 deal with assigning the token ring such that the nodes with larger disks 
 get more data?   In other words, given equally distributed token ranges, 
 when the smaller disk nodes run out of space, the larger disk nodes with 
 still have unused capacity.Or is installing a mixed hardware cluster a 
 no-no?
 
 2. What happens when a cassandra node runs out of disk space for its data 
 files?  Does it continue serving the data while not accepting new data?  Or 
 does the node break and require manual intervention?
 
 This info has alluded me elsewhere.
 Jon
 
 



Re: cassandra nodes with mixed hard disk sizes

2011-03-22 Thread Aaron Morton
Should be: not that I know of without making code changes.
Aaron

On 22/03/2011, at 11:05 PM, Aaron Morton aa...@thelastpickle.com wrote:

 Not that I know of.
 Aaron
 
 On 22/03/2011, at 10:45 PM, buddhasystem potek...@bnl.gov wrote:
 
 
 aaron morton wrote:
 
 
 Also a node is be responsible for storing it's token range and acting as a
 replica for other token ranges. So reducing the token range may not have a
 dramatic affect on the storage requirements. 
 
 
 Aaron,
 
 is there a way to configure wimpy nodes such that the replicas are
 elsewhere?
 
 
 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-nodes-with-mixed-hard-disk-sizes-tp6194071p6195543.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Deleting old SSTables

2011-03-22 Thread Jonathan Colby
According to the Wiki Page on compaction:  once compaction is finished, the old 
SSTable files may be deleted*

* http://wiki.apache.org/cassandra/MemtableSSTable

I thought the old SSTables would be deleted automatically, but this wiki page 
got me thinking otherwise.

Question is,  if it is true that old SSTables must be manually deleted, how can 
one safely identify which SSTables can be deleted??

Jon







Re: Can the Cassandra to be hosted, with all your features and performance, on Microsoft Azure ?

2011-03-22 Thread FernandoVM
Hi,

 contrib/py_stress is the easiest way to shake out any issues with your
 install and get a benchmark.
 There is also https://github.com/brianfrankcooper/YCSB but I would go with
 py_stress until it stops been useful.

Very good, thank's.. !


 Note: These are abstract benchmarks to be used for entertainment purposes
 only, the performance and scaling of your application may vary. :)

Yes, of course. I want have just a parameter to evaluate the cassandra
performance between Azure and a local server. I need know more about
this because Azure architecture details. The compute instances don't
will store data in a local disk, but in the blob sotrage service with
a local cache. I want see the impact of this. :)

Thank's...

[]'s
FernandoVM

On Tue, Mar 22, 2011 at 12:33 AM, aaron morton aa...@thelastpickle.com wrote:
 contrib/py_stress is the easiest way to shake out any issues with your
 install and get a benchmark.
 There is also https://github.com/brianfrankcooper/YCSB but I would go with
 py_stress until it stops been useful.
 Note: These are abstract benchmarks to be used for entertainment purposes
 only, the performance and scaling of your application may vary. :)

 Aaron
 On 22 Mar 2011, at 07:24, FernandoVM wrote:

 There are any benchmark that I can apply after install Cassandra on
 Azure to check performance/scalability issues?


 []'s
 FernandoVM

 On Sun, Mar 13, 2011 at 10:16 PM, aaron morton aa...@thelastpickle.com
 wrote:

 If it works like all the other virtual machine hosts then yes it can be

 hosted.

 Performance can always be less on a virtual machine though.

 See http://wiki.apache.org/cassandra/CloudConfig

 Aaron

 On 14 Mar 2011, at 13:09, FernandoVM wrote:

 Hello friends,


      Anyone know if the Cassandra can be hosted, with all your

 features and performance, on Microsoft Azure?


 []'s

 FernandoVM





 --

 []'s
 FernandoVM





-- 

[]'s
FernandoVM


Changing memtable_throughput_in_mb on a running system

2011-03-22 Thread Jonathan Colby
It seems some settings like memtable_throughput_in_mb  are Keyspace-specific 
(at least with 0.7.4).

How can these settings best be changed on a running cluster?

PS - preferable by a sysadmin using nodetool or cassandra-cli

Thanks!
Jon

Re: EC2 - 2 regions

2011-03-22 Thread Michael Rüger
Thanks Milind for sharing!

As Sasha already asked, ec2 sends data across regions over the
internet without any encryption. So you may consider to tunnel the traffic
thru ssh.

I don't know how to do that with cassandra. Any?

Regards, mike

On Tue, Mar 22, 2011 at 5:29 AM, Milind Parikh milindpar...@gmail.comwrote:

 Patch is attached... I don't have access to Jira.

 A cautionery note: This is NOT a general solution and is not intended as
 such. It could be included as a part of larger patch. I will explain in the
 limitation sections about why it is not a general solution; as I find time.

 Regards
 Milind

 On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna jeremy.hanna1...@gmail.com
  wrote:

 Sorry if I was presumptuous earlier.  I created a ticket so that the patch
 could be submitted and reviewed - that is if it can be generalized so that
 it works across regions and doesn't adversely affect the common case.
 https://issues.apache.org/jira/browse/CASSANDRA-2362

 On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote:

  Sorry if I was presumptuous earlier.  I created a ticket so that the
 patch could be submitted and reviewed - that is if it can be generalized so
 that it works across regions and doesn't adversely affect the common case.
  https://issues.apache.org/jira/browse/CASSANDRA-2362
  
  On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote:
 
  I talked to Matt Dennis in the channel about it and I think everyone
 would like to make sure that cassandra works great across multiple regions.
  He sounded like he didn't know why it wouldn't work after having looked at
 the patches.  I would like to try it both ways - with and without the
 patches later today if I can and I'd like to help out with getting it
 working out of the box.
 
  Thanks for the investigative work and documentation Milind!
 
  Jeremy
 
  On Mar 21, 2011, at 12:12 PM, Dave Viner wrote:
 
  Hi Milind,
 
  Great work here.  Can you provide the patch against the 2 files?
 
  Perhaps there's some way to incorporate it into the trunk of cassandra
 so that this is feasible (in a future release) without patching the source
 code.
 
  Dave Viner
 
 
  On Mon, Mar 21, 2011 at 9:41 AM, A J s5a...@gmail.com wrote:
  Thanks for sharing the document, Milind !
  Followed the instructions and it worked for me.
 
  On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh 
 milindpar...@gmail.com wrote:
  Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly
 this is
  work in progress but wanted to share what I have. PDF is the
 working
  copy.
 
 
 
 https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en
 
  On Sun, Mar 20, 2011 at 7:49 PM, aaron morton 
 aa...@thelastpickle.com
  wrote:
 
  Recent discussion on the dev list
  http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html
  Aaron
  On 19 Mar 2011, at 06:46, A J wrote:
 
  Just to add, all the telnet (port 7000) and cassandra-cli (port
 9160)
  connections are done using the public DNS (that goes like
  ec2-.compute.amazonaws.com)
 
  On Fri, Mar 18, 2011 at 1:37 PM, A J s5a...@gmail.com wrote:
 
  I am able to telnet from one region to another on 7000 port without
 
  issues. (I get the expected Connected to .Escape character is
 
  '^]'.)
 
  Also I am able to execute cassandra client on 9160 port from one
 
  region to another without issues (this is when I run cassandra
 
  separately on each region without forming a cluster).
 
  So I think the ports 7000 and 9160 are not the issue.
 
 
 
  On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner davevi...@gmail.com
 wrote:
 
  From the us-west instance, are you able to connect to the us-east
 instance
 
  using telnet on port 7000 and 9160?
 
  If not, then you need to open those ports for communication (via
 your
 
  Security Group)
 
  Dave Viner
 
  On Fri, Mar 18, 2011 at 10:20 AM, A J s5a...@gmail.com wrote:
 
  Thats exactly what I am doing.
 
  I was able to do the first two scenarios without any issues (i.e. 2
 
  nodes in same availability zone. Followed by an additional node in a
 
  different zone but same region)
 
  I am stuck at the third scenario of separate regions.
 
  (I did read the Cassandra nodes on EC2 in two different regions not
 
  communicating thread but it did not seem to end with resolution)
 
 
  On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner davevi...@gmail.com
 wrote:
 
  Hi AJ,
 
  I'd suggest getting to a multi-region cluster step-by-step.  First,
 get
 
  2
 
  nodes running in the same availability zone.  Make sure that works
 
  properly.
 
  Second, add a node in a separate availability zone, but in the same
 
  region.
 
  Make sure that's working properly.  Third, add a node that's in a
 
  separate
 
  region.
 
  Taking it step-by-step will ensure that any issues are specific to
 the
 
  region-to-region communication, rather than intra-zone connectivity
 or
 
  cassandra cluster configuration.
 
  Dave Viner
 
  On Fri, Mar 18, 2011 at 8:34 AM, A 

Re: EC2 - 2 regions

2011-03-22 Thread Jeremy Hanna
Milind,

Thank you for attaching the patch here, but it would be really nice if you 
could create a jira account so you could participate in the discussion on the 
ticket and put the patch on there - that is the way people license their 
contributions with the apache 2 license.  You just need to create an account 
with the public jira inked off of the ticket at the top.

Understandable that it would necessarily be a general solution now - but it's a 
start to understanding what would need to be done so that if possible, 
something general could be derived.  I'm just trying to help get the discussion 
started so it could be something that people could do out of the box.  Not only 
that, but also so that it could be tested and evolve with the codebase so that 
people could know that it is hardened and used by others.

Any limitations would be nice to note when you attach the patch to the ticket.

Thanks so much for your work on this!

Jeremy
  
On Mar 21, 2011, at 11:29 PM, Milind Parikh wrote:

 Patch is attached... I don't have access to Jira.
  
 A cautionery note: This is NOT a general solution and is not intended as 
 such. It could be included as a part of larger patch. I will explain in the 
 limitation sections about why it is not a general solution; as I find time. 
 
 Regards
 Milind
  
 On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna jeremy.hanna1...@gmail.com 
 wrote:
 Sorry if I was presumptuous earlier.  I created a ticket so that the patch 
 could be submitted and reviewed - that is if it can be generalized so that it 
 works across regions and doesn't adversely affect the common case.
 https://issues.apache.org/jira/browse/CASSANDRA-2362
 
 On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote:
 
  Sorry if I was presumptuous earlier.  I created a ticket so that the patch 
  could be submitted and reviewed - that is if it can be generalized so that 
  it works across regions and doesn't adversely affect the common case.
  https://issues.apache.org/jira/browse/CASSANDRA-2362
 
  On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote:
 
  I talked to Matt Dennis in the channel about it and I think everyone would 
  like to make sure that cassandra works great across multiple regions.  He 
  sounded like he didn't know why it wouldn't work after having looked at 
  the patches.  I would like to try it both ways - with and without the 
  patches later today if I can and I'd like to help out with getting it 
  working out of the box.
 
  Thanks for the investigative work and documentation Milind!
 
  Jeremy
 
  On Mar 21, 2011, at 12:12 PM, Dave Viner wrote:
 
  Hi Milind,
 
  Great work here.  Can you provide the patch against the 2 files?
 
  Perhaps there's some way to incorporate it into the trunk of cassandra so 
  that this is feasible (in a future release) without patching the source 
  code.
 
  Dave Viner
 
 
  On Mon, Mar 21, 2011 at 9:41 AM, A J s5a...@gmail.com wrote:
  Thanks for sharing the document, Milind !
  Followed the instructions and it worked for me.
 
  On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh milindpar...@gmail.com 
  wrote:
  Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly 
  this is
  work in progress but wanted to share what I have. PDF is the working
  copy.
 
 
  https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en
 
  On Sun, Mar 20, 2011 at 7:49 PM, aaron morton aa...@thelastpickle.com
  wrote:
 
  Recent discussion on the dev list
  http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html
  Aaron
  On 19 Mar 2011, at 06:46, A J wrote:
 
  Just to add, all the telnet (port 7000) and cassandra-cli (port 9160)
  connections are done using the public DNS (that goes like
  ec2-.compute.amazonaws.com)
 
  On Fri, Mar 18, 2011 at 1:37 PM, A J s5a...@gmail.com wrote:
 
  I am able to telnet from one region to another on 7000 port without
 
  issues. (I get the expected Connected to .Escape character is
 
  '^]'.)
 
  Also I am able to execute cassandra client on 9160 port from one
 
  region to another without issues (this is when I run cassandra
 
  separately on each region without forming a cluster).
 
  So I think the ports 7000 and 9160 are not the issue.
 
 
 
  On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner davevi...@gmail.com wrote:
 
  From the us-west instance, are you able to connect to the us-east 
  instance
 
  using telnet on port 7000 and 9160?
 
  If not, then you need to open those ports for communication (via your
 
  Security Group)
 
  Dave Viner
 
  On Fri, Mar 18, 2011 at 10:20 AM, A J s5a...@gmail.com wrote:
 
  Thats exactly what I am doing.
 
  I was able to do the first two scenarios without any issues (i.e. 2
 
  nodes in same availability zone. Followed by an additional node in a
 
  different zone but same region)
 
  I am stuck at the third scenario of separate regions.
 
  (I did read the Cassandra nodes on EC2 in two different regions not
 
  communicating thread but it 

Re: Deleting old SSTables

2011-03-22 Thread sridhar basam
Force a GC to remove the unused sstables. Use something like jconsole or cmd
line jmap -histo:live pid. You would run the jmap command as the
cassandra user or root. The jmap will give you a bunch of output on live
objects in the heap if you choose to look at it.

 Sridhar

On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby jonathan.co...@gmail.comwrote:

 According to the Wiki Page on compaction:  once compaction is finished, the
 old SSTable files may be deleted*

 * http://wiki.apache.org/cassandra/MemtableSSTable

 I thought the old SSTables would be deleted automatically, but this wiki
 page got me thinking otherwise.

 Question is,  if it is true that old SSTables must be manually deleted, how
 can one safely identify which SSTables can be deleted??

 Jon








Ec2Snitch Other snitches...

2011-03-22 Thread Sasha Dolgy
Hi Everyone,

Can the Ec2Snitch be enabled by adjusting the parameter in the
cassandra.yaml and restarting the node?

More, I suppose the question I'm after is, can the snitch method be
adjusted adhoc (with node restart) or once it's changed from
SimpleSnitch to Ec2Snitch that's it?  What influence does one node
have over the Gossiper?  If i'm not happy with the Ec2Snitch, I want
to change to PropertyFileSnitch ... but if I have 8 nodes at the
moment. ... i don't want to create additional headaches...

-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: Deleting old SSTables

2011-03-22 Thread Jonathan Ellis
From the next paragraph of the same wiki page:

SSTables that are obsoleted by a compaction are deleted asynchronously
when the JVM performs a GC. You can force a GC from jconsole if
necessary, but Cassandra will force one itself if it detects that it
is low on space. A compaction marker is also added to obsolete
sstables so they can be deleted on startup if the server does not
perform a GC before being restarted.

On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby
jonathan.co...@gmail.com wrote:
 According to the Wiki Page on compaction:  once compaction is finished, the 
 old SSTable files may be deleted*

 * http://wiki.apache.org/cassandra/MemtableSSTable

 I thought the old SSTables would be deleted automatically, but this wiki page 
 got me thinking otherwise.

 Question is,  if it is true that old SSTables must be manually deleted, how 
 can one safely identify which SSTables can be deleted??

 Jon









-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Deleting old SSTables

2011-03-22 Thread Jonathan Colby
doooh.  thanks!
On Mar 22, 2011, at 3:27 PM, Jonathan Ellis wrote:

 From the next paragraph of the same wiki page:
 
 SSTables that are obsoleted by a compaction are deleted asynchronously
 when the JVM performs a GC. You can force a GC from jconsole if
 necessary, but Cassandra will force one itself if it detects that it
 is low on space. A compaction marker is also added to obsolete
 sstables so they can be deleted on startup if the server does not
 perform a GC before being restarted.
 
 On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby
 jonathan.co...@gmail.com wrote:
 According to the Wiki Page on compaction:  once compaction is finished, the 
 old SSTable files may be deleted*
 
 * http://wiki.apache.org/cassandra/MemtableSSTable
 
 I thought the old SSTables would be deleted automatically, but this wiki 
 page got me thinking otherwise.
 
 Question is,  if it is true that old SSTables must be manually deleted, how 
 can one safely identify which SSTables can be deleted??
 
 Jon
 
 
 
 
 
 
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: SSL Streaming (#1567)

2011-03-22 Thread Sasha Dolgy
I see now that this is too new:
https://issues.apache.org/jira/browse/CASSANDRA-1567 and that it's
scheduled for the 0.8 release.

Is it right to assume the following from the accepted patch:

1.  keystore and truststore passwords are kept in clear text in the
cassandra.yaml ?
2.  It's all or nothing when it comes to inter-node communication over
SSL?  Meaning, nodes that are part of the ring that aren't configured
will start to fail if the configuration isn't changed?
3.  I only want to encrypt data from region 1  --  region 2 where a
vpn is not possible... data communication in the same rack for
example, is on a private network and shouldn't be encrypted (except
when it's ec2 ... i think it should be encrypted).  This is not
possible at the moment ... is there a plan for the future?

I do appreciate any feedback and don't mean this to come across in a
negative way.  Just trying to understand how far off it is from being
compliant in a security sense...

-sd


On Tue, Mar 22, 2011 at 9:21 AM, Sasha Dolgy sdo...@gmail.com wrote:
 Hi,

 Is there documentation available anywhere that describes how one can
 use org.apache.cassandra.security.streaming.* ?   After the EC2 posts
 yesterday, one question I was asked was about the security of data
 being shifted between nodes.  Is it done in clear text, or
 encrypted..?  I haven't seen anything to suggest that it's encrypted,
 but see in the source that security.streaming does leverage SSL ...

 Thanks in advance for some pointers to documentation.

 Also, for anyone who is using SSL .. how much of a performance impact
 have you noticed?  Is it minimal or significant?


Meaning of TotalReadLatencyMicros and TotalWriteLatencyMicrosStatistics

2011-03-22 Thread Jonathan Colby
Hi -

On our recently live cassandra cluster of 5 nodes, we've noticed that the 
latency readings, especially Reads have gone up drastically. 

TotalReadLatencyMicros  5413483
TotalWriteLatencyMicros 1811824


I understand these are in microseconds, but what meaning do they have for the 
performance of the cluster?   In other words what do these numbers actually 
measure.

In our case, it looks like we have  a read latency of 5.4 seconds, which is 
very troubling if I interpret this correctly.

Are reads really taking an average of 5 seconds to complete??





Re: Meaning of TotalReadLatencyMicros and TotalWriteLatencyMicrosStatistics

2011-03-22 Thread Ching-Cheng Chen
Just as what it named, it's the total microseconds spent on read operations
so far.

Not average.

Regards,

Chen

Senior Developer, EvidentSoftware(Leaders in Monitoring of NoSQL  JAVA )

http://www.evidentsoftware.com

On Tue, Mar 22, 2011 at 11:11 AM, Jonathan Colby
jonathan.co...@gmail.comwrote:

 Hi -

 On our recently live cassandra cluster of 5 nodes, we've noticed that the
 latency readings, especially Reads have gone up drastically.

 TotalReadLatencyMicros  5413483
 TotalWriteLatencyMicros 1811824


 I understand these are in microseconds, but what meaning do they have for
 the performance of the cluster?   In other words what do these numbers
 actually measure.

 In our case, it looks like we have  a read latency of 5.4 seconds, which is
 very troubling if I interpret this correctly.

 Are reads really taking an average of 5 seconds to complete??






Re: nodetool repair takes forever

2011-03-22 Thread Robert Coli
On Mon, Mar 21, 2011 at 8:33 PM, A J s5a...@gmail.com wrote:
 I am trying to estimate the time it will take to rebuild a node. After
 loading reasonable data,
 ...
 For some reason, the repair command runs forever. I just have 3G of
 data per node but still the repair is running for more than an hour !

What version of cassandra are you running?

=Rob


Re: nodetool repair takes forever

2011-03-22 Thread A J
0.7.4

On Tue, Mar 22, 2011 at 11:49 AM, Robert Coli rc...@digg.com wrote:
 On Mon, Mar 21, 2011 at 8:33 PM, A J s5a...@gmail.com wrote:
 I am trying to estimate the time it will take to rebuild a node. After
 loading reasonable data,
 ...
 For some reason, the repair command runs forever. I just have 3G of
 data per node but still the repair is running for more than an hour !

 What version of cassandra are you running?

 =Rob



Re: Ec2Snitch Other snitches...

2011-03-22 Thread Robert Coli
On Tue, Mar 22, 2011 at 7:19 AM, Sasha Dolgy sdo...@gmail.com wrote:
 More, I suppose the question I'm after is, can the snitch method be
 adjusted adhoc (with node restart) or once it's changed from
 SimpleSnitch to Ec2Snitch that's it?

You can change Snitches on a cluster with data on it, as long as you
are very careful about what you are doing and you are in a particular
case which you are probably not in if you want to change your Snitch.

The snitch meaningfully determines replica placement strategy, and in
general when changing snitches you need the replica placement strategy
to stay exactly the same. Unfortunately the point of changing a snitch
is usually.. changing your replica placement strategy. Simplest case
is if the replica placement strategy actually stays the same, like for
example when Digg replaced its custom version of the
PropertyFileSnitch with SimpleSnitch in prep for going single-DC,
because we weren't actually using the functionality of PFS. In that
case, I simply generated a set of input which hashed correctly such
that I had one piece of input per node. I then verified the topology
based on this input before and after changing my snitch, and got the
same results both times, confirming that my change of the Snitch was a
no-op.

A less simple, but still tractable case is if the topology changes
such that one or more replicas is different but at least one is still
the same. In this case, repair would be likely to repair.. most.. of
your data. But honestly if you have to change strategy that much (and
are not running IP-partitioned counts, which make this operation much
more difficult) you probably just want to dump and reload your data
into a new cluster which has the topology and snitch you want.

=Rob


Re: Advice on mmap related swapping issue

2011-03-22 Thread Adi
On Tue, Mar 22, 2011 at 3:44 PM, ruslan usifov ruslan.usi...@gmail.comwrote:



 2011/3/22 Adi adi.pan...@gmail.com

 I have been going through the mailing list and compiling suggestions to
 address the swapping due to mmap issue.

 1) Use JNA (done but)
 Are these steps also required:
 - Start Cassandra with CAP_IPC_LOCK (or as root). (not done)


 And what is CAP_IPC_LOCK?


I saw that suggestion in
https://issues.apache.org/jira/browse/CASSANDRA-1214.

I do not yet know how to run cassandra or a java process with that
privilege, still researching and hoping my sysadmin knows better.

http://www.lids.org/lids-howto/node50.html

   - Allow locking of shared memory segments
   - Allow mlock and mlockall (which doesn't really have anything to do with
   IPC)


Re: Can the Cassandra to be hosted, with all your features and performance, on Microsoft Azure ?

2011-03-22 Thread aaron morton
Sounds interesting, please let the community know your findings. 

Aaron

On 23 Mar 2011, at 01:31, FernandoVM wrote:

 Hi,
 
 contrib/py_stress is the easiest way to shake out any issues with your
 install and get a benchmark.
 There is also https://github.com/brianfrankcooper/YCSB but I would go with
 py_stress until it stops been useful.
 
 Very good, thank's.. !
 
 
 Note: These are abstract benchmarks to be used for entertainment purposes
 only, the performance and scaling of your application may vary. :)
 
 Yes, of course. I want have just a parameter to evaluate the cassandra
 performance between Azure and a local server. I need know more about
 this because Azure architecture details. The compute instances don't
 will store data in a local disk, but in the blob sotrage service with
 a local cache. I want see the impact of this. :)
 
 Thank's...
 
 []'s
 FernandoVM
 
 On Tue, Mar 22, 2011 at 12:33 AM, aaron morton aa...@thelastpickle.com 
 wrote:
 contrib/py_stress is the easiest way to shake out any issues with your
 install and get a benchmark.
 There is also https://github.com/brianfrankcooper/YCSB but I would go with
 py_stress until it stops been useful.
 Note: These are abstract benchmarks to be used for entertainment purposes
 only, the performance and scaling of your application may vary. :)
 
 Aaron
 On 22 Mar 2011, at 07:24, FernandoVM wrote:
 
 There are any benchmark that I can apply after install Cassandra on
 Azure to check performance/scalability issues?
 
 
 []'s
 FernandoVM
 
 On Sun, Mar 13, 2011 at 10:16 PM, aaron morton aa...@thelastpickle.com
 wrote:
 
 If it works like all the other virtual machine hosts then yes it can be
 
 hosted.
 
 Performance can always be less on a virtual machine though.
 
 See http://wiki.apache.org/cassandra/CloudConfig
 
 Aaron
 
 On 14 Mar 2011, at 13:09, FernandoVM wrote:
 
 Hello friends,
 
 
  Anyone know if the Cassandra can be hosted, with all your
 
 features and performance, on Microsoft Azure?
 
 
 []'s
 
 FernandoVM
 
 
 
 
 
 --
 
 []'s
 FernandoVM
 
 
 
 
 
 -- 
 
 []'s
 FernandoVM



Re: Clearsnapshot Problem

2011-03-22 Thread aaron morton
AFAIK upgrade from 0.6.2 to 0.6.12 should be a straight forward rolling 
restart. Do check the changes.txt file first and if you have a test env test it 
there. (The large gap in versions makes me a little nervous).

If you feel it's reproducible (even sometimes) can you create a jira ticket ?  
https://issues.apache.org/jira/browse/CASSANDRA

Windows gets less loving than *nix so any help is appreciated. 

Thanks
Aaron



On 23 Mar 2011, at 06:48, s p wrote:

 Thanks. The problem is intermittent meaning we have separate CA cluster 
 environments: In some cases there is no problem running a snapshot followed 
 by a later clear snapshot (or for that matter physical delete of the snapshot 
 file). 
 When I stop Cassandra the snapshot file can be deleted. As soon as Cassandra 
 is started the file is locked.
 
 While testing the above I monitored things with sysinternals (Handle, 
 ProcessExplorer, ProcMon). None of them show any open file handles/locks.
 
 Am I missing something? I see you suggestion going to 0.6.12. Can the 
 existing cluster (0.6.2) be upgraded node by node?
  
 -S
 
 On Mon, Mar 21, 2011 at 11:45 PM, aaron morton aa...@thelastpickle.com 
 wrote:
 There have been some issues to with deleting files on windows, cannot find a 
 reference to it happening for snapshots.
 
 If you restart the node can you delete the snapshot?
 
 Longer term can you upgrade to 0.6.12 and let us know if it happens again? 
 Any fix will be against that version.
 
 Hope that helps.
 Aaron
 
 
 On 22 Mar 2011, at 08:11, s p wrote:
 
  I'm running 3-way CA cluster (0.62 ) on a windows 2008 (jre 1.6.24) 64-bit. 
  Things are running fine except when trying to remove old snaphsot files. 
  When running clearsnapshot I get an error msg like below.  I can't remove 
  any daily snapshot files. When trying to delete the actual snapshot file os 
  cmd file I get access denied. File used by another process. Seems CA or 
  JRE is sitting on file?
 
  Feedback much appreciated.
 
 
  C:\Cassandra\
  Exception in thread main java.io.IOException: Failed to delete 
  c:\cassandra\da
  ta\data\ks_SnapshotTest\snapshots\1300731301822-ks_SnapshotTest\cf_SnapshotTest-135-Data.db
  at 
  org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.ja
  va:47)
  at 
  org.apache.cassandra.io.util.FileUtils.deleteDir(FileUtils.java:189)
  at 
  org.apache.cassandra.io.util.FileUtils.deleteDir(FileUtils.java:184)
  at 
  org.apache.cassandra.io.util.FileUtils.deleteDir(FileUtils.java:184)
  at org.apache.cassandra.db.Table.clearSnapshot(Table.java:274)
  at 
  org.apache.cassandra.service.StorageService.clearSnapshot(StorageServ
  ice.java:1023)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
  at java.lang.reflect.Method.invoke(Unknown Source)
  at 
  com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown So
  urce)
  at 
  com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown So
  urce)
  at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source)
  at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
  at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
  at 
  com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown
  Source)
  at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
  at 
  javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Sou
  rce)
  at javax.management.remote.rmi.RMIConnectionImpl.access$200(Unknown 
  Sour
  ce)
  at 
  javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run
  (Unknown Source)
  at 
  javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(U
  nknown Source)
  at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown 
  Source)
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
  at java.lang.reflect.Method.invoke(Unknown Source)
  at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
  at sun.rmi.transport.Transport$1.run(Unknown Source)
  at java.security.AccessController.doPrivileged(Native Method)
  at sun.rmi.transport.Transport.serviceCall(Unknown Source)
  at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
  at 
  sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Sou
  rce)
  at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown 
  Sour
  ce)
  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
  Source
  )
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
  Source)
  at 

Re: moving data from single node cassandra

2011-03-22 Thread Robert Coli
On Sun, Mar 20, 2011 at 4:42 PM, aaron morton aa...@thelastpickle.com wrote:
 When compacting it will use the path with the greatest free space. When 
 compaction completes successfully the files will lose their temporary status 
 and that will be their new home.

 On 18 Mar 2011, at 14:10, John Lewis wrote:

 | data_file_directories makes it seem as though cassandra can use more than 
 one location for sstable storage. Does anyone know how it splits up the data 
 between partitions? I am trying to plan for just about every worst case 
 scenario I can right now, and I want to know if I can change the config to 
 open up some secondary storage for a compaction if needed.

Standard disclaimer whenever anyone mentions using multi data_file_directories :

Multiple deploys have experienced lose as a result of this
configuration. No one, to my knowledge, has experienced win. You
probably don't want to use this feature for its primary effect, the
only time is when you need to work around some other issue.

=Rob


Re: Deleting old SSTables

2011-03-22 Thread buddhasystem
Jonathan,

for all of us just tinker with test clusters, building confidence in the
product, it would be nice to be able to do same with nodetool, without
jconsole, just my 0.5 penny.  Thanks.


Jonathan Ellis-3 wrote:
 
 From the next paragraph of the same wiki page:
 
 SSTables that are obsoleted by a compaction are deleted asynchronously
 when the JVM performs a GC. You can force a GC from jconsole if
 necessary, but Cassandra will force one itself if it detects that it
 is low on space. A compaction marker is also added to obsolete
 sstables so they can be deleted on startup if the server does not
 perform a GC before being restarted.
 
 On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby
 lt;jonathan.co...@gmail.comgt; wrote:
 gt; According to the Wiki Page on compaction:  once compaction is
 finished, the old SSTable files may be deleted*
 gt;
 gt; * http://wiki.apache.org/cassandra/MemtableSSTable
 gt;
 gt; I thought the old SSTables would be deleted automatically, but this
 wiki page got me thinking otherwise.
 gt;
 gt; Question is,  if it is true that old SSTables must be manually
 deleted, how can one safely identify which SSTables can be deleted??
 gt;
 gt; Jon
 gt;
 gt;
 gt;
 gt;
 gt;
 gt;
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com
 


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Deleting-old-SSTables-tp6196113p6198172.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: 0.7.2 choking on a 5 MB column

2011-03-22 Thread Jonathan Ellis
I'm writing a row with about 45k columns.

On Tue, Mar 22, 2011 at 7:39 PM, buddhasystem potek...@bnl.gov wrote:
 I'm writing a row with about 45k columns. Most of them are quite small, and
 there are a few of 2 MB and one of 5 MB. The write procedure times out.
 Total data load is 9 MB.

 What would be the cause?


 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/0-7-2-choking-on-a-5-MB-column-tp6198387p6198387.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


0.7.4 problems .. snitch?

2011-03-22 Thread Sasha Dolgy
Hi there,

Installed a new 4 node 0.7.4 cluster on ec2.  Brought up the first
node without issue with Ec2Snitch configured in the cassandra.yaml.

Brought up a second node, with the first node defined as the seed.  No
visible issues.  3  4 however are giving me problems as shown in the
output below.  Initially, I -did not- define tokens. When node 3 came
up, I had this error, so i went and manually moved the tokens and did
a nodetool move/repair/clean before getting on to node 4.

The tokens for the 4 nodes:

0
1909554714494251628118265338228798
56713727820156410577229101238628035242
170141183460469231731687303715884105726

So now, when the 4th node comes online, with it's token set in the
cassandra.yaml (first one i did it for because of the errors I saw
with node 3) ... everything goes well at first, in joining the ring,
etc.and then I see the following error in the system.log:

:~$  INFO [HintedHandoff:1] 2011-03-23 00:37:24,298
HintedHandOffManager.java (line 304) Started hinted handoff for
endpoint /10.0.0.2
 INFO [HintedHandoff:1] 2011-03-23 00:37:24,298
HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows
to endpoint /10.0.0.2
 INFO [GossipStage:2] 2011-03-23 00:37:55,381 StorageService.java
(line 702) Node /10.0.0.2 state jump to bootstrap
ERROR [GossipStage:2] 2011-03-23 00:37:55,381
DebuggableThreadPoolExecutor.java (line 103) Error in
ThreadPoolExecutor
java.lang.RuntimeException: Bootstrap Token collision between
/10.0.0.3 and /10.0.0.2 (token 1909554714494251628118265338228798
at 
org.apache.cassandra.locator.TokenMetadata.addBootstrapToken(TokenMetadata.java:143)
at 
org.apache.cassandra.service.StorageService.handleStateBootstrap(StorageService.java:706)
at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:648)
at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:772)
at 
org.apache.cassandra.gms.Gossiper.applyApplicationStateLocally(Gossiper.java:737)
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:679)
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:60)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [GossipStage:2] 2011-03-23 00:37:55,382
AbstractCassandraDaemon.java (line 112) Fatal exception in thread
Thread[GossipStage:2,5,main]
java.lang.RuntimeException: Bootstrap Token collision between
/10.0.0.3 and /10.0.0.2 (token 1909554714494251628118265338228798
at 
org.apache.cassandra.locator.TokenMetadata.addBootstrapToken(TokenMetadata.java:143)
at 
org.apache.cassandra.service.StorageService.handleStateBootstrap(StorageService.java:706)
at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:648)
at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:772)
at 
org.apache.cassandra.gms.Gossiper.applyApplicationStateLocally(Gossiper.java:737)
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:679)
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:60)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

:~$  INFO [GossipStage:3] 2011-03-23 00:38:24,859 StorageService.java
(line 745) Nodes /10.0.0.2 and /10.0.0.3 have the same token
1909554714494251628118265338228798.  /10.0.0.2 is the new owner
 WARN [GossipStage:3] 2011-03-23 00:38:24,859 TokenMetadata.java (line
115) Token 1909554714494251628118265338228798 changing ownership
from /10.0.0.3 to /10.0.0.2

:~$ nodetool -h 10.0.0.1 -p 9090 ring
Address Status State   LoadOwnsToken

170141183460469231731687303715884105726
10.0.0.1Up Normal  99.31 KB0.00%   0
10.0.0.2   Up Normal  122.67 KB   11.22%
1909554714494251628118265338228798
10.0.0.4   Up Normal  103.75 KB   88.78%
170141183460469231731687303715884105726
:~$


Should I be a bit more hands off with the Ec2Snitch  ?  Now i have
3 nodes with 1 having a duplicate token 

-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: 0.7.2 choking on a 5 MB column

2011-03-22 Thread buddhasystem
Jonathan, wide rows have been discussed. I thought that the limit on number
of columns is way bigger than 45k. What can one expect in reality?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/0-7-2-choking-on-a-5-MB-column-tp6198387p6198548.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Error connection to remote JMX agent! on nodetool

2011-03-22 Thread Maki Watanabe
How do you define your Keyspace?
As you may know, in Cassandra, replication (factor) is defined as the
attribute of Keyspace.
And what do you mean:
 However replication never happened.
 I can't get data I set at other node.

What did you do on cassandra, and what did you get in response?

maki


2011/3/23 ko...@vivinavi.com ko...@vivinavi.com:
 Hi Sasha
 Thank you so much for your advice.
 I changed JMX_PORT from 10036 to 8080 in cassandra-env.sh.
 Now nodetool ring is working as following.

 # nodetool --host **.**.254.54 ring
 Address Status   State Load    Owns    Token

           31247585259092561925693111230676487333
 **.**.254.53    Up Normal  51.3 KB 84.50%
 4871825541058236750403047111542070004
 **.**.254.54    Up Normal  66.71 KB   15.50%
 31247585259092561925693111230676487333

 Then it seems I could set data to other node by Cassandra-cli --host other
 node IP --port 9160.(Currently only 2 nodes)
 However replication never happened.
 I can't get data I set at other node.
 I don't know what's wrong.
 (I thought replication starts when cassandra -p restart)
 Please advice me how to do to start replication.
 Thank you for your advice in advance.


 (2011/03/18 23:38), Sasha Dolgy wrote:

 You need to specify the -jmxport with nodetool

 On Mar 19, 2011 2:48 AM, ko...@vivinavi.com ko...@vivinavi.com wrote:
 Hi everyone

 I am still new to Cassandra, Thrift.
 But anyway Cassandra 0.7.4, Thrift 0.5.0 are working on java 1.6.0.18 of
 Debian 5.0.7.at single node.
 Then I had to try and check multi node on 2 servers.
 (JVM_PORT=10036 on /etc/cassandra-env.sh)
 I modified /etc/cassandra/cassandra.yaml as following.
 auto_bootstrap:false -true
 seeds: -127.0.0.1 - add Global IP addres of 2 servers(incl.own server)
 listen_address:localhost - Own Global IP address(or own host name on
 /etc/hosts)
 rpc_address:localhost -0.0.0.0
 I run master server and then slave server.
 netstat -nl is as following. on both servers.
 Proto Recv-Q Send-Q Local Address Foreign Address State
 tcp 0 0 0.0.0.0:9160 0.0.0.0:* LISTEN
 tcp 0 0 0.0.0.0:10036 0.0.0.0:* LISTEN
 tcp 0 0 **.**.**.**:7000 0.0.0.0:* LISTEN

 However it seems Cassandra doesn't work.
 Because I can't get any data from Cluster (always null, data is broken?)
 So I checked the nodetool (nodetool --host IP ring).
 The nodetool had errors as following.
 Error connection to remote JMX agent!
 java.io.IOException: Failed to retrieve RMIServer stub:
 javax.naming.ServiceUnavailableException [Root exception is
 java.rmi.ConnectException: Connection refused to host: **.**.**.**;
 nested exception is:
 java.net.ConnectException: Connection refused]
 at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:342)
 at

 javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)
 at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:137)
 at org.apache.cassandra.tools.NodeProbe.init(NodeProbe.java:107)
 at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:511)
 Caused by: javax.naming.ServiceUnavailableException [Root exception is
 java.rmi.ConnectException: Connection refused to host: **.**.**.**;
 nested exception is:
 java.net.ConnectException: Connection refused]
 at
 com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118)
 at

 com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203)
 at javax.naming.InitialContext.lookup(InitialContext.java:409)
 at

 javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1902)
 at

 javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1871)
 at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:276)
 ... 4 more
 Caused by: java.rmi.ConnectException: Connection refused to host:
 **.**.**.**; nested exception is:
 java.net.ConnectException: Connection refused
 at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
 at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
 at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
 at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
 at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
 at
 com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:114)
 ... 9 more
 Caused by: java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at

 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310)
 at

 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176)
 at
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
 at java.net.Socket.connect(Socket.java:546)
 at java.net.Socket.connect(Socket.java:495)
 at java.net.Socket.init(Socket.java:392)
 at java.net.Socket.init(Socket.java:206)
 at

 

Re: 0.7.2 choking on a 5 MB column

2011-03-22 Thread buddhasystem
I see. I'm doing something even more drastic then, because I'm only inserting
one row in this case, and just use cf.insert(), without batch mutator. It
didn't occur to me that was a bad idea.

So I take it, this method will fail. Hmm.


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/0-7-2-choking-on-a-5-MB-column-tp6198387p6198618.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: EC2 - 2 regions

2011-03-22 Thread Milind Parikh
@aj
are you sure that all ports are accessible from all node?

@sasha
I think that being able to have the semantics of address aNAT address can
emable security from different perspective.  Describing an overlay nw will
take long hete. But that may solve your security concerns over the internet.

/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/

On Mar 22, 2011 11:00 AM, Sasha Dolgy sdo...@gmail.com wrote:

there are some other knock on issues too.  the SSL work that has been
done would also have to be changed ...

-sd


On Tue, Mar 22, 2011 at 6:58 PM, A J s5a...@gmail.com wrote:
 Milind,
 Among the limitation you...
--
Sasha Dolgy
sasha.do...@gmail.com