Re: Install Cassandra on EC2
Hi Eldad, Check out http://wiki.apache.org/cassandra/CloudConfig There are a few ways listed there including a step-by-step guide. Dave Viner On Wed, Aug 3, 2011 at 7:49 AM, Eldad Yamin elda...@gmail.com wrote: Thanks! But I prefer to learn how to Install first - if you have any good references (I didn't find any, even general installation for a EC2/regular machine) I'm also going to try and install Solandra, I hope that Whirr will support it in the near future. On Wed, Aug 3, 2011 at 5:43 PM, John Conwell j...@iamjohn.me wrote: One thing you might want to look at is the Apache Whirr project (which is awesome by the way!). It automagically handles spinning up a cluster of resources on EC2 (or rackspace for that matter), installing and configuring cassandra, and starting it. One thing to be aware of if you go this route. By default in the yaml file all data is written under the /var folder. But on a server started by Whirr, this folder only has something like 4gb. Most of the hard disk space is under the /mnt folder. So you'll either need to change what folders are pointed to what drives (not sure if you can or not...I'm sure you could), or change the yaml file to point the /mnt folder. On Wed, Aug 3, 2011 at 6:28 AM, Eldad Yamin elda...@gmail.com wrote: Hi, Is there any manual or important notes I should know before I try to install Cassandra on EC2? Thanks! -- Thanks, John C
Re: LB scenario
AJ, One issue that I found in using load balancer in front of cassandra nodes is that a single node might become bogged down by compaction, or other actions unrelated to the client. If the load balancer does not pick this up in time, it might route client requests to the node that is temporarily overloaded. In practice, I've found it better for the client to have a pool of connections, and then retry as needed to distinct nodes rather than use a load balancer. HTH Dave Viner On Tue, Apr 5, 2011 at 9:51 AM, A J s5a...@gmail.com wrote: Can someone comment on this ? Or is the question too vague ? Thanks. On Wed, Mar 30, 2011 at 3:58 PM, A J s5a...@gmail.com wrote: Does the following load balancing scenario look reasonable with cassandra ? I will not be having any app servers. http://dl.dropbox.com/u/7258508/2011-03-30_1542.png Thanks.
Re: what kind of bug?
I saw this once when my servers ran out of file descriptors. This caused totally weird problems. Make sure all nodes in the cluster are listening on the gossip port (7000 by default). Also check out http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodesor http://www.datastax.com/docs/0.6/troubleshooting/index#view-of-ring-differs-between-some-nodes depending on your version. On Wed, Mar 23, 2011 at 11:39 AM, Aaron Morton aa...@thelastpickle.comwrote: First thing is check the logs on host 1. Check the view of the ring from all other nodes in the cluster, do they think nodes 2 and 3 are also down? Then confirm all nodes have the same config for listen port and all nodes can telnet to the listen port for the other nodes. I'm guessing the insert fails for some inserts because you are working at Quorum and your replication factor is less than 5. Aaron On 23/03/2011, at 11:31 PM, pob peterob...@gmail.com wrote: Hello, what kind of bug is it? If I do nodetool host1 ring, the output is: Address Status State LoadOwnsToken 141784319550391026443072753096570088105 1.174 Up Normal 4.14 GB 16.67% 0 1.173 Down Normal 4.07 GB 16.67% 28356863910078205288614550619314017621 1.172 Down Normal 4.1 GB 16.67% 56713727820156410577229101238628035242 1.179 Up Normal 4.05 GB 16.67% 85070591730234615865843651857942052863 1.175 Up Normal 4.13 GB 16.67% 113427455640312821154458202477256070484 1.177 Up Normal 4.12 GB 16.67% 141784319550391026443072753096570088105 but if I do nodetool host3 ring, the output is: Address Status State LoadOwnsToken 141784319550391026443072753096570088105 1.174 Up Normal 4.14 GB 16.67% 0 1.173 Up Normal 4.07 GB 16.67% 28356863910078205288614550619314017621 1.172 Up Normal 4.1 GB 16.67% 56713727820156410577229101238628035242 1.179 Up Normal 4.05 GB 16.67% 85070591730234615865843651857942052863 1.175 Up Normal 4.13 GB 16.67% 113427455640312821154458202477256070484 1.177 Up Normal 4.12 GB 16.67% 141784319550391026443072753096570088105 Some nodes see some nodes Down, and its impossible to do correct insert. The cassandra is running on nodes that is matched down. Any ideas? Thanks Best, Peter
Re: EC2 - 2 regions
Hi AJ, I'd suggest getting to a multi-region cluster step-by-step. First, get 2 nodes running in the same availability zone. Make sure that works properly. Second, add a node in a separate availability zone, but in the same region. Make sure that's working properly. Third, add a node that's in a separate region. Taking it step-by-step will ensure that any issues are specific to the region-to-region communication, rather than intra-zone connectivity or cassandra cluster configuration. Dave Viner On Fri, Mar 18, 2011 at 8:34 AM, A J s5a...@gmail.com wrote: Hello, I am trying to setup a cassandra cluster across regions. For testing I am keeping it simple and just having one node in US-EAST (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say ec2-2-2-3-4.us-west-1.compute.amazonaws.com). Using Cassandra 0.7.4 The one in east region is the seed node and has the values as: auto_bootstrap: false seeds: ec2-1-2-3-4.compute-1.amazonaws.com listen_address: ec2-1-2-3-4.compute-1.amazonaws.com rpc_address: 0.0.0.0 The one in west region is non seed and has the values as: auto_bootstrap: true seeds: ec2-1-2-3-4.compute-1.amazonaws.com listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com rpc_address: 0.0.0.0 I first fire the seed node (east region instance) and it comes up without issues. When I fire the non-seed node (west region instance) it fails after sometime with the error: DEBUG 15:09:08,844 Created HHOM instance, registered MBean. INFO 15:09:08,844 Joining: getting load information INFO 15:09:08,845 Sleeping 9 ms to wait for load information... DEBUG 15:09:09,822 attempting to connect to ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4 DEBUG 15:09:10,825 Disseminating load info ... DEBUG 15:10:10,826 Disseminating load info ... DEBUG 15:10:38,845 ... got load info INFO 15:10:38,845 Joining: getting bootstrap token ERROR 15:10:38,847 Exception encountered during startup. java.lang.RuntimeException: No other nodes seen! Unable to bootstrap at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:404) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79) The seed node seems to somewhat acknowledge the non-seed node: attempting to connect to /2.2.3.4 attempting to connect to /10.170.190.31 Can you suggest how can I fix it (I did see a few threads on similar issue but did not really follow the chain) Thanks, AJ
Re: EC2 - 2 regions
From the us-west instance, are you able to connect to the us-east instance using telnet on port 7000 and 9160? If not, then you need to open those ports for communication (via your Security Group) Dave Viner On Fri, Mar 18, 2011 at 10:20 AM, A J s5a...@gmail.com wrote: Thats exactly what I am doing. I was able to do the first two scenarios without any issues (i.e. 2 nodes in same availability zone. Followed by an additional node in a different zone but same region) I am stuck at the third scenario of separate regions. (I did read the Cassandra nodes on EC2 in two different regions not communicating thread but it did not seem to end with resolution) On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner davevi...@gmail.com wrote: Hi AJ, I'd suggest getting to a multi-region cluster step-by-step. First, get 2 nodes running in the same availability zone. Make sure that works properly. Second, add a node in a separate availability zone, but in the same region. Make sure that's working properly. Third, add a node that's in a separate region. Taking it step-by-step will ensure that any issues are specific to the region-to-region communication, rather than intra-zone connectivity or cassandra cluster configuration. Dave Viner On Fri, Mar 18, 2011 at 8:34 AM, A J s5a...@gmail.com wrote: Hello, I am trying to setup a cassandra cluster across regions. For testing I am keeping it simple and just having one node in US-EAST (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say ec2-2-2-3-4.us-west-1.compute.amazonaws.com). Using Cassandra 0.7.4 The one in east region is the seed node and has the values as: auto_bootstrap: false seeds: ec2-1-2-3-4.compute-1.amazonaws.com listen_address: ec2-1-2-3-4.compute-1.amazonaws.com rpc_address: 0.0.0.0 The one in west region is non seed and has the values as: auto_bootstrap: true seeds: ec2-1-2-3-4.compute-1.amazonaws.com listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com rpc_address: 0.0.0.0 I first fire the seed node (east region instance) and it comes up without issues. When I fire the non-seed node (west region instance) it fails after sometime with the error: DEBUG 15:09:08,844 Created HHOM instance, registered MBean. INFO 15:09:08,844 Joining: getting load information INFO 15:09:08,845 Sleeping 9 ms to wait for load information... DEBUG 15:09:09,822 attempting to connect to ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4 DEBUG 15:09:10,825 Disseminating load info ... DEBUG 15:10:10,826 Disseminating load info ... DEBUG 15:10:38,845 ... got load info INFO 15:10:38,845 Joining: getting bootstrap token ERROR 15:10:38,847 Exception encountered during startup. java.lang.RuntimeException: No other nodes seen! Unable to bootstrap at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:404) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79) The seed node seems to somewhat acknowledge the non-seed node: attempting to connect to /2.2.3.4 attempting to connect to /10.170.190.31 Can you suggest how can I fix it (I did see a few threads on similar issue but did not really follow the chain) Thanks, AJ
Re: cassandra as user-profile data store
Hi Dave, Glad to hear others are using it in this fashion! Are you using Tyler's suggested strategy for user-profile data - one CF that stores the timeline, with rows of user-ids, and TimeUUID columns for each data-collection-time. Then some post-processing with Hadoop over the timelines for each user to build a Profile? Are you on 0.7 or 0.6.x? Dave Viner On Tue, Mar 1, 2011 at 1:31 AM, Dave Gardner dave.gard...@visualdna.comwrote: Dave Tyler's answer already covers CFs etc.. We are using Cassandra to store user profile data for exactly the sort of use case you describe. We don't yet store _all_ the data in Cassandra; currently we are focusing on the stuff we need available for real-time access. We use Hadoop to analyse the profiles from within Cassandra. Dave On 23 February 2011 23:21, Dave Viner davevi...@gmail.com wrote: Hi all, I'm wondering if anyone has used cassandra as a datastore for a user-profile service. I'm thinking of applications like behavioral targeting, where there are lots lots of users (10s to 100s of millions), and lots lots of data about them intermixed in, say, weblogs (probably TBs worth). The idea would be to use Cassandra as a datastore for distributed parallel processing of the TBs of files (say on hadoop). Then the resulting user-profiles would be query-able quickly. Anyone know of that sort of application of Cassandra? I'm trying to puzzle out just what the column family might look like. Seems like a mix of time-oriented information (user x visits site y at time z), location information (user x appeared from ip x.y.z.a which is geo-location 31.20309, 120.10923), and derived information (because user x visited site y 15 times within a 10 day window, user x must be interested in buying a car). I don't have specifics as yet... just some general thoughts. But this feels like a Cassandra type problem. (User profile can have lots of columns per user, but the exact columns might differ from user to user... very scalable, etc) Thanks Dave Viner
Re: Cassandra nodes on EC2 in two different regions not communicating
Another possibility is this: why not setup 2 nodes in 1 region in 1 az, and get that to work. Then, open a third node in the same region, but different AZ, and get that to work. Then, once you have that working, open a fourth node in a different region and get that to work. Seems like taking a piece-meal approach would be beneficial here. Dave Viner On Thu, Feb 24, 2011 at 6:11 AM, Daniel van Ham Colchete daniel.colch...@gmail.com wrote: Himanshi, my bad, try this for iptables: # SNAT outgoing connections iptables -t nat -A POSTROUTING -p tcp --dport 7000 -d 175.41.143.192 -j SNAT --to-source INTERNALIP As for tcpdump the argument for the -i option is the interface name (eth0, cassth0, etc...), and not the IP. So, it should be tcpdump -i cassth0 -n port 7000 or tcpdump -i eth0 -n port 7000 I`m assuming your main network card is eth0, but that should be the case. Does it work? Best, Daniel On Thu, Feb 24, 2011 at 9:27 AM, Himanshi Sharma himanshi.sha...@tcs.comwrote: Thanks Daniel. But SNAT command is not working and when i try tcpdump it gives [root@ip-10-136-75-201 ~]# tcpdump -i 50.18.60.117 -n port 7000 tcpdump: Invalid adapter index Not able to figure out wats this ?? Thanks, Himanshi From: Daniel van Ham Colchete daniel.colch...@gmail.com To: user@cassandra.apache.org Date: 02/24/2011 04:27 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- Himanshi, you could try adding your public IP address to an internal interface and DNAT the packets to it. This shouldn't give you any problems with your normal traffic. Tell Cassandra on listen on the public IPs and it should work. Linux commands would be: # Create an internal interface using bridge-utils brctl addbr cassth0 # add the ip ip addr add dev cassth0 *50.18.60.117/32* http://50.18.60.117/32 # DNAT incoming connections iptables -t nat -A PREROUTING -p tcp --dport 7000 -d INTERNALIP -j DNAT --to-destination 50.18.60.117 # SNAT outgoing connections iptables -t nat -A OUTPUT -p tcp --dport 7000 -d 175.41.143.192 -j SNAT --to-source INTERNALIP This should work since Amazon you re-SNAT your outgoing packets to your public IP again, so the other cassandra instance will see your public IP as your source address. I didn't test this setup here but it should work unless I forgot some small detail. If you need to troubleshoot use the command tcpdump -i INTERFACE -n port 7000 where INTERFACE should be your public interface or your cassth0. Please let me know if it worked. Best regards, Daniel Colchete On Thu, Feb 24, 2011 at 4:04 AM, Himanshi Sharma * himanshi.sha...@tcs.com* himanshi.sha...@tcs.com wrote: giving private ip to rpc address gives the same exception and the keeping it blank and providing public to listen also fails. I tried keeping both blank and did telnet on 7000 so i get following o/p [root@ip-10-166-223-150 bin]# telnet 122.248.193.37 7000 Trying 122.248.193.37... Connected to 122.248.193.37. Escape character is '^]'. Similarly from another achine [root@ip-10-136-75-201 bin]# telnet 184.72.22.87 7000 Trying 184.72.22.87... Connected to 184.72.22.87. Escape character is '^]'. -Dave Viner wrote: - To: *user@cassandra.apache.org* user@cassandra.apache.org From: Dave Viner *davevi...@gmail.com* davevi...@gmail.com Date: 02/24/2011 11:59AM cc: Himanshi Sharma *himanshi.sha...@tcs.com* himanshi.sha...@tcs.com Subject: Re: Cassandra nodes on EC2 in two different regions not communicating Try using the private ipv4 address in the rpc_address field, and the public ipv4 (NOT the elastic ip) in the listen_address. If that fails, go back to rpc_address empty, and start up cassandra. Then from the other node, please telnet to port 7000 on the first node. And show the output of that session in your reply. I haven't actually constructed a cross-region cluster nor have I used v0.7, but this really sounds like it should be easy. On Wed, Feb 23, 2011 at 10:22 PM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Hi Dave, I tried with the public ips. If i mention the public ip in rpc address field, Cassandra gives the same exception but if leave it blank then Cassandra runs but again in the nodetool command with ring option it does'nt show the node in another region. Thanks, Himanshi -Dave Viner wrote: - To: *user@cassandra.apache.org * user@cassandra.apache.org From: Dave Viner *davevi...@gmail.com * davevi...@gmail.com Date: 02/24/2011 10:43AM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating That looks like it's not an issue of communicating between nodes. It appears that the node can not bind to the address on the localhost that you're asking for. java.net.BindException: Cannot assign requested address I think the issue is that the Elastic IP address is not actually
Re: Cassandra nodes on EC2 in two different regions not communicating
Try using the IP address, not the dns name in the cassandra.yaml. If you can telnet from one to the other on port 7000, and both nodes have the other node in their config, it should work. Dave Viner On Wed, Feb 23, 2011 at 1:43 AM, Himanshi Sharma himanshi.sha...@tcs.comwrote: Ya they do. Have specified Public DNS in seed field of each node in Cassandra.yaml...nt able to figure out what the problem is ??? From: Sasha Dolgy sdo...@gmail.com To: user@cassandra.apache.org Date: 02/23/2011 02:56 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- did you define the other host in the cassandra.yaml ? on both servers they need to know about each other On Wed, Feb 23, 2011 at 10:16 AM, Himanshi Sharma * himanshi.sha...@tcs.com* himanshi.sha...@tcs.com wrote: Thanks Dave but I am able to telnet to other instances on port 7000 and when i run ./nodetool --host * ec2-50-18-60-117.us-west-1.compute.amazonaws.com*http://ec2-50-18-60-117.us-west-1.compute.amazonaws.com/ ring... I can see only one node. Do we need to configure anything else in Cassandra.yaml or Cassandra-env.sh ??? From: Dave Viner *davevi...@gmail.com* davevi...@gmail.com To: * user@cassandra.apache.org* user@cassandra.apache.org Cc: Himanshi Sharma *himanshi.sha...@tcs.com* himanshi.sha...@tcs.com Date: 02/23/2011 11:36 AM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- If you login to one of the nodes, can you telnet to port 7000 on the other node? If not, then almost certainly it's a firewall/Security Group issue. You can find out the security groups for any node by logging in, and then running: % curl *http://169.254.169.254/latest/meta-data/security-groups*http://169.254.169.254/latest/meta-data/security-groups Assuming that both nodes are in the same security group, ensure that the SG is configured to allow other members of the SG to communicate on port 7000 to each other. HTH, Dave Viner On Tue, Feb 22, 2011 at 8:59 PM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Hi, I am new to Cassandra. I m running Cassandra on EC2. I configured Cassandra cluster on two instances in different regions. But when I am trying the nodetool command with ring option, I am getting only single node. How to make these two nodes communicate with each other. I have already opened required ports. i.e 7000, 8080, 9160 in respective security groups. Plz help me with this. Regards, Himanshi Sharma =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you -- Sasha Dolgy* **sasha.do...@gmail.com* sasha.do...@gmail.com =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
cassandra as user-profile data store
Hi all, I'm wondering if anyone has used cassandra as a datastore for a user-profile service. I'm thinking of applications like behavioral targeting, where there are lots lots of users (10s to 100s of millions), and lots lots of data about them intermixed in, say, weblogs (probably TBs worth). The idea would be to use Cassandra as a datastore for distributed parallel processing of the TBs of files (say on hadoop). Then the resulting user-profiles would be query-able quickly. Anyone know of that sort of application of Cassandra? I'm trying to puzzle out just what the column family might look like. Seems like a mix of time-oriented information (user x visits site y at time z), location information (user x appeared from ip x.y.z.a which is geo-location 31.20309, 120.10923), and derived information (because user x visited site y 15 times within a 10 day window, user x must be interested in buying a car). I don't have specifics as yet... just some general thoughts. But this feels like a Cassandra type problem. (User profile can have lots of columns per user, but the exact columns might differ from user to user... very scalable, etc) Thanks Dave Viner
Re: Cassandra nodes on EC2 in two different regions not communicating
That looks like it's not an issue of communicating between nodes. It appears that the node can not bind to the address on the localhost that you're asking for. java.net.BindException: Cannot assign requested address I think the issue is that the Elastic IP address is not actually an IP address that's on the localhost. So the daemon can not bind to that IP. Instead of using the EIP, use the local IP address for the rpc_address (i think that's what you need since that is what Thrift will bind to). Then for the listen_address should be the ip address that is routable from the other node. I would first try with the actual public IP address (not the Elastic IP). Once you get that to work, then shutdown the cluster, change the listen_address to the EIP, boot up and try again. Dave Viner On Wed, Feb 23, 2011 at 8:54 PM, Himanshi Sharma himanshi.sha...@tcs.comwrote: Hey Dave, Sorry i forgot to mention the Non-seed configuration. for first node in us-west its as belowi.e its own elastic ip listen_address: 50.18.60.117 rpc_address: 50.18.60.117 and for second node in ap-southeast-1 its as belowi.e again its own elastic ip listen_address: 175.41.143.192 rpc_address: 175.41.143.192 Thanks, Himanshi From: Dave Viner davevi...@gmail.com To: user@cassandra.apache.org Date: 02/23/2011 11:01 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- internal EC2 ips (10.xxx.xxx.xxx) work across availability zones (e.g., from us-east-1a to us-east-1b) but do not work across regions (e.g., us-east to us-west). To do regions, you must use the public ip address assigned by amazon. Himanshi, when you log into 1 node, and telnet to port 7000 on the other node, which IP address did you use - the 10.x address or the public ip address? And what is the seed/non-seed configuration in both cassandra.yaml files? Dave Viner On Wed, Feb 23, 2011 at 8:12 AM, Frank LoVecchio *fr...@isidorey.com*fr...@isidorey.com wrote: The internal Amazon IP address is what you will want to use so you don't have to go through DNS anyways; not sure if this works from US-East to US-West, but it does make things quicker in between zones, e.g. us-east-1a to us-east-1b. On Wed, Feb 23, 2011 at 9:09 AM, Dave Viner *davevi...@gmail.com*davevi...@gmail.com wrote: Try using the IP address, not the dns name in the cassandra.yaml. If you can telnet from one to the other on port 7000, and both nodes have the other node in their config, it should work. Dave Viner On Wed, Feb 23, 2011 at 1:43 AM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Ya they do. Have specified Public DNS in seed field of each node in Cassandra.yaml...nt able to figure out what the problem is ??? From: Sasha Dolgy *sdo...@gmail.com* sdo...@gmail.com To: * user@cassandra.apache.org* user@cassandra.apache.org Date: 02/23/2011 02:56 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- did you define the other host in the cassandra.yaml ? on both servers they need to know about each other On Wed, Feb 23, 2011 at 10:16 AM, Himanshi Sharma * himanshi.sha...@tcs.com* himanshi.sha...@tcs.com wrote: Thanks Dave but I am able to telnet to other instances on port 7000 and when i run ./nodetool --host * ec2-50-18-60-117.us-west-1.compute.amazonaws.com*http://ec2-50-18-60-117.us-west-1.compute.amazonaws.com/ ring... I can see only one node. Do we need to configure anything else in Cassandra.yaml or Cassandra-env.sh ??? From: Dave Viner *davevi...@gmail.com* davevi...@gmail.com To: * user@cassandra.apache.org* user@cassandra.apache.org Cc: Himanshi Sharma *himanshi.sha...@tcs.com* himanshi.sha...@tcs.com Date: 02/23/2011 11:36 AM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- If you login to one of the nodes, can you telnet to port 7000 on the other node? If not, then almost certainly it's a firewall/Security Group issue. You can find out the security groups for any node by logging in, and then running: % curl *http://169.254.169.254/latest/meta-data/security-groups*http://169.254.169.254/latest/meta-data/security-groups Assuming that both nodes are in the same security group, ensure that the SG is configured to allow other members of the SG to communicate on port 7000 to each other. HTH, Dave Viner On Tue, Feb 22, 2011 at 8:59 PM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Hi, I am new to Cassandra. I m running Cassandra on EC2. I configured Cassandra cluster on two instances in different regions. But when I am trying the nodetool command with ring option, I am getting only single node. How to make these two nodes communicate with each other. I have already opened required ports. i.e 7000
Re: Cassandra nodes on EC2 in two different regions not communicating
Try using the private ipv4 address in the rpc_address field, and the public ipv4 (NOT the elastic ip) in the listen_address. If that fails, go back to rpc_address empty, and start up cassandra. Then from the other node, please telnet to port 7000 on the first node. And show the output of that session in your reply. I haven't actually constructed a cross-region cluster nor have I used v0.7, but this really sounds like it should be easy. On Wed, Feb 23, 2011 at 10:22 PM, Himanshi Sharma himanshi.sha...@tcs.comwrote: Hi Dave, I tried with the public ips. If i mention the public ip in rpc address field, Cassandra gives the same exception but if leave it blank then Cassandra runs but again in the nodetool command with ring option it does'nt show the node in another region. Thanks, Himanshi -Dave Viner wrote: - To: user@cassandra.apache.org From: Dave Viner davevi...@gmail.com Date: 02/24/2011 10:43AM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating That looks like it's not an issue of communicating between nodes. It appears that the node can not bind to the address on the localhost that you're asking for. java.net.BindException: Cannot assign requested address I think the issue is that the Elastic IP address is not actually an IP address that's on the localhost. So the daemon can not bind to that IP. Instead of using the EIP, use the local IP address for the rpc_address (i think that's what you need since that is what Thrift will bind to). Then for the listen_address should be the ip address that is routable from the other node. I would first try with the actual public IP address (not the Elastic IP). Once you get that to work, then shutdown the cluster, change the listen_address to the EIP, boot up and try again. Dave Viner On Wed, Feb 23, 2011 at 8:54 PM, Himanshi Sharma himanshi.sha...@tcs.com wrote: Hey Dave, Sorry i forgot to mention the Non-seed configuration. for first node in us-west its as belowi.e its own elastic ip listen_address: 50.18.60.117 rpc_address: 50.18.60.117 and for second node in ap-southeast-1 its as belowi.e again its own elastic ip listen_address: 175.41.143.192 rpc_address: 175.41.143.192 Thanks, Himanshi From: Dave Viner davevi...@gmail.com To: user@cassandra.apache.org Date: 02/23/2011 11:01 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- internal EC2 ips (10.xxx.xxx.xxx) work across availability zones (e.g., from us-east-1a to us-east-1b) but do not work across regions (e.g., us-east to us-west). To do regions, you must use the public ip address assigned by amazon. Himanshi, when you log into 1 node, and telnet to port 7000 on the other node, which IP address did you use - the 10.x address or the public ip address? And what is the seed/non-seed configuration in both cassandra.yaml files? Dave Viner On Wed, Feb 23, 2011 at 8:12 AM, Frank LoVecchio *fr...@isidorey.com *fr...@isidorey.com wrote: The internal Amazon IP address is what you will want to use so you don't have to go through DNS anyways; not sure if this works from US-East to US-West, but it does make things quicker in between zones, e.g. us-east-1a to us-east-1b. On Wed, Feb 23, 2011 at 9:09 AM, Dave Viner *davevi...@gmail.com *davevi...@gmail.com wrote: Try using the IP address, not the dns name in the cassandra.yaml. If you can telnet from one to the other on port 7000, and both nodes have the other node in their config, it should work. Dave Viner On Wed, Feb 23, 2011 at 1:43 AM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Ya they do. Have specified Public DNS in seed field of each node in Cassandra.yaml...nt able to figure out what the problem is ??? From: Sasha Dolgy *sdo...@gmail.com * sdo...@gmail.com To: *user@cassandra.apache.org * user@cassandra.apache.org Date: 02/23/2011 02:56 PM Subject: Re: Cassandra nodes on EC2 in two different regions not communicating -- did you define the other host in the cassandra.yaml ? on both servers they need to know about each other On Wed, Feb 23, 2011 at 10:16 AM, Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com wrote: Thanks Dave but I am able to telnet to other instances on port 7000 and when i run ./nodetool --host *ec2-50-18-60-117.us-west-1.compute.amazonaws.com * http://ec2-50-18-60-117.us-west-1.compute.amazonaws.com/ ring... I can see only one node. Do we need to configure anything else in Cassandra.yaml or Cassandra-env.sh ??? From: Dave Viner *davevi...@gmail.com * davevi...@gmail.com To: *user@cassandra.apache.org * user@cassandra.apache.org Cc: Himanshi Sharma *himanshi.sha...@tcs.com * himanshi.sha...@tcs.com Date: 02/23/2011 11:36 AM Subject: Re: Cassandra nodes on EC2 in two different
quick shout-out to the riptano/datastax folks!
Just a quick shout-out to the riptano folks and becoming part of/forming DataStax! Congrats!
Re: Upgrading from 0.6 to 0.7.0
I agree. I am running a 0.6 cluster and would like to upgrade to 0.7. But, I can not simply stop my existing nodes. I need a way to load a new cluster - either on the same machines or new machines - with the existing data. I think my overall preference would be to upgrade the cluster to 0.7 running on a new port (or new set of machines), then have a tiny translation service on the old port which did whatever translation is required from 0.6 protocol to 0.7 protocol. Then I would upgrade my clients once to the 0.7 protocol and also change their connection parameters to the new 0.7 cluster. But, I'd be open to anything ... just need a way to upgrade without having to turn everything off, do the upgrade, then turn everything back on. I am not able to do that in my production environment (for business reasons). Docs on alternatives other than turn off, upgrade, turn on would be fantastic. Dave Viner On Fri, Jan 21, 2011 at 1:01 PM, Aaron Morton aa...@thelastpickle.comwrote: Yup, you can use diff ports and you can give them different cluster names and different seed lists. After you upgrade the second cluster partition the data should repair across, either via RR or the HHs that were stored while the first partition was down. Easiest thing would be to run node tool repair. Then a clean up to remove any leftover data. AFAIK file formats are compatible. But drain the nodes before upgrading to clear the log. Can you test this on a non production system? Aaron (we really need to write some upgrade docs:)) On 21/01/2011, at 10:42 PM, Dave Gardner dave.gard...@imagini.net wrote: What about executing writes against both clusters during the changeover? Interested in this topic because we're currently thinking about the same thing - how to upgrade to 0.7 without any interruption. Dave On 21 January 2011 09:20, Daniel Josefsson jid...@gmail.com jid...@gmail.com wrote: No, what I'm thinking of is having two clusters (0.6 and 0.7) running on different ports so they can't find each other. Or isn't that configurable? Then, when I have the two clusters, I could upgrade all of the clients to run against the new cluster, and finally upgrade the rest of the Cassandra nodes. I don't know how the new cluster would cope with having new data in the old cluster when they are upgraded though. /Daniel 2011/1/20 Aaron Morton aa...@thelastpickle.comaa...@thelastpickle.com I'm not sure if your suggesting running a mixed mode cluster there, but AFAIK the changes to the internode protocol prohibit this. The nodes will probable see each either via gossip, but the way the messages define their purpose (their verb handler) has been changed. Out of interest which is more painful, stopping the cluster and upgrading it or upgrading your client code? Aaron On 21/01/2011, at 12:35 AM, Daniel Josefsson jid...@gmail.com jid...@gmail.com wrote: In our case our replication factor is more than half the number of nodes in the cluster. Would it be possible to do the following: - Upgrade half of them - Change Thrift Port and inter-server port (is this the storage_port?) - Start them up - Upgrade clients one by one - Upgrade the the rest of the servers Or might we get some kind of data collision when still writing to the old cluster as the new storage is being used? /Daniel
Re: Cassandra automatic startup script on ubuntu
You can also use the apt-get repository version, which installs the startup script. On http://wiki.apache.org/cassandra/CloudConfig, see the Cassandra Basic Setup section. It applies to any debian based machine, not just cloud instances. HTH Dave Viner On Thu, Jan 20, 2011 at 9:11 AM, Donal Zang zan...@ihep.ac.cn wrote: On 20/01/2011 17:51, Sébastien Druon wrote: Hello! I am using cassandra on a ubuntu machine and installed it from the binary found on the cassandra home page. However, I did not find any scripts to start it up at boot time. Where can I find this kind of script? Thanks a lot in advance Sebastien Hi, this is what I do, you can add the watchdog to rc.local *%S[%m]%s %~ %# cat watchdog #!/bin/bash # # This script is to check every $INTERVAL seconds to see # whether cassandra is work well # and restart it if neccesary # by donal 2010-01-11 # PORT=9160 INTERVAL=2 CASSANDRA=/opt/cassandra check() { netstat -tln|grep LISTEN|grep :$1 if [ $? != 0 ]; then echo restarting cassandra $CASSANDRA/bin/stop-server sleep 1 $CASSANDRA/bin/start-server fi } while true do check $PORT sleep $INTERVAL done*
Re: Do you have a site in production environment with Cassandra? What client do you use?
Perl using the thrift interface directly. On Sat, Jan 15, 2011 at 6:10 AM, Daniel Lundin d...@eintr.org wrote: python + pycassa scala + Hector On Fri, Jan 14, 2011 at 6:24 PM, Ertio Lew ertio...@gmail.com wrote: Hey, If you have a site in production environment or considering so, what is the client that you use to interact with Cassandra. I know that there are several clients available out there according to the language you use but I would love to know what clients are being used widely in production environments and are best to work with(support most required features for performance). Also preferably tell about the technology stack for your applications. Any suggestions, comments appreciated ? Thanks Ertio
anyone using Cassandra as an analytics/data warehouse?
Does anyone use Cassandra to power an analytics or data warehouse implementation? As a concrete example, one could imagine Cassandra storing data for something that reports on page-views on a website. The basic notions might be simple (url as row-key and columns as timeuuids of viewers). But, how would one store things like ip-geolocation to set of pages viewed? Or hour-of-day to pages viewed? Also, how would one do a query like - tell me how many page views occurred between 12/01/2010 and 12/31/2010? - tell me how many page views occurred between 12/01/2010 and 12/31/2010 from the US? - tell me how many page views occurred between 12/01/2010 and 12/31/2010 from the US in the 9th hour of the day (in gmt)? Time slicing and dimension slicing seems like it might be very challenging (especially since the windows of time would not be known in advance). Thanks Dave Viner
Re: anyone using Cassandra as an analytics/data warehouse?
Hi Peter, Thanks. These are great ideas. One comment tho. I'm actually not as worried about the logging into the system performance and more speculating/imagining the querying out of the system. Most traditional data warehouses have a cube or a star schema or something similar. I'm trying to imagine how one might use Cassandra in situations where that sort of design has historically been applied. But, I want to make sure I understand your suggestion A. Is it something like this? a Column Family with the row key being the Unix time divided by 60x60 and a column key of... pretty much anything unique LogCF[hour-day-in-epoch-seconds][timeuuid] = 1 where 'hour-day-in-epoch-seconds' is something like the first second of the given hour of the day, so 01/04/2011 19:00:00 (in epoch seconds: 1294167600); 'timeuuid' is a TimeUUID from cassandra, and '1' is the value of the entry. Then look at the current row every hour to actually compile the numbers, and store the count in the same Column Family LogCF[hour-day-in-epoch-seconds][total] = x where 'x' is the sum of the number of timeuuid columns in the row? Is that what you're envisioning in Option A? Thanks Dave Viner On Tue, Jan 4, 2011 at 6:38 PM, Peter Harrison cheetah...@gmail.com wrote: Okay, here is two ways to handle this, both are quite different from each other. A) This approach does not depend on counters. You simply have a Column Family with the row key being the Unix time divided by 60x60 and a column key of... pretty much anything unique. Then have another process look at the current row every hour to actually compile the numbers, and store the count in the same Column Family. This will solve the first and third use cases, as it is just a matter of looking at the right rows. The second case will require a similar index, but one which includes a country code to be appended to the row key. The downside here is that you are storing lots of data on individual requests and retaining it. If you don't want the detailed data you might add a second process to purge the detail every hour. B) There is a counter feature added to the latest versions of Cassandra. I have not used them, but they should be able to be used to achieve the same effect without a second process cleaning up every hour. Also means it is more of a real time system so you can see how many requests in the hour you are currently in. Basically you have to design your approach based on the query you will be doing. Don't get too hung up on traditional data structures and queries as they have little relationship to a Cassandra approach. On Wed, Jan 5, 2011 at 2:34 PM, Dave Viner davevi...@gmail.com wrote: Does anyone use Cassandra to power an analytics or data warehouse implementation? As a concrete example, one could imagine Cassandra storing data for something that reports on page-views on a website. The basic notions might be simple (url as row-key and columns as timeuuids of viewers). But, how would one store things like ip-geolocation to set of pages viewed? Or hour-of-day to pages viewed? Also, how would one do a query like - tell me how many page views occurred between 12/01/2010 and 12/31/2010? - tell me how many page views occurred between 12/01/2010 and 12/31/2010 from the US? - tell me how many page views occurred between 12/01/2010 and 12/31/2010 from the US in the 9th hour of the day (in gmt)? Time slicing and dimension slicing seems like it might be very challenging (especially since the windows of time would not be known in advance). Thanks Dave Viner
Re: Does Cassandra run better on Amazon EC2 or Rackspace cloud servers?
Since it's all pay-for-use, you could build your system on both, then do whatever stress testing you want. The cassandra part of your app should be unchanged between different cloud providers. Personally, I'm using EC2 and don't have any complaints. Dave Viner On Mon, Jan 3, 2011 at 3:49 PM, Ryan King r...@twitter.com wrote: On Mon, Jan 3, 2011 at 3:04 PM, Cassy Andra cassandral...@gmail.com wrote: My company is looking to develop a software prototype based off Cassandra in the cloud. We except to run 5 - 10 NoSQL servers for the prototype. I've read online (Jonathan Ellis was pretty vocal about this) that EC2 has some I/O issues. Is the general consensus to run Cassandra on EC2 or Rackspace? What are the pros + cons? I don't know about RAX cloud, but Joe Stump of SimpleGeo did some benchmarks of ec2 io performance: http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html -ryan
Re: Virtual IP / hardware load balancing for cassandra nodes
You can put a Cassandra cluster behind a load balancer. One thing to be cautious of is the health check. Just because the node is listening on port 9160 doesn't mean that it's healthy to serve requests. It is required, but not sufficient. The real test is the JMX values. Dave Viner On Mon, Dec 20, 2010 at 6:25 AM, Jonathan Colby jonathan.co...@gmail.comwrote: I was unable to find example or documentation on my question. I'd like to know what the best way to group a cluster of cassandra nodes behind a virtual ip. For example, can cassandra nodes be placed behind a Citrix Netscaler hardware load balancer? I can't imagine it being a problem, but in doing so would you break any cassandra functionality? The goal is to have the application talk to a single virtual ip and be directed to a random node in the cluster. I heard a little about adding the node addresses to Hector's load-balancing mechanism, but this doesn't seem too robust or easy to maintain. Thanks in advance.
Re: Cassandra Monitoring
How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page: JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge; Thanks Dave Viner On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory ran...@gmail.com wrote: FYI, I just added an mx4j section to the bottom of this page http://wiki.apache.org/cassandra/Operations On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis jbel...@gmail.com wrote: mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068 On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller peter.schul...@infidyne.com wrote: How / what are you monitoring? Best practices someone? I recently set up monitoring using the cassandra-munin-plugins (https://github.com/jamesgolick/cassandra-munin-plugins). However, due to various little details that wasn't too fun to integrate properly with munin-node-configure and automated configuration management. A problem is also the starting of a JVM for each use of jmxquery, which can become a problem with many column families. I like your web server idea. Something persistent that can sit there and do the JMX acrobatics, and expose something more easily consumed for stuff like munin/zabbix/etc. It would be pretty nice to have that out of the box with Cassandra, though I expect that would be considered bloat. :) -- / Peter Schuller -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- /Ran
Re: Cassandra Monitoring
Can you share the code for run_column_family_stores.sh ? On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo edlinuxg...@gmail.comwrote: On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory ran...@gmail.com wrote: Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in the lib before you start Cassandra jmx-to-rest runs in a separate jvm. It also has a nice useful HTML interface that you can look into any running host. On Sunday, December 19, 2010, Dave Viner davevi...@gmail.com wrote: How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page: JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge; ThanksDave Viner On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory ran...@gmail.com wrote: FYI, I just added an mx4j section to the bottom of this page http://wiki.apache.org/cassandra/Operations On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis jbel...@gmail.com wrote: mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068 On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller peter.schul...@infidyne.com wrote: How / what are you monitoring? Best practices someone? I recently set up monitoring using the cassandra-munin-plugins (https://github.com/jamesgolick/cassandra-munin-plugins). However, due to various little details that wasn't too fun to integrate properly with munin-node-configure and automated configuration management. A problem is also the starting of a JVM for each use of jmxquery, which can become a problem with many column families. I like your web server idea. Something persistent that can sit there and do the JMX acrobatics, and expose something more easily consumed for stuff like munin/zabbix/etc. It would be pretty nice to have that out of the box with Cassandra, though I expect that would be considered bloat. :) -- / Peter Schuller -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- /Ran -- /Ran There is a lot of overhead on your monitoring station to kick up so many JMX connections. There can also be nat/hostname problems for remote JMX. My solution is to execute JMX over nagios remote plugin executor (NRPE). command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI issues involved in pulling stats over JMX,
Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?
I don't know the details of operation of HBase, so I can't speak on that point. But, I do know that Facebook hired Jonathan Grey, former CTO of Streamy, who is a huge HBase contributor. Streamy ended in Mar 2010 - although I'm not sure when he went to work for Facebook. He presented on HBase at the Hadoop conference in October in NYC: http://mpouttuclarke.wordpress.com/2010/10/18/notes-from-hadoop-world-2010-nyc/ Again, I don't know the chronology (whether he was hired before the decision to use hbase or after). But I know that Jonathan is a fantastically smart (and extremely nice) guy and I'm sure he could make HBase bend to his will at any point. Dave Viner On Sun, Nov 21, 2010 at 4:16 PM, Todd Lipcon t...@lipcon.org wrote: On Sun, Nov 21, 2010 at 2:06 PM, Edward Ribeiro edward.ribe...@gmail.comwrote: Also I believe saying HBASE is consistent is not true. This can happen: Write to region server. - Region Server acknowledges client- write to WAL - region server fails = write lost I wonder how facebook will reconcile that. :) Are you sure about that? Client writes to WAL before ack user? According to these posts[1][2], if writing the record to the WAL fails the whole operation must be considered a failure., so it would be nonsense acknowledge clients before writing the lifeline. I hope any cloudera guy explain this... [only jumping in because info was requested - those who know me know that I think Cassandra is a very interesting architecture and a better fit for many applications than HBase] You can operate the commit log in two different modes in HBase. One mode is deferred log flush, where the region server appends but does not sync() the commit log to HDFS on every write, but rather on a periodic basis (eg once a second). This is similar to the innodb_flush_log_at_trx_commit=2 option in MySQL for example. This has slightly better performance obviously since the writer doesn't need to wait on the commit, but as you noted there's a window where a write may be acknowledged but then lost. This is an issue of *durability* moreso than consistency. In the other mode of operation (default in recent versions of HBase) we do not acknowledge a write until it has been pushed to the OS buffer on the entire pipeline of log replicas. Obviously this is slower, but it results in no lost data regardless of any machine failures. Additionally, concurrent readers do not see written data until these same properties have been satisfied. So this mode is 100% consistent and 100% durable. In practice, this effects latency significantly since it adds two extra round trips to each write, but system throughput is only reduced by 20-30% since the commits are pipelined (see HDFS-895 for gory details) I believe Cassandra has similar tuning options about whether to sync every commit to the log or only do so periodically. If you're interested in learning more, feel free to reference this documentation: http://hbase.apache.org/docs/r0.89.20100726/acid-semantics.html Besides that, you know that WAL is written to HDFS that takes care of replication and fault tolerance, right? Of course, even so, there's a window of inconsistency before the HLog is flushed to disk, but I don't think you can dismiss this as not consistent. At most, you may classify it as eventual consistent. :) [1] http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html [2] http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html E. Ribeiro
Re: Cold boot performance problems
Has anyone found solid step-by-step docs on how to raid0 the ephemeral disks in ec2 for use by Cassandra? On Fri, Oct 8, 2010 at 12:11 PM, Jason Horman jhor...@gmail.com wrote: We are currently using EBS with 4 volumes striped with LVM. Wow, we didn't realize you could raid the ephemeral disks. I thought the opinion for Cassandra though was that the ephemeral disks were dangerous. We have lost of a few machines over the past year, but replicas hopefully prevent real trouble. How about the sharding strategies? Is it worth it to investigate sharding out via multiple keyspaces? Would order preserving partitioning help group indexes better for users? On Fri, Oct 8, 2010 at 1:53 PM, Jonathan Ellis jbel...@gmail.com wrote: Two things that can help: In 0.6.5, enable the dynamic snitch with -Dcassandra.dynamic_snitch_enabled=true -Dcassandra.dynamic_snitch=cassandra.dynamic_snitch_enabled which if you are doing a rolling restart will let other nodes route around the slow node (at CL.ONE) until it's warmed up (by the read repairs in the background). In 0.6.6, we've added save/load of the Cassandra caches: https://issues.apache.org/jira/browse/CASSANDRA-1417 Finally: we recommend using raid0 ephemeral disks on EC2 with L or XL instance sizes for better i/o performance. (Corey Hulen has some numbers at http://www.coreyhulen.org/?p=326.) On Fri, Oct 8, 2010 at 12:36 PM, Jason Horman jhor...@gmail.com wrote: We are experiencing very slow performance on Amazon EC2 after a cold boot. 10-20 tps. After the cache is primed things are much better, but it would be nice if users who aren't in cache didn't experience such slow performance. Before dumping a bunch of config I just had some general questions. We are using uuid keys, 40m of them and the random partitioner. Typical access pattern is reading 200-300 keys in a single web request. Are uuid keys going to be painful b/c they are so random. Should we be using less random keys, maybe with a shard prefix (01-80), and make sure that our tokens group user data together on the cluster (via the order preserving partitioner) Would the order preserving partitioner be a better option in the sense that it would group a single users data to a single set of machines (if we added a prefix to the uuid)? Is there any benefit to doing sharding of our own via Keyspaces. 01-80 keyspaces to split up the data files. (we already have 80 mysql shards we are migrating from, so doing this wouldn't be terrible implementation wise) Should a goal be to get the data/index files as small as possible. Is there a size at which they become problematic? (Amazon EC2/EBS fyi) Via more servers Via more cassandra instances on the same server Via manual sharding by keyspace Via manual sharding by columnfamily Thanks, -- -jason horman -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- -jason
Re: Advice on settings
Also, as a note related to EC2, choose whether you want to be in multiple availability zones. The highest performance possible is to be in a single AZ, as all those machines will have *very* high speed interconnects. But, individual AZs also can suffer outages. You can distribute your instances across, say, 2 AZs, and then use a RackAwareStrategy to force replication to put at least 1 copy of the data into the other AZ. Also, it's easiest to stay within a single Region (in EC2-speak). This allows you to use the internal IP addresses for Gossip and Thrift connections - which means you do not pay inbound-outbound fees for the data xfer. HTH, Dave Viner On Thu, Oct 7, 2010 at 10:26 AM, B. Todd Burruss bburr...@real.com wrote: if you are updating columns quite rapidly, you will scatter the columns over many sstables as you update them over time. this means that a read of a specific column will require looking at more sstables to find the data. performing a compaction (using nodetool) will merge the sstables into one making your reads more performant. of course the more columns, the more scattering around, the more I/O. to your point about sharing the data around. adding more machines is always a good thing to spread the load - you add RAM, CPU, and persistent storage to the cluster. there probably is some point where enough machines creates a lot of network traffic, but 10 or 20 machines shouldn't be an issue. don't worry about trying to hit a node that has the data unless your machines are connected across slow network links. On 10/07/2010 12:48 AM, Dave Gardner wrote: Hi all We're rolling out a Cassandra cluster on EC2 and I've got a couple if questions about settings. I'm interested to hear what other people have experienced with different values and generally seek advice. *gcgraceseconds* Currently we configure one setting for all CFs. We experimented with this a bit during testing, including changing from the default (10 days) to 3 hours. Our use case involves lots of rewriting the columns for any given keys. We probably rewrite around 5 million per day. We are thinking of setting this to around 3 days for production so that we don't have old copies of data hanging round. Is there anything obviously wrong with this? Out of curiosity, would there be any performance issues if we had this set to 30 days? My understanding is that it would only affect the amount of disk space used. However Ben Black suggests here that the cleanup will actually only impact data deleted through the API: http://comments.gmane.org/gmane.comp.db.cassandra.user/4437 In this case, I guess that we need not worry too much about the setting since we are actually updating, never deleting. Is this the case? *Replication factor* Our use case is many more writes than reads, but when we do have reads they're random (we're not currently using hadoop to read entire CFs). I'm wondering what sort of level of RF to have for a cluster. We currently have 12 nodes and RF=4. To improve read performance I'm thinking of upping the number of nodes and keeping RF at 4. My understanding is that this means we're sharing the data around more. However it also means a client read to a random node has less chance of actually connecting to one of the nodes with the data on. I'm assuming this is fine. What sort of RFs do others use? With a huge cluster like the recently mentioned 400 node US govt cluster, what sort of RF is sane? On a similar note (read perf), I'm guessing that reading at weak consistency level will bring gains. Gleamed from this slide amongst other places: http://www.slideshare.net/mobile/benjaminblack/introduction-to-cassandra-replication-and-consistency#13 Is this true, or will read repair still hammer disks in all the machines with the data on? Again I guess it's better to have low RF so there are less copied of the data to inspect when doing read repair. Will this result in better read performance? Thanks dave
Re: Dazed and confused with Cassandra on EC2 ...
Hi Jedd, I'm using Cassandra on EC2 as well - so I'm quite interested. Just to clarify your post - it sounds like you have 4 questions/issue: 1. Writes have slowed down significantly. What's the logical explanation? And what is the logical solution/options to solve it? 2. You grew from 2 nodes to 4, but the original 2 nodes have 200GB and the 2 new ones have 40 GB. What's the recommended practice for rebalancing (i.e., when should you do it), what's the actual procedure, and what's the expected impact of it? 3. Cassandra nodes disappear. (I'm not quite clear what this means.) 4. You took a machine offline without decommissioning it from the cluster. Now the machine is gone, but the other nodes (in Gossip logs) report that they are still looking for it. How do you stop nodes from looking for a removed node? I'm not trying to put words in your mouth - but I want to make sure that I understand what you're asking about (because I have similar ec2-related thoughts). Let me know if this is an accurate summary. Dave Viner On Fri, Sep 17, 2010 at 7:41 AM, Jedd Rashbrooke jedd.rashbro...@imagini.net wrote: Howdi, I've just landed in an experiment to get Cassandra going, and fed by PHP via Thrift via Hadoop, all running on EC2. I've been lurking a bit on the list for a couple of weeks, mostly reading any threads with the word 'performance' in them. Few people have anything polite to say about EC2, but I want to just throw out some observations and get some feedback on whether what I'm seeing is even approaching any kind of normal. My background is mostly *nix and networking, with half-way decent understanding of DB's -- but Cassandra, Hadoop, Thrift and EC2 are all fairly new to me. We're using a four-node decently-specced (m2.2xlarge, if you're EC2-aware) cluster - 32GB, 4-core, if you're not :) I'm using Ubuntu with the Deb packages for Cassandra and Hadoop, and some fairly conservative tweaks to things like JVM memory (bumping them up to 4GB, then 16GB). One of our insert jobs - a mapper only process - was running pretty fast a few days ago. Somewhere around a million lines of input, split into a dozen files, inserting via a Hadoop job in about a half hour. Happy times. This was when the cluster was modestly sized - 20-50GB. It's now about 200GB, and performance has dropped by an order of magnitude - perhaps 5-6 hours to do the same amount of work, using the same codebase and the same input data. I've read that reads slow down as the DB grows, but had an expectation that writes would be consistently snappy. How surprising is this performance drop given the DB growth? My 4-node cluster started off as a 2-node - and now nodetool ring suggests the two original nodes are 200GB each, and the newer two are 40GB. Is this normal? Would a rebalance likely improve performance substantially? My feeling is that it would be expensive to perform. EC2 seems to get a bad rap, and we're feeling quite a bit of pain, which is sad given the (on paper) spec of the machines, and the cost - over US$3k/month for the cluster. I've split Cassandra commitlog, Cassandra data, hadoop(hdfs) and tmp onto separate 'spindles' - observations so far suggest late '90's disk IO speed (15MB max sustained writes, one machine, one disk to another), and consistently inconsistent performance (identical machine next to it running the same task at the same time was getting 28MB) over several hours. Cassandra nodes seems to disappear too easily - even with just one core (out of four) maxed out with a jsvc task, minimal disk or network activity, the machine feels very sluggish. Tailing the cassandra logs hints that it's doing hinted handoffs and occasionally compaction tasks. I've never seen this kind of behaviour - and suspect this is more a feature of EC2. Gossip now seems to be pining the loss of an older machine (that I stupidly took offline briefly - EC2 gave it a new IP address when it came back). There's nothing in the storage-conf to refer to the old address, all 4 Cassandra daemons have been re-started several times since, but gossip occasionally (a day later) says that it is looking for it - and more worrying that it is 'now part of the cluster'. I'm unsure if this is just an irritation or part of the underlying problem. What I'm going to do next is to try importing some data into a local machine - it's just time-consuming to pull in our S3 data - and see if I can fake up to around the same capacity and watch for performance degradation. I'm also toying with the idea of going from 4 to 8 nodes, but I'm clueless on whether / how much this would help. As I say, though, I'm keen on anyone else's observations on my observations - I'm painfully aware that I'm juggling a lot of unknown factors at the moment. cheers, Jedd.
Re: Monitoring with Cacti
I haven't tried cacti, but I'm using CloudKick as an external service for monitoring Cassandra. It's super easy to get setup. Happy to share my setup if that'd help. It doesn't currently monitor JMX information, but it does offer some basic checks like thread pool and column family stats - https://support.cloudkick.com/Cassandra_Checks. Dave Viner On Fri, Sep 10, 2010 at 8:31 PM, Edward Capriolo edlinuxg...@gmail.comwrote: On Fri, Sep 10, 2010 at 7:29 PM, aaron morton aa...@thelastpickle.com wrote: Am going through the rather painful process of trying to monitor cassandra using Cacti (it's what we use at work). At the moment it feels like a losing battle :) Does anyone know of some cacti resources for monitoring the JVM or Cassandra metrics other than... mysql-cacti-templates http://code.google.com/p/mysql-cacti-templates/ - provides templates and data sources that require ssh and can monitor JVM heap and a few things. Cassandra-cacti-m6 http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp Coded for version 0.6* , have made some changes to stop it looking for stats that no longer exist. Missing some metrics I think but it's probably the best bet so far. If I get it working I'll contribute it back to them. Most of the problems were probably down the how much effort it takes to setup cacti. jmxterm http://www.cyclopsgroup.org/projects/jmxterm/ Allows for command line access to JMX. I started down the path of writing a cacti data source to use this just to see how it worked. Looks like a lot of work. Thanks for any advice. Aaron Setting up cacti is easy, the second time, and third time :) As for cassandra-cacti-m6 (i am the author). Unfortunately, I have been fighting the jmx switcharo battle for about 3 years now hadoop/hbase/cassandra/hornetq/vserver In a nutshell there is ALWAYS work involved. First, is because as you noticed attributes change/remove/add/renamed. Second it takes a human to logically group things together. For example, if you have two items cache hits and cache misses. You really do not want two separate graphs that will scale independently. You want one slick stack graph, with nice colors, and you want a CDEF to calculate the cache hit percentage by dividing one into the other and show that at the bottom. If you want to have a 7.0 branch to cassandra-cacti-m6 I would love the help. We are not on 7.0 yet so I have not had the time just to go out and make graphs for a version we are not using yet :) but if you come up with patches they are happily accepted. Edward
Re: Cassandra HAProxy
FWIW - we've been using HAProxy in front of a cassandra cluster in production and haven't run into any problems yet. It sounds like our cluster is tiny in comparison to Anthony M's cluster. But I just wanted to mentioned that others out there are doing the same. One thing in this thread that I thought was interesting is Ben's initial comment the presence of the proxy precludes clients properly backing off from nodes returning errors. I think it would be very cool if someone implemented a mechanism for haproxy to detect the error nodes and then enable it to drop those nodes from the rotation. I'd be happy to help with this, as I know how it works with haproxy and standard web servers or other tcp servers. But, I'm not sure how to make it work with Cassandra, since, as Ben points out, it can return valid tcp responses (that say error-condition) on the standard port. Dave Viner On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote: On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I don't know it seems to tax our setup of 39 extra large ec2 nodes, its also closer to 24000 reqs/sec at peak since there are different tables (2 tables for each read and 2 for each write) Could you clarify what you mean here? On the face of it, this performance seems really poor given the number and size of nodes. As you say I would expect to achieve much better performance given the node size, but if you go back and look through some of the issues we've seen over time, you'll find we've been hit with nodes being too small, having too few nodes to deal with request volume, having OOMs, having bad sstables, having the ring appear different to different nodes, and several other problems. Many of i/o problems presented themselves as MessageDeserializer pool backups (although we stopped having these since Jonathan was by and suggested row cache of about 1Gb, thanks Riptano!). We currently have mystery OOMs which are probably caused by GC storms during compactions (although usually the nodes restart and compact fine, so who knows). I also regularly watch nodes go away for 30 seconds or so (logs show node goes dead, then comes back to life a few seconds later). I've sort of given up worrying about these, as we are in the process of moving this cluster to our own machines in a colo, so I figure I should wait until they are moved, and see how the new machines do before I worry more about performance. -Anthony -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Thrift + PHP: help!
I am a user of the perl api - so I'd like to lurk in case there are things that can benefit both perl php. Dave Viner On Wed, Aug 18, 2010 at 1:35 PM, Gabriel Sosa sosagabr...@gmail.com wrote: I would like to help with this too! On Wed, Aug 18, 2010 at 5:15 PM, Bas Kok bakot...@gmail.com wrote: I have some experience in this area and would be happy to help out as well. -- Bas On Wed, Aug 18, 2010 at 8:26 PM, Dave Gardner dave.gard...@imagini.netwrote: I'm happy to assist. Having a robust PHP implementation would help us greatly. Dave On Wednesday, August 18, 2010, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: As Jonathan mentioned in his keynote at the Cassandra Summit, the thrift + php has some bugs and is maintainerless right now. Is there anyone out there in the Cassandra community that is adept at PHP that could help out with the thrift + php work? It would benefit all who use Cassandra with PHP. Bryan Duxbury, a thrift developer/committer, said if someone really wanted to have a go at making thrift php robust, i would assist them heavily. Please respond to this thread or ask Bryan in the channel. -- Gabriel Sosa Si buscas resultados distintos, no hagas siempre lo mismo. - Einstein
Re: error using get_range_slice with random partitioner
Funny you should ask... I just went through the same exercise. You must use Cassandra 0.6.4. Otherwise you will get duplicate keys. However, here is a snippet of perl that you can use. our $WANTED_COLUMN_NAME = 'mycol'; get_key_to_one_column_map('myKeySpace', 'myColFamily', 'mySuperCol', QUORUM, \%map); sub get_key_to_one_column_map { my ($keyspace, $column_family_name, $super_column_name, $consistency_level, $returned_keys) = @_; my($socket, $transport, $protocol, $client, $result, $predicate, $column_parent, $keyrange); $column_parent = new Cassandra::ColumnParent(); $column_parent-{'column_family'} = $column_family_name; $column_parent-{'super_column'} = $super_column_name; $keyrange = new Cassandra::KeyRange({ 'start_key' = '', 'end_key' = '', 'count' = 10 }); $predicate = new Cassandra::SlicePredicate(); $predicate-{'column_names'} = [$WANTED_COLUMN_NAME]; eval { $socket = new Thrift::Socket($CASSANDRA_HOST, $CASSANDRA_PORT); $transport = new Thrift::BufferedTransport($socket, 1024, 1024); $protocol = new Thrift::BinaryProtocol($transport); $client = new Cassandra::CassandraClient($protocol); $transport-open(); my($next_start_key, $one_res, $iteration, $have_more, $value, $local_count, $previous_start_key); $iteration = 0; $have_more = 1; while ($have_more == 1) { $iteration++; $result = undef; $result = $client-get_range_slices($keyspace, $column_parent, $predicate, $keyrange, $consistency_level); # on success, results is an array of objects. if (scalar(@$result) == 1) { # we only got 1 result... check to see if it's the # same key as the start key... if so, we're done. if ($result-[0]-{'key'} eq $keyrange-{'start_key'}) { $have_more = 0; last; } } # check to see if we are starting with some value # if so, we throw away the first result. if ($keyrange-{'start_key'}) { shift(@$result); } if (scalar(@$result) == 0) { $have_more = 0; last; } $previous_start_key = $keyrange-{'start_key'}; $local_count = 0; for (my $r = 0; $r scalar(@$result); $r++) { $one_res = $result-[$r]; $next_start_key = $one_res-{'key'}; $keyrange-{'start_key'} = $next_start_key; if (!exists($returned_keys-{$next_start_key})) { $have_more = 1; $local_count++; } next if (scalar(@{ $one_res-{'columns'} }) == 0); $value = undef; for (my $i = 0; $i scalar(@{ $one_res-{'columns'} }); $i++) { if ($one_res-{'columns'}-[$i]-{'column'}-{'name'} eq $WANTED_COLUMN_NAME) { $value = $one_res-{'columns'}-[$i]-{'column'}-{'value'}; if (!exists($returned_keys-{$next_start_key})) { $returned_keys-{$next_start_key} = $value; } else { # NOTE: prior to Cassandra 0.6.4, the get_range_slices returns duplicates sometimes. #warn Found second value for key [$next_start_key] was [ . $returned_keys-{$next_start_key} . ] now [$value]!; } } } $have_more = 1; } # end results loop if ($keyrange-{'start_key'} eq $previous_start_key) { $have_more = 0; } } # end while() loop $transport-close(); }; if ($@) { warn Problem with Cassandra: . Dumper($@); } # cleanup undef $client; undef $protocol; undef $transport; undef $socket; } HTH Dave Viner On Fri, Aug 6, 2010 at 7:45 AM, Adam Crain adam.cr...@greenenergycorp.comwrote: Thomas, That was indeed the source of the problem. I naively assumed that the token range would help me avoid retrieving duplicate rows. If you iterate over the keys, how do you avoid retrieving duplicate keys? I tried this morning and I seem to get odd results. Maybe this is just a consequence of the random partitioner. I really don't care about the order of the iteration, but only each key once and that I see all keys is important. -Adam -Original Message- From: th.hel...@gmail.com on behalf of Thomas Heller Sent: Fri 8/6/2010 7:27 AM To: user@cassandra.apache.org Subject: Re: error using get_range_slice
Re: Please need help with Munin: Cassandra Munin plugin problem
Is your code posted somewhere such that others could try it? On Thu, Jul 29, 2010 at 5:57 AM, Miriam Allalouf miriam.allal...@gmail.comwrote: Hi, Please, can someone help us with Munin?? Thanks, Miriam On Mon, Jul 26, 2010 at 1:58 PM, osishkin osishkin osish...@gmail.com wrote: Hi, I'm trying to use Munin to monitor cassandra. I've seen other people using munin here ,so I hope someone ran into this problem. The default plugins are working, so this is definitely a problem with the cassandra plugin. I keep getting errors such as : Exception in thread main java.lang.NoClassDefFoundError: javax.management.remote.JMXConnector at org.munin.JMXQuery.disconnect(Unknown Source) at org.munin.JMXQuery.main(Unknown Source) Plugin compactions_bytes exited with status 256. Exception in thread main java.lang.NoClassDefFoundError: javax.management.ObjectName at org.munin.Configuration$FieldProperties.set(Unknown Source) at org.munin.Configuration.parseString(Unknown Source) at org.munin.Configuration.parse(Unknown Source) at org.munin.JMXQuery.main(Unknown Source) Plugin jvm_memory exited with status 256. However when I call the plugin directly from my console (from /etc/munin/plugins) it works. So there must be something very basic I'm missing here. I'm using RHEL 5 with IBM jre 1.6. Anyone encountered a similar problem? I appologize for writing on an issue that's not purely cassandra here. Thank you
iterating over all rows keys gets duplicate key returns
Hi all, I'm having a strange result in trying to iterate over all row keys for a particular column family. The iteration works, but I see the same row key returned multiple times during the iteration. I'm using cassandra 0.6.3, and I've put the code in use at http://pastebin.com/zz5xJQ8f Using get_range_slices() and a keyrange with incrementing start_key's, shouldn't I get an enumeration of the keys such that each key appears only once ? In iterating 1000 times, I was given the same rows 8322 times. Somehow it seems like something is amiss in how I'm performing the iteration over the keys. Any suggestions on how I can properly iterate? Thanks Dave Viner
Re: iterating over all rows keys gets duplicate key returns
Just as a followup, here's what seems to be the resolution: 1. 0.6.4 should fix this problem. 2. Using OPP as the DHT should solve it as well. 3. Prior to 0.6.4, when using RandomPartitioner as the DHT, there's no good way to guarantee that you see *all* row keys for a column family. Strategies tried: A. iterate over the keys returned until the start_key is identical to the last key returned. When start_key == last key returned, exit. - fails since duplicate keys can appear anywhere, even as the last key returned. B. iterate over keys returned, adding the keys to a hash table. When an iteration returns no new keys, assume that all keys have been seen and exit. - this also fails, since a particular result set can be full of duplicates, but the iteration has not traversed the entire row-key spectrum. Dave Viner On Wed, Jul 28, 2010 at 3:48 PM, Rob Coli rc...@digg.com wrote: On 7/28/10 2:43 PM, Dave Viner wrote: Hi all, I'm having a strange result in trying to iterate over all row keys for a particular column family. The iteration works, but I see the same row key returned multiple times during the iteration. I'm using cassandra 0.6.3, and I've put the code in use at For those not playing along on IRC, this was determined to be caused by : http://issues.apache.org/jira/browse/CASSANDRA-1042 Which is fixed in 0.6.4. =Rob
Re: Quick Poll: Server names
I've seen used several... names of children of employees of the company names of streets near office names of diseases (lead to very hard to spell names after a while, but was quite educational for most developers) names of characters from famous books (e.g., lord of the rings, asimov novels, etc) On Tue, Jul 27, 2010 at 7:54 AM, uncle mantis uncleman...@gmail.com wrote: I will be naming my servers after insect family names. What do you all use for yours? If this is something that is too off topic please contact a moderator. Regards, Michael
Re: non blocking Cassandra with Tornado
FWIW - I think this is actually more of a question about Thrift than about Cassandra. If I understand you correctly, you're looking for a async client. Cassandra lives on the other side of the thrift service. So, you need a client that can speak Thrift asynchronously. You might check out the new async Thrift client in Java for inspiration: http://blog.rapleaf.com/dev/2010/06/23/fully-async-thrift-client-in-java/ Or, even better, port the Thrift async client to work for python and other languages. Dave Viner On Tue, Jul 27, 2010 at 8:44 AM, Peter Schuller peter.schul...@infidyne.com wrote: The idea is rather than calling a cassandra client function like get_slice(), call the send_get_slice() then have a non blocking wait on the socket thrift is using, then call recv_get_slice(). (disclaimer: I've never used tornado) Without looking at the generated thrift code, this sounds dangerous. What happens if send_get_slice() blocks? What happens if recv_get_slice() has to block because you didn't happen to receive the response in one packet? Normally you're either doing blocking code or callback oriented reactive code. It sounds like you're trying to use blocking calls in a non-blocking context under the assumption that readable data on the socket means the entire response is readable, and that the socket being writable means that the entire request can be written without blocking. This might seems to work and you may not block, or block only briefly. Until, for example, a TCP connection stalls and your entire event loop hangs due to a blocking read. Apologies if I'm misunderstanding what you're trying to do. -- / Peter Schuller
Re: SV: How to stop cassandra server, installed from debian/ubuntupackage
Yes... if you're using debian cassandra you can do: /etc/init.d/cassandra stop On Mon, Jul 26, 2010 at 8:05 AM, Lee Parker l...@socialagency.com wrote: Which debian/ubuntu packages are you using? I am using the ones that are maintained by Eric Evans and the init.d script stops the server correctly. Lee Parker On Mon, Jul 26, 2010 at 9:22 AM, miche...@hermanus.cc wrote: This is how I have been doing it: pkill cassandra then I do a netstat -anp | grep 8080 I look for the java service I'd running and then kill that java I'd e.g. kill java id --Original Message-- From: Thorvaldsson Justus To: 'user@cassandra.apache.org' ReplyTo: user@cassandra.apache.org Subject: SV: How to stop cassandra server, installed from debian/ubuntupackage Sent: Jul 26, 2010 4:14 PM I use standard close, CTRL C, I don't run it as deamon Dunno but think it works fine =) -Ursprungligt meddelande- Från: o...@notrly.com [mailto:o...@notrly.com] Skickat: den 26 juli 2010 15:52 Till: user@cassandra.apache.org Ämne: How to stop cassandra server, installed from debian/ubuntu package Hi, this might be a dumb question, but I was wondering how do i stop the cassandra server.. I installed it using the debian package, so i start cassandra by running /etc/init.d/cassandra. I looked at the script and tried /etc/init.d/cassandra stop, but it looks like it just tries to start cassandra again, so i get the port in use exception. Thanks Sent via my BlackBerry from Vodacom - let your email find you!
Re: Design questions/Schema help
AFAIK, atomic increments are not available. There recently has been quite a bit of discussion about them. So, you might search the archives. Dave Viner On Mon, Jul 26, 2010 at 7:02 PM, Mark static.void@gmail.com wrote: On 7/26/10 6:06 PM, Dave Viner wrote: I'd love to hear other's opinions here... but here are my 2 cents. With Cassandra, you need to think of the queries - which you've pretty much done. For the most popular queries, you could do something like: ColumnFamily Name=QueriesCounted ComparesWith=UTF8Type / And then access it as: key-space.QueriesCounted['query-foo-bar'] = $count; This makes it easy to get the count for any particular query. I'm not sure the best way to store the top counts idea. Perhaps a secondary process which iterates over all the queries to see which sorts the query values by count, and then stores them into another ColumnFamily. You could use the same idea for the last query (session ids by query) ColumnFamily Name=QueriesRecorded ComparesWith=UTF8Type ColumnType=super CompareSubcolumnsWith=TimeUUIDType / And then access it as: key-space. QueriesRecorded['query-foo-bar'][timeuuid] = session-id; Actually, if you used that idea (queries-recorded), you could generate the counts and aggregates from that directly in a hadoop post-processing... But perhaps others will have better ideas. If you haven't read http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model, go read it now. It won't answer your question directly, but will describe the process of modeling a blog in cassandra so you can get a sense of the process. Dave Viner On Mon, Jul 26, 2010 at 4:46 PM, Mark static.void@gmail.com wrote: We are thinking about using Cassandra to store our search logs. Can someone point me in the right direction/lend some guidance on design? I am new to Cassandra and I am having trouble wrapping my head around some of these new concepts. My brain keeps wanting to go back to a RDBMS design. We will be storing the user query, # of hits returned and their session id. We would like to be able to answer the following questions. - What is the n most popular queries and their counts within the last x (mins/hours/days/etc). Basically the most popular searches within a given time range. - What is the most popular query within the last x where hits = 0. Same as above but with an extra where clause - For session id x give me all their other queries - What are all the session ids that searched for 'foos' We accomplish the above functionality w/ MySQL using 2 tables. One for the raw search log information and the other to keep the aggregate/running counts of queries. Would this sort of ad-hoc querying be better implemented using Hadoop + Hive? If so, should I be storing all this information in Cassandra then using Hadoop to retrieve it? Thanks for your suggestions Perhaps a secondary process which iterates over all the queries to see which sorts the query values by count, and then stores them into another ColumnFamily. - I was trying to avoid this. Is there some sort of atomic increment feature available? I guess I could do the same thing we are currently doing which is... a) store full query details into table A b) query table B for aggregate count of query 'foo' then store count + 1
Re: Cassandra Chef recipe and EC2 snitch
You don't need the ec2snitch necessarily. AFAIK, It's meant to be a better way of detecting where your ec2 instances are. But, unless you're popping instances all the time, I don't think it's worth it. Check out the step-by-step guide on that same page. Pure EC2 api calls to setup your cluster. You can also use rackaware-ness in EC2. Just add in the PropertyFile endpoint and put your rack file in /etc/cassandra/rack.properties. Dave Viner On Thu, Jul 22, 2010 at 10:08 AM, Allan Carroll alla...@gmail.com wrote: Hi all, I'm setting up a new cluster on EC2 for the first time and looking at the wiki cloud setup page (http://wiki.apache.org/cassandra/CloudConfig). There's a chef recipe linked there that mentions an ec2snitch. The link doesn't seem to go where it says it does. Does anyone know where those resources have gone or are they no longer available? Thanks -Allan
Re: Suggestion for the storage.conf
Added: http://wiki.apache.org/cassandra/StorageConfiguration On Mon, Jul 19, 2010 at 2:55 AM, Dimitry Lvovsky dimi...@reviewpro.comwrote: I think it would be a good idea to add a bit more explanation storage-conf.xml/wiki regarding the replication factor. It caused some confusion until we dug around the mail archiveto realize that our UnavailableExceptions were caused by our incorrect assumption and that RF=1 does NOT mean that this nodes data will be replicated to one another node -- but rather the data will only exist in one node :-s. -- Dimitry Lvovsky ReviewPro www.reviewpro.com
Re: Cassandra benchmarking on Rackspace Cloud
This may be too much work... but you might consider building an Amazon EC2 AMI of your nodes. This would let others quickly boot up your nodes and run the stress test against it. I know you mentioned that you're using Rackspace Cloud. I'm not super familiar with the internals of RSCloud, but perhaps they have something similar? This feels like the kind of problem that might be easier for someone else to setup and quickly test. (The beauty of the virtual server - quick setup and quick tear down) Dave Viner On Mon, Jul 19, 2010 at 10:24 AM, Peter Schuller peter.schul...@infidyne.com wrote: I ran this test previously on the cloud, with similar results: nodes reads/sec 1 24,000 2 21,000 3 21,000 4 21,000 5 21,000 6 21,000 In fact, I ran it twice out of disbelief (on different nodes the second time) to essentially identical results. Something other than cassandra just *has* to be fishy here unless there is some kind of bug causing communication with nodes that should not be involved. It really sounds like there is a hidden bottleneck somewhere. You already mention that you've run multiple test clients so that the client is not a bottleneck. What about bandwidth? I could imagine bandwidth adding up a bit given those requests rate. Is it possible all the nodes are communicating with each other via some bottleneck (like 100 mbit)? What does the load look like when you observe the nodes during bottlenecking? How much bandwidth is each machine pushing (ifstat, nload, etc); is Cassandra obviously CPU bound or does it look idle? Presumably Cassandra is not perfectly concurrent and you may not saturate 8 cores under this load necessarily, but as you add more and more nodes and still only reaching 21k/sec you should come past a point where you're not even saturating a single core... *Something* else is probably going on. -- / Peter Schuller
Re: A very short summary on Cassandra for a book
I am no expert... but parts seem accurate, parts not. Cassandra stores four or five dimension associated arrays not sure what you're counting as a dimension of the associated array, but here are the 2 associative array-like syntaxes: ColumnFamily[row-key][column-name] = value1 ColumnFamily[row-key][super-column-name][column-name] = value2 The first dimension is fixed on creation of the database but the rest can be infinitely large I don't understand this sentence. The definition of a ColumnFamily is set by the configuration file (storage-conf.xml). If you change it, and restart a node, that node will use the new definition of the CF. It is true that the number of columns can be large. I have no idea if it's actually infinite - but more or less. Also, it's probably not precise to call it a database, since that tends to invoke images of things like MySQL, Oracle, Postgres, etc. Inserts are super fast and can happen to any database server in the cluster. Yes, this is true. However, the system is append only there so there is no in-place update operation like increment The first part is not quite true. There is appending, but there is no increment that's guaranteed universal. Cassandra is eventually consistent. So atomic increment doesn't really work in the eventual world. But, more precisely, one can add, update, change, modify, delete rows, columns, and values at any time from any node. Also sorting happens on insert time Yes, I believe this is true. Dave Viner On Thu, Jul 15, 2010 at 4:26 PM, Karoly Negyesi chx1...@gmail.com wrote: Hi, I am writing a scalability chapter in a book and I need to mention Apache Cassandra although it's just a mention. Still I would not like to be sloppy and would like to get verification whether my summary is accurate. Cassandra stores four or five dimension associated arrays. The first dimension is fixed on creation of the database but the rest can be infinitely large. Inserts are super fast and can happen to any database server in the cluster. However, the system is append only there so there is no in-place update operation like increment. Also sorting happens on insert time. Thanks Karoly Negyesi
Re: Elastic Load Balancing Cassandra
I haven't used ELB, but I've setup HAProxy to do it... appears to work well so far. Dave Viner On Tue, Jul 13, 2010 at 3:30 PM, Brian Helfrich helfrich9...@gmail.comwrote: Hi, has anyone been able to load balance a Cassandra cluster with an AWS Elastic Load Balancer? I've setup an ELB with the obvious settings (namely, --listener lb-port=9160,instance-port=9160,protocol=TCP) but client's simply hang trying to load records from the ELB hostname:9160. Thanks, --Brian.
Re: Is anyone using version 0.7 schema update API
Check out step 4 of this page: https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP ./compiler/cpp/thrift -gen php ../PATH-TO-CASSANDRA/interface/cassandra.thrift That is how to compile the thrift client from the cassandra bindings. Just replace the php with the language of your choosing. According to http://wiki.apache.org/thrift/, Thrift has generators for C++, C#, Erlang, Haskell, Java, Objective C/Cocoa, OCaml, Perl, PHP, Python, Ruby, and Squeak HTH Dave Viner On Tue, Jul 13, 2010 at 6:05 PM, GH gavan.h...@gmail.com wrote: To be honest I do not know how to regenerate the binidings, I will look into that. ollowing your email, I went on and took the unit test code and created a client. Given that this code works I am guessing that the thrift bindings are in place and it is more that the client code does not support the new functions yet. I might be off track and don't know if there it is appropriate for someone as new to this as I am to make changes to the client and submit them (especially if some one else is already doing that). I could do that, if it helped the group. On Wed, Jul 14, 2010 at 2:12 AM, Benjamin Black b...@b3k.us wrote: I updated the Ruby client to 0.7, but I am not a Cassandra committer (and not much of a Java guy), so haven't touched the Java client. Is there more to it than regenerating Thrift bindings? On Tue, Jul 13, 2010 at 1:42 AM, GH gavan.h...@gmail.com wrote: They are not complicated, its more that they are not in the package that they should be in. I assume the client package exposes the functionality of the server and it does not have the ability to manage the tables in the database that to me seems to be extremely limiting. When I did not see that code in place I assume that it is not complete or that I have not got the right code drop. From your commetns it sounds like you don't support the Java client code base in line with the ruby code. Which I think is limiting but is just the way it is. On Tue, Jul 13, 2010 at 8:53 AM, Benjamin Black b...@b3k.us wrote: I guess I don't understand what is so complicated about the schema management calls that numerous examples are needed. On Mon, Jul 12, 2010 at 4:43 AM, GH gavan.h...@gmail.com wrote: Hi, My problem is that I cannot locate Java equivalents to the api calls you present in the ruby files you have presented. They are not visible in the java client packages I have (My code is not that old of trunk). I located the code below from some of the unit test code files This code will have to be refactored to create a test. This is all I could find and it seems that there must be better client examples than this. I expected to see client code in the org.apache.cassandra.cli package but there was nothing there. (I searched all of the code for calls to these API's in the end) Where should I be looking to get proper Java code samples ? Regards Gavan Here is what I was about to refactor... TSocket socket = new TSocket(DatabaseDescriptor.getListenAddress().getHostName(), DatabaseDescriptor.getRpcPort()); TTransport transport; transport = socket; TBinaryProtocol binaryProtocol = new TBinaryProtocol(transport, false, false); Cassandra.Client cassandraClient = new Cassandra.Client(binaryProtocol); transport.open(); thriftClient = cassandraClient; SetString keyspaces = thriftClient.describe_keyspaces(); if (!keyspaces.contains(KEYSPACE)) { ListCfDef cfDefs = new ArrayListCfDef(); thriftClient.system_add_keyspace(new KsDef(KEYSPACE, org.apache.cassandra.locator.RackUnawareStrategy, 1, cfDefs)); } thriftClient.set_keyspace(KEYSPACE); CfDef cfDef = new CfDef(KEYSPACE, COLUMN_FAMILY); try { thriftClient.system_add_column_family(cfDef); } catch (InvalidRequestException e) { throw new RuntimeException(e); } On Mon, Jul 12, 2010 at 4:34 PM, Benjamin Black b...@b3k.us wrote: http://github.com/fauna/cassandra/tree/master/lib/cassandra/0.7/ Unclear to me what problems you are experiencing. On Sun, Jul 11, 2010 at 2:27 PM, GH gavan.h...@gmail.com wrote: Hi Dop, Do you have any code on dynamically creating KeySpace and ColumnFamily. Currently I was all but creating a new client to do that which seems to be the wrong way. If you have something that works that will put me on the right track I hope. Gavan On Mon, Jul 12, 2010 at 2:41 AM, Dop Sun su...@dopsun.com wrote: Based on current source codes in the head, moving from 0.6.x to 0.7, means some code changes in the client side (other than server side changes, like storage_conf.xml). Something like: 1. New Clock class instead
Re: How to add a new Keyspace?
Here are my notes on how to make schema changes in 0.6: # Empty the commitlog with nodetool drain. = NOTE while this is running, the node will not accept writes. # Shutdown Cassandra and verify that there is no remaining data in the commitlog. = HOW to verify? # Delete the sstable files (-Data.db, -Index.db, and -Filter.db) for any CFs removed, and rename the files for any CFs that were renamed. # Make necessary changes to your storage-conf.xml. # Start Cassandra back up and your edits should take effect. Related URLs: http://wiki.apache.org/cassandra/FAQ#modify_cf_config http://www.mail-archive.com/user@cassandra.apache.org/msg02498.html HTH Dave Viner On Wed, Jul 7, 2010 at 11:39 PM, Peter Schuller peter.schul...@infidyne.com wrote: If I want to add a new Keyspace, does it mean I have to distribute my storage-conf.xml to whole nodes? and restart whole nodes? I *think* that is the case in Cassandra 0.6, but I'll let someone else comment. In trunk/upcoming 7 there are live schema upgrades that propagate through the cluster: http://wiki.apache.org/cassandra/LiveSchemaUpdates -- / Peter Schuller
Re: Property file snitch for Cassandra?
After more investigation, as well as a bunch of trial and error, here's what seems to be happening. 1. The rack.properties file key values (the stuff before the =) must match the toString() method of the InetAddress object for the host. 2. (In EC2) the InetAddress of a node *other* than the one you are on will have no hostname, but will have an IP address. 3. (In EC2) the InetAddress of the node on which the rack.properties is read will be the internal name + '/' + IP address. So, the solution that *appears* to work for EC2 (without touching DNS) is to have 2 lines in the rack.properties file for each host: /10.196.35.64=DC1:RAC1 ip-10-196-35-64.ec2.internal/10.196.35.64=DC1:RAC1 The first line is used by the *other* nodes in the cluster to identify this server as belonging to DC1 and RAC1. The second line is used by *this* node to identify itself as in DC1 and RAC1. I've not yet proven to myself that this is accurate, but it definitely stops the error messages and , from looking at the code, seems like it should work. Is this correct? Thanks Dave Viner On Wed, Jul 7, 2010 at 6:22 PM, Eric Evans eev...@rackspace.com wrote: Let's move this to the user@ list... On Wed, 2010-07-07 at 16:32 -0700, Dave Viner wrote: After starting up my cluster, I see this one of the system.log : ERROR [GMFD:1] 2010-07-07 23:27:46,044 PropertyFileEndPointSnitch.java (line 91) Could not find end point information for /10.202.159.32, will use default. However, I definitely have that IP listed in the /etc/cassandra/rack.properties file on that machine: # grep 10.202.159.32 /etc/cassandra/rack.properties 10.202.159.32\:7000=DC1:RAC1 # The \:7000 is from the sample file and description at http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.3/contrib/property_snitch/README.txt . Is there some other format I should be using? -- Eric Evans eev...@rackspace.com
Backing up the data stored in cassandra
Hi all, What is the recommended strategy for backing up the data stored inside cassandra? I realized that Cass. is a distributed database, and with a decent replication factor, backups are already done in some sense. But, as a relatively new user, I'm always concerned that the data is only within the system and not stored *anywhere* else. In an earlier email in the list, the recommendation was: Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. Is this still the recommended approach? If I backed up the files in DataDirectories/*, is it possible to restore a node using those files? (That is, bring up a new node, copy the backed up files from the crashed node onto the new node, then have the new node join the cluster?) Thanks Dave Viner