Re: no other nodes seen on priam cluster
Glad you got it going! There is a REST call you can make to priam telling it to double the cluster size (/v1/cassconfig/double_ring), it will pre fill all SimpleDB entries for when the nodes come online, you then change the number of nodes on the autoscale group. Now that Priam supports C* 1.2 with Vnodes, increasing the cluster size in an ad-hoc manner might be just around the corner. Instacluster has some predefined cluster sizes (Free, Basic, Professional and Enterprise), these are loosely based on the estimated performance and storage capacity. You can also create a custom cluster where you define the number of nodes (minimum of 4) and the Instance type according to your requirements. For pricing on those check out https://www.instaclustr.com/pricing/per-instance, we base our pricing on estimated support and throughput requirements. Cheers Ben Instaclustr | www.instaclustr.com | @instaclustr On 02/03/2013, at 3:59 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: Thanks a lot Ben, actually I managed to make it work erasing the SimpleDB Priam uses to keeps instances... I would pulled the last commit from the repo, not sure if it helped or not. But you message made me curious about something... How do you do to add more Cassandra nodes on the fly? Just update the autoscale properties? I saw instaclustr.com changes the instance type as the number of nodes increase (not sure why the price also becomes higher per instance in this case), I am guessing priam use the data backed up to S3 to restore a node data in another instance, right? []s 2013/2/28 Ben Bromhead b...@relational.io Off the top of my head I would check to make sure the Autoscaling Group you created is restricted to a single Availability Zone, also Priam sets the number of EC2 instances it expects based on the maximum instance count you set on your scaling group (it did this last time i checked a few months ago, it's behaviour may have changed). So I would make your desired, min and max instances for your scaling group are all the same, make sure your ASG is restricted to a single availability zone (e.g. us-east-1b) and then (if you are able to and there is no data in your cluster) delete all the SimpleDB entries Priam has created and then also possibly clear out the cassandra data directory. Other than that I see you've raised it as an issue on the Priam project page , so see what they say ;) Cheers Ben On Thu, Feb 28, 2013 at 3:40 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: One additional important info, I checked here and the seeds seems really different on each node. The command echo `curl http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds` returns ip2 on first node and ip1,ip1 on second node. Any idea why? It's probably what is causing cassandra to die, right? 2013/2/27 Marcelo Elias Del Valle mvall...@gmail.com Hello Ben, Thanks for the willingness to help, 2013/2/27 Ben Bromhead b...@instaclustr.com Have your added the priam java agent to cassandras JVM argurments (e.g. -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does the web container running priam have permissions to write to the cassandra config directory? Also what do the priam logs say? I put the priam log of the first node bellow. Yes, I have added priam-cass-extensions to java args and Priam IS actually writting to cassandra dir. If you want to get up and running quickly with cassandra, AWS and priam quickly check out www.instaclustr.com you. We deploy Cassandra under your AWS account and you have full root access to the nodes if you want to explore and play around + there is a free tier which is great for experimenting and trying Cassandra out. That sounded really great. I am not sure if it would apply to our case (will consider it though), but some partners would have a great benefit from it, for sure! I will send your link to them. What priam says: 2013-02-27 14:14:58.0614 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-hostname returns: ec2-174-129-59-107.compute-1.amazon aws.com 2013-02-27 14:14:58.0615 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-ipv4 returns: 174.129.59.107 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium 2013-02-27 14:14:59.0614 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1, ASG Name set to dmp_cluster-useast1b 2013-02-27 14:14:59.0746 INFO pool-2-thread-1
Re: no other nodes seen on priam cluster
Thanks a lot Ben, actually I managed to make it work erasing the SimpleDB Priam uses to keeps instances... I would pulled the last commit from the repo, not sure if it helped or not. But you message made me curious about something... How do you do to add more Cassandra nodes on the fly? Just update the autoscale properties? I saw instaclustr.com changes the instance type as the number of nodes increase (not sure why the price also becomes higher per instance in this case), I am guessing priam use the data backed up to S3 to restore a node data in another instance, right? []s 2013/2/28 Ben Bromhead b...@relational.io Off the top of my head I would check to make sure the Autoscaling Group you created is restricted to a single Availability Zone, also Priam sets the number of EC2 instances it expects based on the maximum instance count you set on your scaling group (it did this last time i checked a few months ago, it's behaviour may have changed). So I would make your desired, min and max instances for your scaling group are all the same, make sure your ASG is restricted to a single availability zone (e.g. us-east-1b) and then (if you are able to and there is no data in your cluster) delete all the SimpleDB entries Priam has created and then also possibly clear out the cassandra data directory. Other than that I see you've raised it as an issue on the Priam project page , so see what they say ;) Cheers Ben On Thu, Feb 28, 2013 at 3:40 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: One additional important info, I checked here and the seeds seems really different on each node. The command echo `curl http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds`http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds returns ip2 on first node and ip1,ip1 on second node. Any idea why? It's probably what is causing cassandra to die, right? 2013/2/27 Marcelo Elias Del Valle mvall...@gmail.com Hello Ben, Thanks for the willingness to help, 2013/2/27 Ben Bromhead b...@instaclustr.com Have your added the priam java agent to cassandras JVM argurments (e.g. -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does the web container running priam have permissions to write to the cassandra config directory? Also what do the priam logs say? I put the priam log of the first node bellow. Yes, I have added priam-cass-extensions to java args and Priam IS actually writting to cassandra dir. If you want to get up and running quickly with cassandra, AWS and priam quickly check out www.instaclustr.comhttp://www.instaclustr.com/?cid=cass-listyou. We deploy Cassandra under your AWS account and you have full root access to the nodes if you want to explore and play around + there is a free tier which is great for experimenting and trying Cassandra out. That sounded really great. I am not sure if it would apply to our case (will consider it though), but some partners would have a great benefit from it, for sure! I will send your link to them. What priam says: 2013-02-27 14:14:58.0614 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-hostname returns: ec2-174-129-59-107.compute-1.amazon aws.com 2013-02-27 14:14:58.0615 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-ipv4 returns: 174.129.59.107 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium 2013-02-27 14:14:59.0614 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1, ASG Name set to dmp_cluster-useast1b 2013-02-27 14:14:59.0746 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration appid used to fetch properties is: dmp_cluster 2013-02-27 14:14:59.0843 INFO pool-2-thread-1 org.quartz.simpl.SimpleThreadPool Job execution threads will use class loader of thread: pool-2-thread-1 2013-02-27 14:14:59.0861 INFO pool-2-thread-1 org.quartz.core.SchedulerSignalerImpl Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl 2013-02-27 14:14:59.0862 INFO pool-2-thread-1 org.quartz.core.QuartzScheduler Quartz Scheduler v.1.7.3 created. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.simpl.RAMJobStore RAMJobStore initialized. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.propertie s' 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler version: 1.7.3 2013-02-27
Re: no other nodes seen on priam cluster
Off the top of my head I would check to make sure the Autoscaling Group you created is restricted to a single Availability Zone, also Priam sets the number of EC2 instances it expects based on the maximum instance count you set on your scaling group (it did this last time i checked a few months ago, it's behaviour may have changed). So I would make your desired, min and max instances for your scaling group are all the same, make sure your ASG is restricted to a single availability zone (e.g. us-east-1b) and then (if you are able to and there is no data in your cluster) delete all the SimpleDB entries Priam has created and then also possibly clear out the cassandra data directory. Other than that I see you've raised it as an issue on the Priam project page , so see what they say ;) Cheers Ben On Thu, Feb 28, 2013 at 3:40 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: One additional important info, I checked here and the seeds seems really different on each node. The command echo `curl http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds`http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds returns ip2 on first node and ip1,ip1 on second node. Any idea why? It's probably what is causing cassandra to die, right? 2013/2/27 Marcelo Elias Del Valle mvall...@gmail.com Hello Ben, Thanks for the willingness to help, 2013/2/27 Ben Bromhead b...@instaclustr.com Have your added the priam java agent to cassandras JVM argurments (e.g. -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does the web container running priam have permissions to write to the cassandra config directory? Also what do the priam logs say? I put the priam log of the first node bellow. Yes, I have added priam-cass-extensions to java args and Priam IS actually writting to cassandra dir. If you want to get up and running quickly with cassandra, AWS and priam quickly check out www.instaclustr.comhttp://www.instaclustr.com/?cid=cass-listyou. We deploy Cassandra under your AWS account and you have full root access to the nodes if you want to explore and play around + there is a free tier which is great for experimenting and trying Cassandra out. That sounded really great. I am not sure if it would apply to our case (will consider it though), but some partners would have a great benefit from it, for sure! I will send your link to them. What priam says: 2013-02-27 14:14:58.0614 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-hostname returns: ec2-174-129-59-107.compute-1.amazon aws.com 2013-02-27 14:14:58.0615 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-ipv4 returns: 174.129.59.107 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium 2013-02-27 14:14:59.0614 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1, ASG Name set to dmp_cluster-useast1b 2013-02-27 14:14:59.0746 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration appid used to fetch properties is: dmp_cluster 2013-02-27 14:14:59.0843 INFO pool-2-thread-1 org.quartz.simpl.SimpleThreadPool Job execution threads will use class loader of thread: pool-2-thread-1 2013-02-27 14:14:59.0861 INFO pool-2-thread-1 org.quartz.core.SchedulerSignalerImpl Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl 2013-02-27 14:14:59.0862 INFO pool-2-thread-1 org.quartz.core.QuartzScheduler Quartz Scheduler v.1.7.3 created. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.simpl.RAMJobStore RAMJobStore initialized. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.propertie s' 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler version: 1.7.3 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.core.QuartzScheduler JobFactory set to: com.netflix.priam.scheduler.GuiceJobFactory@1b6a1c4 2013-02-27 14:15:00.0239 INFO pool-2-thread-1 com.netflix.priam.aws.AWSMembership Querying Amazon returned following instance in the ASG: us-east-1b -- i-8eb32bfd,i-88b32bfb 2013-02-27 14:15:01.0470 INFO Timer-0 org.quartz.utils.UpdateChecker New update(s) found: 1.8.5 [ http://www.terracotta.org/kit/reflector?kitID=defaultpageID=QuartzChangeLog ] 2013-02-27 14:15:10.0925 INFO pool-2-thread-1 com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d49a0da7 2013-02-27 14:15:11.0397
Re: no other nodes seen on priam cluster
Hi Marcelo A few questions: Have your added the priam java agent to cassandras JVM argurments (e.g. -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does the web container running priam have permissions to write to the cassandra config directory? Also what do the priam logs say? If you want to get up and running quickly with cassandra, AWS and priam quickly check out www.instaclustr.comhttp://www.instaclustr.com/?cid=cass-listyou. We deploy Cassandra under your AWS account and you have full root access to the nodes if you want to explore and play around + there is a free tier which is great for experimenting and trying Cassandra out. Cheers Ben On Wed, Feb 27, 2013 at 6:09 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: Hello, I am using cassandra 1.2.1 and I am trying to set up a Priam cluster on AWS with two nodes. However, I can't get both nodes up and running because of a weird error (at least to me). When I start both nodes, they are both able to connect to each other and do some communication. However, after some seconds, I just see Java.lang.RuntimeException: No other nodes seen! , so they disconnect and die. I tried to test all ports (7000, 9160 and 7199) between both nodes and there is no firewall. On the second node, before the above exception, I get a broken pipe, as shown bellow. Any hint? DEBUG 18:54:31,776 attempting to connect to /10.224.238.170 DEBUG 18:54:32,402 Reseting version for /10.224.238.170 DEBUG 18:54:32,778 Connection version 6 from /10.224.238.170 DEBUG 18:54:32,779 Upgrading incoming connection to be compressed DEBUG 18:54:32,779 Max version for /10.224.238.170 is 6 DEBUG 18:54:32,779 Setting version 6 for /10.224.238.170 DEBUG 18:54:32,780 set version for /10.224.238.170 to 6 DEBUG 18:54:33,455 Disseminating load info ... DEBUG 18:54:59,082 Reseting version for /10.224.238.170 DEBUG 18:55:00,405 error writing to /10.224.238.170 java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72) at sun.nio.ch.IOUtil.write(IOUtil.java:43) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) at java.nio.channels.Channels.writeFullyImpl(Channels.java:59) at java.nio.channels.Channels.writeFully(Channels.java:81) at java.nio.channels.Channels.access$000(Channels.java:47) at java.nio.channels.Channels$1.write(Channels.java:155) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:272) at java.io.DataOutputStream.flush(DataOutputStream.java:106) at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:189) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:143) DEBUG 18:55:01,405 attempting to connect to /10.224.238.170 DEBUG 18:55:01,461 Started replayAllFailedBatches DEBUG 18:55:01,462 forceFlush requested but everything is clean in batchlog DEBUG 18:55:01,463 Finished replayAllFailedBatches INFO 18:55:01,472 JOINING: schema complete, ready to bootstrap DEBUG 18:55:01,473 ... got ring + schema info INFO 18:55:01,473 JOINING: getting bootstrap token ERROR 18:55:01,475 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. and on the first node: DEBUG 18:54:30,833 Disseminating load info ... DEBUG 18:54:31,532 Connection version 6 from /10.242.139.159 DEBUG 18:54:31,533 Upgrading incoming connection to be compressed DEBUG 18:54:31,534 Max version for /10.242.139.159 is 6 DEBUG 18:54:31,534 Setting version 6 for /10.242.139.159 DEBUG 18:54:31,534 set version for /10.242.139.159 to 6 DEBUG 18:54:31,542 Reseting version for /10.242.139.159 DEBUG 18:54:31,791 Connection version 6 from /10.242.139.159 DEBUG 18:54:31,792 Upgrading incoming connection to be compressed DEBUG 18:54:31,792 Max version for /10.242.139.159 is 6 DEBUG 18:54:31,792 Setting version 6 for /10.242.139.159 DEBUG 18:54:31,793 set version for /10.242.139.159 to 6 INFO 18:54:32,414 Node /10.242.139.159 is now part of the cluster DEBUG 18:54:32,415 Resetting pool for /10.242.139.159 DEBUG 18:54:32,415 removing expire time for endpoint : /10.242.139.159 INFO 18:54:32,415 InetAddress /10.242.139.159 is now UP DEBUG 18:54:32,789 attempting to connect to ec2-75-101-233-115.compute-1.amazonaws.com/10.242.139.159 DEBUG 18:54:58,840 Started replayAllFailedBatches DEBUG