Re: Starting up hadoop cluster programatically
The Jsch source doesn't have any references to proxyHost, so I'm guessing these properties don't have any effect. However, the Jsch home page does mention connection through HTTP proxy. Perhaps you could ask on the Jsch list how to achieve this? Cheers, Tom On Wed, Jan 19, 2011 at 6:40 AM, Andrei Savu savu.and...@gmail.com wrote: I don't think it's possible to open SSH connections over a HTTP/HTTPS company proxy. I believe you need a SOCKS proxy but unfortunately, as far as I know, jsch does not support proxy connections. I believe that you need to find a way to have direct SSH access to the Amazon Cloud. Maybe there is a simpler way. You should wait for more replies. -original message- Subject: Starting up hadoop cluster programatically From: praveen.pe...@nokia.com Date: 19/01/2011 16:33 Hi all, I am trying to create a cluster dynamically using a Java Program. I wrote a program similar to HadoopServiceController in trunk code. I also added proxy information as system property (since I am inside my company network). I saw the cluster got created but the configuration fails. Here is the output. I am sure I am missing the proxy details somewhere else. I set proxy details as follows: System.setProperty(http.proxyHost, xxx.xxx.com); System.setProperty(http.proxyPort, ); System.setProperty(https.proxyHost, xxx.xxx.com); System.setProperty(https.proxyPort, ); Here is the output: INFO: Starting up cluster... Jan 18, 2011 3:29:02 PM org.apache.whirr.cluster.actions.BootstrapClusterAction doAction INFO: Bootstrapping cluster Jan 18, 2011 3:29:03 PM org.apache.whirr.cluster.actions.BootstrapClusterAction buildTemplate INFO: Configuring template Jan 18, 2011 3:29:08 PM org.apache.whirr.cluster.actions.BootstrapClusterAction$1 call INFO: Starting 1 node(s) with roles [tt, dn] Jan 18, 2011 3:29:08 PM org.apache.whirr.cluster.actions.BootstrapClusterAction buildTemplate INFO: Configuring template Jan 18, 2011 3:29:11 PM org.apache.whirr.cluster.actions.BootstrapClusterAction$1 call INFO: Starting 1 node(s) with roles [jt, nn] problem applying options to node(556284): org.jclouds.ssh.SshException: root@184.106.155.148:22: Error connecting to session. at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:250) at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:200) at org.jclouds.compute.util.ComputeUtils.runCallablesOnNode(ComputeUtils.java:202) at org.jclouds.compute.util.ComputeUtils.runOptionsOnNode(ComputeUtils.java:151) at org.jclouds.compute.util.ComputeUtils$1.call(ComputeUtils.java:116) at org.jclouds.compute.util.ComputeUtils$1.call(ComputeUtils.java:112) at org.jclouds.compute.strategy.impl.EncodeTagIntoNameRunNodesAndAddToSetStrategy$1.call(EncodeTagIntoNameRunNodesAndAddToSetStrategy.java:93) at org.jclouds.compute.strategy.impl.EncodeTagIntoNameRunNodesAndAddToSetStrategy$1.call(EncodeTagIntoNameRunNodesAndAddToSetStrategy.java:86) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out at com.jcraft.jsch.Util.createSocket(Util.java:386) at com.jcraft.jsch.Session.connect(Session.java:182) at com.jcraft.jsch.Session.connect(Session.java:150) at org.jclouds.ssh.jsch.JschSshClient.newSession(JschSshClient.java:245) at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:184) ... 11 more
Re: Running Mapred jobs after launching cluster
You don't need to add anything to the classpath, but you need to use the configuration in the org.apache.whirr.service.Cluster object to populate your Hadoop Configuration object so that your code knows which cluster to connect to. See the getConfiguration() method in HadoopServiceController for how to do this. Cheers, Tom On Thu, Jan 27, 2011 at 12:21 PM, praveen.pe...@nokia.com wrote: Hello all, I wrote a java class HadoopLanucher that is very similar to HadoopServiceController. I was succesfully able to launch a cluster programtically from my application using Whirr. Now I want to copy files to hdfs and also run a job progrmatically. When I copy a file to hdfs its copying to local file system, not hdfs. Here is the code I used: Configuration conf = new Configuration(); FileSystem hdfs = FileSystem.get(conf); hdfs.copyFromLocalFile(false, true, new Path(localFilePath), new Path(hdfsFileDirectory)); Do I need to add anything else to the classpath so Hadoop libraries know that it needs to talk to the dynamically lanuched cluster? When running Whirr from command line I know it uses HADOOP_CONF_DIR to find the hadoop config files but when doing the same from Java I am wondering how to solve this issue. Praveen
Re: Error while running cassandra using whirr
On Fri, Jan 28, 2011 at 12:53 AM, Ashish paliwalash...@gmail.com wrote: Folks, I followed the instructions from http://www.philwhln.com/quickly-launch-a-cassandra-cluster-on-amazon-ec2 Using whirr-0.2.0-incubating stable release I suggest trying with 0.3.0 (out soon, or available from svn now, as the blog outlines) since the Cassandra code has changed quite a bit since 0.2.0. Instances are launched, but at the end it it displays an error while connecting it. Following exception is printed - Authorizing firewall Running configuration script Exception in thread main java.io.IOException: org.jclouds.compute.RunScriptOnNodesException: error runScript on filter ed nodes options(RunScriptOptions [overridingCredentials=true, runAsRoot=true]) Execution failures: 0 error[s] Node failures: 1) SshException on node us-east-1/i-17f3497b: org.jclouds.ssh.SshException: ec2-user@50.16.165.161:22: Error connecting to session. at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:250) at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:204) at org.jclouds.compute.internal.BaseComputeService$4.call(BaseComputeService.java:375) at org.jclouds.compute.internal.BaseComputeService$4.call(BaseComputeService.java:364) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:452) at com.jcraft.jsch.Session.connect(Session.java:150) at org.jclouds.ssh.jsch.JschSshClient.newSession(JschSshClient.java:245) at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:184) ... 7 more Hadoop cluster runs fine with the whirr release. My configuration file is plan simple, picked from the blog and the real value replaced. Am I missing anything out here? thanks ashish
Re: Running Mapred jobs after launching cluster
On Fri, Jan 28, 2011 at 12:06 PM, praveen.pe...@nokia.com wrote: Thanks Tom. I think I got it working with my own driver so I will go with it for now (unless that proves to be a bad option). BTW, could you tell me how to stick with one hadoop version while launching cluster. I have hadoop-0.20.2 in my classpath but it lookws like Whirr gets the latest hadoop from the repository. Since the latest version may be different depending on the time, I would like to stick to one version so that hadoop version mismatch won't happen. You do need to make sure that the versions are the same. See the Hadoop integration tests, which specify the version of Hadoop to use in their POM. Also what jar files are necessary for launching cluster using Java. Currently I have cli version of jar file but that's way too large since it has ervrything in it. You need Whirr's core and Hadoop jars, as well as their dependencies. If you look at the POMs in the source code they will tell you the dependencies. Cheers Tom Thanks Praveen -Original Message- From: ext Tom White [mailto:tom.e.wh...@gmail.com] Sent: Friday, January 28, 2011 2:12 PM To: whirr-user@incubator.apache.org Subject: Re: Running Mapred jobs after launching cluster On Fri, Jan 28, 2011 at 6:28 AM, praveen.pe...@nokia.com wrote: Thanks Tom. Could you eloborate little more on the second option. What is the HADOOP_CONF_DIR here, after launching the cluster? ~/.whirr/cluster-name When you said run in new process, did you mean using command line Whirr tool? I meant that you could launch Whirr using the CLI, or Java. Then run the job in another process, with HADOOP_CONF_DIR set. The MR jobs you are running I assume can be run against an arbitrary cluster, so you should be able to point them at a cluster started by Whirr. Tom I may finally end up writing my own driver for running external mapred jobs so I can have more control but I was just curious to know if option #2 is better than writing my own driver. Praveen -Original Message- From: ext Tom White [mailto:t...@cloudera.com] Sent: Thursday, January 27, 2011 4:01 PM To: whirr-user@incubator.apache.org Subject: Re: Running Mapred jobs after launching cluster If they implement the Tool interface then you can set configuration on them. Failing that you could set HADOOP_CONF_DIR and run them in a new process. Cheers, Tom On Thu, Jan 27, 2011 at 12:52 PM, praveen.pe...@nokia.com wrote: Hmm... I am running some of the map reduce jobs written by me but some of them are in external libraries (eg. Mahout) which I don't have control over. Since I can't modify the code in external libraries, is there any other way to make this work? Praveen -Original Message- From: ext Tom White [mailto:tom.e.wh...@gmail.com] Sent: Thursday, January 27, 2011 3:42 PM To: whirr-user@incubator.apache.org Subject: Re: Running Mapred jobs after launching cluster You don't need to add anything to the classpath, but you need to use the configuration in the org.apache.whirr.service.Cluster object to populate your Hadoop Configuration object so that your code knows which cluster to connect to. See the getConfiguration() method in HadoopServiceController for how to do this. Cheers, Tom On Thu, Jan 27, 2011 at 12:21 PM, praveen.pe...@nokia.com wrote: Hello all, I wrote a java class HadoopLanucher that is very similar to HadoopServiceController. I was succesfully able to launch a cluster programtically from my application using Whirr. Now I want to copy files to hdfs and also run a job progrmatically. When I copy a file to hdfs its copying to local file system, not hdfs. Here is the code I used: Configuration conf = new Configuration(); FileSystem hdfs = FileSystem.get(conf); hdfs.copyFromLocalFile(false, true, new Path(localFilePath), new Path(hdfsFileDirectory)); Do I need to add anything else to the classpath so Hadoop libraries know that it needs to talk to the dynamically lanuched cluster? When running Whirr from command line I know it uses HADOOP_CONF_DIR to find the hadoop config files but when doing the same from Java I am wondering how to solve this issue. Praveen
[ANNOUNCE] Apache Whirr 0.3.0-incubating released
The Apache Whirr team is pleased to announce the release of Whirr 0.3.0-incubating from the Apache Incubator. Apache Whirr is a set of libraries for running cloud services such as Apache Hadoop, HBase, ZooKeeper, and Cassandra. The release is available here: http://www.apache.org/dyn/closer.cgi/incubator/whirr/ The full change log is available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=1230version=12315487 We welcome your help and feedback. For more information on how to report problems, and to get involved, visit the project website at http://incubator.apache.org/whirr/ Thanks to everyone who contributed to this release. The Apache Whirr Team
Re: Whirr and HBase
Try removing the CDH lines. I don't think that this combination works yet. Tom On Feb 1, 2011 7:53 AM, Paolo Castagna castagna.li...@googlemail.com wrote: Andrei Savu wrote: Could you share the recipe? I want to try to replicate the issue on my computer. whirr.cluster-name=myhbase whirr.instance-templates=1 zk+nn+jt+hbase-master,3 dn+tt+hbase-regionserver whirr.hadoop-install-runurl=cloudera/cdh/install whirr.hadoop-configure-runurl=cloudera/cdh/post-configure whirr.provider=ec2 whirr.identity= whirr.credential= # See also: http://aws.amazon.com/ec2/instance-types/ # t1.micro, m1.small, m1.large, m1.xlarge, m2.xlarge, m2.2xlarge, m2.4xlarge, c1.medium, c1.xlarge, cc1.4xlarge # whirr.hardware-id=m1.large # Ubuntu 10.04 LTS Lucid. See also: http://alestic.com/ # whirr.image-id=eu-west-1/ami-0d9ca979 # If you choose a different location, make sure whirr.image-id is updated too # whirr.location-id=eu-west-1 #whirr.hardware-id=m1.large #whirr.location-id=us-east-1 #whirr.image-id=us-east-1/ami-f8f40591 whirr.hardware-id=m1.xlarge whirr.image-id=us-east-1/ami-da0cf8b3 whirr.location-id=us-east-1 whirr.private-key-file=${sys:user.home}/.ssh/whirr whirr.public-key-file=${sys:user.home}/.ssh/whirr.pub I did different attempts (you see them commented). Last one, I was using m1.xlarge with us-east-1/ami-da0cf8b3. Paolo On Tue, Feb 1, 2011 at 5:32 PM, Paolo Castagna castagna.li...@googlemail.com wrote: Hi, I am trying to run an HBase small cluster using Whirr 0.3.0-incubating and (since it does not start HBase master or it does not install Hadoop correctly) Whirr from trunk. When I run it from trunk with a recipe very similar to the one provided in the recipes folder, I see these errors in the whirr.log: 2011-02-01 15:11:58,484 DEBUG [jclouds.compute] (user thread 9) stderr from runscript as ubuntu@50.16.158.231 + [[ hbase != \h\b\a\s\e ]] + HBASE_HOME=/usr/local/hbase-0.89.20100924 + HBASE_CONF_DIR=/usr/local/hbase-0.89.20100924/conf + update_repo + which dpkg + sudo apt-get update + install_hbase + id hadoop + useradd hadoop useradd: group hadoop exists - if you want to add this user to that group, use -g [...] 2011-02-01 15:12:26,370 DEBUG [jclouds.compute] (user thread 2) stderr from computeserv as ubuntu@50.16.158.231 + HBASE_VERSION=hbase-0.89.20100924 + [[ hbase != \h\b\a\s\e ]] + HBASE_HOME=/usr/local/hbase-0.89.20100924 + HBASE_CONF_DIR=/usr/local/hbase-0.89.20100924/conf + configure_hbase + case $CLOUD_PROVIDER in + MOUNT=/mnt + mkdir -p /mnt/hbase + chown hadoop:hadoop /mnt/hbase chown: invalid user: `hadoop:hadoop' Is this a known problem? Paolo
Re: Whirr supports redhat?
I used the API guide and curl to get the information for server and flavour IDs. There is probably a better way that I don't know about though. Cheers, Tom On Fri, Feb 18, 2011 at 4:15 PM, praveen.pe...@nokia.com wrote: Thanks Tom. I will let the list know how it goes. Where do I get the information about what the whirr.image-id property should be? Praveen On Feb 18, 2011, at 7:12 PM, ext Tom White tom.e.wh...@gmail.com wrote: Hi Praveen, I haven't tried Hadoop on Cloudservers with Redhat, but the scripts do support RPM-based systems (like Amazon's Linux AMI). Please let the list know if you get it working with this combination and consider contributing a configuration recipe. Cheers, Tom On Fri, Feb 18, 2011 at 2:46 PM, praveen.pe...@nokia.com wrote: Does Whirr support spawning hadoop cluster on redhat on rackspace. I could not find any documentation related to redhat. Currently I am using Ubuntu and it works great but we need to use redhet ultimately. Thanks Praveen
Re: image-id to specify a m1.large hardware in EC2
Hi Patricio, In the past I've used whirr.hardware-id=m1.large whirr.image-id=us-east-1/ami-da0cf8b3 whirr.location-id=us-east-1 Hope that helps. Tom 2011/4/13 Patricio Echagüe patric...@gmail.com: Hi all, I need to create m1.large EC2 nodes in EC2 and was wondering if someone knows what image-id to specify in the properties file for whirr in order to do that. whirr.image-id=ami-2a1fec43 (apparently the default doesn't work for m1.large hardware) with the previously default ami, if I set whirr.hardware-id = m1.large it throws an exception saying that it is an incompatible type for the AMI. Currently the default is m1.small and when I run the benchmark TestDFSIO, all my datanodes are timing out. So my thought here is that perhaps a hardware type m1.large or bigger can help. Any help will be much appreciated Thanks
Re: Custom install and config functions when calling Whirr from code
Hi John, The functions directory itself needs to be on the classpath. You can achieve this by including it in your application JAR (like the Whirr service JARs do), or by adding it to the application classpath (like the bin/whirr script does). Hope that helps. Cheers, Tom On Thu, May 26, 2011 at 3:29 PM, John Conwell j...@iamjohn.me wrote: I have customized the install and config functions for cassandra, and put these two files in the functions folder, and it works great from the cmd line whirr utility. But when launching a cluster via the Serice.launchCluster() method how do you specify a custom function file for either install or configuration? Thanks, John C
Re: How to use OtherAction?
On Thu, Jun 2, 2011 at 3:06 PM, Andrei Savu savu.and...@gmail.com wrote: I understand. Tom should be able to tell us more about the intended usage scenario for OtherAction. The other action call was just to cover the case if new events were added and not explicitly exposed in ClusterActionHandlerSupport. It's not currently used. I just want to add that in 0.5.0 we've added support for remote script execution to the CLI: https://issues.apache.org/jira/browse/WHIRR-173 ... and I believe what you need is a similar mechanism available in the core API. By adding a new EXECUTE_ACTION, and extending ClusterActionHandlerSupport to expose before/afterExecute methods? Tom -- Andrei Savu On Fri, Jun 3, 2011 at 12:59 AM, John Conwell j...@iamjohn.me wrote: Well, the reason I want to know is two fold. First, I'm using the whirr core API to spin up and provision multiple clusters. But on one of my clusters, I'd need to push out a shell script to execute after the configure action happens. Basically, my code needs to generate a bunch of stuff, which it then uses to create the shell script dynamically, and I want to use the OtherAction to push that shell script out to the instances in the cluster to execute. Second, I just like to know how generic extension points work, because I like to come up with interesting ways to integrate APIs into what I'm working on Thanks, John On Thu, Jun 2, 2011 at 2:48 PM, Andrei Savu savu.and...@gmail.com wrote: What's you use case? Maybe there is another way. As far as I know no service is using OtherAction event. -- Andrei Savu / andreisavu.ro On Fri, Jun 3, 2011 at 12:46 AM, John Conwell j...@iamjohn.me wrote: In looking at the ClusterActionHandlerSupport, I notice the before/after OtherAction event. Is this functional? How can I trigger this event? -- Thanks, John C -- Thanks, John C
[ANNOUNCE] Apache Whirr 0.5.0-incubating released
The Apache Whirr team is pleased to announce the release of Whirr 0.5.0-incubating from the Apache Incubator. This is the fifth incubating release of Apache Whirr, a set of libraries for running cloud services such as Apache Hadoop, HBase, ZooKeeper, and Cassandra. The release is available here: http://www.apache.org/dyn/closer.cgi/incubator/whirr/ The full change log is available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12316248styleName=HtmlprojectId=1230 We welcome your help and feedback. For more information on how to report problems, and to get involved, visit the project website at http://incubator.apache.org/whirr/ The Apache Whirr Team
Re: set up two cassandra clusters?
Do the clusters have different names? Can you supply the stacktrace you're getting from whirr.log. Cheers, Tom On Tue, Jun 7, 2011 at 4:45 PM, Khanh Nguyen nguyen.h.kh...@gmail.com wrote: Hi, I want to launch another cassandra clusters on EC2 but I keep getting an exception like this Exception in thread main java.lang.IllegalStateException: The permission '209.6.54.22/32-1-9160-9160' has already been authorized on the specified group Essentially, I am trying to launch two clusters. One uses RandomPartitioner and another use ByteOrderedPartitioner. Thanks. Cheers, -k
Re: hadoop security and ssh proxy
The proxy is not used for security (which would be better provided by a firewall), but to make the datanode addresses resolve correctly for the client. Without the proxy the datanodes return their internal addresses which are not routable by the client (which runs in an external network typically). I agree that it would be better if we could replace the proxy with something better, such as https://issues.apache.org/jira/browse/WHIRR-81. On Tue, Jun 14, 2011 at 9:26 AM, John Conwell j...@iamjohn.me wrote: I get the whole security is a good thing thing, but could someone give me a description as to why when whirr configures hadoop it sets up the ssh proxy to disallow all coms to the data / task nodes except via the name node over the proxy? If I'm running on EC2, wont correctly setting up security groups give me enough security? The reason I ask is that I'm using Whirr through its API to automate...well...all the cool things whirr does. But they key point is automation. After a hadoop cluster is up and running I'd like the program to kick off a hadoop job, monitor jobs and tasks. But that means my program has to launch hadoop-proxy.sh somehow, capture the PID of the process, kick off my hadoop job, then when done, kill the process via the PID. The whole calling a shell script, capturing the PID, persisting it, and killing it all through my java automation just seems a bit duct-tape and bailing-wire'ish. You can run the proxy from Java via HadoopProxy, which handles all these details for you. So I'm trying to figure out why we have the whole hadoop-proxy.sh thing in the first place (specifically within the context of EC2) -- Thanks, John C Cheers, Tom
Re: Is Service.launchCluster thread safe?
On Tue, Jun 14, 2011 at 3:46 PM, John Conwell j...@iamjohn.me wrote: So as an FYI, I just tested using the whirr API to start multiple clusters at the same time using Futures, and it (seems to) works great. really cuts down on the time to ramp up a set of clusters (like 4 or more). Yay Great. This would be a nice thing to put on the wiki or in the docs as a usage example. Would you like to do this? Thanks, Tom On Fri, Jun 10, 2011 at 11:38 AM, Andrei Savu savu.and...@gmail.com wrote: John, I don't think we've checked this. Could you open an issue? We should at least update the docs. To be safe you should create multiple instances, one for each thread. I don't think you need to worry about the amount of memory used. In 0.5.0 we've done some performance improvements that are relevant to you. Sent from my phone. Cheers, -- Andrei On Jun 10, 2011 8:38 PM, John Conwell j...@iamjohn.me wrote: In 0.4.0, is Service.launchCluster(ClusterSpec) thread safe? Through the API I'm spinning up 3 clusters, one after another, and I'd like to change the code to launch them in parallel, but wanted to check if this is threadsafe. -- Thanks, John C -- Thanks, John C
Re: hadoop security and ssh proxy
On Wed, Jun 15, 2011 at 10:18 AM, John Conwell j...@iamjohn.me wrote: Ok, that makes sense. Thanks for the clarification. It is definitely unwieldy when trying to integrate whirr's API into another API to wrap spinning up hadoop clusters, and getting it to work without any manual steps. Agreed, but it is possible - see the Hadoop integration tests which are an example of spinning up a Hadoop cluster from Java in a completely automated fashion. Tom On Tue, Jun 14, 2011 at 5:13 PM, Tom White tom.e.wh...@gmail.com wrote: The proxy is not used for security (which would be better provided by a firewall), but to make the datanode addresses resolve correctly for the client. Without the proxy the datanodes return their internal addresses which are not routable by the client (which runs in an external network typically). I agree that it would be better if we could replace the proxy with something better, such as https://issues.apache.org/jira/browse/WHIRR-81. On Tue, Jun 14, 2011 at 9:26 AM, John Conwell j...@iamjohn.me wrote: I get the whole security is a good thing thing, but could someone give me a description as to why when whirr configures hadoop it sets up the ssh proxy to disallow all coms to the data / task nodes except via the name node over the proxy? If I'm running on EC2, wont correctly setting up security groups give me enough security? The reason I ask is that I'm using Whirr through its API to automate...well...all the cool things whirr does. But they key point is automation. After a hadoop cluster is up and running I'd like the program to kick off a hadoop job, monitor jobs and tasks. But that means my program has to launch hadoop-proxy.sh somehow, capture the PID of the process, kick off my hadoop job, then when done, kill the process via the PID. The whole calling a shell script, capturing the PID, persisting it, and killing it all through my java automation just seems a bit duct-tape and bailing-wire'ish. You can run the proxy from Java via HadoopProxy, which handles all these details for you. So I'm trying to figure out why we have the whole hadoop-proxy.sh thing in the first place (specifically within the context of EC2) -- Thanks, John C Cheers, Tom -- Thanks, John C
Re: Is Service.launchCluster thread safe?
There's a page at src/site/xdoc/api-guide.xml, which might be a good place for this. Thanks! Tom On Wed, Jun 15, 2011 at 10:34 AM, John Conwell j...@iamjohn.me wrote: Yea, I can do that. What section would you want it in? Maybe a new section on using whirr via its API? On Tue, Jun 14, 2011 at 5:20 PM, Tom White tom.e.wh...@gmail.com wrote: On Tue, Jun 14, 2011 at 3:46 PM, John Conwell j...@iamjohn.me wrote: So as an FYI, I just tested using the whirr API to start multiple clusters at the same time using Futures, and it (seems to) works great. really cuts down on the time to ramp up a set of clusters (like 4 or more). Yay Great. This would be a nice thing to put on the wiki or in the docs as a usage example. Would you like to do this? Thanks, Tom On Fri, Jun 10, 2011 at 11:38 AM, Andrei Savu savu.and...@gmail.com wrote: John, I don't think we've checked this. Could you open an issue? We should at least update the docs. To be safe you should create multiple instances, one for each thread. I don't think you need to worry about the amount of memory used. In 0.5.0 we've done some performance improvements that are relevant to you. Sent from my phone. Cheers, -- Andrei On Jun 10, 2011 8:38 PM, John Conwell j...@iamjohn.me wrote: In 0.4.0, is Service.launchCluster(ClusterSpec) thread safe? Through the API I'm spinning up 3 clusters, one after another, and I'd like to change the code to launch them in parallel, but wanted to check if this is threadsafe. -- Thanks, John C -- Thanks, John C -- Thanks, John C
Re: Execute a cmd via jclouds SshClient that requires sudo privs on VM started by Whirr
You could write your own predicate that does a cast. See ClusterController.runningInGroup() for something similar. Cheers, Tom On Thu, Jun 16, 2011 at 9:47 AM, John Conwell j...@iamjohn.me wrote: Pulled the code from RunScriptCommand as an example, and I think I'm good in that respect. I'm having issues with runScriptOnNodesMatching() when I want to only run the script on 1 node in the group. I have the node Id of the target node, but not sure how to create a predicate that just targets one node based on the Id. NodePredicates.withIds() returns a PredicateComputeMetadata, but runScriptOnNodesMatching takes a PredicateNodeMetadata. The jclouds wiki states Individual commands are executed against a specific node's id, but that doesnt really explain how to do this. On Wed, Jun 15, 2011 at 1:21 PM, Andrei Savu savu.and...@gmail.com wrote: Take a look at the RunScriptCommand class. You need to call the function somehow like this: StatementBuilder builder = new StatementBuilder(); builder.addStatements( Statements.appendFile(/tmp/my.cfg, lines), exec(getFileContent(scriptPath)) ); controller.runScriptOnNodesMatching( spec, condition, builder) -- Andrei On Wed, Jun 15, 2011 at 8:38 PM, John Conwell j...@iamjohn.me wrote: I looked at computeService.runScriptOnNodesMatching a while ago and couldnt make much sense on how to use the API correctly (for uploading a script and running it). Is there a good unit tests that shows how to do this? On Tue, Jun 14, 2011 at 2:03 AM, Andrei Savu savu.and...@gmail.com wrote: You could also try to use computeService.runScriptOnNodesMatching and upload the file using an AppendFile jclouds statement together with the credentials from the cluster spec file. This approach is similar to what RunScriptCommand is doing. -- Andrei Savu / andreisavu.ro On Mon, Jun 13, 2011 at 11:16 PM, John Conwell j...@iamjohn.me wrote: the AMI is us-east-1/ami-da0cf8b3 and OS is the default that whirr installs, Ubuntu 10.4 my thinks. On Mon, Jun 13, 2011 at 1:13 PM, Andrei Savu savu.and...@gmail.com wrote: That should work. What ami, OS are you using? On Jun 13, 2011 10:22 PM, John Conwell j...@iamjohn.me wrote: which user. Whirr creates a user called ubuntu, and it also creates a user that is specified via whirr.cluster-user. I tried the user that is specified via whirr.cluster-user and that didnt work On Mon, Jun 13, 2011 at 12:16 PM, Andrei Savu savu.and...@gmail.com wrote: The user created by Whirr should be able to do sudo without requesting a password. On Jun 13, 2011 8:55 PM, John Conwell j...@iamjohn.me wrote: I've got a cluster that gets started by Whirr. After its running, I need to create a config file and copy it up to a specific folder on the VM. I'm using the class.method SshClient.put() to copy the file up to the /tmp directory on my VM with no security issues. But then I need to copy the file to a different folder, via SshClient.exec method. But the cp command requires sudo because the user whirr created for me doesn't have privs to copy to the required folder. Also, I cant specify a password with the sudo command because the connection was made using x509 certificates. So how can I execute remote ssh commands that require sudo privs? -- Thanks, John C -- Thanks, John C -- Thanks, John C -- Thanks, John C -- Thanks, John C
Re: Run commands via whirr
Looks great - I've often wanted something like this. I think adding whirr run-cmd would be the way to add this since then it's integrated with the whirr command. Thanks, Tom On Wed, Aug 24, 2011 at 2:06 AM, Karel Vervaeke ka...@outerthought.org wrote: I'd be happy to do the jira dance. On Tue, Aug 23, 2011 at 6:35 PM, Andrei Savu savu.and...@gmail.com wrote: Looks good! How about adding this as a script in bin/ e.g. bin/whirr-cmd? -- Andrei On Tue, Aug 23, 2011 at 8:50 AM, Karel Vervaeke ka...@outerthought.org wrote: I got bored with writing little scripts when using whirr run-script so hacked this up. Maybe it's useful to someone. Usage examples: whirrcmd --cluster=recipes/mycluster.properties sudo /usr/bin/jps status whirrcmd --cluster=recipes/mycluster.properties --roles=hbase-master sudo /usr/bin/jps status Perhaps it be better to add it to whirr (e.g. whirr run-script --command= or whirr run-cmd ...), but I don't know if it's useful enough. Here's the ugly bit: whirrcmd() { local whirr_args tmpfile=$(mktemp --suffix .sh) whirr_args=(${@:1:$#-1}) cmd_arg=${@:$#} cat $tmpfile EOF #!/bin/bash $cmd_arg EOF whirr run-script ${whirr_args[@]} --script=$tmpfile rm $tmpfile } -- Karel Vervaeke http://outerthought.org/ Open Source Content Applications Makers of Kauri, Daisy CMS and Lily -- Karel Vervaeke http://outerthought.org/ Open Source Content Applications Makers of Kauri, Daisy CMS and Lily