Hi Rob: I had to do all those steps you talked about, specially at bootstrap I run a Bash script stored at s3 like this:
--core-key-value, giraph.zkList=localhost:2181, --mapred-key-value, mapreduce.job.counters.limit=1200 Then at the steps configuration I start by setting up Giraph and Zookeeper by calling two Bash scripts (two separate steps): s3://elasticmapreduce/libs/script-runner/script-runner.jar s3://mybucket/install_giraph.sh s3://elasticmapreduce/libs/script-runner/script-runner.jar s3://mybucket/install_zookeeper.sh In the case of the install_giraph.sh I do this: hadoop dfs -copyToLocal s3://mybucket/giraph.tar.gz /home/hadoop tar -xzvf /home/hadoop/giraph.tar.gz -C /home/hadoop and install_zookeeper.sh does this: hadoop dfs -copyToLocal s3://data.clipesebandas/binaries/zookeeper.tar.gz /home/hadoop tar -xzvf /home/hadoop/zookeeper.tar.gz -C /home/hadoop /home/hadoop/zookeeper/bin/zkServer.sh start And finally I run my Giraph algorithm in another step like this: /home/hadoop/giraph.jar org.giraph.MyGraphAlgorithm /user/hadoop/input_graph, /user/hadoop/built_graph 20 1 Perhaps some steps, like Zookeeper configuration, are not needed since this configuration is based on Giraph 0.1. Hope this helps. Cheers Gustavo On Mon, Nov 11, 2013 at 12:43 PM, Rob Vesse <rve...@dotnetrdf.org> wrote: > Hi All > > I've been looking around for any documentation about running Giraph on > Amazon Elastic Map Reduce (EMR) and didn't turn up anything particularly > useful. > > It looks like the only real requirements to run on EMR are to add > Bootstrap actions to the Job Flow configuration to apply the relevant > Hadoop configuration settings e.g. increasing max map tasks. After that it > looks like I should just need to use a standard Custom JAR launch step to > launch the Giraph Runner with appropriate arguments for my Giraph program. > > Before I start trying to do this and incurring EC2 costs does anyone have > experience of running Giraph applications on EMR that they are willing to > share? Any suggestions, tips, common pitfalls etc I should be aware of? > > Cheers, > > Rob >