My apologies, I run "bin/storm supervisor &" at the end. That was a bad
copy and paste.

On Tue, Sep 9, 2014 at 8:10 AM, Vikas Agarwal <[email protected]> wrote:

> >>*I launch another basic EC2 CentOS machine and do the exact same
> commands except that I don't install zookeeper and I only run "bin/storm
> nimbus &" at the end*<<
>
> Why would you run nimbus on two machines?
>
> Yes, by default Hortonworks provides one supervisor node par machine, so I
> am trying to add new EC2 machine with supervisor role. And it worked upto
> some extent but still not fully functional. I am still testing its linear
> scalability.
>
>
> On Tue, Sep 9, 2014 at 5:15 PM, Stephen Hartzell <
> [email protected]> wrote:
>
>> Vikas,
>>
>>
>>   I've tried to use the HortonWorks distribution, but that only provides
>> one supervisor and nimbus on one virtual machine. I'm excited to hear that
>> you have storm working on AWS EC2 machines because that is exactly what I
>> am trying to do! Right now we're still in the development stage, so all we
>> are trying to do is to have one worker machine connect to one nimbus
>> machine. So far we haven't got this work.
>>
>>   Although it might be lengthy, let me go ahead and post the commands I'm
>> using to setup the nimbus machine.
>>
>>
>> *I launch a basic EC2 CentOS machine with ports 6627 and 8080 open to TCP
>> connections (it's public IP is 54.68.149.181)*
>>
>> sudo yum update
>>
>> sudo yum install libuuid-devel gcc gcc-c++ kernel-devel
>>
>> *# Install zeromq*
>>
>> wget http://download.zeromq.org/zeromq-2.1.7.tar.gz
>>
>> tar –zxvf zeromq-2.1.7.tar.gz
>>
>> cd zeromq-2.1.7.tar.gz
>>
>> ./configure
>>
>> make
>>
>> sudo make install
>>
>> cd ~
>>
>> *# Install jzmq*
>>
>> sudo yum install git java-devel libtool
>>
>> export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.x.x86_64    *#
>> JDK required for jzmq in the configuration stage*
>>
>> cd $JAVA_HOME
>>
>> sudo ln –s include include                                         *#
>> JDK include directory has headers required for jzmq*
>>
>> cd ~
>>
>> git clone https://github.com/nathanmarz/jzmq.git
>>
>> cd jzmq/src/
>>
>> CLASSPATH=.:./.:$CLASSPATH
>>
>> touch classdist_noinst.stamp
>>
>> javac -d . org/zeromq/ZMQ.java org/zeromq/ZMQException.java
>> org/zeromq/ZMQQueue.java org/zeromq/ZMQForwarder.java
>> org/zeromq/ZMQStreamer.java
>>
>> cd ~/jzmq
>>
>> ./autogen.sh
>>
>> ./configure
>>
>> make
>>
>> sudo make install
>>
>> cd ~
>>
>> *# Download zookeeper*
>>
>> wget
>> http://mirror.cc.columbia.edu/pub/software/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
>>
>> tar -zxvf zookeeper-3.4.6.tar.gz
>>
>> cd zookeeper-3.4.6
>>
>> vi conf/zoo.cfg
>>
>>
>> tickTime=2000
>>
>> dataDir=/tmp/zookeeper
>>
>> clientPort=2181
>>
>>
>> mkdir /tmp/zookeeper
>> bin/zkServer.sh start
>> bin/zkCli.sh -server 54.68.149.181:2181
>>
>>
>> *# Download storm and modify configuration file (changes over
>> default.yaml shown in bold)*
>>
>> https://www.dropbox.com/s/fl4kr7w0oc8ihdw/storm-0.8.2.zipunzip
>> storm-0.8.2.zip
>> <https://www.dropbox.com/s/fl4kr7w0oc8ihdw/storm-0.8.2.zipunzipstorm-0.8.2.zip>
>>
>> cd storm-0.8.2.zip
>>
>> vi conf/storm.yaml
>>
>>
>> # Licensed to the Apache Software Foundation (ASF) under one
>>>
>>> # or more contributor license agreements.  See the NOTICE file
>>>
>>> # distributed with this work for additional information
>>>
>>> # regarding copyright ownership.  The ASF licenses this file
>>>
>>> # to you under the Apache License, Version 2.0 (the
>>>
>>> # "License"); you may not use this file except in compliance
>>>
>>> # with the License.  You may obtain a copy of the License at
>>>
>>> #
>>>
>>> # http:# www.apache.org/licenses/LICENSE-2.0
>>>
>>> #
>>>
>>> # Unless required by applicable law or agreed to in writing, software
>>>
>>> # distributed under the License is distributed on an "AS IS" BASIS,
>>>
>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>>> implied.
>>>
>>> # See the License for the specific language governing permissions and
>>>
>>> # limitations under the License.
>>>
>>>
>>>>
>>>> ########### These all have default values as shown
>>>
>>> ########### Additional configuration goes into storm.yaml
>>>
>>>
>>>> java.library.path: "/usr/local/lib:/opt/local/lib:/usr/lib:
>>>> */home/ec2-user*"
>>>
>>>
>>>> ### storm.* configs are general configurations
>>>
>>> # the local dir is where jars are kept
>>>
>>> storm.local.dir: "storm-local"
>>>
>>> storm.zookeeper.servers:
>>>
>>>     - "*54.68.149.181*"
>>>
>>> storm.zookeeper.port: 2181
>>>
>>> storm.zookeeper.root: "/storm"
>>>
>>> storm.zookeeper.session.timeout: 20000
>>>
>>> storm.zookeeper.connection.timeout: 15000
>>>
>>> storm.zookeeper.retry.times: 5
>>>
>>> storm.zookeeper.retry.interval: 1000
>>>
>>> storm.zookeeper.retry.intervalceiling.millis: 30000
>>>
>>> storm.cluster.mode: "distributed" # can be distributed or local
>>>
>>> storm.local.mode.zmq: false
>>>
>>> storm.thrift.transport:
>>>> "backtype.storm.security.auth.SimpleTransportPlugin"
>>>
>>> storm.messaging.transport: "backtype.storm.messaging.zmq"
>>>
>>>
>>>> ### nimbus.* configs are for the master
>>>
>>> nimbus.host: "*54.68.149.181*"
>>>
>>> nimbus.thrift.port: 6627
>>>
>>> nimbus.childopts: "-Xmx1024m"
>>>
>>> nimbus.task.timeout.secs: 30
>>>
>>> nimbus.supervisor.timeout.secs: 60
>>>
>>> nimbus.monitor.freq.secs: 10
>>>
>>> nimbus.cleanup.inbox.freq.secs: 600
>>>
>>> nimbus.inbox.jar.expiration.secs: 3600
>>>
>>> nimbus.task.launch.secs: 120
>>>
>>> nimbus.reassign: true
>>>
>>> nimbus.file.copy.expiration.secs: 600
>>>
>>> nimbus.topology.validator:
>>>> "backtype.storm.nimbus.DefaultTopologyValidator"
>>>
>>>
>>>> ### ui.* configs are for the master
>>>
>>> ui.port: 8080
>>>
>>> ui.childopts: "-Xmx768m"
>>>
>>>
>>>> logviewer.port: 8000
>>>
>>> logviewer.childopts: "-Xmx128m"
>>>
>>> logviewer.appender.name: "A1"
>>>
>>>
>>>>
>>>> drpc.port: 3772
>>>
>>> drpc.worker.threads: 64
>>>
>>> drpc.queue.size: 128
>>>
>>> drpc.invocations.port: 3773
>>>
>>> drpc.request.timeout.secs: 600
>>>
>>> drpc.childopts: "-Xmx768m"
>>>
>>>
>>>> transactional.zookeeper.root: "/transactional"
>>>
>>> transactional.zookeeper.servers: null
>>>
>>> transactional.zookeeper.port: null
>>>
>>>
>>>> ### supervisor.* configs are for node supervisors
>>>
>>> # Define the amount of workers that can be run on this machine. Each
>>>> worker is assigned a port to use for communication
>>>
>>> supervisor.slots.ports:
>>>
>>>     - 6700
>>>
>>>     - 6701
>>>
>>>     - 6702
>>>
>>>     - 6703
>>>
>>> supervisor.childopts: "-Xmx256m"
>>>
>>> #how long supervisor will wait to ensure that a worker process is started
>>>
>>> supervisor.worker.start.timeout.secs: 120
>>>
>>> #how long between heartbeats until supervisor considers that worker dead
>>>> and tries to restart it
>>>
>>> supervisor.worker.timeout.secs: 30
>>>
>>> #how frequently the supervisor checks on the status of the processes
>>>> it's monitoring and restarts if necessary
>>>
>>> supervisor.monitor.frequency.secs: 3
>>>
>>> #how frequently the supervisor heartbeats to the cluster state (for
>>>> nimbus)
>>>
>>> supervisor.heartbeat.frequency.secs: 5
>>>
>>> supervisor.enable: true
>>>
>>>
>>>> ### worker.* configs are for task workers
>>>
>>> worker.childopts: "-Xmx768m"
>>>
>>> worker.heartbeat.frequency.secs: 1
>>>
>>>
>>>> task.heartbeat.frequency.secs: 3
>>>
>>> task.refresh.poll.secs: 10
>>>
>>>
>>>> zmq.threads: 1
>>>
>>> zmq.linger.millis: 5000
>>>
>>> zmq.hwm: 0
>>>
>>>
>>>>
>>>> storm.messaging.netty.server_worker_threads: 1
>>>
>>> storm.messaging.netty.client_worker_threads: 1
>>>
>>> storm.messaging.netty.buffer_size: 5242880 #5MB buffer
>>>
>>> storm.messaging.netty.max_retries: 30
>>>
>>> storm.messaging.netty.max_wait_ms: 1000
>>>
>>> storm.messaging.netty.min_wait_ms: 100
>>>
>>>
>>>> ### topology.* configs are for specific executing storms
>>>
>>> topology.enable.message.timeouts: true
>>>
>>> topology.debug: false
>>>
>>> topology.optimize: true
>>>
>>> topology.workers: 1
>>>
>>> topology.acker.executors: null
>>>
>>> topology.tasks: null
>>>
>>> # maximum amount of time a message has to complete before it's
>>>> considered failed
>>>
>>> topology.message.timeout.secs: 30
>>>
>>> topology.skip.missing.kryo.registrations: false
>>>
>>> topology.max.task.parallelism: null
>>>
>>> topology.max.spout.pending: null
>>>
>>> topology.state.synchronization.timeout.secs: 60
>>>
>>> topology.stats.sample.rate: 0.05
>>>
>>> topology.builtin.metrics.bucket.size.secs: 60
>>>
>>> topology.fall.back.on.java.serialization: true
>>>
>>> topology.worker.childopts: null
>>>
>>> topology.executor.receive.buffer.size: 1024 #batched
>>>
>>> topology.executor.send.buffer.size: 1024 #individual messages
>>>
>>> topology.receiver.buffer.size: 8 # setting it too high causes a lot of
>>>> problems (heartbeat thread gets starved, throughput plummets)
>>>
>>> topology.transfer.buffer.size: 1024 # batched
>>>
>>> topology.tick.tuple.freq.secs: null
>>>
>>> topology.worker.shared.thread.pool.size: 4
>>>
>>> topology.disruptor.wait.strategy:
>>>> "com.lmax.disruptor.BlockingWaitStrategy"
>>>
>>> topology.spout.wait.strategy:
>>>> "backtype.storm.spout.SleepSpoutWaitStrategy"
>>>
>>> topology.sleep.spout.wait.strategy.time.ms: 1
>>>
>>> topology.error.throttle.interval.secs: 10
>>>
>>> topology.max.error.report.per.interval: 5
>>>
>>> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
>>>
>>> topology.tuple.serializer:
>>>> "backtype.storm.serialization.types.ListDelegateSerializer"
>>>
>>> topology.trident.batch.emit.interval.millis: 500
>>>
>>>
>>>> dev.zookeeper.path: "/tmp/dev-storm-zookeeper"
>>>
>>>
>>
>> bin/storm nimbus &
>>
>> bin/storm supervisor &
>>
>> bin/storm ui &
>>
>> *I launch another basic EC2 CentOS machine and do the exact same commands
>> except that I don't install zookeeper and I only run "bin/storm nimbus &"
>> at the end*
>>
>> Any thoughts would be greatly appreciated. Thanks so much for all of your
>> help. I'm sure someone else has done this before!
>>
>> On Mon, Sep 8, 2014 at 11:14 PM, Vikas Agarwal <[email protected]>
>> wrote:
>>
>>> Although, implementing the Storm cluster manually would be really nice
>>> learning, I would suggest using HortonWorks distribution which comes with
>>> Storm as OOTB solution and you can configure everything from Ambari UI. We
>>> are using Storm on Amazon EC2 machine, though it is right now in beta
>>> stage. We are going to move to production in coming 2-3 months.
>>>
>>>
>>> On Tue, Sep 9, 2014 at 5:53 AM, Stephen Hartzell <
>>> [email protected]> wrote:
>>>
>>>> All,
>>>>
>>>>   I implemented the suggestions given by Parh and Harsha. I am now
>>>> using the default.yaml but I changed the storm.zookeeper.servers to the
>>>> nimbus machine's ip address: 54.68.149.181. I also changed the nimbus.host
>>>> to 54.68.149.181. I also opened up port 6627. Now, the UI web page gives
>>>> the following error: org.apache.thrift7.transport.TTransportException:
>>>> java.net.ConnectException: Connection refused
>>>>
>>>> You should be able to see the error it gives by going to the web page
>>>> yourself at: http://54.68.149.181:8080. I am only using this account
>>>> to test and see if I can even get storm to work, so these machines are only
>>>> for testing. Perhaps someone could tell me what the storm.yaml file should
>>>> look like for this setup?
>>>>
>>>> -Thanks, Stephne
>>>>
>>>> On Mon, Sep 8, 2014 at 7:41 PM, Stephen Hartzell <
>>>> [email protected]> wrote:
>>>>
>>>>> I'm getting kind of confused by the storm.yaml file. Should I be using
>>>>> the default.yaml and just modify the zookeeper and nimbus ip, or should I
>>>>> use a bran new storm.yaml?
>>>>>
>>>>> My nimbus machine has the ip address: 54.68.149.181. My zookeeper is
>>>>> on the nimbus machine. what should the storm.yaml look like on my worker
>>>>> and nimbus machine? Will the storm.yaml be the same on my worker and 
>>>>> nimbus
>>>>> machine? I am not trying to do anything fancy, I am just trying to get a
>>>>> very basic cluster up and running.
>>>>>
>>>>> -Thanks, Stephen
>>>>>
>>>>> On Mon, Sep 8, 2014 at 7:00 PM, Stephen Hartzell <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>   All Thanks so much for your help. I cannot tell you how much I
>>>>>> appreciate it. I'm going to try out your suggestions and keep banging my
>>>>>> head again the wall : D. I've spent an enormous amount of time trying to
>>>>>> get this to work. I'll let you know what happens after I try to implement
>>>>>> your suggestions. It would be really cool if someone had a tutorial that
>>>>>> detailed this part. (I'll make it myself if I ever get this to work!) It
>>>>>> seems like trying to get a two-machine cluster setup on AWS would be a 
>>>>>> VERY
>>>>>> common use-case. I've read and watched everything I can on the topic and
>>>>>> nothing got it working for me!
>>>>>>
>>>>>> On Mon, Sep 8, 2014 at 6:54 PM, Parth Brahmbhatt <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> The worker connects to the thrift port and not the ui port. You need
>>>>>>> to open port 6627 or whatever is the value being set in storm.yaml using
>>>>>>>  property “nimbus.thrift.port”.
>>>>>>>
>>>>>>> Based on the configuration that you have pointed so far it seems
>>>>>>> your nimbus host has nimbus,ui,supervisor working because you actually 
>>>>>>> have
>>>>>>> zookeeper running locally on that host. As Harsha pointed out you need 
>>>>>>> to
>>>>>>> change it to a value that is the public ip instead of loopback 
>>>>>>> interface.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Parth
>>>>>>>
>>>>>>>
>>>>>>> On Sep 8, 2014, at 3:42 PM, Harsha <[email protected]> wrote:
>>>>>>>
>>>>>>> storm.zookeeper.servers:
>>>>>>>      - "127.0.0.1"
>>>>>>> nimbus.host: "127.0.0.1" ( *127.0.0.1 causes to bind a loopback
>>>>>>> interface , instead either use your public ip or 0.0.0.0*)
>>>>>>> storm.local.dir: /tmp/storm ( I* recommend this to move to a
>>>>>>> different folder probably /home/storm, /tmp/storm will get deleted if 
>>>>>>> your
>>>>>>> machine is restarted)*
>>>>>>>
>>>>>>> make sure you zookeeper is also listening in 0.0.0.0 or public ip
>>>>>>> not 127.0.0.1.
>>>>>>>
>>>>>>> "No, I cannot ping my host which has a public ip address of
>>>>>>> 54.68.149.181"
>>>>>>> you are not able to reach this ip form worker node but able to
>>>>>>> access the UI using it?
>>>>>>> -Harsha
>>>>>>>
>>>>>>> On Mon, Sep 8, 2014, at 03:34 PM, Stephen Hartzell wrote:
>>>>>>>
>>>>>>> Harsha,
>>>>>>>
>>>>>>>   The storm.yaml on the host machine looks like this:
>>>>>>>
>>>>>>> storm.zookeeper.servers:
>>>>>>>      - "127.0.0.1"
>>>>>>>
>>>>>>>
>>>>>>> nimbus.host: "127.0.0.1"
>>>>>>>
>>>>>>> storm.local.dir: /tmp/storm
>>>>>>>
>>>>>>>
>>>>>>>   The storm.yaml on the worker machine looks like this:
>>>>>>>
>>>>>>> storm.zookeeper.servers:
>>>>>>>      - "54.68.149.181"
>>>>>>>
>>>>>>>
>>>>>>> nimbus.host: "54.68.149.181"
>>>>>>>
>>>>>>> storm.local.dir: /tmp/storm
>>>>>>>
>>>>>>> No, I cannot ping my host which has a public ip address of
>>>>>>> 54.68.149.181 although I can connect to the UI web page when it is 
>>>>>>> hosted.
>>>>>>> I don't know how I would go about connecting to zookeeper on the nimbus
>>>>>>> host.
>>>>>>> -Thanks, Stephen
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 8, 2014 at 6:28 PM, Harsha <[email protected]> wrote:
>>>>>>>
>>>>>>>
>>>>>>> There aren't any errors in worker machine supervisor logs. Are you
>>>>>>> using the same storm.yaml for both the machines and also are you able to
>>>>>>> ping your nimbus host or connect to zookeeper on nimbus host.
>>>>>>> -Harsha
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 8, 2014, at 03:24 PM, Stephen Hartzell wrote:
>>>>>>>
>>>>>>> Harsha,
>>>>>>>
>>>>>>>   Thanks so much for getting back with me. I will check the logs,
>>>>>>> but I don't seem to get any error messages. I have a nimbus AWS machine
>>>>>>> with zookeeper on it and a worker AWS machine.
>>>>>>>
>>>>>>> On the nimbus machine I start the zookeeper and then I run:
>>>>>>>
>>>>>>> bin/storm nimbus &
>>>>>>> bin/storm supervisor &
>>>>>>> bin/storm ui
>>>>>>>
>>>>>>> On the worker machine I run:
>>>>>>> bin/storm supervisor
>>>>>>>
>>>>>>> When I go to the UI page, I only see 1 supervisor (the one on the
>>>>>>> nimbus machine). So apparently, the worker machine isn't "registering" 
>>>>>>> with
>>>>>>> the nimbus machine.
>>>>>>>
>>>>>>> On Mon, Sep 8, 2014 at 6:16 PM, Harsha <[email protected]> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Stephen,
>>>>>>>         What are the issues you are seeing.
>>>>>>> "How do worker machines "know" how to connect to nimbus? Is it in
>>>>>>> the storm configuration file"
>>>>>>>
>>>>>>> Yes. make sure you the supervisor(worker) , nimbus nodes  are able
>>>>>>> to connect to your zookeeper cluster.
>>>>>>> Check your logs under storm_inst/logs/ for any errors when you try
>>>>>>> to start nimbus or supervisors.
>>>>>>> If you are installing it manually try following these steps if you
>>>>>>> are not already done.
>>>>>>>
>>>>>>> http://www.michael-noll.com/tutorials/running-multi-node-storm-cluster/
>>>>>>> -Harsha
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 8, 2014, at 03:01 PM, Stephen Hartzell wrote:
>>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> I would greatly appreciate any help that anyone would afford. I've
>>>>>>> been trying to setup a storm cluster on AWS for a few weeks now on 
>>>>>>> centOS
>>>>>>> EC2 machines. So far, I haven't been able to get a cluster built. I can 
>>>>>>> get
>>>>>>> a supervisor and nimbus to run on a single machine, but I can't figure 
>>>>>>> out
>>>>>>> how to get another worker to connect to nimbus. How do worker machines
>>>>>>> "know" how to connect to nimbus? Is it in the storm configuration file?
>>>>>>> I've gone through many tutorials and the official documentation, but 
>>>>>>> this
>>>>>>> point doesn't seem to be covered anywhere in sufficient detail for a new
>>>>>>> guy like me.
>>>>>>>
>>>>>>>   Some of you may be tempted to point me toward storm-deploy, but I
>>>>>>> spent four days trying to get that to work until I gave up. I'm having
>>>>>>> Issue #58 on github. Following the instructions exactly and other 
>>>>>>> tutorials
>>>>>>> on a bran new AWS machine fails. So I gave up on storm-deploy and 
>>>>>>> decided
>>>>>>> to try and setup a cluster manually. Thanks in advance to anyone 
>>>>>>> willing to
>>>>>>> offer me any inputs you can!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>> entity to which it is addressed and may contain information that is
>>>>>>> confidential, privileged and exempt from disclosure under applicable 
>>>>>>> law.
>>>>>>> If the reader of this message is not the intended recipient, you are 
>>>>>>> hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited. 
>>>>>>> If
>>>>>>> you have received this communication in error, please contact the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Reply via email to