Re: Problem initializing pipes in HamaStreaming

Martin Illecker Fri, 27 Sep 2013 03:32:01 -0700

Hi Roman,

then you don't have started hdfs? (start-dfs.sh)


Are you able to access the hdfs namenode?
http://localhost:50070/dfshealth.jsp

Your hdfs should contain the following files:

$hadoop fs -ls /tmp/PyStreaming/
Found 8 items
-rw-r--r--   279 2013-09-27 12:19 /tmp/PyStreaming/BSP.py
-rw-r--r--   5159 2013-09-27 12:19 /tmp/PyStreaming/BSPPeer.py
-rw-r--r--   379 2013-09-27 12:19 /tmp/PyStreaming/BSPRunner.py
-rw-r--r--   970 2013-09-27 12:19 /tmp/PyStreaming/BinaryProtocol.py
-rw-r--r--   299 2013-09-27 12:19 /tmp/PyStreaming/BspJobConfiguration.py
-rw-r--r--   557 2013-09-27 12:19 /tmp/PyStreaming/HelloWorldBSP.py
-rw-r--r--   5570 2013-09-27 12:19 /tmp/PyStreaming/KMeansBSP.py
-rw-r--r--   326 2013-09-27 12:19 /tmp/PyStreaming/README

Without the default file system in hama-site.xml, it will not work.

Martin


2013/9/27 Roman Shapovalov <[email protected]>

> Martin,
>
> if I set default file system to hdfs://localhost/, I get the connection
> error:
>
> 13/09/27 14:04:11 INFO ipc.Client: Retrying connect to server:
> localhost/127.0.0.1:40000. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1
> SECONDS)
>
> (and 10 times like that, than get a java.net.ConnectException).
>
> I attach the hama-site.xml (as it was before adding the default fs
> property). I had only added the bsp.master.address property to switch
> to the PDM.
>
> Roman
>
> On Fri, Sep 27, 2013 at 4:20 AM, Martin Illecker <[email protected]>
> wrote:
> > Hi Roman!
> >
> > Did you setup the default filesystem in hama-site.xml?
> >
> > Please submit your hama-site.xml configuration.
> >
> > Martin
> >
> >
> > hama-site.xml - pseudo-distributed mode
> >
> > <configuration>
> >
> >     <property>
> >         <name>bsp.master.address</name>
> >         <value>localhost:40000</value>
> >         <description>The address of the bsp master server. Either the
> >             literal string "local" or a host:port for distributed mode
> >         </description>
> >     </property>
> >
> >     <property>
> >         <name>fs.default.name</name>
> >         <value>hdfs://localhost/</value>
> >         <description>
> >             The name of the default file system. Either the literal
> string
> >             "local" or a host:port for HDFS.
> >         </description>
> >     </property>
> >
> >     <property>
> >         <name>hama.zookeeper.quorum</name>
> >         <value>localhost</value>
> >         <description>Comma separated list of servers in the ZooKeeper
> Quorum.
> >             For example, "host1.mydomain.com,host2.mydomain.com,
> host3.mydomain.com".
> >             By default this is set to localhost for local and
> pseudo-distributed modes
> >             of operation. For a fully-distributed setup, this should be
> set to a full
> >             list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is set
> in hama-env.sh
> >             this is the list of servers which we will start/stop
> zookeeper on.
> >         </description>
> >     </property>
> >
> > </configuration>
> >
> >
> > Am 27.09.2013 um 09:32 schrieb Roman Shapovalov <
> [email protected]>:
> >
> >> Edward,
> >>
> >> Yes, I did. See the logs in my previous message.
> >>
> >> Roman
> >>
> >> On Fri, Sep 27, 2013 at 7:15 AM, Edward J. Yoon <[email protected]>
> wrote:
> >>> Have you tried to run in pseudo-distributed mode?
> >>>
> >>> On Fri, Sep 27, 2013 at 5:47 AM, Roman Shapovalov
> >>> <[email protected]> wrote:
> >>>> Martin,
> >>>>
> >>>> Thanks for such verbose instructions.
> >>>>
> >>>>> You can find all Hama configuration files in the *conf* folder.
> >>>>
> >>>> OK, I thought Edward meant Hadoop configs specifically.
> >>>> I have only added JAVA_HOME variable there, otherwise they are
> default.
> >>>>
> >>>>> You should also find task logs in your *temp* folder.
> >>>>
> >>>> I found the folder, but there were no .log files in the attempt*
> >>>> folders (in both modes).
> >>>>
> >>>>> Normally you should find it in *hama/logs/tasklogs*.
> >>>>
> >>>> They appear in the pseudo-distributed mode only (which also fails).
> >>>> See the attached file.
> >>>>
> >>>>> By the way do you have python3.2 installed? :-)
> >>>>
> >>>> Yes. "python" links to Python 2.6, but I pass "python3.2" as an
> >>>> interpreter, which links to the correct version.
> >>>>
> >>>>
> >>>> Roman
> >>>>
> >>>> On Thu, Sep 26, 2013 at 4:03 PM, Martin Illecker <
> [email protected]> wrote:
> >>>>> Hi Roman,
> >>>>>
> >>>>> if you are running Hama in local mode, it will not use HDFS anyway.
> >>>>>
> >>>>> You can find all Hama configuration files in the *conf* folder.
> >>>>>
> >>>>> $ll hama/conf/
> >>>>> total 56
> >>>>> -rwxr-xr-x groomservers*
> >>>>> -rwxr-xr-x hama-default.xml*
> >>>>> -rwxr-xr-x hama-env.sh*
> >>>>> -rwxr-xr-x hama-site.xml*
> >>>>> -rwxr-xr-x log4j.properties*
> >>>>>
> >>>>> Probably you should setup the Pseudo Distributed Mode [1] in
> hama-site.xml.
> >>>>>
> >>>>> But the task log would be very interesting.
> >>>>>
> >>>>> Normally you should find it in *hama/logs/tasklogs*.
> >>>>> e.g.,
> hama/logs/tasklogs/job_201309262134_0001/attempt_201309262134_0001_000000_0.log
> >>>>>
> >>>>> You should also find task logs in your *temp* folder.
> >>>>> But this location will depend on your operation system.
> >>>>> e.g., in OSX
> >>>>>
> /private/tmp/hadoop-YOURUSER/bsp/local/groomServer/attempt_201309262134_0001_000000_0/work/tasklogs/
> >>>>>
> >>>>> By the way do you have python3.2 installed? :-)
> >>>>> $ python --version
> >>>>> Python 3.2.5
> >>>>> $ python3.2 --version
> >>>>> Python 3.2.5
> >>>>>
> >>>>> May I ask which operation system do you use?
> >>>>>
> >>>>> Martin
> >>>>>
> >>>>> [1]
> http://wiki.apache.org/hama/GettingStarted#Pseudo_Distributed_Mode
> >>>>>
> >>>>>
> >>>>>
> >>>>> 2013/9/26 Roman Shapovalov <[email protected]>
> >>>>>
> >>>>>> Hi Edward,
> >>>>>>
> >>>>>> Could you please be more specific? (Sorry, I am new to this stuff)
> >>>>>>
> >>>>>> I run Hama in local mode. The logs/ directory is empty, and I did
> not
> >>>>>> find any logs in HDFS as well.
> >>>>>>
> >>>>>> And where can I find the Hadoop configuration?
> >>>>>>
> >>>>>> Thank you,
> >>>>>> Roman
> >>>>>>
> >>>>>> On Thu, Sep 26, 2013 at 12:05 PM, Edward J. Yoon <
> [email protected]>
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> That's strange. Can you attach your namenode logs and hadoop
> >>>>>> configurations?
> >>>>>>>
> >>>>>>> On Thu, Sep 26, 2013 at 11:03 PM, Roman Shapovalov
> >>>>>>> <[email protected]> wrote:
> >>>>>>>> Hi again,
> >>>>>>>>
> >>>>>>>> I have updated both Hama (from the trunk) and Streaming (from
> Martin's
> >>>>>>>> github), and checked that patches have been applied, but I keep
> >>>>>>>> getting the same error (full log for local configuration is
> attached).
> >>>>>>>>
> >>>>>>>> Another thing may be relevant: I keep the default Hadoop
> libraries in
> >>>>>>>> lib/. If I replace them as the tutorial says, some classes cannot
> be
> >>>>>>>> found even if  I run pure Hama (which works perfectly with default
> >>>>>>>> libs). I don't know if it is important.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Roman
> >>>>>>>>
> >>>>>>>> On Tue, Sep 24, 2013 at 9:22 AM, Martin Illecker <
> [email protected]>
> >>>>>> wrote:
> >>>>>>>>> Hi Roman,
> >>>>>>>>>
> >>>>>>>>> sorry for inconvenience!
> >>>>>>>>> The problem has been reported [1] and will be fixed shortly to
> the
> >>>>>> trunk.
> >>>>>>>>>
> >>>>>>>>> [1] https://issues.apache.org/jira/browse/HAMA-805
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 2013/9/23 Edward J. Yoon <[email protected]>
> >>>>>>>>>
> >>>>>>>>>> This looks like a bug of DistCacheUtils.
> >>>>>>>>>>
> >>>>>>>>>> Thanks for your report. I'll look at it tomorrow.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Sep 23, 2013 at 11:52 PM, Roman Shapovalov
> >>>>>>>>>> <[email protected]> wrote:
> >>>>>>>>>>> Hello all,
> >>>>>>>>>>>
> >>>>>>>>>>> I try to use Hama Streaming.
> >>>>>>>>>>> I have successfully installed Hama (the Pi example works).
> >>>>>>>>>>> I follow this tutorial:
> >>>>>>>>>>> http://wiki.apache.org/hama/HamaStreaming
> >>>>>>>>>>>
> >>>>>>>>>>> When I try to run the distributed HelloWorld in the local
> >>>>>>>>>>> configuration, I get the following error:
> >>>>>>>>>>>
> >>>>>>>>>>> $ bin/hama pipes -streaming true -bspTasks 3 -interpreter
> python3.2
> >>>>>>>>>>> -cachefiles /tmp/PyStreaming/*.py -output /tmp/pystream-out/
> >>>>>> -program
> >>>>>>>>>>> /tmp/PyStreaming/BSPRunner.py -programArgs HelloWorldBSP
> >>>>>>>>>>>
> >>>>>>>>>>> 13/09/23 18:03:50 INFO pipes.Submitter: Streaming enabled!
> >>>>>>>>>>> 13/09/23 18:03:50 WARN util.NativeCodeLoader: Unable to load
> >>>>>>>>>>> native-hadoop library for your platform... using builtin-java
> >>>>>> classes
> >>>>>>>>>>> where applicable
> >>>>>>>>>>> 13/09/23 18:03:50 WARN bsp.BSPJobClient: No job jar file set.
>  User
> >>>>>>>>>>> classes may not be found. See BSPJob#setJar(String) or check
> Your
> >>>>>> jar
> >>>>>>>>>>> file.
> >>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.BSPJobClient: Running job:
> >>>>>>>>>> job_localrunner_0001
> >>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.LocalBSPRunner: Setting up a new
> barrier
> >>>>>> for
> >>>>>>>>>> 3 tasks!
> >>>>>>>>>>> 13/09/23 18:03:50 ERROR bsp.LocalBSPRunner: Exception during
> BSP
> >>>>>>>>>> execution!
> >>>>>>>>>>> java.lang.NullPointerException
> >>>>>>>>>>>    at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:255)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
> >>>>>>>>>>>    at
> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>>>>>>>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> >>>>>>>>>>>    at
> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>>>>>>>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >>>>>>>>>>>    at java.lang.Thread.run(Thread.java:662)
> >>>>>>>>>>> [output cropped]
> >>>>>>>>>>>
> >>>>>>>>>>> When I turn to the pseudo-distributed mode, job fails too
> (after a
> >>>>>>>>>>> minute of execution):
> >>>>>>>>>>>
> >>>>>>>>>>> 13/09/23 18:46:34 INFO pipes.Submitter: Streaming enabled!
> >>>>>>>>>>> 13/09/23 18:46:34 WARN util.NativeCodeLoader: Unable to load
> >>>>>>>>>>> native-hadoop library for your platform... using builtin-java
> >>>>>> classes
> >>>>>>>>>>> where applicable
> >>>>>>>>>>> 13/09/23 18:46:34 WARN bsp.BSPJobClient: No job jar file set.
>  User
> >>>>>>>>>>> classes may not be found. See BSPJob#setJar(String) or check
> Your
> >>>>>> jar
> >>>>>>>>>>> file.
> >>>>>>>>>>> 13/09/23 18:46:34 INFO bsp.BSPJobClient: Running job:
> >>>>>>>>>> job_201309231846_0001
> >>>>>>>>>>> 13/09/23 18:47:40 INFO bsp.BSPJobClient: Job failed.
> >>>>>>>>>>>
> >>>>>>>>>>> Task log contains errors:
> >>>>>>>>>>>
> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: Starting Socket Reader #1
> for
> >>>>>> port
> >>>>>>>>>> 43475
> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server Responder:
> starting
> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server listener on
> 43475:
> >>>>>> starting
> >>>>>>>>>>> 13/09/23 18:46:37 INFO message.HadoopMessageManagerImpl:
>  BSPPeer
> >>>>>>>>>>> address:localhost.localdomain port:43475
> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server handler 0 on
> 43475:
> >>>>>>>>>> starting
> >>>>>>>>>>> 13/09/23 18:46:37 WARN util.NativeCodeLoader: Unable to load
> >>>>>>>>>>> native-hadoop library for your platform... using builtin-java
> >>>>>> classes
> >>>>>>>>>>> where applicable
> >>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZKSyncClient: Initializing ZK Sync
> >>>>>> Client
> >>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZooKeeperSyncClientImpl: Start
> >>>>>> connecting
> >>>>>>>>>>> to Zookeeper! At localhost.localdomain/127.0.0.1:43475
> >>>>>>>>>>> 13/09/23 18:46:37 ERROR bsp.BSPTask: Error running bsp setup
> and bsp
> >>>>>>>>>> function.
> >>>>>>>>>>> java.lang.NullPointerException
> >>>>>>>>>>>    at java.io.File.<init>(File.java:222)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.pipes.PipesApplication.setupCommand(PipesApplication.java:130)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:257)
> >>>>>>>>>>>    at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44)
> >>>>>>>>>>>    at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:176)
> >>>>>>>>>>>    at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
> >>>>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246)
> >>>>>>>>>>> [output cropped]
> >>>>>>>>>>>
> >>>>>>>>>>> I use the latest trunk version of Hama, Python 3.2.5 and Hadoop
> >>>>>>>>>> 2.0.0-cdh4.1.1.
> >>>>>>>>>>>
> >>>>>>>>>>> Please help me to figure out the problem.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks in advance,
> >>>>>>>>>>> Roman
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best Regards, Edward J. Yoon
> >>>>>>>>>> @eddieyoon
> >>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards, Edward J. Yoon
> >>>>>>> @eddieyoon
> >>>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards, Edward J. Yoon
> >>> @eddieyoon
> >
>

Re: Problem initializing pipes in HamaStreaming

Reply via email to