Enjoy! On Fri, Sep 27, 2013 at 11:31 PM, Roman Shapovalov <[email protected]> wrote: > It works now! > > I just needed to add more jars to the lib/ folder, i.e. hdfs and protobuf. > You might want to refresh the Hadoop Installation section in the > tutorial. In this case, the list of files I copied: > > protobuf-java-2.4.0a.jar > guava-11.0.2.jar > hadoop-auth-2.0.0-cdh4.1.1.jar > hadoop-common-2.0.0-cdh4.1.1-tests.jar > hadoop-common-2.0.0-cdh4.1.1.jar > hadoop-core-2.0.0-mr1-cdh4.1.1.jar > hadoop-hdfs-2.0.0-cdh4.1.1-tests.jar > hadoop-hdfs-2.0.0-cdh4.1.1.jar > hadoop-test-2.0.0-mr1-cdh4.1.1.jar > > Thank you guys for the help! You are the best! > > Roman > > On Fri, Sep 27, 2013 at 10:15 AM, Edward J. Yoon <[email protected]> > wrote: >> :-) >> >> Please check whether hdfs deamons are running with following command. >> >> % ps -ef | grep java >> >> or use the web UI. >> >> If everything is OK, then start Hama - >> http://wiki.apache.org/hama/GettingStarted#Pseudo_Distributed_Mode >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon >> >> On 2013. 9. 27., at 오후 10:59, Roman Shapovalov >> <[email protected]> wrote: >> >>>> If so, use hostname or localhost instead of 0.0.0.0. >>> >>> Done. >>> >>>> You can check the Hama cluster status via web UI at http://localhost:40013/ >>> >>> It does not load at all for both non-default fs.default.name values. >>> >>> >>> Also, I changed Hadoop libs in the libs/ folder, since the default >>> ones were of the version 1.2 (it may have caused the IPC version >>> mismatch error). Besides the core and test jars (as listed in the >>> tutorial), I added some more (it seems in 2.0 they decoupled core to >>> several jars). Does it make sense? >>> >>> It still does not connect, but with a new error (from the bspmaster log): >>> >>> 2013-09-27 17:48:46,437 ERROR org.apache.hama.bsp.BSPMaster: Can't get >>> connection to Hadoop Namenode! >>> java.io.IOException: No FileSystem for scheme: hdfs >>> at >>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206) >>> >>> Roman >>> >>> On Fri, Sep 27, 2013 at 9:40 AM, Edward J. Yoon <[email protected]> >>> wrote: >>>> So, did you set the value of fs.default.name property to >>>> hdfs://0.0.0.0:8020? If so, use hostname or localhost instead of 0.0.0.0. >>>> >>>> You can check the Hama cluster status via web UI at http://localhost:40013/ >>>> >>>> -- >>>> Best Regards, Edward J. Yoon >>>> @eddieyoon >>>> >>>> On 2013. 9. 27., at 오후 10:13, Roman Shapovalov >>>> <[email protected]> wrote: >>>> >>>>> I've set filesystem to "hdfs://0.0.0.0:8020". >>>>> >>>>>> Please check the bspmaster log. >>>>> >>>>> There is the following error: >>>>> 2013-09-27 17:01:32,697 ERROR org.apache.hama.bsp.BSPMaster: Can't get >>>>> connection to Hadoop Namenode! >>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot >>>>> communicate with client version 4 >>>>> >>>>> The full log is attached. >>>>> >>>>> Roman >>>>> >>>>> >>>>> On Fri, Sep 27, 2013 at 8:59 AM, Edward J. Yoon <[email protected]> >>>>> wrote: >>>>>> CDH's default dfs port is 8020. >>>>>> >>>>>>>> May there be the problem with permissions for HDFS access? >>>>>> >>>>>> Please check the bspmaster log. I think your problem is a configuration >>>>>> issue. >>>>>> >>>>>> See http://wiki.apache.org/hama/GettingStarted#Pseudo_Distributed_Mode >>>>>> >>>>>> -- >>>>>> Best Regards, Edward J. Yoon >>>>>> @eddieyoon >>>>>> >>>>>> On 2013. 9. 27., at 오후 9:54, Martin Illecker <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Please have a look in your hadoop configuration *hadoop-site.xml* [1]. >>>>>>> >>>>>>> Try setting the same default filesystem in *hama-site.xml*. >>>>>>> >>>>>>> <property> >>>>>>> <name>fs.default.name</name> >>>>>>> <value>hdfs://localhost:54310</value> >>>>>>> </property> >>>>>>> >>>>>>> [1] http://wiki.apache.org/hadoop/GettingStartedWithHadoop >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2013/9/27 Roman Shapovalov <[email protected]> >>>>>>> >>>>>>>> Edward, >>>>>>>> >>>>>>>>> I've added our own DistCacheUtils class >>>>>>>> >>>>>>>> I use the current version of it. >>>>>>>> >>>>>>>> I attach the current console output (do you call it the bspmaster >>>>>>>> log?), with DEBUG logs. Is there a way get even more verbose output? >>>>>>>> May there be the problem with permissions for HDFS access? >>>>>>>> >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 27, 2013 at 7:22 AM, Edward J. Yoon <[email protected]> >>>>>>>> wrote: >>>>>>>>> If there's a HDFS connection error, you'll see the error logs in >>>>>>>> bspmaster log. >>>>>>>>> >>>>>>>>> If it's not a connection error, ... >>>>>>>>> >>>>>>>>> It's maybe related with HDFS API usage. To fix the issue of >>>>>>>>> compatibility with HDFS 2.0, I've added our own DistCacheUtils >>>>>>>>> class[1] including setLocalFiles and addLocalFiles methods which set >>>>>>>>> the cache configurations directly. >>>>>>>>> >>>>>>>>> 1. >>>>>>>> http://svn.apache.org/repos/asf/hama/trunk/core/src/main/java/org/apache/hama/util/DistCacheUtils.java >>>>>>>>> >>>>>>>>> On Fri, Sep 27, 2013 at 8:17 PM, Roman Shapovalov >>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> It seems Streaming could not find the Python files, since it >>>>>>>>>>> searched >>>>>>>> them in the local file system. >>>>>>>>>> >>>>>>>>>> It works if I specify references to the local files. However, if I >>>>>>>>>> set >>>>>>>>>> hdfs://localhost/ as a file system, I keep getting the connection >>>>>>>>>> error. May the port number matter? >>>>>>>>>> >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> On Fri, Sep 27, 2013 at 6:55 AM, Roman Shapovalov >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> Martin, >>>>>>>>>>> >>>>>>>>>>>> then you don't have started hdfs? >>>>>>>>>>> >>>>>>>>>>> I have not started it manually, but it has been active: >>>>>>>>>>> >>>>>>>>>>> NameNode '0.0.0.0:8020' (active) >>>>>>>>>>> Started:Wed Sep 25 18:54:42 EDT 2013 >>>>>>>>>>> >>>>>>>>>>>> Your hdfs should contain the following files: >>>>>>>>>>> >>>>>>>>>>> It does. >>>>>>>>>>> >>>>>>>>>>>> Without the default file system in hama-site.xml, it will not work. >>>>>>>>>>> >>>>>>>>>>> Well, at least Hama (without streaming) worked, using the local file >>>>>>>> system. >>>>>>>>>>> It seems Streaming could not find the Python files, since it >>>>>>>>>>> searched >>>>>>>>>>> them in the local file system. >>>>>>>>>>> >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 27, 2013 at 6:30 AM, Martin Illecker >>>>>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>>>>>> Hi Roman, >>>>>>>>>>>> >>>>>>>>>>>> then you don't have started hdfs? (start-dfs.sh) >>>>>>>>>>>> >>>>>>>>>>>> Are you able to access the hdfs namenode? >>>>>>>>>>>> http://localhost:50070/dfshealth.jsp >>>>>>>>>>>> >>>>>>>>>>>> Your hdfs should contain the following files: >>>>>>>>>>>> >>>>>>>>>>>> $hadoop fs -ls /tmp/PyStreaming/ >>>>>>>>>>>> Found 8 items >>>>>>>>>>>> -rw-r--r-- 279 2013-09-27 12:19 /tmp/PyStreaming/BSP.py >>>>>>>>>>>> -rw-r--r-- 5159 2013-09-27 12:19 /tmp/PyStreaming/BSPPeer.py >>>>>>>>>>>> -rw-r--r-- 379 2013-09-27 12:19 /tmp/PyStreaming/BSPRunner.py >>>>>>>>>>>> -rw-r--r-- 970 2013-09-27 12:19 >>>>>>>>>>>> /tmp/PyStreaming/BinaryProtocol.py >>>>>>>>>>>> -rw-r--r-- 299 2013-09-27 12:19 >>>>>>>> /tmp/PyStreaming/BspJobConfiguration.py >>>>>>>>>>>> -rw-r--r-- 557 2013-09-27 12:19 /tmp/PyStreaming/HelloWorldBSP.py >>>>>>>>>>>> -rw-r--r-- 5570 2013-09-27 12:19 /tmp/PyStreaming/KMeansBSP.py >>>>>>>>>>>> -rw-r--r-- 326 2013-09-27 12:19 /tmp/PyStreaming/README >>>>>>>>>>>> >>>>>>>>>>>> Without the default file system in hama-site.xml, it will not work. >>>>>>>>>>>> >>>>>>>>>>>> Martin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2013/9/27 Roman Shapovalov <[email protected]> >>>>>>>>>>>> >>>>>>>>>>>>> Martin, >>>>>>>>>>>>> >>>>>>>>>>>>> if I set default file system to hdfs://localhost/, I get the >>>>>>>> connection >>>>>>>>>>>>> error: >>>>>>>>>>>>> >>>>>>>>>>>>> 13/09/27 14:04:11 INFO ipc.Client: Retrying connect to server: >>>>>>>>>>>>> localhost/127.0.0.1:40000. Already tried 0 time(s); retry policy >>>>>>>>>>>>> is >>>>>>>>>>>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 >>>>>>>>>>>>> SECONDS) >>>>>>>>>>>>> >>>>>>>>>>>>> (and 10 times like that, than get a java.net.ConnectException). >>>>>>>>>>>>> >>>>>>>>>>>>> I attach the hama-site.xml (as it was before adding the default fs >>>>>>>>>>>>> property). I had only added the bsp.master.address property to >>>>>>>>>>>>> switch >>>>>>>>>>>>> to the PDM. >>>>>>>>>>>>> >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 27, 2013 at 4:20 AM, Martin Illecker >>>>>>>>>>>>> <[email protected] >>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hi Roman! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Did you setup the default filesystem in hama-site.xml? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please submit your hama-site.xml configuration. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Martin >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> hama-site.xml - pseudo-distributed mode >>>>>>>>>>>>>> >>>>>>>>>>>>>> <configuration> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <property> >>>>>>>>>>>>>> <name>bsp.master.address</name> >>>>>>>>>>>>>> <value>localhost:40000</value> >>>>>>>>>>>>>> <description>The address of the bsp master server. Either >>>>>>>> the >>>>>>>>>>>>>> literal string "local" or a host:port for distributed >>>>>>>> mode >>>>>>>>>>>>>> </description> >>>>>>>>>>>>>> </property> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <property> >>>>>>>>>>>>>> <name>fs.default.name</name> >>>>>>>>>>>>>> <value>hdfs://localhost/</value> >>>>>>>>>>>>>> <description> >>>>>>>>>>>>>> The name of the default file system. Either the literal >>>>>>>>>>>>> string >>>>>>>>>>>>>> "local" or a host:port for HDFS. >>>>>>>>>>>>>> </description> >>>>>>>>>>>>>> </property> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <property> >>>>>>>>>>>>>> <name>hama.zookeeper.quorum</name> >>>>>>>>>>>>>> <value>localhost</value> >>>>>>>>>>>>>> <description>Comma separated list of servers in the >>>>>>>> ZooKeeper >>>>>>>>>>>>> Quorum. >>>>>>>>>>>>>> For example, "host1.mydomain.com,host2.mydomain.com, >>>>>>>>>>>>> host3.mydomain.com". >>>>>>>>>>>>>> By default this is set to localhost for local and >>>>>>>>>>>>> pseudo-distributed modes >>>>>>>>>>>>>> of operation. For a fully-distributed setup, this >>>>>>>> should be >>>>>>>>>>>>> set to a full >>>>>>>>>>>>>> list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK >>>>>>>> is set >>>>>>>>>>>>> in hama-env.sh >>>>>>>>>>>>>> this is the list of servers which we will start/stop >>>>>>>>>>>>> zookeeper on. >>>>>>>>>>>>>> </description> >>>>>>>>>>>>>> </property> >>>>>>>>>>>>>> >>>>>>>>>>>>>> </configuration> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Am 27.09.2013 um 09:32 schrieb Roman Shapovalov < >>>>>>>>>>>>> [email protected]>: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Edward, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, I did. See the logs in my previous message. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Sep 27, 2013 at 7:15 AM, Edward J. Yoon < >>>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> Have you tried to run in pseudo-distributed mode? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Sep 27, 2013 at 5:47 AM, Roman Shapovalov >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> Martin, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for such verbose instructions. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You can find all Hama configuration files in the *conf* >>>>>>>>>>>>>>>>>> folder. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> OK, I thought Edward meant Hadoop configs specifically. >>>>>>>>>>>>>>>>> I have only added JAVA_HOME variable there, otherwise they are >>>>>>>>>>>>> default. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You should also find task logs in your *temp* folder. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I found the folder, but there were no .log files in the >>>>>>>>>>>>>>>>> attempt* >>>>>>>>>>>>>>>>> folders (in both modes). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Normally you should find it in *hama/logs/tasklogs*. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> They appear in the pseudo-distributed mode only (which also >>>>>>>> fails). >>>>>>>>>>>>>>>>> See the attached file. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> By the way do you have python3.2 installed? :-) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes. "python" links to Python 2.6, but I pass "python3.2" as >>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>> interpreter, which links to the correct version. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Sep 26, 2013 at 4:03 PM, Martin Illecker < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> if you are running Hama in local mode, it will not use HDFS >>>>>>>> anyway. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You can find all Hama configuration files in the *conf* >>>>>>>>>>>>>>>>>> folder. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> $ll hama/conf/ >>>>>>>>>>>>>>>>>> total 56 >>>>>>>>>>>>>>>>>> -rwxr-xr-x groomservers* >>>>>>>>>>>>>>>>>> -rwxr-xr-x hama-default.xml* >>>>>>>>>>>>>>>>>> -rwxr-xr-x hama-env.sh* >>>>>>>>>>>>>>>>>> -rwxr-xr-x hama-site.xml* >>>>>>>>>>>>>>>>>> -rwxr-xr-x log4j.properties* >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Probably you should setup the Pseudo Distributed Mode [1] in >>>>>>>>>>>>> hama-site.xml. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> But the task log would be very interesting. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Normally you should find it in *hama/logs/tasklogs*. >>>>>>>>>>>>>>>>>> e.g., >>>>>>>>>>>>> >>>>>>>> hama/logs/tasklogs/job_201309262134_0001/attempt_201309262134_0001_000000_0.log >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You should also find task logs in your *temp* folder. >>>>>>>>>>>>>>>>>> But this location will depend on your operation system. >>>>>>>>>>>>>>>>>> e.g., in OSX >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> /private/tmp/hadoop-YOURUSER/bsp/local/groomServer/attempt_201309262134_0001_000000_0/work/tasklogs/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> By the way do you have python3.2 installed? :-) >>>>>>>>>>>>>>>>>> $ python --version >>>>>>>>>>>>>>>>>> Python 3.2.5 >>>>>>>>>>>>>>>>>> $ python3.2 --version >>>>>>>>>>>>>>>>>> Python 3.2.5 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> May I ask which operation system do you use? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Martin >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>> http://wiki.apache.org/hama/GettingStarted#Pseudo_Distributed_Mode >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2013/9/26 Roman Shapovalov <[email protected]> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Edward, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Could you please be more specific? (Sorry, I am new to this >>>>>>>> stuff) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I run Hama in local mode. The logs/ directory is empty, and >>>>>>>>>>>>>>>>>>> I >>>>>>>> did >>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>> find any logs in HDFS as well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> And where can I find the Hadoop configuration? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Sep 26, 2013 at 12:05 PM, Edward J. Yoon < >>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> That's strange. Can you attach your namenode logs and >>>>>>>>>>>>>>>>>>>> hadoop >>>>>>>>>>>>>>>>>>> configurations? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, Sep 26, 2013 at 11:03 PM, Roman Shapovalov >>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> Hi again, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I have updated both Hama (from the trunk) and Streaming >>>>>>>> (from >>>>>>>>>>>>> Martin's >>>>>>>>>>>>>>>>>>>>> github), and checked that patches have been applied, but I >>>>>>>> keep >>>>>>>>>>>>>>>>>>>>> getting the same error (full log for local configuration >>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>> attached). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Another thing may be relevant: I keep the default Hadoop >>>>>>>>>>>>> libraries in >>>>>>>>>>>>>>>>>>>>> lib/. If I replace them as the tutorial says, some classes >>>>>>>> cannot >>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>> found even if I run pure Hama (which works perfectly with >>>>>>>> default >>>>>>>>>>>>>>>>>>>>> libs). I don't know if it is important. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Tue, Sep 24, 2013 at 9:22 AM, Martin Illecker < >>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> sorry for inconvenience! >>>>>>>>>>>>>>>>>>>>>> The problem has been reported [1] and will be fixed >>>>>>>> shortly to >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> trunk. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/HAMA-805 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2013/9/23 Edward J. Yoon <[email protected]> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This looks like a bug of DistCacheUtils. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for your report. I'll look at it tomorrow. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 23, 2013 at 11:52 PM, Roman Shapovalov >>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I try to use Hama Streaming. >>>>>>>>>>>>>>>>>>>>>>>> I have successfully installed Hama (the Pi example >>>>>>>> works). >>>>>>>>>>>>>>>>>>>>>>>> I follow this tutorial: >>>>>>>>>>>>>>>>>>>>>>>> http://wiki.apache.org/hama/HamaStreaming >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> When I try to run the distributed HelloWorld in the >>>>>>>>>>>>>>>>>>>>>>>> local >>>>>>>>>>>>>>>>>>>>>>>> configuration, I get the following error: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> $ bin/hama pipes -streaming true -bspTasks 3 >>>>>>>>>>>>>>>>>>>>>>>> -interpreter >>>>>>>>>>>>> python3.2 >>>>>>>>>>>>>>>>>>>>>>>> -cachefiles /tmp/PyStreaming/*.py -output >>>>>>>> /tmp/pystream-out/ >>>>>>>>>>>>>>>>>>> -program >>>>>>>>>>>>>>>>>>>>>>>> /tmp/PyStreaming/BSPRunner.py -programArgs >>>>>>>>>>>>>>>>>>>>>>>> HelloWorldBSP >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 INFO pipes.Submitter: Streaming >>>>>>>> enabled! >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 WARN util.NativeCodeLoader: Unable to >>>>>>>> load >>>>>>>>>>>>>>>>>>>>>>>> native-hadoop library for your platform... using >>>>>>>> builtin-java >>>>>>>>>>>>>>>>>>> classes >>>>>>>>>>>>>>>>>>>>>>>> where applicable >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 WARN bsp.BSPJobClient: No job jar >>>>>>>>>>>>>>>>>>>>>>>> file >>>>>>>> set. >>>>>>>>>>>>> User >>>>>>>>>>>>>>>>>>>>>>>> classes may not be found. See BSPJob#setJar(String) or >>>>>>>> check >>>>>>>>>>>>> Your >>>>>>>>>>>>>>>>>>> jar >>>>>>>>>>>>>>>>>>>>>>>> file. >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.BSPJobClient: Running job: >>>>>>>>>>>>>>>>>>>>>>> job_localrunner_0001 >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.LocalBSPRunner: Setting up a >>>>>>>> new >>>>>>>>>>>>> barrier >>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>> 3 tasks! >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:03:50 ERROR bsp.LocalBSPRunner: Exception >>>>>>>> during >>>>>>>>>>>>> BSP >>>>>>>>>>>>>>>>>>>>>>> execution! >>>>>>>>>>>>>>>>>>>>>>>> java.lang.NullPointerException >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>> org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:255) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>> >>>>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:138) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>> >>>>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:138) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >>>>>>>>>>>>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:662) >>>>>>>>>>>>>>>>>>>>>>>> [output cropped] >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> When I turn to the pseudo-distributed mode, job fails >>>>>>>>>>>>>>>>>>>>>>>> too >>>>>>>>>>>>> (after a >>>>>>>>>>>>>>>>>>>>>>>> minute of execution): >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:34 INFO pipes.Submitter: Streaming >>>>>>>> enabled! >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:34 WARN util.NativeCodeLoader: Unable to >>>>>>>> load >>>>>>>>>>>>>>>>>>>>>>>> native-hadoop library for your platform... using >>>>>>>> builtin-java >>>>>>>>>>>>>>>>>>> classes >>>>>>>>>>>>>>>>>>>>>>>> where applicable >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:34 WARN bsp.BSPJobClient: No job jar >>>>>>>>>>>>>>>>>>>>>>>> file >>>>>>>> set. >>>>>>>>>>>>> User >>>>>>>>>>>>>>>>>>>>>>>> classes may not be found. See BSPJob#setJar(String) or >>>>>>>> check >>>>>>>>>>>>> Your >>>>>>>>>>>>>>>>>>> jar >>>>>>>>>>>>>>>>>>>>>>>> file. >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:34 INFO bsp.BSPJobClient: Running job: >>>>>>>>>>>>>>>>>>>>>>> job_201309231846_0001 >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:47:40 INFO bsp.BSPJobClient: Job failed. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Task log contains errors: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: Starting Socket >>>>>>>> Reader #1 >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>>>>>> 43475 >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server >>>>>>>>>>>>>>>>>>>>>>>> Responder: >>>>>>>>>>>>> starting >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server listener >>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>> 43475: >>>>>>>>>>>>>>>>>>> starting >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO >>>>>>>>>>>>>>>>>>>>>>>> message.HadoopMessageManagerImpl: >>>>>>>>>>>>> BSPPeer >>>>>>>>>>>>>>>>>>>>>>>> address:localhost.localdomain port:43475 >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC Server handler 0 >>>>>>>> on >>>>>>>>>>>>> 43475: >>>>>>>>>>>>>>>>>>>>>>> starting >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 WARN util.NativeCodeLoader: Unable to >>>>>>>> load >>>>>>>>>>>>>>>>>>>>>>>> native-hadoop library for your platform... using >>>>>>>> builtin-java >>>>>>>>>>>>>>>>>>> classes >>>>>>>>>>>>>>>>>>>>>>>> where applicable >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZKSyncClient: Initializing >>>>>>>> ZK Sync >>>>>>>>>>>>>>>>>>> Client >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZooKeeperSyncClientImpl: >>>>>>>> Start >>>>>>>>>>>>>>>>>>> connecting >>>>>>>>>>>>>>>>>>>>>>>> to Zookeeper! At localhost.localdomain/127.0.0.1:43475 >>>>>>>>>>>>>>>>>>>>>>>> 13/09/23 18:46:37 ERROR bsp.BSPTask: Error running bsp >>>>>>>> setup >>>>>>>>>>>>> and bsp >>>>>>>>>>>>>>>>>>>>>>> function. >>>>>>>>>>>>>>>>>>>>>>>> java.lang.NullPointerException >>>>>>>>>>>>>>>>>>>>>>>> at java.io.File.<init>(File.java:222) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.pipes.PipesApplication.setupCommand(PipesApplication.java:130) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:257) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>> org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>> org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:176) >>>>>>>>>>>>>>>>>>>>>>>> at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146) >>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246) >>>>>>>>>>>>>>>>>>>>>>>> [output cropped] >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I use the latest trunk version of Hama, Python 3.2.5 >>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>> Hadoop >>>>>>>>>>>>>>>>>>>>>>> 2.0.0-cdh4.1.1. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Please help me to figure out the problem. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> Best Regards, Edward J. Yoon >>>>>>>>>>>>>>>>>>>>>>> @eddieyoon >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> Best Regards, Edward J. Yoon >>>>>>>>>>>>>>>>>>>> @eddieyoon >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Best Regards, Edward J. Yoon >>>>>>>>>>>>>>>> @eddieyoon >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, Edward J. Yoon >>>>>>>>> @eddieyoon >>>>>>>> >>>>>> >>>> >>
-- Best Regards, Edward J. Yoon @eddieyoon
