Hi Telles, It looks correct. Did you put the hdfs-site.xml into your HADOOP_CONF_DIR ?(such as ~/.samza/conf)
Fang, Yan [email protected] +1 (206) 849-4108 On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega <[email protected]> wrote: > Hi Yan Fang, > > I was able to deploy the file to hdfs, I can see them in all my nodes but > when I tried running I got this error: > > Exception in thread "main" java.io.IOException: No FileSystem for scheme: > hdfs > at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287) > at > > org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111) > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55) > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48) > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62) > at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37) > at org.apache.samza.job.JobRunner.main(JobRunner.scala) > > > This is my yarn.package.path config: > > > > > yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz > > Thanks in advance > > > > > > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <[email protected]> wrote: > > > Hi Telles, > > > > In terms of "*I tried pushing the tar file to HDFS but I got an error > from > > hadoop saying that it couldn’t find core-site.xml file*.", I guess you > set > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can > do > > 1) make the HADOOP_CONF_DIR point to the directory where your conf files > > are, such as /etc/hadoop/conf. Or 2) copy the config files to > > ~/.samza/conf. Thank you, > > > > Cheer, > > > > Fang, Yan > > [email protected] > > +1 (206) 849-4108 > > > > > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini < > > [email protected]> wrote: > > > > > Hey Telles, > > > > > > To get YARN working with the HTTP file system, you need to follow the > > > instructions on: > > > > > > > > > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y > > > arn.html > > > > > > > > > In the "Set Up Http Filesystem for YARN" section. > > > > > > You shouldn't need to compile anything (no Gradle, which is what your > > > stack trace is showing). This setup should be done for all of the NMs, > > > since they will be the ones downloading your job's package (from > > > yarn.package.path). > > > > > > Cheers, > > > Chris > > > > > > On 8/9/14 9:44 PM, "Telles Nobrega" <[email protected]> wrote: > > > > > > >Hi again, I tried installing the scala libs but the Http problem still > > > >occurs. I realised that I need to compile incubator samza in the > > machines > > > >that I¹m going to run the jobs, but the compilation fails with this > huge > > > >message: > > > > > > > ># > > > ># There is insufficient memory for the Java Runtime Environment to > > > >continue. > > > ># Native memory allocation (malloc) failed to allocate 3946053632 > bytes > > > >for committing reserved memory. > > > ># An error report file with more information is saved as: > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log > > > >Could not write standard input into: Gradle Worker 13. > > > >java.io.IOException: Broken pipe > > > > at java.io.FileOutputStream.writeBytes(Native Method) > > > > at java.io.FileOutputStream.write(FileOutputStream.java:345) > > > > at > > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > > > > at > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > > > > at > > > > > > >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH > > > >andleRunner.java:53) > > > > at > > > > > > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > > > >l$1.run(DefaultExecutorFactory.java:66) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > > > >1145) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > > > >:615) > > > > at java.lang.Thread.run(Thread.java:744) > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1 > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13' > > > >finished with non-zero exit value 1 > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE > > > >xitValue(DefaultExecHandle.java:362) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork > > > >erProcess.java:89) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP > > > >rocess.java:33) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau > > > >ltWorkerProcess.java:55) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > > > > > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > > > >57) > > > > at > > > > > > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm > > > >pl.java:43) > > > > at java.lang.reflect.Method.invoke(Method.java:606) > > > > at > > > > > > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > > > >ch.java:35) > > > > at > > > > > > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > > > >ch.java:24) > > > > at > > > > > > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > > > > at > > > > > > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > > > > at > > > > > > >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa > > > >ndler.invoke(ProxyDispatchAdapter.java:93) > > > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH > > > >andle.java:212) > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j > > > >ava:309) > > > > at > > > > > > >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja > > > >va:108) > > > > at > > > > > > >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > > > > at > > > > > > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > > > >l$1.run(DefaultExecutorFactory.java:66) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > > > >1145) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > > > >:615) > > > > at java.lang.Thread.run(Thread.java:744) > > > >OpenJDK 64-Bit Server VM warning: INFO: > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed; > > > >error='Cannot allocate memory' (errno=12) > > > ># > > > ># There is insufficient memory for the Java Runtime Environment to > > > >continue. > > > ># Native memory allocation (malloc) failed to allocate 3946053632 > bytes > > > >for committing reserved memory. > > > ># An error report file with more information is saved as: > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log > > > >Could not write standard input into: Gradle Worker 14. > > > >java.io.IOException: Broken pipe > > > > at java.io.FileOutputStream.writeBytes(Native Method) > > > > at java.io.FileOutputStream.write(FileOutputStream.java:345) > > > > at > > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > > > > at > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > > > > at > > > > > > >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH > > > >andleRunner.java:53) > > > > at > > > > > > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > > > >l$1.run(DefaultExecutorFactory.java:66) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > > > >1145) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > > > >:615) > > > > at java.lang.Thread.run(Thread.java:744) > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1 > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14' > > > >finished with non-zero exit value 1 > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE > > > >xitValue(DefaultExecHandle.java:362) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork > > > >erProcess.java:89) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP > > > >rocess.java:33) > > > > at > > > > > > >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau > > > >ltWorkerProcess.java:55) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > > > > > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > > > >57) > > > > at > > > > > > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm > > > >pl.java:43) > > > > at java.lang.reflect.Method.invoke(Method.java:606) > > > > at > > > > > > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > > > >ch.java:35) > > > > at > > > > > > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > > > >ch.java:24) > > > > at > > > > > > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > > > > at > > > > > > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > > > > at > > > > > > >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa > > > >ndler.invoke(ProxyDispatchAdapter.java:93) > > > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH > > > >andle.java:212) > > > > at > > > > > > >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j > > > >ava:309) > > > > at > > > > > > >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja > > > >va:108) > > > > at > > > > > > >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > > > > at > > > > > > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > > > >l$1.run(DefaultExecutorFactory.java:66) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > > > >1145) > > > > at > > > > > > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > > > >:615) > > > > at java.lang.Thread.r > > > > > > > >Do I need more memory for my machines? Each already has 4GB. I really > > > >need to have this running. I¹m not sure which way is best http or hdfs > > > >which one you suggest and how can i solve my problem for each case. > > > > > > > >Thanks in advance and sorry for bothering this much. > > > >On 10 Aug 2014, at 00:20, Telles Nobrega <[email protected]> > > wrote: > > > > > > > >> Hi Chris, now I have the tar file in my RM machine, and the yarn > path > > > >>points to it. I changed the core-site.xml to use HttpFileSystem > instead > > > >>of HDFS now it is failing with > > > >> > > > >> Application application_1407640485281_0001 failed 2 times due to AM > > > >>Container for appattempt_1407640485281_0001_000002 exited with > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found > > > >> > > > >> I think I can solve this just installing scala files from the samza > > > >>tutorial, can you confirm that? > > > >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega <[email protected]> > > > >>wrote: > > > >> > > > >>> Hi Chris, > > > >>> > > > >>> I think the problem is that I forgot to update the > yarn.job.package. > > > >>> I will try again to see if it works now. > > > >>> > > > >>> I have one more question, how can I stop (command line) the jobs > > > >>>running in my topology, for the experiment that I will run, I need > to > > > >>>run the same job in 4 minutes intervals. So I need to kill it, clean > > > >>>the kafka topics and rerun. > > > >>> > > > >>> Thanks in advance. > > > >>> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini > > > >>><[email protected]> wrote: > > > >>> > > > >>>> Hey Telles, > > > >>>> > > > >>>>>> Do I need to have the job folder on each machine in my cluster? > > > >>>> > > > >>>> No, you should not need to do this. There are two ways to deploy > > your > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the other > is > > > >>>>to > > > >>>> put it on an HTTP server. The link to running a Samza job in a > > > >>>>multi-node > > > >>>> YARN cluster describes how to do both (either HTTP server or > HDFS). > > > >>>> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS > > server(s), > > > >>>>you > > > >>>> must update yarn.package.path to point to it. From there, the YARN > > NM > > > >>>> should download it for you automatically when you start your job. > > > >>>> > > > >>>> * Can you send along a paste of your job config? > > > >>>> > > > >>>> Cheers, > > > >>>> Chris > > > >>>> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <[email protected]> > > wrote: > > > >>>> > > > >>>>> Hi Telles, it looks to me that you forgot to update the > > > >>>>> "yarn.package.path" > > > >>>>> attribute in your config file for the task. > > > >>>>> > > > >>>>> - Claudio Martins > > > >>>>> Head of Engineering > > > >>>>> MobileAware USA Inc. / www.mobileaware.com > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288 > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio > > > >>>>> > > > >>>>> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega > > > >>>>><[email protected]> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Hi, > > > >>>>>> > > > >>>>>> this is my first time trying to run a job on a multinode > > > >>>>>>environment. I > > > >>>>>> have the cluster set up, I can see in the GUI that all nodes are > > > >>>>>> working. > > > >>>>>> Do I need to have the job folder on each machine in my cluster? > > > >>>>>> - The first time I tried running with the job on the namenode > > > >>>>>>machine > > > >>>>>> and > > > >>>>>> it failed saying: > > > >>>>>> > > > >>>>>> Application application_1407509228798_0001 failed 2 times due to > > AM > > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with > > > >>>>>>exitCode: > > > >>>>>> -1000 due to: File > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > > > > >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack > > > >>>>>>age- > > > >>>>>> 0.7.0-dist.tar.gz > > > >>>>>> does not exist > > > >>>>>> > > > >>>>>> So I copied the folder to each machine in my cluster and got > this > > > >>>>>>error: > > > >>>>>> > > > >>>>>> Application application_1407509228798_0002 failed 2 times due to > > AM > > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with > > > >>>>>>exitCode: > > > >>>>>> -1000 due to: Resource > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > > > > >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack > > > >>>>>>age- > > > >>>>>> 0.7.0-dist.tar.gz > > > >>>>>> changed on src filesystem (expected 1407509168000, was > > 1407509434000 > > > >>>>>> > > > >>>>>> What am I missing? > > > >>>>>> > > > >>>>>> p.s.: I followed this > > > >>>>>> > > > >>>>>>< > > > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz > > > >>>>>>a> > > > >>>>>> tutorial > > > >>>>>> and this > > > >>>>>> < > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi- > > > >>>>>>node > > > >>>>>> -yarn.html > > > >>>>>>> > > > >>>>>> to > > > >>>>>> set up the cluster. > > > >>>>>> > > > >>>>>> Help is much appreciated. > > > >>>>>> > > > >>>>>> Thanks in advance. > > > >>>>>> > > > >>>>>> -- > > > >>>>>> ------------------------------------------ > > > >>>>>> Telles Mota Vidal Nobrega > > > >>>>>> M.sc. Candidate at UFCG > > > >>>>>> B.sc. in Computer Science at UFCG > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG > > > >>>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > > > > > -- > ------------------------------------------ > Telles Mota Vidal Nobrega > M.sc. Candidate at UFCG > B.sc. in Computer Science at UFCG > Software Engineer at OpenStack Project - HP/LSD-UFCG >
