Hi Telles, In terms of "*I tried pushing the tar file to HDFS but I got an error from hadoop saying that it couldn’t find core-site.xml file*.", I guess you set the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can do 1) make the HADOOP_CONF_DIR point to the directory where your conf files are, such as /etc/hadoop/conf. Or 2) copy the config files to ~/.samza/conf. Thank you,
Cheer, Fang, Yan [email protected] +1 (206) 849-4108 On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini < [email protected]> wrote: > Hey Telles, > > To get YARN working with the HTTP file system, you need to follow the > instructions on: > > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y > arn.html > > > In the "Set Up Http Filesystem for YARN" section. > > You shouldn't need to compile anything (no Gradle, which is what your > stack trace is showing). This setup should be done for all of the NMs, > since they will be the ones downloading your job's package (from > yarn.package.path). > > Cheers, > Chris > > On 8/9/14 9:44 PM, "Telles Nobrega" <[email protected]> wrote: > > >Hi again, I tried installing the scala libs but the Http problem still > >occurs. I realised that I need to compile incubator samza in the machines > >that I¹m going to run the jobs, but the compilation fails with this huge > >message: > > > ># > ># There is insufficient memory for the Java Runtime Environment to > >continue. > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes > >for committing reserved memory. > ># An error report file with more information is saved as: > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log > >Could not write standard input into: Gradle Worker 13. > >java.io.IOException: Broken pipe > > at java.io.FileOutputStream.writeBytes(Native Method) > > at java.io.FileOutputStream.write(FileOutputStream.java:345) > > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > > at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > > at > >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH > >andleRunner.java:53) > > at > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > >l$1.run(DefaultExecutorFactory.java:66) > > at > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > >1145) > > at > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > >:615) > > at java.lang.Thread.run(Thread.java:744) > >Process 'Gradle Worker 13' finished with non-zero exit value 1 > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13' > >finished with non-zero exit value 1 > > at > >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE > >xitValue(DefaultExecHandle.java:362) > > at > >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork > >erProcess.java:89) > > at > >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP > >rocess.java:33) > > at > >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau > >ltWorkerProcess.java:55) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > >57) > > at > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm > >pl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > >ch.java:35) > > at > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > >ch.java:24) > > at > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > > at > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > > at > >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa > >ndler.invoke(ProxyDispatchAdapter.java:93) > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > > at > >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH > >andle.java:212) > > at > >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j > >ava:309) > > at > >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja > >va:108) > > at > >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > > at > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > >l$1.run(DefaultExecutorFactory.java:66) > > at > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > >1145) > > at > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > >:615) > > at java.lang.Thread.run(Thread.java:744) > >OpenJDK 64-Bit Server VM warning: INFO: > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed; > >error='Cannot allocate memory' (errno=12) > ># > ># There is insufficient memory for the Java Runtime Environment to > >continue. > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes > >for committing reserved memory. > ># An error report file with more information is saved as: > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log > >Could not write standard input into: Gradle Worker 14. > >java.io.IOException: Broken pipe > > at java.io.FileOutputStream.writeBytes(Native Method) > > at java.io.FileOutputStream.write(FileOutputStream.java:345) > > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > > at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > > at > >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH > >andleRunner.java:53) > > at > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > >l$1.run(DefaultExecutorFactory.java:66) > > at > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > >1145) > > at > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > >:615) > > at java.lang.Thread.run(Thread.java:744) > >Process 'Gradle Worker 14' finished with non-zero exit value 1 > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14' > >finished with non-zero exit value 1 > > at > >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE > >xitValue(DefaultExecHandle.java:362) > > at > >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork > >erProcess.java:89) > > at > >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP > >rocess.java:33) > > at > >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau > >ltWorkerProcess.java:55) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > >57) > > at > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm > >pl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > >ch.java:35) > > at > >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat > >ch.java:24) > > at > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > > at > >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > > at > >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa > >ndler.invoke(ProxyDispatchAdapter.java:93) > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > > at > >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH > >andle.java:212) > > at > >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j > >ava:309) > > at > >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja > >va:108) > > at > >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > > at > >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp > >l$1.run(DefaultExecutorFactory.java:66) > > at > >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: > >1145) > > at > >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > >:615) > > at java.lang.Thread.r > > > >Do I need more memory for my machines? Each already has 4GB. I really > >need to have this running. I¹m not sure which way is best http or hdfs > >which one you suggest and how can i solve my problem for each case. > > > >Thanks in advance and sorry for bothering this much. > >On 10 Aug 2014, at 00:20, Telles Nobrega <[email protected]> wrote: > > > >> Hi Chris, now I have the tar file in my RM machine, and the yarn path > >>points to it. I changed the core-site.xml to use HttpFileSystem instead > >>of HDFS now it is failing with > >> > >> Application application_1407640485281_0001 failed 2 times due to AM > >>Container for appattempt_1407640485281_0001_000002 exited with > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class > >>org.apache.samza.util.hadoop.HttpFileSystem not found > >> > >> I think I can solve this just installing scala files from the samza > >>tutorial, can you confirm that? > >> > >> On 09 Aug 2014, at 08:34, Telles Nobrega <[email protected]> > >>wrote: > >> > >>> Hi Chris, > >>> > >>> I think the problem is that I forgot to update the yarn.job.package. > >>> I will try again to see if it works now. > >>> > >>> I have one more question, how can I stop (command line) the jobs > >>>running in my topology, for the experiment that I will run, I need to > >>>run the same job in 4 minutes intervals. So I need to kill it, clean > >>>the kafka topics and rerun. > >>> > >>> Thanks in advance. > >>> > >>> On 08 Aug 2014, at 12:41, Chris Riccomini > >>><[email protected]> wrote: > >>> > >>>> Hey Telles, > >>>> > >>>>>> Do I need to have the job folder on each machine in my cluster? > >>>> > >>>> No, you should not need to do this. There are two ways to deploy your > >>>> tarball to the YARN grid. One is to put it in HDFS, and the other is > >>>>to > >>>> put it on an HTTP server. The link to running a Samza job in a > >>>>multi-node > >>>> YARN cluster describes how to do both (either HTTP server or HDFS). > >>>> > >>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s), > >>>>you > >>>> must update yarn.package.path to point to it. From there, the YARN NM > >>>> should download it for you automatically when you start your job. > >>>> > >>>> * Can you send along a paste of your job config? > >>>> > >>>> Cheers, > >>>> Chris > >>>> > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <[email protected]> wrote: > >>>> > >>>>> Hi Telles, it looks to me that you forgot to update the > >>>>> "yarn.package.path" > >>>>> attribute in your config file for the task. > >>>>> > >>>>> - Claudio Martins > >>>>> Head of Engineering > >>>>> MobileAware USA Inc. / www.mobileaware.com > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288 > >>>>> linkedin: www.linkedin.com/in/martinsclaudio > >>>>> > >>>>> > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega > >>>>><[email protected]> > >>>>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> this is my first time trying to run a job on a multinode > >>>>>>environment. I > >>>>>> have the cluster set up, I can see in the GUI that all nodes are > >>>>>> working. > >>>>>> Do I need to have the job folder on each machine in my cluster? > >>>>>> - The first time I tried running with the job on the namenode > >>>>>>machine > >>>>>> and > >>>>>> it failed saying: > >>>>>> > >>>>>> Application application_1407509228798_0001 failed 2 times due to AM > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with > >>>>>>exitCode: > >>>>>> -1000 due to: File > >>>>>> > >>>>>> > >>>>>> > >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack > >>>>>>age- > >>>>>> 0.7.0-dist.tar.gz > >>>>>> does not exist > >>>>>> > >>>>>> So I copied the folder to each machine in my cluster and got this > >>>>>>error: > >>>>>> > >>>>>> Application application_1407509228798_0002 failed 2 times due to AM > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with > >>>>>>exitCode: > >>>>>> -1000 due to: Resource > >>>>>> > >>>>>> > >>>>>> > >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack > >>>>>>age- > >>>>>> 0.7.0-dist.tar.gz > >>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000 > >>>>>> > >>>>>> What am I missing? > >>>>>> > >>>>>> p.s.: I followed this > >>>>>> > >>>>>>< > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz > >>>>>>a> > >>>>>> tutorial > >>>>>> and this > >>>>>> < > >>>>>> > >>>>>> > >>>>>> > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi- > >>>>>>node > >>>>>> -yarn.html > >>>>>>> > >>>>>> to > >>>>>> set up the cluster. > >>>>>> > >>>>>> Help is much appreciated. > >>>>>> > >>>>>> Thanks in advance. > >>>>>> > >>>>>> -- > >>>>>> ------------------------------------------ > >>>>>> Telles Mota Vidal Nobrega > >>>>>> M.sc. Candidate at UFCG > >>>>>> B.sc. in Computer Science at UFCG > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG > >>>>>> > >>>> > >>> > >> > > > >
