Hi: I got samza running a job in local mode with the property: job.factory.class=org.apache.samza.job.local.ThreadJobFactory
Now I am trying to get it running in multiple machines. I have followed the steps in the following guide: https://github.com/apache/samza/blob/master/docs/learn/tutorials/versioned/run-in-multi-node-yarn.md I see the node up and running. I have created a tar.gz file with the contents of the bin and lib folders that were running locally Yarn and published it in a local Apache2 web server. The properties file looks like this: task.class=samzafroga.job1 job.name=samzafroga.job1 job.factory.class=org.apache.samza.job.yarn.YarnJobFactory yarn.package.path= http://192.168.15.92/jobs/samzajob1.tar.gz systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory systems.kafka.consumer.zookeeper.connect= broker01:2181 systems.kafka.producer.bootstrap.servers= broker01:9092 task.inputs=kafka.syslog serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory systems.kafka.streams.syslog.samza.msg.serde=string systems.kafka.streams.samzaout.samza.msg.serde=string When I run the same command that was working in the local mode: bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/job1.properties I see the following exception: java version "1.7.0_75" OpenJDK Runtime Environment (IcedTea 2.5.4) (7u75-2.5.4-2) OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode) /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Dlog4j.configuration=file:bin/log4j-console.xml -Dsamza.log.dir=/opt/jobs -Djava.io.tmpdir=/opt/jobs/tmp -Xmx768M -XX:+PrintGCDateStamps -Xloggc:/opt/jobs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10241024 -d64 -cp /opt/hadoop/conf:/opt/jobs/lib/samzafroga-0.0.1-jar-with-dependencies.jar org.apache.samza.job.JobRunner --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file:///opt/jobs/job1.properties log4j: reset attribute= "false". log4j: Threshold ="null". log4j: Level value for root is [INFO]. log4j: root level set to INFO log4j: Class name: [org.apache.log4j.ConsoleAppender] log4j: Parsing layout of class: "org.apache.log4j.PatternLayout" log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy HH:mm:ss} %5p %c{1} - %m%n]. log4j: Adding appender named [consoleAppender] to category [root]. log4j: Class name: [org.apache.log4j.RollingFileAppender] log4j: Setting property [append] to [false]. log4j: Setting property [file] to [out/learning.log]. log4j: Parsing layout of class: "org.apache.log4j.PatternLayout" log4j: Setting property [conversionPattern] to [%d{ABSOLUTE} %-5p [%c{1}] %m%n]. log4j: setFile called: out/learning.log, false log4j: setFile ended log4j: Adding appender named [fileAppender] to category [root]. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:191) at org.apache.samza.job.JobRunner.run(JobRunner.scala:56) at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37) at org.apache.samza.job.JobRunner.main(JobRunner.scala) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 5 more I guess there is a problem with the job package but I am not sure how to solve it. Thanks, Jordi ________________________________ Jordi Blasi Uribarri Área I+D+i jbl...@nextel.es Oficina Bilbao [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2014.png]