Hey Yan, Would you be up for contributing a tutorial page that describes this? This is really useful information. Our docs are just simple .md files in the main code base.
Regarding step (3), is the hdfs-site.xml put into the conf folder for the NM boxes, or on the client side (where run-job.sh is run)? Cheers, Chris On 3/11/14 10:07 AM, "Yan Fang" <[email protected]> wrote: >Hi Sonali, > >The way I make Samza run with HDFS is following: > >1. include hdfs jar in Samza jar tar.gz. >2. you may also want to make sure the hadoop-common.jar has the same >version as your hdfs jar. Otherwise, you may have configuration error >popping out. >3. then put hdfs-site.xml to conf folder, the same folder as the >yarn-site.xml >4. all other steps are not changed. > >Hope this will help. Thank you. > >Cheers, > >Fang, Yan >[email protected] >+1 (206) 849-4108 > > >On Tue, Mar 11, 2014 at 9:25 AM, Chris Riccomini ><[email protected]>wrote: > >> Hey Sonali, >> >> I believe that you need to make sure that the HDFS jar is in your >>.tar.gz >> file, as you've said. >> >> If that doesn't work, you might need to define this setting in >> core-site.xml on the machine you're running run-job.sh on: >> >> <property> >> <name>fs.hdfs.impl</name> >> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> >> <description>The FileSystem for hdfs: uris.</description> >> </property> >> >> >> You might also need to configure your NodeManagers to have the HDFS file >> system impl as well. >> >> I've never run Samza with HDFS, so I'm guessing here. Perhaps someone >>else >> on the list has been successful with this? >> >> Cheers, >> Chris >> >> On 3/10/14 3:59 PM, "[email protected]" >> <[email protected]> wrote: >> >> >Hello, >> > >> >I fixed this by starting from scratch with gradlew. But now when I run >>my >> >job it throws this error: >> >Exception in thread "main" java.io.IOException: No FileSystem for >>scheme: >> >hdfs >> > at >> >>>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421) >> > at >> >org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) >> > at >>org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) >> > at >> >org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) >> > at >>org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) >> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287) >> > at >> >>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.sc >>>al >> >a:111) >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55) >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48) >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:100) >> > at org.apache.samza.job.JobRunner$.main(JobRunner.scala:75) >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala) >> > >> >I looked at the samza job tar.gz and it doesn't have a Hadoop-hdfs jar. >> >Is that why I get this error? >> > >> >Thanks, >> >Sonali >> > >> >From: Parthasarathy, Sonali >> >Sent: Monday, March 10, 2014 11:25 AM >> >To: [email protected] >> >Subject: Failed to package using mvn >> > >> >Hi, >> > >> >When I tried to do a mvn clean package of my hello-samza project, I get >> >the following error. Has anyone seen this before? >> > >> >[ERROR] Failed to execute goal on project samza-wikipedia: Could not >> >resolve dependencies for project samza:samza-wikipedia:jar:0.7.0: Could >> >not find artifact org.apache.samza:samza-kv_2.10:jar:0.7.0 in >> >apache-releases (https://repository.apache.org/content/groups/public) >>-> >> >[Help 1] >> >[ERROR] >> >[ERROR] To see the full stack trace of the errors, re-run Maven with >>the >> >-e switch. >> >[ERROR] Re-run Maven using the -X switch to enable full debug logging. >> >[ERROR] >> >[ERROR] For more information about the errors and possible solutions, >> >please read the following articles: >> >[ERROR] [Help 1] >> > >> >>http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionExce >>p >> >tion >> >[ERROR] >> >[ERROR] After correcting the problems, you can resume the build with >>the >> >command >> >[ERROR] mvn <goals> -rf :samza-wikipedia >> > >> >Thanks, >> >Sonali >> > >> >Sonali Parthasarathy >> >R&D Developer, Data Insights >> >Accenture Technology Labs >> >703-341-7432 >> > >> > >> >________________________________ >> > >> >This message is for the designated recipient only and may contain >> >privileged, proprietary, or otherwise confidential information. If you >> >have received it in error, please notify the sender immediately and >> >delete the original. Any other use of the e-mail by you is prohibited. >> >Where allowed by local law, electronic communications with Accenture >>and >> >its affiliates, including e-mail and instant messaging (including >> >content), may be scanned by our systems for the purposes of information >> >security and assessment of internal compliance with Accenture policy. >> >>>________________________________________________________________________ >>>__ >> >____________ >> > >> >www.accenture.com >> >>
