Hey Yan,

Would you be up for contributing a tutorial page that describes this? This
is really useful information. Our docs are just simple .md files in the
main code base.

Regarding step (3), is the hdfs-site.xml put into the conf folder for the
NM boxes, or on the client side (where run-job.sh is run)?

Cheers,
Chris

On 3/11/14 10:07 AM, "Yan Fang" <[email protected]> wrote:

>Hi Sonali,
>
>The way I make Samza run with HDFS is following:
>
>1. include hdfs jar in Samza jar tar.gz.
>2. you may also want to make sure the hadoop-common.jar has the same
>version as your hdfs jar. Otherwise, you may have configuration error
>popping out.
>3. then put hdfs-site.xml to conf folder, the same folder as the
>yarn-site.xml
>4. all other steps are not changed.
>
>Hope this will help. Thank you.
>
>Cheers,
>
>Fang, Yan
>[email protected]
>+1 (206) 849-4108
>
>
>On Tue, Mar 11, 2014 at 9:25 AM, Chris Riccomini
><[email protected]>wrote:
>
>> Hey Sonali,
>>
>> I believe that you need to make sure that the HDFS jar is in your
>>.tar.gz
>> file, as you've said.
>>
>> If that doesn't work, you might need to define this setting in
>> core-site.xml on the machine you're running run-job.sh on:
>>
>> <property>
>>   <name>fs.hdfs.impl</name>
>>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>   <description>The FileSystem for hdfs: uris.</description>
>> </property>
>>
>>
>> You might also need to configure your NodeManagers to have the HDFS file
>> system impl as well.
>>
>> I've never run Samza with HDFS, so I'm guessing here. Perhaps someone
>>else
>> on the list has been successful with this?
>>
>> Cheers,
>> Chris
>>
>> On 3/10/14 3:59 PM, "[email protected]"
>> <[email protected]> wrote:
>>
>> >Hello,
>> >
>> >I fixed this by starting from scratch with gradlew. But now when I run
>>my
>> >job it throws this error:
>> >Exception in thread "main" java.io.IOException: No FileSystem for
>>scheme:
>> >hdfs
>> >        at
>> 
>>>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>> >        at
>> >org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>> >        at 
>>org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>> >        at
>> >org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>> >        at 
>>org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>> >        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>> >        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>> >        at
>> 
>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.sc
>>>al
>> >a:111)
>> >        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> >        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>> >        at org.apache.samza.job.JobRunner.run(JobRunner.scala:100)
>> >        at org.apache.samza.job.JobRunner$.main(JobRunner.scala:75)
>> >        at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> >
>> >I looked at the samza job tar.gz and it doesn't have a Hadoop-hdfs jar.
>> >Is that why I get this error?
>> >
>> >Thanks,
>> >Sonali
>> >
>> >From: Parthasarathy, Sonali
>> >Sent: Monday, March 10, 2014 11:25 AM
>> >To: [email protected]
>> >Subject: Failed to package using mvn
>> >
>> >Hi,
>> >
>> >When I tried to do a mvn clean package of my hello-samza project, I get
>> >the following error. Has anyone seen this before?
>> >
>> >[ERROR] Failed to execute goal on project samza-wikipedia: Could not
>> >resolve dependencies for project samza:samza-wikipedia:jar:0.7.0: Could
>> >not find artifact org.apache.samza:samza-kv_2.10:jar:0.7.0 in
>> >apache-releases (https://repository.apache.org/content/groups/public)
>>->
>> >[Help 1]
>> >[ERROR]
>> >[ERROR] To see the full stack trace of the errors, re-run Maven with
>>the
>> >-e switch.
>> >[ERROR] Re-run Maven using the -X switch to enable full debug logging.
>> >[ERROR]
>> >[ERROR] For more information about the errors and possible solutions,
>> >please read the following articles:
>> >[ERROR] [Help 1]
>> >
>> 
>>http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionExce
>>p
>> >tion
>> >[ERROR]
>> >[ERROR] After correcting the problems, you can resume the build with
>>the
>> >command
>> >[ERROR]   mvn <goals> -rf :samza-wikipedia
>> >
>> >Thanks,
>> >Sonali
>> >
>> >Sonali Parthasarathy
>> >R&D Developer, Data Insights
>> >Accenture Technology Labs
>> >703-341-7432
>> >
>> >
>> >________________________________
>> >
>> >This message is for the designated recipient only and may contain
>> >privileged, proprietary, or otherwise confidential information. If you
>> >have received it in error, please notify the sender immediately and
>> >delete the original. Any other use of the e-mail by you is prohibited.
>> >Where allowed by local law, electronic communications with Accenture
>>and
>> >its affiliates, including e-mail and instant messaging (including
>> >content), may be scanned by our systems for the purposes of information
>> >security and assessment of internal compliance with Accenture policy.
>> 
>>>________________________________________________________________________
>>>__
>> >____________
>> >
>> >www.accenture.com
>>
>>

Reply via email to