Re: How to read a multipart s3 file?

2014-05-06 Thread Andre Kuhnen
Try using s3n instead of s3 Em 06/05/2014 21:19, kamatsuoka ken...@gmail.com escreveu: I have a Spark app that writes out a file, s3://mybucket/mydir/myfile.txt. Behind the scenes, the S3 driver creates a bunch of files like s3://mybucket//mydir/myfile.txt/part-, as well as the block

Re: Lease Exception hadoop 2.4

2014-05-04 Thread Andre Kuhnen
in multiple mappers/reduce partitioners? Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi https://twitter.com/mayur_rustagi On Sun, May 4, 2014 at 5:30 PM, Andre Kuhnen andrekuh...@gmail.comwrote: Please, can anyone give a feedback? thanks Hello, I am getting

Re: Lease Exception hadoop 2.4

2014-05-04 Thread Andre Kuhnen
(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on Any ideas? thanks 2014-05-04 11:53 GMT-03:00 Andre Kuhnen andrekuh...@gmail.com: Thanks Mayur, the only think that my code is doing is: read from s3, and saveAsTextFile on hdfs. Like I said, everything is written correctly, but at the end

Re: Lease Exception hadoop 2.4

2014-05-04 Thread Andre Kuhnen
I think I forgot to rsync the slaves with the new compiled jar, I will give it a try as soon as possible, Em 04/05/2014 21:35, Andre Kuhnen andrekuh...@gmail.com escreveu: I compiled spark with SPARK_HADOOP_VERSION=2.4.0 sbt/sbt assembly, fixed the s3 dependencies, but I am still getting

Lease Exception hadoop 2.4

2014-05-03 Thread Andre Kuhnen
Hello, I am getting this warning after upgrading Hadoop 2.4, when I try to write something to the HDFS. The content is written correctly, but I do not like this warning. DO I have to compile SPARK with hadoop 2.4? WARN TaskSetManager: Loss was due to org.apache.hadoop.ipc.RemoteException

MultipleOutputs IdentityReducer

2014-04-25 Thread Andre Kuhnen

MultipleOutputs IdentityReducer

2014-04-25 Thread Andre Kuhnen
Hello, I am trying to write multiple files with Spark, but I can not find a way to do it. Here is the idea. val rddKeyValue : Rdd[(String, String)] = rddlines.map( line = createKeyValue(line)) now I would like to save this as keyname.txt and all the values inside the file I tried to use this