First you create the file: final File outputFile = new File(outputPath);
Then you write to it: Files.append(counts + "\n", outputFile, Charset.defaultCharset()); Cheers On Fri, Aug 14, 2015 at 4:38 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I thought prefix meant the output path? What's the purpose of prefix and > where do I specify the path if not in prefix? > > On Fri, Aug 14, 2015 at 4:36 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Please take a look at JavaPairDStream.scala: >> def saveAsHadoopFiles[F <: OutputFormat[_, _]]( >> prefix: String, >> suffix: String, >> keyClass: Class[_], >> valueClass: Class[_], >> outputFormatClass: Class[F]) { >> >> Did you intend to use outputPath as prefix ? >> >> Cheers >> >> >> On Fri, Aug 14, 2015 at 1:36 PM, Mohit Anchlia <mohitanch...@gmail.com> >> wrote: >> >>> Spark 1.3 >>> >>> Code: >>> >>> wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>, >>> Time, Void>()* { >>> >>> @Override >>> >>> *public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws* >>> IOException { >>> >>> String counts = "Counts at time " + time + " " + rdd.collect(); >>> >>> System.*out*.println(counts); >>> >>> System.*out*.println("Appending to " + outputFile.getAbsolutePath()); >>> >>> Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*()); >>> >>> *return* *null*; >>> >>> } >>> >>> }); >>> >>> wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text. >>> *class*, TextOutputFormat.*class*); >>> >>> >>> What do I need to check in namenode? I see 0 bytes files like this: >>> >>> >>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45 >>> /tmp/out-1439495124000.txt >>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45 >>> /tmp/out-1439495125000.txt >>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45 >>> /tmp/out-1439495126000.txt >>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45 >>> /tmp/out-1439495127000.txt >>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45 >>> /tmp/out-1439495128000.txt >>> >>> >>> >>> However, I also wrote data to a local file on the local file system for >>> verification and I see the data: >>> >>> >>> $ ls -ltr !$ >>> ls -ltr /tmp/out >>> -rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out >>> >>> >>> On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Which Spark release are you using ? >>>> >>>> Can you show us snippet of your code ? >>>> >>>> Have you checked namenode log ? >>>> >>>> Thanks >>>> >>>> >>>> >>>> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mohitanch...@gmail.com> >>>> wrote: >>>> >>>> I was able to get this working by using an alternative method however I >>>> only see 0 bytes files in hadoop. I've verified that the output does exist >>>> in the logs however it's missing from hdfs. >>>> >>>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mohitanch...@gmail.com >>>> > wrote: >>>> >>>>> I have this call trying to save to hdfs 2.6 >>>>> >>>>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt"); >>>>> >>>>> but I am getting the following: >>>>> java.lang.RuntimeException: class scala.runtime.Nothing$ not >>>>> org.apache.hadoop.mapreduce.OutputFormat >>>>> >>>> >>>> >>> >> >