Ha! Not the prettiest thing, but it'll do. The CrunchTool trait also has a done() method, so you can also do
pcol.write(to.textFile(outputPath)) done() On Fri, Jun 20, 2014 at 2:32 PM, Daniel Siegmann <[email protected]> wrote: > Got it to work like so: > > > read(from.textFile(inputPath)).write(to.textFile(outputPath)).native.getPipeline().done() > > Is that the correct way? > > Thanks for the help, I have a running word count example now. :-) > > > > On Fri, Jun 20, 2014 at 4:34 PM, Josh Wills <[email protected]> wrote: > >> You need to manually call run() or done() to execute the pipeline if >> you're not materializing the output. The user guide will be useful for the >> basic concepts, even though it focuses on the Java API. >> On Jun 20, 2014 1:27 PM, "Daniel Siegmann" <[email protected]> >> wrote: >> >>> Thanks Josh! The thrift and protobuf defs were what I was missing. I'm >>> able to compile and run the code now. I also updated to Scrunch 0.10.0. >>> >>> Any idea why it might not write the output? If I have >>> >>> countWords(args(0)).materialize.foreach(line => println(s"**** $line")) >>> >>> I get all my output, but >>> >>> countWords(args(0)).write(to.textFile(args(1))) >>> >>> Doesn't even create the output directory, even though I see this in my >>> logs >>> >>> 14/06/20 16:17:47 INFO impl.FileTargetImpl: Will write output files to >>> new path: >>> /var/folders/th/7vf9rjqd1955jnwnzg3x9ym40000gn/T/1403295466563-1/wordcounts >>> >>> No exceptions or anything. I'm probably missing something obvious. :-( >>> >>> >>> On Thu, Jun 19, 2014 at 6:03 PM, Josh Wills <[email protected]> wrote: >>> >>>> Here you go: https://github.com/jwills/scrunch-demo >>>> >>>> Did this w/Maven; you'll have to forgive me as my SBT-fu isn't great. >>>> It looks like vanilla Hadoop 1.x doesn't include any thrift/protobuf >>>> dependencies that Scrunch expects to be present at compile-time; I added >>>> them as provided dependencies in this example and then verified that I >>>> could run the -job.jar that I built w/mvn package under Hadoop 1.0.3. >>>> >>>> J >>>> >>>> >>>> On Thu, Jun 19, 2014 at 2:33 PM, Daniel Siegmann < >>>> [email protected]> wrote: >>>> >>>>> Hi Josh, thanks for the reply. >>>>> >>>>> Which version of Hadoop are you looking to compile against? >>>>>> >>>>> >>>>> I think any 1.x version will suffice (our production cluster is MapR). >>>>> >>>>> The Spotify comparison is interesting. Too bad they didn't evaluate >>>>> Scoobi as well. Thanks for the info. >>>>> >>>> >>>> >>>> >>>> -- >>>> Director of Data Science >>>> Cloudera <http://www.cloudera.com> >>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>> >>> >>> >>> >>> -- >>> Daniel Siegmann, Software Developer >>> Velos >>> Accelerating Machine Learning >>> >>> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 >>> E: [email protected] W: www.velos.io >>> >> > > > -- > Daniel Siegmann, Software Developer > Velos > Accelerating Machine Learning > > 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 > E: [email protected] W: www.velos.io > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
