Re: Scrunch example project with SBT?

Daniel Siegmann Fri, 20 Jun 2014 14:33:07 -0700

Got it to work like so:

read(from.textFile(inputPath)).write(to.textFile(outputPath)).native.getPipeline().done()


Is that the correct way?

Thanks for the help, I have a running word count example now. :-)



On Fri, Jun 20, 2014 at 4:34 PM, Josh Wills <[email protected]> wrote:

> You need to manually call run() or done() to execute the pipeline if
> you're not materializing the output. The user guide will be useful for the
> basic concepts, even though it focuses on the Java API.
> On Jun 20, 2014 1:27 PM, "Daniel Siegmann" <[email protected]>
> wrote:
>
>> Thanks Josh! The thrift and protobuf defs were what I was missing. I'm
>> able to compile and run the code now. I also updated to Scrunch 0.10.0.
>>
>> Any idea why it might not write the output? If I have
>>
>> countWords(args(0)).materialize.foreach(line => println(s"**** $line"))
>>
>> I get all my output, but
>>
>> countWords(args(0)).write(to.textFile(args(1)))
>>
>> Doesn't even create the output directory, even though I see this in my
>> logs
>>
>> 14/06/20 16:17:47 INFO impl.FileTargetImpl: Will write output files to
>> new path:
>> /var/folders/th/7vf9rjqd1955jnwnzg3x9ym40000gn/T/1403295466563-1/wordcounts
>>
>> No exceptions or anything. I'm probably missing something obvious. :-(
>>
>>
>> On Thu, Jun 19, 2014 at 6:03 PM, Josh Wills <[email protected]> wrote:
>>
>>> Here you go: https://github.com/jwills/scrunch-demo
>>>
>>> Did this w/Maven; you'll have to forgive me as my SBT-fu isn't great. It
>>> looks like vanilla Hadoop 1.x doesn't include any thrift/protobuf
>>> dependencies that Scrunch expects to be present at compile-time; I added
>>> them as provided dependencies in this example and then verified that I
>>> could run the -job.jar that I built w/mvn package under Hadoop 1.0.3.
>>>
>>> J
>>>
>>>
>>> On Thu, Jun 19, 2014 at 2:33 PM, Daniel Siegmann <
>>> [email protected]> wrote:
>>>
>>>> Hi Josh, thanks for the reply.
>>>>
>>>>  Which version of Hadoop are you looking to compile against?
>>>>>
>>>>
>>>> I think any 1.x version will suffice (our production cluster is MapR).
>>>>
>>>> The Spotify comparison is interesting. Too bad they didn't evaluate
>>>> Scoobi as well. Thanks for the info.
>>>>
>>>
>>>
>>>
>>> --
>>> Director of Data Science
>>> Cloudera <http://www.cloudera.com>
>>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>>
>>
>>
>>
>> --
>> Daniel Siegmann, Software Developer
>> Velos
>> Accelerating Machine Learning
>>
>> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
>> E: [email protected] W: www.velos.io
>>
>


-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: [email protected] W: www.velos.io

Re: Scrunch example project with SBT?

Reply via email to