Thanks all for the info provided.
One of the things I noticed is that both Map and Reduce functions receives
a function which is used on all objects
(map : Return a new distributed dataset formed by passing each element of
the source through a function *func*)

Q1: all the examples I seen so far have a very simple function as func e.g
(line => line.split(" ")) any examples or cases were a more complex
function is needed? if so, what is the syntax for this?

Q2: what is the difference between Map and flatMap? when should I use which?

Q3: reduceByKey((a, b) => a + b) -> Here again, this example was used in
the word count sample. I understand it takes the value argument of the K,V
pair and preform the function on them. e.g. +  but what does the a , b
represent? what if my value is not an integer?

Thanks
Eran



On Tue, Feb 4, 2014 at 1:19 PM, Akhil Das <[email protected]> wrote:

> From the Spark download page, you may download a prebuilt package. If you
> download source package, build it against the hadoop version that you have.
>
> You can open Spark's interactive shell in standalone local mode like this,
> by issuing *./spark-shell *command inside Spark directory
>
> *akhld@akhldz:/data/spark-0.8.0$ ./spark-shell *
>
> Now You can run a word count example in the shell, taking input from hdfs
> and writing output back to hdfs*.*
>
>  *scala> var file =
> sc.textFile("hdfs://bigmaster:54310/sampledata/textbook.txt")*
>
> *scala> var count = file.flatMap(line => line.split(" ")).map(word =>
> (word, 1)).reduceByKey(_+_)*
>
> *scala> count.saveAsTextFile("hdfs://bigmaster:54310/sampledata/wcout")*
>
> *You may find similar information over here *
> http://sprism.blogspot.in/2012/11/lightning-fast-wordcount-using-spark.html
>
>
> On Tue, Feb 4, 2014 at 4:45 PM, goi cto <[email protected]> wrote:
>
>> Hi,
>> I am a newbie with spark and scala and trying to get around.
>> I am looking for resources to learn more (if possible, by example) on how
>> to program with map & reduce functions.
>> Any good recommendations?
>> (I did the getting started guides on the site but still don't feel
>> comfortable with that)...
>>
>> --
>> Eran | CTO
>>
>
>
>
> --
> Thanks
> Best Regards
>



-- 
Eran | CTO

Reply via email to