Re: Spark or MR, Scala or Java?

2014-11-23 Thread Sanjay Subramanian
Thanks a ton Ashishsanjay From: Ashish Rangole To: Sanjay Subramanian Cc: Krishna Sankar ; Sean Owen ; Guillermo Ortiz ; user Sent: Sunday, November 23, 2014 11:03 AM Subject: Re: Spark or MR, Scala or Java? This being a very broad topic, a discussion can quickly get subjective

Re: Spark or MR, Scala or Java?

2014-11-23 Thread Krishna Sankar
A very timely article http://rahulkavale.github.io/blog/2014/11/16/scrap-your-map-reduce/ Cheers P.S: Now reply to ALL. On Sun, Nov 23, 2014 at 7:16 PM, Krishna Sankar wrote: > Good point. > On the positive side, whether we choose the most efficient mechanism in > Scala might not be as importan

Re: Spark or MR, Scala or Java?

2014-11-23 Thread Krishna Sankar
Good point. On the positive side, whether we choose the most efficient mechanism in Scala might not be as important, as the Spark framework mediates the distributed computation. Even if there is some declarative part in Spark, we can still choose an inefficient computation path that is not apparent

Re: Spark or MR, Scala or Java?

2014-11-23 Thread Ognen Duzlevski
On Sun, Nov 23, 2014 at 1:03 PM, Ashish Rangole wrote: > Java or Scala : I knew Java already yet I learnt Scala when I came across > Spark. As others have said, you can get started with a little bit of Scala > and learn more as you progress. Once you have started using Scala for a few > weeks you

Re: Spark or MR, Scala or Java?

2014-11-23 Thread Ashish Rangole
They love SQL so I have to educate > them using Hive , Presto, Impala...so the question is what is your task or > tasks ? > > > Sorry , a long non technical answer to your question... > > Make sense ? > > sanjay > > > ---------- > *From:* Krishna

Re: Spark or MR, Scala or Java?

2014-11-23 Thread Sanjay Subramanian
14 4:53 PM Subject: Re: Spark or MR, Scala or Java? Adding to already interesting answers: - "Is there any case where MR is better than Spark? I don't know what cases I should be used Spark by MR. When is MR faster than Spark?" - Many. MR would be better (am n

Re: Spark or MR, Scala or Java?

2014-11-22 Thread Soumya Simanta
Thanks Sean. adding user@spark.apache.org again. On Sat, Nov 22, 2014 at 9:35 PM, Sean Owen wrote: > On Sun, Nov 23, 2014 at 2:20 AM, Soumya Simanta > wrote: > > Is the MapReduce API "simpler" or the implementation? Almost, every Spark > > presentation has a slide that shows 100+ lines of Hado

Re: Spark or MR, Scala or Java?

2014-11-22 Thread Krishna Sankar
Adding to already interesting answers: - "Is there any case where MR is better than Spark? I don't know what cases I should be used Spark by MR. When is MR faster than Spark?" - Many. MR would be better (am not saying faster ;o)) for - Very large dataset, - Multistage ma

Re: Spark or MR, Scala or Java?

2014-11-22 Thread Sean Owen
MapReduce is simpler and narrower, which also means it is generally lighter weight, with less to know and configure, and runs more predictably. If you have a job that is truly just a few maps, with maybe one reduce, MR will likely be more efficient. Until recently its shuffle has been more develope

Re: Spark or MR, Scala or Java?

2014-11-22 Thread Denny Lee
Just to add some more stuff - there are various scenarios where traditional Hadoop makes more sense than Spark. For example, if you have a long running processing job in which you do not want to utilize too many resources of the cluster. Another example could be that you want to run a distributed e

RE: Spark or MR, Scala or Java?

2014-11-22 Thread Ashic Mahtab
Date: Sat, 22 Nov 2014 16:34:04 +0100 > Subject: Spark or MR, Scala or Java? > From: konstt2...@gmail.com > To: user@spark.apache.org > > Hello, > > I'm a newbie with Spark but I've been working with Hadoop for a while. > I have two questions. > > Is th

Spark or MR, Scala or Java?

2014-11-22 Thread Guillermo Ortiz
Hello, I'm a newbie with Spark but I've been working with Hadoop for a while. I have two questions. Is there any case where MR is better than Spark? I don't know what cases I should be used Spark by MR. When is MR faster than Spark? The other question is, I know Java, is it worth it to learn Sca