Check out PySpark. No Scala required. On Friday, October 17, 2014, Adaryl "Bob" Wakefield, MBA < adaryl.wakefi...@hotmail.com> wrote:
> “The only problem with Spark adoption is the steep learning curve of > Scala , and understanding the API properly.” > > This is why I’m looking for reasons to avoid Spark. In my mind, it’s one > more thing to have to master and doesn’t really have anything to offer that > can’t be done with other tools that are already inside my skillset. I spoke > with some software engineers recently and basically the discussion boiled > down to if you need to master Java or Scala go with Java. Three months into > Java I don’t want to stop that and start learning Scala. > > B. > *From:* kartik saxena > <javascript:_e(%7B%7D,'cvml','kartik....@gmail.com');> > *Sent:* Friday, October 17, 2014 1:12 PM > *To:* user@hadoop.apache.org > <javascript:_e(%7B%7D,'cvml','user@hadoop.apache.org');> > *Subject:* Re: Spark vs Tez > > I did a performance benchmark during my summer internship . I am > currently a grad student. Can't reveal much about the specific project but > Spark is still faster than around 4-5th iteration of Tez of the same > query/dataset. By Iteration I mean utilizing the "hot-container" property > of Apache Tez . See latest release of Tez and some hortonworks tutorials > on their website. > > The only problem with Spark adoption is the steep learning curve of Scala > , and understanding the API properly. > > Thanks > > On Fri, Oct 17, 2014 at 11:06 AM, Adaryl "Bob" Wakefield, MBA < > adaryl.wakefi...@hotmail.com > <javascript:_e(%7B%7D,'cvml','adaryl.wakefi...@hotmail.com');>> wrote: > >> Does anybody have any performance figures on how Spark stacks up >> against Tez? If you don’t have figures, does anybody have an opinion? Spark >> seems so popular but I’m not really seeing why. >> B. >> > > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com