RE: Need advice for Spark newbie

2015-02-26 Thread Steve Nunez
Hi Vikram, There was a recent presentation at Strata that you might find useful: Hive on Spark is Blazing Fast .. Or Is It?http://www.slideshare.net/hortonworks/hive-on-spark-is-blazing-fast-or-is-it-final Generally those conclusions mirror my own observations: on large data sets, Hive

Re: Surprising Spark SQL benchmark

2014-10-31 Thread Steve Nunez
To be fair, we (Spark community) haven’t been any better, for example this benchmark: https://databricks.com/blog/2014/10/10/spark-petabyte-sort.html For which no details or code have been released to allow others to reproduce it. I would encourage anyone doing a Spark benchmark in

Re: Breaking the previous large-scale sort record with Spark

2014-10-10 Thread Steve Nunez
Great stuff. Wonderful to see such progress in so short a time. How about some links to code and instructions so that these benchmarks can be reproduced? Regards, - Steve From: Debasish Das debasish.da...@gmail.com Date: Friday, October 10, 2014 at 8:17 To: Matei Zaharia

Re: Issues with HDP 2.4.0.2.1.3.0-563

2014-08-04 Thread Steve Nunez
I don’t think there is an hwx profile, but there probably should be. - Steve From: Patrick Wendell pwend...@gmail.com Date: Monday, August 4, 2014 at 10:08 To: Ron's Yahoo! zlgonza...@yahoo.com Cc: Ron's Yahoo! zlgonza...@yahoo.com.invalid, Steve Nunez snu...@hortonworks.com, u

Re: Issues with HDP 2.4.0.2.1.3.0-563

2014-08-04 Thread Steve Nunez
purist but just that I am not sure these are things that the project can meaningfully bother with. It makes sense to set vendor repos in the pom for convenience, and makes sense to run smoke tests in Jenkins against particular versions. $0.02 Sean On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez snu

'Proper' Build Tool

2014-07-28 Thread Steve Nunez
Gents, It seem that until recently, building via sbt was a documented process in the 0.9 overview: http://spark.apache.org/docs/0.9.0/ The section on building mentions using sbt/sbt assembly. However in the latest overview: http://spark.apache.org/docs/latest/index.html There¹s no mention of

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez snu...@hortonworks.com wrote: So, do we have a short-term fix until Hive 0.14 comes out? Perhaps adding the hive-exec jar to the spark-project repo? It doesn¹t look like there¹s a release date schedule for 0.14. On 7/28/14, 10:50

Re: No such file or directory errors running tests

2014-07-27 Thread Steve Nunez
Whilst we¹re on this topic, I¹d be interested to see if you get hive failures. I¹m trying to build on a Mac using HDP and seem to be getting failures related to Parquet. I¹ll know for sure once I get in tomorrow and confirm with engineering, but this is likely because the version of Hive is