Build spark failed with maven

2015-02-10 Thread Yi Tian
Hi, all I got an ERROR when I build spark master branch with maven (commit: |2d1e916730492f5d61b97da6c483d3223ca44315|) |[INFO] [INFO] [INFO] Building Spark Project Catalyst 1.3.0-SNAPSHOT [INFO] -

Spark Summit East - March 18-19 - NYC

2015-02-10 Thread Scott walent
The inaugural Spark Summit East, an event to bring the Apache Spark community together, will be in New York City on March 18, 2015. We are excited about the growth of Spark and to bring the event to the east coast. At Spark Summit East you can look forward to hearing from Matei Zaharia, Databricks

RE: Using CUDA within Spark / boosting linear algebra

2015-02-10 Thread Ulanov, Alexander
Thanks, Evan! It seems that ticket was marked as duplicate though the original one discusses slightly different topic. I was able to link netlib with MKL from BIDMat binaries. Indeed, MKL is statically linked inside a 60MB library. |A*B size | BIDMat MKL | Breeze+Netlib-MKL from BIDMat| Breez

Re: renaming SchemaRDD -> DataFrame

2015-02-10 Thread Reynold Xin
It's a good point. I will update the documentation to say that this is not meant to be subclassed externally. On Tue, Feb 10, 2015 at 12:10 PM, Koert Kuipers wrote: > thanks matei its good to know i can create them like that > > reynold, yeah somehow the words sql gets me going :) sorry... > ye

Re: renaming SchemaRDD -> DataFrame

2015-02-10 Thread Koert Kuipers
thanks matei its good to know i can create them like that reynold, yeah somehow the words sql gets me going :) sorry... yeah agreed that you need new transformations to preserve the schema info. i misunderstood and thought i had to implement the bunch but that is clearly not necessary as matei ind

Re: renaming SchemaRDD -> DataFrame

2015-02-10 Thread Reynold Xin
Koert, Don't get too hang up on the name SQL. This is exactly what you want: a collection with record-like objects with field names and runtime types. Almost all of the 40 methods are transformations for structured data, such as aggregation on a field, or filtering on a field. If all you have is

Re: renaming SchemaRDD -> DataFrame

2015-02-10 Thread Matei Zaharia
You're not really supposed to subclass DataFrame, instead you can make it from an RDD of Rows and a schema (e.g. with SQLContext.applySchema). Actually the Spark SQL data source API supports that too (org.apache.spark.sql.sources). Think of DataFrame as a container for structured data, not as a

Re: renaming SchemaRDD -> DataFrame

2015-02-10 Thread Koert Kuipers
so i understand the success or spark.sql. besides the fact that anything with the words SQL in its name will have thousands of developers running towards it because of the familiarity, there is also a genuine need for a generic RDD that holds record-like objects, with field names and runtime types.

new committer criteria

2015-02-10 Thread Imran Rashid
Hi all, We've been considering changing criteria for being a committer ( http://s.apache.org/VFw), but I don't think there are any conclusions yet. I had proposed eliminating (or at least weakening) this requirement: > ...have contributed at least one major component where they have taken an > "o

FYI: Prof John Canny is giving a talk on "Machine Learning at the limit" in SF Big Analytics Meetup

2015-02-10 Thread Chester Chen
Just in case you are in San Francisco, we are having a meetup by Prof John Canny http://www.meetup.com/SF-Big-Analytics/events/220427049/ Chester

Re: Keep or remove Debian packaging in Spark?

2015-02-10 Thread Mark Hamstra
Yeah, I'm fine with that. On Mon, Feb 9, 2015 at 10:09 PM, Patrick Wendell wrote: > Mark was involved in adding this code (IIRC) and has also been the > most active in maintaining it. So I'd be interested in hearing his > thoughts on that proposal. Mark - would you be okay deprecating this > and

Re: Keep or remove Debian packaging in Spark?

2015-02-10 Thread jay vyas
@patrick @nate good idea, might as well join forces... right now in bigtop we already have - packaging of both deb and rpm versions of spark in bigtop, + - puppet recipes which work for standalone deployment, + - curation of e2e vagrant tests + bigpetstore-spark, for automated testing spark in b

Batch prediciton for ALS

2015-02-10 Thread Debasish Das
Hi, Will it be possible to merge this PR to 1.3 ? https://github.com/apache/spark/pull/3098 The batch prediction API in ALS will be useful for us who want to cross validate on prec@k and MAP... Thanks. Deb

Re: Unit tests

2015-02-10 Thread Iulian Dragoș
Thank, Josh, I missed that PR. On Mon, Feb 9, 2015 at 7:45 PM, Josh Rosen wrote: > Hi Iulian, > > I think the AkakUtilsSuite failure that you observed has been fixed in > https://issues.apache.org/jira/browse/SPARK-5548 / > https://github.com/apache/spark/pull/4343 > > On February 9, 2015 at 5:4

Spark On HPC Podcast

2015-02-10 Thread Brock Palen
Sorry to pollute the list. I am one half the HPC podcast www.rce-cast.com and we are looking to feature Spark on the show. We are looking for a developer or two who can answer questions to educate the research community about Spark. Please contact me off list. It takes about an hour over the

R: Powered by Spark: Concur

2015-02-10 Thread Paolo Platter
Thank you! Paolo Inviata dal mio Windows Phone Da: Patrick Wendell Inviato: ‎10/‎02/‎2015 08:59 A: Paolo Platter Cc: Denny Lee; Matei Zaharia

Re: Powered by Spark: Concur

2015-02-10 Thread Patrick Wendell
Thanks Paolo - I've fixed it. On Mon, Feb 9, 2015 at 11:10 PM, Paolo Platter wrote: > Hi, > > I checked the powered by wiki too and Agile Labs should be Agile Lab. The > link is wrong too, it should be www.agilelab.it. > The description is correct. > > Thanks a lot > > Paolo > > Inviata dal mio