Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Marcelo Vanzin
Hi Steve, thanks for the write up. On Fri, Mar 18, 2016 at 3:12 AM, Steve Loughran wrote: > If you want a separate project, eg. SPARK-EXTRAS, then it *generally* needs > to go through incubation. While normally its the incubator PMC which > sponsors/oversees the

RE: Fwd: DF creation

2016-03-18 Thread Diwakar Dhanuskodi
Import sqlContext.implicits._  before  using  df () Sent from Samsung Mobile. Original message From: satyajit vegesna Date:19/03/2016 06:00 (GMT+05:30) To: u...@spark.apache.org, dev@spark.apache.org Cc: Subject: Fwd: DF creation Hi , I am

Re: df.dtypes -> pyspark.sql.types

2016-03-18 Thread Reynold Xin
We probably should have the alias. Is this still a problem on master branch? On Wed, Mar 16, 2016 at 9:40 AM, Ruslan Dautkhanov wrote: > Running following: > > #fix schema for gaid which should not be Double >> from pyspark.sql.types import * >> customSchema = StructType()

Re: [POWERED BY] Please add our organization

2016-03-18 Thread Jean-Baptiste Onofré
Just a detail (but important): "leverage Apache Spark" (not "leverage Spark"). My $0.01. Regards JB On 03/16/2016 05:18 PM, Craig Lukasik wrote: Name: ​ ​ Zaloni's Bedrock & Mica URL: http://www.zaloni.com/products/ Description: ​ ​ Zaloni's data ​lake ​ management platform (Bedrock) and

Re: graceful shutdown in external data sources

2016-03-18 Thread Steve Loughran
On 16 Mar 2016, at 23:43, Dan Burkert > wrote: After further thought, I think following both of your suggestions- adding a shutdown hook and making the threads non-daemon- may have the result I'm looking for. I'll check and see if there are other

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Imran Rashid
On Fri, Mar 18, 2016 at 3:15 PM, Shane Curcuru wrote: > Question: why was the code removed from the Spark repo? What's the harm > in keeping it available here? Assuming the Spark PMC has no plan on releasing the code, why would we keep it in our codebase? It only makes

CfP 11th Workshop on Virtualization in High-Performance Cloud Computing (VHPC '16)

2016-03-18 Thread VHPC 16
CALL FOR PAPERS 11th Workshop on Virtualization in High­-Performance Cloud Computing (VHPC '16) held in conjunction with the International Supercomputing Conference - High Performance, June 19-23, 2016, Frankfurt, Germany.

Re: [POWERED BY] Please add our organization

2016-03-18 Thread Sean Owen
(Good practice indeed but the general idea is to reference "Apache Spark" after which it's reasonable to reference "Spark", and this wiki does.) On Thu, Mar 17, 2016 at 8:13 PM, Jean-Baptiste Onofré wrote: > Just a detail (but important): "leverage Apache Spark" (not "leverage

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Jean-Baptiste Onofré
Hi Marcelo, I quickly discussed with Reynold this morning about this. I share your concerns. I fully understand that it's painful for users to wait a Spark releases to include fix in streaming backends as it's not really related. It makes sense to provide backends "outside" of ASF, especially

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Mattmann, Chris A (3980)
Hi Marcelo, Thanks for your reply. As a committer on the project, you *can* VETO code. For sure. Unfortunately you don’t have a binding vote on adding new PMC members/committers, and/or on releasing the software, but do have the ability to VETO. That said, if that’s not your intent, sorry for

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Luciano Resende
On Fri, Mar 18, 2016 at 10:07 AM, Marcelo Vanzin wrote: > Hi Steve, thanks for the write up. > > On Fri, Mar 18, 2016 at 3:12 AM, Steve Loughran > wrote: > > If you want a separate project, eg. SPARK-EXTRAS, then it *generally* > needs to go through

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Cody Koeninger
Why would a PMC vote be necessary on every code deletion? There was a Jira and pull request discussion about the submodules that have been removed so far. https://issues.apache.org/jira/browse/SPARK-13843 There's another ongoing one about Kafka specifically

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-03-18 Thread Michael Armbrust
Patrick reuploaded the artifacts, so it should be fixed now. On Mar 16, 2016 5:48 PM, "Nicholas Chammas" wrote: > Looks like the other packages may also be corrupt. I’m getting the same > error for the Spark 1.6.1 / Hadoop 2.4 package. > > >

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-03-18 Thread Jakob Odersky
I just experienced the issue, however retrying the download a second time worked. Could it be that there is some load balancer/cache in front of the archive and some nodes still serve the corrupt packages? On Fri, Mar 18, 2016 at 8:00 AM, Nicholas Chammas wrote: > I'm

Fwd: DF creation

2016-03-18 Thread satyajit vegesna
Hi , I am trying to create separate val reference to object DATA (as shown below), case class data(name:String,age:String) Creation of this object is done separately and the reference to the object is stored into val data. i use val samplerdd = sc.parallelize(Seq(data)) , to create RDD.

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-03-18 Thread Nicholas Chammas
I just retried the Spark 1.6.1 / Hadoop 2.6 download and got a corrupt ZIP file. Jakob, are you sure the ZIP unpacks correctly for you? Is it the same Spark 1.6.1/Hadoop 2.6 package you had a success with? On Fri, Mar 18, 2016 at 6:11 PM Jakob Odersky wrote: > I just

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Cody Koeninger
> Or, as Cody Koeniger suggests, having a spark-extras project in the ASF with > a focus on extras with their own support channel. To be clear, I didn't suggest that and don't think that's the best solution. I said to the people who want things done that way, which committer is going to step

Re: SPARK-13843 and future of streaming backends

2016-03-18 Thread Marcelo Vanzin
On Fri, Mar 18, 2016 at 2:12 PM, chrismattmann wrote: > So, my comment here is that any code *cannot* be removed from an Apache > project if there is a VETO issued which so far I haven't seen, though maybe > Marcelo can clarify that. No, my intention was not to veto the

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-03-18 Thread Jakob Odersky
I just realized you're using a different download site. Sorry for the confusion, the link I get for a direct download of Spark 1.6.1 / Hadoop 2.6 is http://d3kbcqa49mib13.cloudfront.net/spark-1.6.1-bin-hadoop2.6.tgz On Fri, Mar 18, 2016 at 3:20 PM, Nicholas Chammas