date:20160526

Re: [ANNOUNCE] Apache Spark 2.0.0-preview release

2016-05-26 Thread Sean Owen

I still don't see any artifacts in maven -- did it publish?

http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22

On Wed, May 25, 2016 at 10:30 AM, Reynold Xin  wrote:
> Yup I have published it to maven. Will post the link in a bit.
>
> One thing is that for developers, it might be better to use the nightly
> snapshot because that one probably has fewer bugs than the preview one.
>
>
> On Wednesday, May 25, 2016, Daniel Darabos
>  wrote:
>>
>> Awesome, thanks! It's very helpful for preparing for the migration. Do you
>> plan to push 2.0.0-preview to Maven too? (I for one would appreciate the
>> convenience.)

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Creation of SparkML Estimators in Java broken?

2016-05-26 Thread Benjii519

Hello, 

Let me preface this with the fact that I am completely new to Spark and
Scala, so I may be missing something basic. 

I have been looking at implementing a clustering algorithm on top of SparkML
using Java, and ran into immediate problems. As a sanity check, I went to
the Java API example, but encountered the same behavior: I am unable to set
parameters on a Java defined Estimator 

Focusing on the JavaDeveloperApiExample, as I trust that more than my code,
I encounter the exception pasted at end of post. 

Digging around the Spark code, it looks like adding parameters through Java
is broken because the Scala params implementation is using reflection to
determine valid parameters. This works fine in the Scala Estimators as they
appear to use implementation specific params as a trait. In the Java case,
the params are a generic base class and reflection on params won't find
anything to populate (all defined on the Estimator class). Therefore, when I
try to set a parameter on the estimator, the validation fails as an unknown
parameter. 

Any feedback / suggestions? Is this a known issue? 

Thanks! 

Exception in thread "main" java.lang.IllegalArgumentException: requirement
failed: Param myJavaLogReg_d3e770dacdc9__maxIter does not belong to
myJavaLogReg_d3e770dacdc9. 
at scala.Predef$.require(Predef.scala:233) 
at
org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:740) 
at org.apache.spark.ml.param.Params$class.set(params.scala:618) 
at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:43) 
at org.apache.spark.ml.param.Params$class.set(params.scala:604) 
at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:43) 
at
org.apache.spark.examples.ml.MyJavaLogisticRegression.setMaxIter(JavaDeveloperApiExample.java:144)
 
at
org.apache.spark.examples.ml.MyJavaLogisticRegression.init(JavaDeveloperApiExample.java:139)
 
at
org.apache.spark.examples.ml.MyJavaLogisticRegression.(JavaDeveloperApiExample.java:111)
 
at
org.apache.spark.examples.ml.JavaDeveloperApiExample.main(JavaDeveloperApiExample.java:68)



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Creation-of-SparkML-Estimators-in-Java-broken-tp17710.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Dataset reduceByKey

2016-05-26 Thread Reynold Xin

Here's a ticket: https://issues.apache.org/jira/browse/SPARK-15598



On Fri, May 20, 2016 at 12:35 AM, Reynold Xin  wrote:

> Andres - this is great feedback. Let me think about it a little bit more
> and reply later.
>
>
> On Thu, May 19, 2016 at 11:12 AM, Andres Perez  wrote:
>
>> Hi all,
>>
>> We were in the process of porting an RDD program to one which uses
>> Datasets. Most things were easy to transition, but one hole in
>> functionality we found was the ability to reduce a Dataset by key,
>> something akin to PairRDDFunctions.reduceByKey. Our first attempt of adding
>> the functionality ourselves involved creating a KeyValueGroupedDataset and
>> calling reduceGroups to get the reduced Dataset.
>>
>>   class RichPairDataset[K, V: ClassTag](val ds: Dataset[(K, V)]) {
>> def reduceByKey(func: (V, V) => V)(implicit e1: Encoder[K], e2:
>> Encoder[V], e3: Encoder[(K, V)]): Dataset[(K, V)] =
>>   ds.groupByKey(_._1).reduceGroups { (tup1, tup2) => (tup1._1,
>> func(tup1._2, tup2._2)) }.map { case (k, (_, v)) => (k, v) }
>>   }
>>
>> Note that the functions passed into .reduceGroups takes in the key-value
>> pair. It'd be nicer to pass in a function that maps just the values, i.e.
>> reduceGroups(func). This would require the ability to modify the values of
>> the KeyValueGroupedDataset (which is returned by the .groupByKey call on a
>> Dataset). Such a function (e.g., KeyValuedGroupedDataset.mapValues(func: V
>> => U)) does not currently exist.
>>
>> The more important issue, however, is the inefficiency of .reduceGroups.
>> The function does not support partial aggregation (reducing map-side), and
>> as a result requires shuffling all the data in the Dataset. A more
>> efficient alternative that that we explored involved creating a Dataset
>> from the KeyValueGroupedDataset by creating an Aggregator and passing it as
>> a TypedColumn to KeyValueGroupedDataset's .agg function. Unfortunately, the
>> Aggregator necessitated the creation of a zero to create a valid monoid.
>> However, the zero is dependent on the reduce function. The zero for a
>> function such as addition on Ints would be different from the zero for
>> taking the minimum over Ints, for example. The Aggregator requires that we
>> not break the rule of reduce(a, zero) == a. To do this we had to create an
>> Aggregator with a buffer type that stores the value along with a null flag
>> (using Scala's nice Option syntax yielded some mysterious errors that I
>> haven't worked through yet, unfortunately), used by the zero element to
>> signal that it should not participate in the reduce function.
>>
>> -Andy
>>
>
>

Re: changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

2016-05-26 Thread Reynold Xin

Yup - but the reason we did the null handling that way was for Python,
which also affects Scala.


On Thu, May 26, 2016 at 4:17 PM, Koert Kuipers  wrote:

> ok, thanks for creating ticket.
>
> just to be clear: my example was in scala
>
> On Thu, May 26, 2016 at 7:07 PM, Reynold Xin  wrote:
>
>> This is unfortunately due to the way we set handle default values in
>> Python. I agree it doesn't follow the principle of least astonishment.
>>
>> Maybe the best thing to do here is to put the actual default values in
>> the Python API for csv (and json, parquet, etc), rather than using None in
>> Python. This would require us to duplicate default values twice (once in
>> data source options, and another in the Python API), but that's probably OK
>> given they shouldn't change all the time.
>>
>> Ticket https://issues.apache.org/jira/browse/SPARK-15585
>>
>>
>>
>>
>> On Thu, May 26, 2016 at 3:35 PM, Koert Kuipers  wrote:
>>
>>> in spark 1.6.1 we used:
>>>  sqlContext.read
>>>   .format("com.databricks.spark.csv")
>>>   .delimiter("~")
>>>   .option("quote", null)
>>>
>>> this effectively turned off quoting, which is a necessity for certain
>>> data formats where quoting is not supported and "\"" is a valid character
>>> itself in the data.
>>>
>>> in spark 2.0.0-SNAPSHOT we did same thing:
>>>  sqlContext.read
>>>   .format("csv")
>>>   .delimiter("~")
>>>   .option("quote", null)
>>>
>>> but this did not work, we got weird blowups where spark was trying to
>>> parse thousands of lines as if it is one record. the reason was that a
>>> (valid) quote character ("\"") was present in the data. for example
>>> a~b"c~d
>>>
>>> as it turns out setting quote to null does not turn of quoting anymore.
>>> instead it means to use the default quote character.
>>>
>>> does anyone know how to turn off quoting now?
>>>
>>> our current workaround is:
>>>  sqlContext.read
>>>   .format("csv")
>>>   .delimiter("~")
>>>   .option("quote", "☃")
>>>
>>> (we assume there are no unicode snowman's in our data...)
>>>
>>>
>>>
>>
>

Re: changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

2016-05-26 Thread Koert Kuipers

ok, thanks for creating ticket.

just to be clear: my example was in scala

On Thu, May 26, 2016 at 7:07 PM, Reynold Xin  wrote:

> This is unfortunately due to the way we set handle default values in
> Python. I agree it doesn't follow the principle of least astonishment.
>
> Maybe the best thing to do here is to put the actual default values in the
> Python API for csv (and json, parquet, etc), rather than using None in
> Python. This would require us to duplicate default values twice (once in
> data source options, and another in the Python API), but that's probably OK
> given they shouldn't change all the time.
>
> Ticket https://issues.apache.org/jira/browse/SPARK-15585
>
>
>
>
> On Thu, May 26, 2016 at 3:35 PM, Koert Kuipers  wrote:
>
>> in spark 1.6.1 we used:
>>  sqlContext.read
>>   .format("com.databricks.spark.csv")
>>   .delimiter("~")
>>   .option("quote", null)
>>
>> this effectively turned off quoting, which is a necessity for certain
>> data formats where quoting is not supported and "\"" is a valid character
>> itself in the data.
>>
>> in spark 2.0.0-SNAPSHOT we did same thing:
>>  sqlContext.read
>>   .format("csv")
>>   .delimiter("~")
>>   .option("quote", null)
>>
>> but this did not work, we got weird blowups where spark was trying to
>> parse thousands of lines as if it is one record. the reason was that a
>> (valid) quote character ("\"") was present in the data. for example
>> a~b"c~d
>>
>> as it turns out setting quote to null does not turn of quoting anymore.
>> instead it means to use the default quote character.
>>
>> does anyone know how to turn off quoting now?
>>
>> our current workaround is:
>>  sqlContext.read
>>   .format("csv")
>>   .delimiter("~")
>>   .option("quote", "☃")
>>
>> (we assume there are no unicode snowman's in our data...)
>>
>>
>>
>

changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

2016-05-26 Thread Koert Kuipers

in spark 1.6.1 we used:
 sqlContext.read
  .format("com.databricks.spark.csv")
  .delimiter("~")
  .option("quote", null)

this effectively turned off quoting, which is a necessity for certain data
formats where quoting is not supported and "\"" is a valid character itself
in the data.

in spark 2.0.0-SNAPSHOT we did same thing:
 sqlContext.read
  .format("csv")
  .delimiter("~")
  .option("quote", null)

but this did not work, we got weird blowups where spark was trying to parse
thousands of lines as if it is one record. the reason was that a (valid)
quote character ("\"") was present in the data. for example
a~b"c~d

as it turns out setting quote to null does not turn of quoting anymore.
instead it means to use the default quote character.

does anyone know how to turn off quoting now?

our current workaround is:
 sqlContext.read
  .format("csv")
  .delimiter("~")
  .option("quote", "☃")

(we assume there are no unicode snowman's in our data...)

Re: JDBC Dialect for saving DataFrame into Vertica Table

2016-05-26 Thread Reynold Xin

It's probably a good idea to have the vertica dialect too, since it doesn't
seem like it'd be too difficult to maintain. It is not going to be as
performant as the native Vertica data source, but is going to be much
lighter weight.


On Thu, May 26, 2016 at 3:09 PM, Mohammed Guller 
wrote:

> Vertica also provides a Spark connector. It was not GA the last time I
> looked at it, but available on the Vertica community site. Have you tried
> using the Vertica Spark connector instead of the JDBC driver?
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> 
>
>
>
> *From:* Aaron Ilovici [mailto:ailov...@wayfair.com]
> *Sent:* Thursday, May 26, 2016 8:08 AM
> *To:* u...@spark.apache.org; dev@spark.apache.org
> *Subject:* JDBC Dialect for saving DataFrame into Vertica Table
>
>
>
> I am attempting to write a DataFrame of Rows to Vertica via
> DataFrameWriter's jdbc function in the following manner:
>
>
>
> dataframe.write().mode(SaveMode.Append).jdbc(url, table, properties);
>
>
>
> This works when there are no NULL values in any of the Rows in my
> DataFrame. However, when there are rows, I get the following error:
>
>
>
> ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 24)
>
> java.sql.SQLFeatureNotSupportedException: [Vertica][JDBC](10220) Driver
> not capable.
>
> at com.vertica.exceptions.ExceptionConverter.toSQLException(Unknown
> Source)
>
> at
> com.vertica.jdbc.common.SPreparedStatement.checkTypeSupported(Unknown
> Source)
>
> at com.vertica.jdbc.common.SPreparedStatement.setNull(Unknown Source)
>
>
>
> This appears to be Spark's attempt to set a null value in a
> PreparedStatement, but Vertica does not understand the type upon executing
> the transaction. I see in JdbcDialects.scala that there are dialects for
> MySQL, Postgres, DB2, MsSQLServer, Derby, and Oracle.
>
>
>
> 1 - Would writing a dialect for Vertica eleviate the issue, by setting a
> 'NULL' in a type that Vertica would understand?
>
> 2 - What would be the best way to do this without a Spark patch? Scala,
> Java, make a jar and call 'JdbcDialects.registerDialect(VerticaDialect)'
> once created?
>
> 3 - Where would one find the proper mapping between Spark DataTypes and
> Vertica DataTypes? I don't see 'NULL' handling for any of the dialects,
> only the base case 'case _ => None' - is None mapped to the proper NULL
> type elsewhere?
>
>
>
> My environment: Spark 1.6, Vertica Driver 7.2.2, Java 1.7
>
>
>
> I would be happy to create a Jira and submit a pull request with the
> VerticaDialect once I figure this out.
>
>
>
> Thank you for any insight on this,
>
>
>
> *AARON ILOVICI*
> Software Engineer
>
> Marketing Engineering
>
> *WAYFAIR*
> 4 Copley Place
> Boston, MA 02116
>
> (617) 532-6100 x1231
> ailov...@wayfair.com
>
>
>

RE: JDBC Dialect for saving DataFrame into Vertica Table

2016-05-26 Thread Mohammed Guller

Vertica also provides a Spark connector. It was not GA the last time I looked 
at it, but available on the Vertica community site. Have you tried using the 
Vertica Spark connector instead of the JDBC driver?

Mohammed
Author: Big Data Analytics with 
Spark

From: Aaron Ilovici [mailto:ailov...@wayfair.com]
Sent: Thursday, May 26, 2016 8:08 AM
To: u...@spark.apache.org; dev@spark.apache.org
Subject: JDBC Dialect for saving DataFrame into Vertica Table

I am attempting to write a DataFrame of Rows to Vertica via DataFrameWriter's 
jdbc function in the following manner:

dataframe.write().mode(SaveMode.Append).jdbc(url, table, properties);

This works when there are no NULL values in any of the Rows in my DataFrame. 
However, when there are rows, I get the following error:

ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 24)
java.sql.SQLFeatureNotSupportedException: [Vertica][JDBC](10220) Driver not 
capable.
at com.vertica.exceptions.ExceptionConverter.toSQLException(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.checkTypeSupported(Unknown 
Source)
at com.vertica.jdbc.common.SPreparedStatement.setNull(Unknown Source)

This appears to be Spark's attempt to set a null value in a PreparedStatement, 
but Vertica does not understand the type upon executing the transaction. I see 
in JdbcDialects.scala that there are dialects for MySQL, Postgres, DB2, 
MsSQLServer, Derby, and Oracle.

1 - Would writing a dialect for Vertica eleviate the issue, by setting a 'NULL' 
in a type that Vertica would understand?
2 - What would be the best way to do this without a Spark patch? Scala, Java, 
make a jar and call 'JdbcDialects.registerDialect(VerticaDialect)' once created?
3 - Where would one find the proper mapping between Spark DataTypes and Vertica 
DataTypes? I don't see 'NULL' handling for any of the dialects, only the base 
case 'case _ => None' - is None mapped to the proper NULL type elsewhere?

My environment: Spark 1.6, Vertica Driver 7.2.2, Java 1.7

I would be happy to create a Jira and submit a pull request with the 
VerticaDialect once I figure this out.

Thank you for any insight on this,

AARON ILOVICI
Software Engineer
Marketing Engineering

[cid:image001.png@01D1B760.973BD800]

WAYFAIR
4 Copley Place
Boston, MA 02116
(617) 532-6100 x1231
ailov...@wayfair.com

How to access the off-heap representation of cached data in Spark 2.0

2016-05-26 Thread jpivar...@gmail.com

Following up on an  earlier thread

 
, I would like to access the off-heap representation of cached data in Spark
2.0 in order to see how Spark might be linked to physics software written in
C and C++.
I'm willing to do exploration on my own, but could somebody point me to a
place to start? I have downloaded the 2.0 preview and created a persisted
Dataset:
import scala.util.Randomcase class Muon(px: Double, py: Double) {  def pt =
Math.sqrt(px*px + py*py)}val rdd = sc.parallelize(0 until 1 map {x => 
Muon(Random.nextGaussian, Random.nextGaussian)  }, 10)val df = rdd.toDFval
ds = df.as[Muon]ds.persist()
So I have a Dataset in memory, and if I understand the  blog articles

  
correctly, it's in off-heap memory (sun.misc.Unsafe). Is there any way I
could get a pointer to that data that I could explore with BridJ? Any hints
on how it's stored? Like, could I get started through some Djinni calls or
something?
Thanks!
-- Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-access-the-off-heap-representation-of-cached-data-in-Spark-2-0-tp17701.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: LiveListenerBus with started and stopped flags? Why both?

2016-05-26 Thread Shixiong(Ryan) Zhu

Just to prevent from restarting LiveListenerBus. The internal Thread cannot
be restarted.

On Wed, May 25, 2016 at 12:59 PM, Jacek Laskowski  wrote:

> Hi,
>
> I'm wondering why LiveListenerBus has two AtomicBoolean flags [1]?
> Could it not have just one, say started? Why does Spark have to check
> the stopped state?
>
> [1]
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala#L49-L51
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

[RESULT][VOTE] Removing module maintainer process

2016-05-26 Thread Matei Zaharia

Thanks everyone for voting. With only +1 votes, the vote passes, so I'll update 
the contributor wiki appropriately.

+1 votes:

Matei Zaharia (binding)
Mridul Muralidharan (binding)
Andrew Or (binding)
Sean Owen (binding)
Nick Pentreath (binding)
Tom Graves (binding)
Imran Rashid (binding)
Holden Karau
Owen O'Malley

No 0 or -1 votes.

Matei


> On May 24, 2016, at 12:27 PM, Owen O'Malley  wrote:
> 
> +1 (non-binding)
> 
> I think this is an important step to improve Spark as an Apache project.
> 
> .. Owen
> 
> On Mon, May 23, 2016 at 11:18 AM, Holden Karau  > wrote:
> +1 non-binding (as a contributor anything which speed things up is worth a 
> try, and git blame is a good enough substitute for the list when figuring out 
> who to ping on a PR).
> 
> 
> On Monday, May 23, 2016, Imran Rashid  > wrote:
> +1 (binding)
> 
> On Mon, May 23, 2016 at 8:13 AM, Tom Graves > 
> wrote:
> +1 (binding)
> 
> Tom
> 
> 
> On Sunday, May 22, 2016 7:34 PM, Matei Zaharia > 
> wrote:
> 
> 
> It looks like the discussion thread on this has only had positive replies, so 
> I'm going to call a VOTE. The proposal is to remove the maintainer process in 
> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers
>   
>   
> >
>  given that it doesn't seem to have had a huge impact on the project, and it 
> can unnecessarily create friction in contributing. We already have +1s from 
> Mridul, Tom, Andrew Or and Imran on that thread.
> 
> I'll leave the VOTE open for 48 hours, until 9 PM EST on May 24, 2016.
> 
> Matei
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org <>
> For additional commands, e-mail: dev-h...@spark.apache.org <>
> 
> 
> 
> 
> 
> -- 
> Cell : 425-233-8271 
> Twitter: https://twitter.com/holdenkarau 
>

Spark Job Execution halts during shuffle...

2016-05-26 Thread Priya Ch

Hello Team,


 I am trying to perform join 2 rdds where one is of size 800 MB and the
other is 190 MB. During the join step, my job halts and I don't see
progress in the execution.

This is the message I see on console -

INFO spark.MapOutputTrackerMasterEndPoint: Asked to send map output
locations for shuffle 0 to :4
INFO spark.MapOutputTrackerMasterEndPoint: Asked to send map output
locations for shuffle 1 to :4

After these messages, I dont see any progress. I am using Spark 1.6.0
version and yarn scheduler (running in YARN client mode). My cluster
configurations is - 3 node cluster (1 master and 2 slaves). Each slave has
1 TB hard disk space, 300GB memory and 32 cores.

HDFS block size is 128 MB.

Thanks,
Padma Ch

Merging two datafiles

2016-05-26 Thread dvlpr

Hi everyone,
I am doing some research in spark. I have one doubt: Can we merge or combine
two datafiles and two indexfiles of different jobs (on same rdd) ?
Please give me some ideas.

Thank you!



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Merging-two-datafiles-tp17697.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Labeling Jiras

2016-05-26 Thread Steve Loughran


> On 25 May 2016, at 23:08, Sean Owen  wrote:
> 
> Yeah I think using labels is fine -- just not if they're for someone's
> internal purpose. I don't have a problem with using meaningful labels
> if they're meaningful to everyone. In fact, I'd rather be using labels
> rather than "umbrella" JIRAs.
> 
> Labels I have removed as unuseful are ones like "patch" or "important"
> or "bug". "big-endian" sounds useful. The only downside is that,
> inevitably, a label won't be consistently applied. But such is life.
> 

labels are good in JIRA for things that span components, even transient tagging 
for events like "hackathon". Don't scale to personal/team use in the ASF; 
that's what google spreadsheets are better for

Now, what would be nice there would be for some spreadsheet plugin to pull JIRA 
status into a spreadsheet

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [ANNOUNCE] Apache Spark 2.0.0-preview release

2016-05-26 Thread Gurvinder Singh

On 05/26/2016 02:38 AM, Matei Zaharia wrote:
> Just wondering, what is the main use case for the Docker images -- to
> develop apps locally or to deploy a cluster? 
I use docker images for both development and deploying on production
cluster. As it makes sure I have the correct version of Java and Spark.
If the image is really just
> a script to download a certain package name from a mirror, it may be
> okay to create an official one, though it does seem tricky to make it
> properly use the right mirror.
I don't think that's an issue, as you will publish a docker image which
will already have spark baked from which ever mirror you choose. The
mirror issue will be only when people want to build their own image from
published Dockerfile, then they can change the mirror if they prefer.

Here is the link to current Spark Dockerfile
(https://gist.github.com/gurvindersingh/8308d46995a58303b90e4bc2fc46e343)
I use as base, then I can start master and worker from it as I like.

- Gurvinder
> 
> Matei
> 
>> On May 25, 2016, at 6:05 PM, Luciano Resende > > wrote:
>>
>>
>>
>> On Wed, May 25, 2016 at 2:34 PM, Sean Owen > > wrote:
>>
>> I don't think the project would bless anything but the standard
>> release artifacts since only those are voted on. People are free to
>> maintain whatever they like and even share it, as long as it's clear
>> it's not from the Apache project.
>>
>>
>> +1
>>
>>
>> -- 
>> Luciano Resende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
> 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [ANNOUNCE] Apache Spark 2.0.0-preview release

Creation of SparkML Estimators in Java broken?

Re: Dataset reduceByKey

Re: changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

Re: changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

changed behavior for csv datasource and quoting in spark 2.0.0-SNAPSHOT

Re: JDBC Dialect for saving DataFrame into Vertica Table

RE: JDBC Dialect for saving DataFrame into Vertica Table

How to access the off-heap representation of cached data in Spark 2.0

Re: LiveListenerBus with started and stopped flags? Why both?

[RESULT][VOTE] Removing module maintainer process

Spark Job Execution halts during shuffle...

Merging two datafiles

Re: Labeling Jiras

Re: [ANNOUNCE] Apache Spark 2.0.0-preview release

15 matches

Site Navigation

Mail list logo

Footer information