Re: Should python-2 be supported in Spark 3.0?

2018-09-15 Thread Erik Erlandson
In case this didn't make it onto this thread: There is a 3rd option, which is to deprecate Py2 for Spark-3.0, and remove it entirely on a later 3.x release. On Sat, Sep 15, 2018 at 11:09 AM, Erik Erlandson wrote: > On a separate dev@spark thread, I raised a question of whether or not to >

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Jules Damji
+1 I think phasing out EOL of any feature or supported language is a better strategy if possible than a quick drop. With enough admonition, it can gradually be dropped in 3.x— of course, there are exceptions. Cheers Jules Sent from my iPhone Pardon the dumb thumb typos :) > On Sep 15,

from_csv

2018-09-15 Thread Maxim Gekk
Hi All, I would like to propose new function from_csv() for parsing columns containing strings in CSV format. Here is my PR: https://github.com/apache/spark/pull/22379 An use case is loading a dataset from an external storage, dbms or systems like Kafka to where CSV content was dumped as one of

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Maciej Szymkiewicz
There is no need to ditch Python 2. There are basically two options - Use stub files and limit yourself to support only Python 3 support. Python 3 users benefit from type hints, Python 2 users don't, but no core functionality is affected. This is the approach I've used with

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Alexander Shorin
What's the release due for Apache Spark 3.0? Will it be tomorrow or somewhere at the middle of 2019 year? I think we shouldn't care much about Python 2.x today, since quite soon it support turns into pumpkin. For today's projects I hope nobody takes into account support of 2.7 unless there is

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Maciej Szymkiewicz
For the reference I raised question of Python 2 support before - http://apache-spark-developers-list.1001551.n3.nabble.com/Future-of-the-Python-2-support-td20094.html On Sat, 15 Sep 2018 at 15:14, Alexander Shorin wrote: > What's the release due for Apache Spark 3.0? Will it be tomorrow or >

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Reynold Xin
we can also declare python 2 as deprecated and drop it in 3.x, not necessarily 3.0. -- excuse the brevity and lower case due to wrist injury On Sat, Sep 15, 2018 at 10:33 AM Erik Erlandson wrote: > I am probably splitting hairs to finely, but I was considering the > difference between

Re: Should python-2 be supported in Spark 3.0?

2018-09-15 Thread Nicholas Chammas
As Reynold pointed out, we don't have to drop Python 2 support right off the bat. We can just deprecate it with Spark 3.0, which would allow us to actually drop it at a later 3.x release. On Sat, Sep 15, 2018 at 2:09 PM Erik Erlandson wrote: > On a separate dev@spark thread, I raised a question

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Erik Erlandson
I am probably splitting hairs to finely, but I was considering the difference between improvements to the jvm-side (py4j and the scala/java code) that would make it easier to write the python layer ("python-friendly api"), and actual improvements to the python layers ("friendly python api").

Re: Python friendly API for Spark 3.0

2018-09-15 Thread Leif Walsh
Hey there, Here’s something I proposed recently that’s in this space. https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-24258 It’s motivated by working with a user who wanted to do some custom statistics for which they could write the numpy code, and knew in what dimensions they

Should python-2 be supported in Spark 3.0?

2018-09-15 Thread Erik Erlandson
On a separate dev@spark thread, I raised a question of whether or not to support python 2 in Apache Spark, going forward into Spark 3.0. Python-2 is going EOL at the end of 2019. The upcoming release of Spark 3.0 is an opportunity to make breaking

Re: from_csv

2018-09-15 Thread Reynold Xin
makes sense - i'd make this as consistent as to_json / from_json as possible. how would this work in sql? i.e. how would passing options in work? -- excuse the brevity and lower case due to wrist injury On Sat, Sep 15, 2018 at 2:58 AM Maxim Gekk wrote: > Hi All, > > I would like to propose

RE: Support STS to run in k8s deployment with spark deployment mode as cluster

2018-09-15 Thread Garlapati, Suryanarayana (Nokia - IN/Bangalore)
Hi, Following is the bug to track the same. https://issues.apache.org/jira/browse/SPARK-25442 Regards Surya From: Garlapati, Suryanarayana (Nokia - IN/Bangalore) Sent: Sunday, September 16, 2018 10:15 AM To: dev@spark.apache.org; Ilan Filonenko Cc: u...@spark.apache.org; Imandi, Srinivas

Support STS to run in k8s deployment with spark deployment mode as cluster

2018-09-15 Thread Garlapati, Suryanarayana (Nokia - IN/Bangalore)
Hi All, I would like to propose the following changes for supporting the STS to run in k8s deployments with spark deployment mode as cluster. PR: https://github.com/apache/spark/pull/22433 Can you please review and provide the comments? Regards Surya