Re: SPIP: Spark on Kubernetes

2017-08-30 Thread Reynold Xin
This has passed, hasn't it? On Tue, Aug 15, 2017 at 5:33 PM Anirudh Ramanathan wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >

Re: Time window on Processing Time

2017-08-30 Thread madhu phatak
Hi, That's great. Thanks a lot. On Wed, Aug 30, 2017 at 10:44 AM, Tathagata Das wrote: > Yes, it can be! There is a sql function called current_timestamp() which > is self-explanatory. So I believe you should be able to do something like > > import

Re: SPIP: Spark on Kubernetes

2017-08-30 Thread vaquar khan
+1 (non-binding) Regards, Vaquar khan On Mon, Aug 28, 2017 at 5:09 PM, Erik Erlandson wrote: > > In addition to the engineering & software aspects of the native Kubernetes > community project, we have also worked at building out the community, with > the goal of providing

Re: Are there multiple processes out there running JIRA <-> Github maintenance tasks?

2017-08-30 Thread Josh Rosen
I think that's because https://issues.apache.org/jira/browse/SPARK-21728 was re-opened in JIRA and had a new PR associated with it, so the bot did the temporary issue re-assignment in order to be able to transition the issue status from "reopened" to "in progress". On Wed, Aug 30, 2017 at 1:18 PM

Re: Welcoming Saisai (Jerry) Shao as a committer

2017-08-30 Thread Joseph Bradley
Congrats! On Aug 29, 2017 9:55 AM, "Felix Cheung" wrote: > Congrats! > > -- > *From:* Wenchen Fan > *Sent:* Tuesday, August 29, 2017 9:21:38 AM > *To:* Kevin Yu > *Cc:* Meisam Fathi; dev > *Subject:* Re: Welcoming

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Ryan Blue
Maybe I'm missing something, but the high-level proposal consists of: Goals, Non-Goals, and Proposed API. What is there to discuss other than the details of the API that's being proposed? I think the goals make sense, but goals alone aren't enough to approve a SPIP. On Wed, Aug 30, 2017 at 2:46

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread James Baker
I guess I was more suggesting that by coding up the powerful mode as the API, it becomes easy for someone to layer an easy mode beneath it to enable simpler datasources to be integrated (and that simple mode should be the out of scope thing). Taking a small step back here, one of the places

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread James Baker
I guess I was more suggesting that by coding up the powerful mode as the API, it becomes easy for someone to layer an easy mode beneath it to enable simpler datasources to be integrated (and that simple mode should be the out of scope thing). Taking a small step back here, one of the places

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread James Baker
I guess I was more suggesting that by coding up the powerful mode as the API, it becomes easy for someone to layer an easy mode beneath it to enable simpler datasources to be integrated (and that simple mode should be the out of scope thing). Taking a small step back here, one of the places

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Reynold Xin
Sure that's good to do (and as discussed earlier a good compromise might be to expose an interface for the source to decide which part of the logical plan they want to accept). To me everything is about cost vs benefit. In my mind, the biggest issue with the existing data source API is backward

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread James Baker
I guess I was more suggesting that by coding up the powerful mode as the API, it becomes easy for someone to layer an easy mode beneath it to enable simpler datasources to be integrated (and that simple mode should be the out of scope thing). Taking a small step back here, one of the places

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Reynold Xin
So we seem to be getting into a cycle of discussing more about the details of APIs than the high level proposal. The details of APIs are important to debate, but those belong more in code reviews. One other important thing is that we should avoid API design by committee. While it is extremely

Re: Are there multiple processes out there running JIRA <-> Github maintenance tasks?

2017-08-30 Thread Marcelo Vanzin
I'm still seeing some odd behavior. I just deleted my repo's branch for https://github.com/apache/spark/pull/19013 and the script seems to have done some update to the bug, since I got a bunch of e-mails. On Mon, Aug 28, 2017 at 2:34 PM, Josh Rosen wrote: > This should

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Ryan Blue
-1 (non-binding) Sometimes it takes a VOTE thread to get people to actually read and comment, so thanks for starting this one… but there’s still discussion happening on the prototype API, which it hasn’t been updated. I’d like to see the proposal shaped by the ongoing discussion so that we have a

Re: Updates on migration guides

2017-08-30 Thread linguin . m . s
+1 2017/08/31 0:02、Dongjoon Hyun のメッセージ: > +1 > >> On Wed, Aug 30, 2017 at 7:54 AM, Xiao Li wrote: >> Hi, Devs, >> >> Many questions from the open source community are actually caused by the >> behavior changes we made in each release. So far,

Re: Updates on migration guides

2017-08-30 Thread Nick Pentreath
MLlib has tried quite hard to ensure the migration guide is up to date for each release. I think generally we catch all breaking and most major behavior changes On Wed, 30 Aug 2017 at 17:02, Dongjoon Hyun wrote: > +1 > > On Wed, Aug 30, 2017 at 7:54 AM, Xiao Li

Updates on migration guides

2017-08-30 Thread Xiao Li
Hi, Devs, Many questions from the open source community are actually caused by the behavior changes we made in each release. So far, the migration guides (e.g., https://spark.apache.org/docs/latest/sql-programming-guide.html#migration-guide) were not being properly updated. In the last few

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Reynold Xin
That might be good to do, but seems like orthogonal to this effort itself. It would be a completely different interface. On Wed, Aug 30, 2017 at 1:10 PM Wenchen Fan wrote: > OK I agree with it, how about we add a new interface to push down the > query plan, based on the

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-30 Thread Wenchen Fan
OK I agree with it, how about we add a new interface to push down the query plan, based on the current framework? We can mark the query-plan-push-down interface as unstable, to save the effort of designing a stable representation of query plan and maintaining forward compatibility. On Wed, Aug

Apache Spark Streaming / Spark SQL Job logs

2017-08-30 Thread Chetan Khatri
Hey Spark Dev, Can anyone suggests sample Spark Streaming / Spark SQL Job logs to download. I want to play with Log analytics. Thanks

Fwd: [jira] [Commented] (SPARK-21728) Allow SparkSubmit to use logging

2017-08-30 Thread Jacek Laskowski
Hi, I think that's the code change (by Marcelo Vanzin) that has changed how logging works as of now which seems not to load conf/log4j.properties by default. Can anyone explain how it's supposed to work in 2.3? I could not figure it out from the code and conf/log4j.properties is not picked up