Re: Time window on Processing Time

2017-08-29 Thread Tathagata Das
Yes, it can be! There is a sql function called current_timestamp() which is self-explanatory. So I believe you should be able to do something like import org.apache.spark.sql.functions._ ds.withColumn("processingTime", current_timestamp()) .groupBy(window("processingTime", "1 minute")) .count

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread James Baker
I'll just focus on the one-by-one thing for now - it's the thing that blocks me the most. I think the place where we're most confused here is on the cost of determining whether I can push down a filter. For me, in order to work out whether I can push down a filter or satisfy a sort, I might hav

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread James Baker
I'll just focus on the one-by-one thing for now - it's the thing that blocks me the most. I think the place where we're most confused here is on the cost of determining whether I can push down a filter. For me, in order to work out whether I can push down a filter or satisfy a sort, I might hav

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread James Baker
I'll just focus on the one-by-one thing for now - it's the thing that blocks me the most. I think the place where we're most confused here is on the cost of determining whether I can push down a filter. For me, in order to work out whether I can push down a filter or satisfy a sort, I might hav

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread Wenchen Fan
Hi James, Thanks for your feedback! I think your concerns are all valid, but we need to make a tradeoff here. > Explicitly here, what I'm looking for is a convenient mechanism to accept a fully specified set of arguments The problem with this approach is: 1) if we wanna add more arguments in the

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread James Baker
Yeah, for sure. With the stable representation - agree that in the general case this is pretty intractable, it restricts the modifications that you can do in the future too much. That said, it shouldn't be as hard if you restrict yourself to the parts of the plan which are supported by the data

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread James Baker
Yeah, for sure. With the stable representation - agree that in the general case this is pretty intractable, it restricts the modifications that you can do in the future too much. That said, it shouldn't be as hard if you restrict yourself to the parts of the plan which are supported by the data

Re: Welcoming Saisai (Jerry) Shao as a committer

2017-08-29 Thread Felix Cheung
Congrats! From: Wenchen Fan Sent: Tuesday, August 29, 2017 9:21:38 AM To: Kevin Yu Cc: Meisam Fathi; dev Subject: Re: Welcoming Saisai (Jerry) Shao as a committer Congratulations, Saisai! On 29 Aug 2017, at 10:38 PM, Kevin Yu mailto:keviny...@gmail.com>> wrote:

Re: [build system] tonight's downtime

2017-08-29 Thread shane knapp
alright, we're back up! On Tue, Aug 29, 2017 at 9:13 AM, shane knapp wrote: > ok, we were up for a little bit, but had to take the webserver down > due to a failed disk in the RAID array. > > given that this was our only hardware casualty, i will happily give up > 25 mins for the array to rebuild

Re: Welcoming Saisai (Jerry) Shao as a committer

2017-08-29 Thread Wenchen Fan
Congratulations, Saisai! > On 29 Aug 2017, at 10:38 PM, Kevin Yu wrote: > > Congratulations, Jerry! > > On Tue, Aug 29, 2017 at 6:35 AM, Meisam Fathi > wrote: > Congratulations, Jerry! > > Thanks, > Meisam > > On Tue, Aug 29, 2017 at 1:13 AM Wang, Carson

Re: [build system] tonight's downtime

2017-08-29 Thread shane knapp
ok, we were up for a little bit, but had to take the webserver down due to a failed disk in the RAID array. given that this was our only hardware casualty, i will happily give up 25 mins for the array to rebuild itself. i'll post updates as they come. On Mon, Aug 28, 2017 at 10:30 PM, shane knap

Re: Welcoming Saisai (Jerry) Shao as a committer

2017-08-29 Thread Kevin Yu
Congratulations, Jerry! On Tue, Aug 29, 2017 at 6:35 AM, Meisam Fathi wrote: > Congratulations, Jerry! > > Thanks, > Meisam > > On Tue, Aug 29, 2017 at 1:13 AM Wang, Carson > wrote: > >> Congratulations, Saisai! >> >> >> -Original Message- >> From: Matei Zaharia [mailto:matei.zaha...@gm

Re: Welcoming Saisai (Jerry) Shao as a committer

2017-08-29 Thread Meisam Fathi
Congratulations, Jerry! Thanks, Meisam On Tue, Aug 29, 2017 at 1:13 AM Wang, Carson wrote: > Congratulations, Saisai! > > > -Original Message- > From: Matei Zaharia [mailto:matei.zaha...@gmail.com] > Sent: Tuesday, August 29, 2017 9:29 AM > To: dev > Cc: Saisai Shao > Subject: Welcomi

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2

2017-08-29 Thread Reynold Xin
James, Thanks for the comment. I think you just pointed out a trade-off between expressiveness and API simplicity, compatibility and evolvability. For the max expressiveness, we'd want the ability to expose full query plans, and let the data source decide which part of the query plan can be pushed

Fwd: spark dataframe jdbc Amazon RDS problem

2017-08-29 Thread 刘虓
+dev -- Forwarded message -- From: 刘虓 Date: 2017-08-27 1:02 GMT+08:00 Subject: Re: spark dataframe jdbc Amazon RDS problem To: user my code is here: from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() mysql_jdbc_url = 'mydb/test' table = "test" props

RE: Welcoming Saisai (Jerry) Shao as a committer

2017-08-29 Thread Wang, Carson
Congratulations, Saisai! -Original Message- From: Matei Zaharia [mailto:matei.zaha...@gmail.com] Sent: Tuesday, August 29, 2017 9:29 AM To: dev Cc: Saisai Shao Subject: Welcoming Saisai (Jerry) Shao as a committer Hi everyone, The PMC recently voted to add Saisai (Jerry) Shao as a co