Re: code freeze and branch cut for Apache Spark 2.4

2018-08-29 Thread Wenchen Fan
A few updates on this thread: We still have a blocking issue, the repartition correctness bug: https://github.com/apache/spark/pull/22112 It's close to merging. There are a few PRs to fix Scala 2.12 issues. I think they will keep coming up and we don't need to block Spark 2.4 on this. All other

Re: [DISCUSS] move away from python doctests

2018-08-29 Thread Maciej Szymkiewicz
Hi Imran, On Wed, 29 Aug 2018 at 22:26, Imran Rashid wrote: > Hi Li, > > yes that makes perfect sense. That more-or-less is the same as my view, > though I framed it differently. I guess in that case, I'm really asking: > > Can pyspark changes please be accompanied by more unit tests, and not

Re: SPIP: Executor Plugin (SPARK-24918)

2018-08-29 Thread Mridul Muralidharan
+1 I left a couple of comments in NiharS's PR, but this is very useful to have in spark ! Regards, Mridul On Fri, Aug 3, 2018 at 10:00 AM Imran Rashid wrote: > > I'd like to propose adding a plugin api for Executors, primarily for > instrumentation and debugging >

Re: [DISCUSS] move away from python doctests

2018-08-29 Thread Imran Rashid
Hi Li, yes that makes perfect sense. That more-or-less is the same as my view, though I framed it differently. I guess in that case, I'm really asking: Can pyspark changes please be accompanied by more unit tests, and not assume we're getting coverage from doctests? Imran On Wed, Aug 29,

Re: Joining DataFrames derived from the same source yields confusing/incorrect results

2018-08-29 Thread Tomasz Gawęda
Hi, Tweet linked on the issue suggests some Spark error, but I didn't dig into it to find root cause. At least, it's quite confusing behaviour Pozdrawiam/Best regards, Tomek 29.08.2018 6:44 PM Nicholas Chammas napisał(a): Dunno if I made a silly mistake, but I wanted to bring some attention

Re: [DISCUSS] move away from python doctests

2018-08-29 Thread Li Jin
Hi Imran, My understanding is that doctests and unittests are orthogonal - doctests are used to make sure docstring examples are correct and are not meant to replace unittests. Functionalities are covered by unit tests to ensure correctness and doctests are used to test the docstring, not the

[DISCUSS] move away from python doctests

2018-08-29 Thread Imran Rashid
Hi, I'd like to propose that we move away from such heavy reliance on doctests in python, and move towards more traditional unit tests. The main reason is that its hard to share test code in doc tests. For example, I was just looking at

Joining DataFrames derived from the same source yields confusing/incorrect results

2018-08-29 Thread Nicholas Chammas
Dunno if I made a silly mistake, but I wanted to bring some attention to this issue in case there was something serious going on here that might affect the upcoming release. https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-25150 Nick

Re: [VOTE] SPARK 2.3.2 (RC5)

2018-08-29 Thread chenliang613
Hi Any new progress ? will start RC6 soon ? Regards Liang Saisai Shao wrote > There's still another one SPARK-25114. > > I will wait for several days in case some other blocks jumped. > > Thanks > Saisai > > > > Wenchen Fan > cloud0fan@ > 于2018年8月15日周三 上午10:19写道: > >> SPARK-25051 is