date:20180805

Re: Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Mridul Muralidharan

I agree, we should not work around the testcase but rather understand and fix the root cause. Closure cleaner should have null'ed out the references and allowed it to be serialized. Regards, Mridul On Sun, Aug 5, 2018 at 8:38 PM Wenchen Fan wrote: > > It seems to me that the closure cleaner fail

Re: [Proposal] New feature: reconfigurable number of partitions on stateful operators in Structured Streaming

2018-08-05 Thread Jungtaek Lim

Answering one of missed question: > I am not sure how were you planning to expose the state key groups at api level and if it would be transparent. I was thinking about introducing new configuration: it may look like adding unnecessary configuration, but I thought it would help elasticity ("adap

Re: [DISCUSS][SQL] Control the number of output files

2018-08-05 Thread John Zhuge

Great help from the community! On Sun, Aug 5, 2018 at 6:17 PM Xiao Li wrote: > FYI, the new hints have been merged. They will be available in the > upcoming release (Spark 2.4). > > *John Zhuge*, thanks for your work! Really appreciate it! Please submit > more PRs and help the community improve

Re: [DISCUSS][SQL] Control the number of output files

2018-08-05 Thread John Zhuge

https://issues.apache.org/jira/browse/SPARK-24940 The PR has been merged to 2.4.0. On Sun, Aug 5, 2018 at 6:06 PM Koert Kuipers wrote: > lukas, > what is the jira ticket for this? i would like to follow it's activity. > thanks! > koert > > On Wed, Jul 25, 2018 at 5:32 PM, lukas nalezenec wrote

Re: Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Wenchen Fan

It seems to me that the closure cleaner fails to clean up something. The failed test case defines a serializable class inside the test case, and the class doesn't refer to anything in the outer class. Ideally it can be serialized after cleaning up the closure. This is somehow a very weird way to d

Re: [DISCUSS][SQL] Control the number of output files

2018-08-05 Thread Xiao Li

FYI, the new hints have been merged. They will be available in the upcoming release (Spark 2.4). *John Zhuge*, thanks for your work! Really appreciate it! Please submit more PRs and help the community improve Spark. : ) Xiao 2018-08-05 21:06 GMT-04:00 Koert Kuipers : > lukas, > what is the jira

Re: [DISCUSS][SQL] Control the number of output files

2018-08-05 Thread Koert Kuipers

lukas, what is the jira ticket for this? i would like to follow it's activity. thanks! koert On Wed, Jul 25, 2018 at 5:32 PM, lukas nalezenec wrote: > Hi, > Yes, This feature is planned - Spark should be soon able to repartition > output by size. > Lukas > > > Dne st 25. 7. 2018 23:26 uživatel F

Re: Why is SQLImplicits an abstract class rather than a trait?

2018-08-05 Thread Jacek Laskowski

Hi Assaf, No idea (and don't remember I've ever wondered about it before), but why not doing this (untested): trait MySparkTestTrait { lazy val spark: SparkSession = SparkSession.builder().getOrCreate() // <-- you sure you don't need master? import spark.implicits._ } Wouldn't that import wo

Re: Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Stavros Kontopoulos

Makes sense, not sure if closure cleaning is related to the last one for example or others. The last one is a bit weird, unless I am missing something about the LegacyAccumulatorWrapper logic. Stavros On Sun, Aug 5, 2018 at 10:23 PM, Sean Owen wrote: > Yep that's what I did. There are more fail

Re: Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Sean Owen

Yep that's what I did. There are more failures with different resolutions. I'll open a JIRA and PR and ping you, to make sure that the changes are all reasonable, and not an artifact of missing something about closure cleaning in 2.12. In the meantime having a 2.12 build up and running for master

Re: Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Stavros Kontopoulos

Hi Sean, I run a quick build so the failing tests seem to be: - SPARK-17644: After one stage is aborted for too many failed attempts, subsequent stagesstill behave correctly on fetch failures *** FAILED *** A job with one fetch failure should eventually succeed (DAGSchedulerSuite.scala:2422)

Why is SQLImplicits an abstract class rather than a trait?

2018-08-05 Thread assaf.mendelson

Hi all, I have been playing a bit with SQLImplicits and noticed that it is an abstract class. I was wondering why is that? It has no constructor. Because of it being an abstract class it means that adding a test trait cannot extend it and still be a trait. Consider the following: trait MySp

Set up Scala 2.12 test build in Jenkins

2018-08-05 Thread Sean Owen

Shane et al - could we get a test job in Jenkins to test the Scala 2.12 build? I don't think I have the access or expertise for it, though I could probably copy and paste a job. I think we just need to clone the, say, master Maven Hadoop 2.7 job, and add two steps: run "./dev/change-scala-version.s

Re: Am I crazy, or does the binary distro not have Kafka integration?

2018-08-05 Thread Sean Owen

Yes it's a resaonable argument, that putting N more external integration modules on the default spark-submit classpath might bring in more third-party dependencies that clash or something. I think the convenience factor isn't a big deal; users can also just write a dependence on said module in thei

Re: [Proposal] New feature: reconfigurable number of partitions on stateful operators in Structured Streaming

2018-08-05 Thread Jungtaek Lim

"coalesce" looks like working: I misunderstood it as an efficient version of "repartition" which does shuffle, so expected it would trigger shuffle. My proposal would be covered as using "coalesce": thanks Joseph for correction. Let me abandon the proposal. We may still miss for now is documentati

Re: Set up Scala 2.12 test build in Jenkins

Re: [Proposal] New feature: reconfigurable number of partitions on stateful operators in Structured Streaming

Re: [DISCUSS][SQL] Control the number of output files

Re: [DISCUSS][SQL] Control the number of output files

Re: Set up Scala 2.12 test build in Jenkins

Re: [DISCUSS][SQL] Control the number of output files

Re: [DISCUSS][SQL] Control the number of output files

Re: Why is SQLImplicits an abstract class rather than a trait?

Re: Set up Scala 2.12 test build in Jenkins

Re: Set up Scala 2.12 test build in Jenkins

Re: Set up Scala 2.12 test build in Jenkins

Why is SQLImplicits an abstract class rather than a trait?

Set up Scala 2.12 test build in Jenkins

Re: Am I crazy, or does the binary distro not have Kafka integration?

Re: [Proposal] New feature: reconfigurable number of partitions on stateful operators in Structured Streaming

15 matches

Site Navigation

Mail list logo

Footer information