[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200196298 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200196305 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53899/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200196180 **[Test build #53899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53899/consoleFull)** for PR 11632 at commit [`7963244`](https://github.com/apache/spark/commit/7963244f064c4950781020b929f4fab5feedb2fe). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14069][SQL] Improve SparkStatusTracker ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11888#issuecomment-200196073 **[Test build #53884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53884/consoleFull)** for PR 11888 at commit [`fe80390`](https://github.com/apache/spark/commit/fe8039013efd2e4504168dff264f725fb56665a9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][SQL]Don't stop ContinuousQuery in qui...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11909#issuecomment-200195453 **[Test build #53902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53902/consoleFull)** for PR 11909 at commit [`b2b0378`](https://github.com/apache/spark/commit/b2b03782d7614732ea0484d03a7cf393c46447b7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-14091 [core] Consider improving performa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11911#issuecomment-200195242 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][SQL]Don't stop ContinuousQuery in qui...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11909#issuecomment-200194626 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13656][SQL] Remove spark.sql.parquet.ca...
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/11576#issuecomment-200194463 @yhuai oh, I got you. I'll close this for now and keep an eye on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13656][SQL] Remove spark.sql.parquet.ca...
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/11576 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-14091 [core] Consider improving performa...
GitHub user rajeshbalamohan opened a pull request: https://github.com/apache/spark/pull/11911 SPARK-14091 [core] Consider improving performance of SparkContext.get⦠## What changes were proposed in this pull request? Currently SparkContext.getCallSite() makes a call to Utils.getCallSite(). private[spark] def getCallSite(): CallSite = { val callSite = Utils.getCallSite() CallSite( Option(getLocalProperty(CallSite.SHORT_FORM)).getOrElse(callSite.shortForm), Option(getLocalProperty(CallSite.LONG_FORM)).getOrElse(callSite.longForm) ) } However, in some places utils.withDummyCallSite(sc) is invoked to avoid expensive threaddumps within getCallSite(). But Utils.getCallSite() is evaluated earlier causing threaddumps to be computed. This would impact when lots of RDDs are created (e.g spends close to 3-7 seconds when 1000+ are RDDs are present, which can have significant impact when entire query runtime is in the order of 10-20 seconds) Creating this jira to consider evaluating getCallSite only when needed. ## How was this patch tested? No new test cases are added. Following standalone test was tried out manually. Also, built entire spark binary and tried with few SQL queries in TPC-DS and TPC-H in multi node cluster def run(): Unit = { val conf = new SparkConf() val sc = new SparkContext("local[1]", "test-context", conf) val start: Long = System.currentTimeMillis(); val confBroadcast = sc.broadcast(new SerializableConfiguration(new Configuration())) Utils.withDummyCallSite(sc) { //Large tables end up creating 5500 RDDs for(i <- 1 to 5000) { val testRDD = new HadoopRDD(sc, confBroadcast, None, null, classOf[NullWritable], classOf[Writable], 10) } } val end: Long = System.currentTimeMillis(); println("Time taken : " + (end - start)) } def main(args: Array[String]): Unit = { run } (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) â¦CallSite() (rbalamohan) You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajeshbalamohan/spark SPARK-14091 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11911.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11911 commit dba630b854d6fdb298f8ef7ed25acf497f0eeebe Author: Rajesh BalamohanDate: 2016-03-23T04:57:01Z SPARK-14091 [core] Consider improving performance of SparkContext.getCallSite() (rbalamohan) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11904#discussion_r57112678 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -414,6 +399,22 @@ class Analyzer( } /** + * Build a project list for Project/Aggregate and expand the star if possible + */ +private def buildExpandedProjectList( +exprs: Seq[NamedExpression], +plan: UnaryNode): Seq[NamedExpression] = { + exprs.flatMap { +// Using Dataframe/Dataset API: testData2.groupBy($"a", $"b").agg($"*") +case s: Star => s.expand(plan.child, resolver) +// Using SQL API without running ResolveAlias: SELECT * FROM testData2 group by a, b +case UnresolvedAlias(s: Star, _) => expandStarExpression(s, plan.child) :: Nil --- End diff -- I think it should be `s.expand(plan.child, resolver)`, what you do here is just get rid of the outer `UnresolvedAlias` and resolve the star in next round. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11904#discussion_r57112464 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -414,6 +399,22 @@ class Analyzer( } /** + * Build a project list for Project/Aggregate and expand the star if possible + */ +private def buildExpandedProjectList( +exprs: Seq[NamedExpression], +plan: UnaryNode): Seq[NamedExpression] = { + exprs.flatMap { +// Using Dataframe/Dataset API: testData2.groupBy($"a", $"b").agg($"*") +case s: Star => s.expand(plan.child, resolver) +// Using SQL API without running ResolveAlias: SELECT * FROM testData2 group by a, b +case UnresolvedAlias(s: Star, _) => expandStarExpression(s, plan.child) :: Nil --- End diff -- do we need this case? I think the next case can also handle it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10399#issuecomment-200192093 cc @marmbrus for final sign off. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200192040 **[Test build #53901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53901/consoleFull)** for PR 11895 at commit [`4352499`](https://github.com/apache/spark/commit/435249984ded6e5f56c464de1d7c519a18a8d5e5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200191973 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200191931 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200191932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53898/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200191855 **[Test build #53898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53898/consoleFull)** for PR 11690 at commit [`b75927e`](https://github.com/apache/spark/commit/b75927e0a5a14da091157b746d0244482c0a54e7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14061][SQL] implement CreateMap
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11879#issuecomment-200191244 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53878/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14061][SQL] implement CreateMap
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11879#issuecomment-200191243 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10399#issuecomment-200191181 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10399#issuecomment-200191182 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53885/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10399#issuecomment-200191047 **[Test build #53885 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53885/consoleFull)** for PR 10399 at commit [`de22b50`](https://github.com/apache/spark/commit/de22b50aeae20cdaf013c8e5ea985bb30338553b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14061][SQL] implement CreateMap
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11879#issuecomment-200191109 **[Test build #53878 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53878/consoleFull)** for PR 11879 at commit [`a4b6827`](https://github.com/apache/spark/commit/a4b682765957a71e72b76515bd837657fc9569c0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200190262 **[Test build #53896 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53896/consoleFull)** for PR 11895 at commit [`4352499`](https://github.com/apache/spark/commit/435249984ded6e5f56c464de1d7c519a18a8d5e5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaFlumeEventCount ` * `class SparkSink extends AbstractSink with Logging with Configurable ` * `class FlumeInputDStream[T: ClassTag](` * `class SparkFlumeEvent() extends Externalizable ` * `class FlumeEventServer(receiver: FlumeReceiver) extends AvroSourceProtocol ` * `class FlumeReceiver(` * ` class CompressionChannelPipelineFactory extends ChannelPipelineFactory ` * `class FlumeUtils(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200190277 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53896/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200190275 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][SQL]Don't stop ContinuousQuery in qui...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11909#issuecomment-200190070 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53881/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][SQL]Don't stop ContinuousQuery in qui...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11909#issuecomment-200190065 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][SQL]Don't stop ContinuousQuery in qui...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11909#issuecomment-200189554 **[Test build #53881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53881/consoleFull)** for PR 11909 at commit [`b2b0378`](https://github.com/apache/spark/commit/b2b03782d7614732ea0484d03a7cf393c46447b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200187640 @andrewor14 Let's create a jira for that ignored test :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13995][SQL] Extract correct IsNotNull c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11809#issuecomment-200186748 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13995][SQL] Extract correct IsNotNull c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11809#issuecomment-200186752 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53883/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13995][SQL] Extract correct IsNotNull c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11809#issuecomment-200186256 **[Test build #53883 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53883/consoleFull)** for PR 11809 at commit [`56ca15f`](https://github.com/apache/spark/commit/56ca15fa348d0488ca689f9fec2dd912d0625fc4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with NullIntolerant ` * `case class UnaryMinus(child: Expression) extends UnaryExpression` * `case class UnaryPositive(child: Expression)` * `case class Abs(child: Expression)` * `case class Add(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant ` * `case class Subtract(left: Expression, right: Expression)` * `case class Multiply(left: Expression, right: Expression)` * `case class Divide(left: Expression, right: Expression)` * `case class Remainder(left: Expression, right: Expression)` * `case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant ` * `case class EqualTo(left: Expression, right: Expression)` * `case class LessThan(left: Expression, right: Expression)` * `case class LessThanOrEqual(left: Expression, right: Expression)` * `case class GreaterThan(left: Expression, right: Expression)` * `case class GreaterThanOrEqual(left: Expression, right: Expression)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200186068 **[Test build #2665 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2665/consoleFull)** for PR 11836 at commit [`5ea8469`](https://github.com/apache/spark/commit/5ea8469aafd347a7d1e69077de8d31a8f0167b25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200186037 **[Test build #2663 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2663/consoleFull)** for PR 11836 at commit [`5ea8469`](https://github.com/apache/spark/commit/5ea8469aafd347a7d1e69077de8d31a8f0167b25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200186105 **[Test build #53900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53900/consoleFull)** for PR 11836 at commit [`5ea8469`](https://github.com/apache/spark/commit/5ea8469aafd347a7d1e69077de8d31a8f0167b25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200186044 **[Test build #2664 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2664/consoleFull)** for PR 11836 at commit [`5ea8469`](https://github.com/apache/spark/commit/5ea8469aafd347a7d1e69077de8d31a8f0167b25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200186012 **[Test build #2662 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2662/consoleFull)** for PR 11836 at commit [`5ea8469`](https://github.com/apache/spark/commit/5ea8469aafd347a7d1e69077de8d31a8f0167b25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200185921 OK, now this should pass tests... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11836#discussion_r57111259 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveContextSuite.scala --- @@ -0,0 +1,38 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to You under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.spark.sql.hive + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.hive.test.TestHive + + +class HiveContextSuite extends SparkFunSuite { + + test("HiveContext can access `spark.sql.*` configs") { +// Avoid creating another SparkContext in the same JVM +val sc = TestHive.sparkContext +val hiveContext = new HiveContext(sc) --- End diff -- using `TestHive` doesn't work either. I'm just going to ignore the test again... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200185578 **[Test build #53899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53899/consoleFull)** for PR 11632 at commit [`7963244`](https://github.com/apache/spark/commit/7963244f064c4950781020b929f4fab5feedb2fe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200185580 **[Test build #53898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53898/consoleFull)** for PR 11690 at commit [`b75927e`](https://github.com/apache/spark/commit/b75927e0a5a14da091157b746d0244482c0a54e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200185429 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53897/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200185425 **[Test build #53897 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53897/consoleFull)** for PR 11690 at commit [`ac7f73d`](https://github.com/apache/spark/commit/ac7f73d9b068c0998ea884b2b901f7438c977f3c). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200185428 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200185058 **[Test build #53896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53896/consoleFull)** for PR 11895 at commit [`4352499`](https://github.com/apache/spark/commit/435249984ded6e5f56c464de1d7c519a18a8d5e5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11690#issuecomment-200185062 **[Test build #53897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53897/consoleFull)** for PR 11690 at commit [`ac7f73d`](https://github.com/apache/spark/commit/ac7f73d9b068c0998ea884b2b901f7438c977f3c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13642][Yarn][1.6-backport] Properly han...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/11690#discussion_r57110793 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -117,6 +119,20 @@ private[spark] class ApplicationMaster( private var delegationTokenRenewerOption: Option[AMDelegationTokenRenewer] = None + if (SystemUtils.IS_OS_UNIX) { +// Register signal handler for signal "TERM", "INT" and "HUP". For the cases where AM receive a +// signal and stop, from RM's aspect this application needs to be reattempted, rather than mark +// as success. +class AMSignalHandler(name: String) extends SignalHandler { --- End diff -- I think here modifier is not allowed. ``` [error] /Users/sshao/projects/apache-spark/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:126: illegal start of statement (no modifiers allowed here) [error] private class AMSignalHandler(name: String) extends SignalHandler { [error] ^ [error] one error found ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200184646 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200184579 **[Test build #53894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53894/consoleFull)** for PR 11632 at commit [`c1a26f9`](https://github.com/apache/spark/commit/c1a26f922c01643ae1bcbec3b8648531cbdfaeb9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200184598 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200184600 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53894/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200184059 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200184062 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53856/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200183975 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200183970 **[Test build #53887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53887/consoleFull)** for PR 11895 at commit [`4352499`](https://github.com/apache/spark/commit/435249984ded6e5f56c464de1d7c519a18a8d5e5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaFlumeEventCount ` * `class SparkSink extends AbstractSink with Logging with Configurable ` * `class FlumeInputDStream[T: ClassTag](` * `class SparkFlumeEvent() extends Externalizable ` * `class FlumeEventServer(receiver: FlumeReceiver) extends AvroSourceProtocol ` * `class FlumeReceiver(` * ` class CompressionChannelPipelineFactory extends ChannelPipelineFactory ` * `class FlumeUtils(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200183976 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53887/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200183941 **[Test build #53856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53856/consoleFull)** for PR 11836 at commit [`e552558`](https://github.com/apache/spark/commit/e5525581d6b92b4306076fae75a7321fe346e650). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3000] Allow MemoryStore blocks to be dr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11874#issuecomment-200180743 **[Test build #53895 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53895/consoleFull)** for PR 11874 at commit [`ccdd20c`](https://github.com/apache/spark/commit/ccdd20cd7e29a6a8d0d2592494fd0d19292b3702). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3000] Allow MemoryStore blocks to be dr...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/11874#issuecomment-200179763 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r57109811 --- Diff: R/pkg/R/functions.R --- @@ -2638,3 +2638,81 @@ setMethod("sort_array", jc <- callJStatic("org.apache.spark.sql.functions", "sort_array", x@jc, asc) column(jc) }) + +#' This function computes a histogram for a given SparkR Column. +#' +#' @name histogram +#' @title Histogram +#' @param nbins the number of bins (optional). The default is 10. +#' @param df the DataFrame containing the Column to build the histogram from. +#' @param colname the name of the column to build the histogram from. +#' @return a data.frame with the histogram statistics, i.e., counts and centroids. +#' @examples \dontrun{ +#' +#' # Create a DataFrame from the Iris dataset +#' irisDF <- createDataFrame(sqlContext, iris) +#' +#' # Compute histogram statistics +#' histData <- histogram(df, "colname"Sepal_Length", nbins = 12) +#' +#' # Once SparkR has computed the histogram statistics, it would be very easy to +#' # render the histogram using R's visualization packages such as ggplot2. +#' +#' } +setMethod("histogram", + signature(df = "DataFrame"), + function(df, colname, nbins = 10) { --- End diff -- Some other functions here take the `Column` type, you might want to support both `character` or `Column` for colname --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200176635 **[Test build #53894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53894/consoleFull)** for PR 11632 at commit [`c1a26f9`](https://github.com/apache/spark/commit/c1a26f922c01643ae1bcbec3b8648531cbdfaeb9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200175738 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11836#discussion_r57109756 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveContextSuite.scala --- @@ -0,0 +1,38 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to You under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.spark.sql.hive + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.hive.test.TestHive + + +class HiveContextSuite extends SparkFunSuite { + + test("HiveContext can access `spark.sql.*` configs") { +// Avoid creating another SparkContext in the same JVM +val sc = TestHive.sparkContext +val hiveContext = new HiveContext(sc) --- End diff -- Since there is already a global singleton, this probably will not work because of the derby connection limitation (inside the JVM, we can have a single connection to a given derby). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200175744 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53875/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200175193 **[Test build #53875 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53875/consoleFull)** for PR 11796 at commit [`6c4ed0a`](https://github.com/apache/spark/commit/6c4ed0a2ea0b3ae968b4d5a8a81555ea3e4cbcf4). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200174534 I tried `hive/test` locally and I got the following error at `HiveContextSuite`. ``` 21:43:03.057 ERROR DataNucleus.Datastore.Schema: Failed initialising database. Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: -- java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@25b50060, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source) at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) at org.apache.derby.jdbc.Driver20.connect(Unknown Source) at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361) at com.jolbox.bonecp.BoneCP.(BoneCP.java:416) at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120) at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:501) at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:298) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571) at
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r57109660 --- Diff: R/pkg/R/functions.R --- @@ -2638,3 +2638,81 @@ setMethod("sort_array", jc <- callJStatic("org.apache.spark.sql.functions", "sort_array", x@jc, asc) column(jc) }) + +#' This function computes a histogram for a given SparkR Column. +#' +#' @name histogram +#' @title Histogram +#' @param nbins the number of bins (optional). The default is 10. +#' @param df the DataFrame containing the Column to build the histogram from. +#' @param colname the name of the column to build the histogram from. +#' @return a data.frame with the histogram statistics, i.e., counts and centroids. +#' @examples \dontrun{ +#' +#' # Create a DataFrame from the Iris dataset +#' irisDF <- createDataFrame(sqlContext, iris) +#' +#' # Compute histogram statistics +#' histData <- histogram(df, "colname"Sepal_Length", nbins = 12) +#' +#' # Once SparkR has computed the histogram statistics, it would be very easy to +#' # render the histogram using R's visualization packages such as ggplot2. +#' +#' } +setMethod("histogram", + signature(df = "DataFrame"), + function(df, colname, nbins = 10) { +# Validate nbins +if (nbins < 2) { + stop("The number of bins must be a positive integer number greater than 1.") --- End diff -- from the message it sounds like ==2 is ok? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200174226 **[Test build #53882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53882/consoleFull)** for PR 11895 at commit [`4352499`](https://github.com/apache/spark/commit/435249984ded6e5f56c464de1d7c519a18a8d5e5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaFlumeEventCount ` * `class SparkSink extends AbstractSink with Logging with Configurable ` * `class FlumeInputDStream[T: ClassTag](` * `class SparkFlumeEvent() extends Externalizable ` * `class FlumeEventServer(receiver: FlumeReceiver) extends AvroSourceProtocol ` * `class FlumeReceiver(` * ` class CompressionChannelPipelineFactory extends ChannelPipelineFactory ` * `class FlumeUtils(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200174265 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53882/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14073][Streaming][test-maven]Move flume...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11895#issuecomment-200174262 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13817][SQL][MINOR] Renames Dataset.newD...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11889#issuecomment-200174008 **[Test build #53893 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53893/consoleFull)** for PR 11889 at commit [`019660d`](https://github.com/apache/spark/commit/019660dfff2ecc9d440a99147085df3151080358). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r57109481 --- Diff: R/pkg/R/functions.R --- @@ -2638,3 +2638,81 @@ setMethod("sort_array", jc <- callJStatic("org.apache.spark.sql.functions", "sort_array", x@jc, asc) column(jc) }) + +#' This function computes a histogram for a given SparkR Column. +#' +#' @name histogram +#' @title Histogram +#' @param nbins the number of bins (optional). The default is 10. +#' @param df the DataFrame containing the Column to build the histogram from. +#' @param colname the name of the column to build the histogram from. +#' @return a data.frame with the histogram statistics, i.e., counts and centroids. +#' @examples \dontrun{ --- End diff -- please add @rdname --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r57109461 --- Diff: R/pkg/R/functions.R --- @@ -2638,3 +2638,81 @@ setMethod("sort_array", jc <- callJStatic("org.apache.spark.sql.functions", "sort_array", x@jc, asc) column(jc) }) + +#' This function computes a histogram for a given SparkR Column. +#' +#' @name histogram +#' @title Histogram +#' @param nbins the number of bins (optional). The default is 10. +#' @param df the DataFrame containing the Column to build the histogram from. +#' @param colname the name of the column to build the histogram from. +#' @return a data.frame with the histogram statistics, i.e., counts and centroids. +#' @examples \dontrun{ +#' +#' # Create a DataFrame from the Iris dataset +#' irisDF <- createDataFrame(sqlContext, iris) +#' +#' # Compute histogram statistics +#' histData <- histogram(df, "colname"Sepal_Length", nbins = 12) +#' +#' # Once SparkR has computed the histogram statistics, it would be very easy to +#' # render the histogram using R's visualization packages such as ggplot2. --- End diff -- i think it would be great if you'd like to include the code example to plug this into ggplot2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200174009 **[Test build #53892 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53892/consoleFull)** for PR 11904 at commit [`4368a6c`](https://github.com/apache/spark/commit/4368a6ce88619dd99e20f73393710ae7e16a4951). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r57109425 --- Diff: R/pkg/R/functions.R --- @@ -2638,3 +2638,81 @@ setMethod("sort_array", jc <- callJStatic("org.apache.spark.sql.functions", "sort_array", x@jc, asc) column(jc) }) + +#' This function computes a histogram for a given SparkR Column. +#' +#' @name histogram +#' @title Histogram +#' @param nbins the number of bins (optional). The default is 10. +#' @param df the DataFrame containing the Column to build the histogram from. +#' @param colname the name of the column to build the histogram from. +#' @return a data.frame with the histogram statistics, i.e., counts and centroids. +#' @examples \dontrun{ +#' +#' # Create a DataFrame from the Iris dataset +#' irisDF <- createDataFrame(sqlContext, iris) +#' +#' # Compute histogram statistics +#' histData <- histogram(df, "colname"Sepal_Length", nbins = 12) +#' +#' # Once SparkR has computed the histogram statistics, it would be very easy to +#' # render the histogram using R's visualization packages such as ggplot2. +#' +#' } +setMethod("histogram", + signature(df = "DataFrame"), + function(df, colname, nbins = 10) { --- End diff -- is it possible to specify colname type in signature()? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11904#discussion_r57109380 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -439,6 +440,16 @@ class Analyzer( case s: Star => s.expand(child, resolver) case o => o :: Nil }) +case p: Murmur3Hash if containsStar(p.children) => --- End diff -- @cloud-fan Now, only four cases. I did not add a new function `replaceAllChildren` for less code changes. Please let me know if you think we still need to do it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200172353 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200172354 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53871/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash an...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200172180 that sounds reasonable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200171820 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53891/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11796#issuecomment-200171893 **[Test build #53871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53871/consoleFull)** for PR 11796 at commit [`0339263`](https://github.com/apache/spark/commit/03392634cfe27064100479ea92fa0d4d465f3aab). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200171798 **[Test build #53891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53891/consoleFull)** for PR 11632 at commit [`fcebfb6`](https://github.com/apache/spark/commit/fcebfb6cbbc8c0a1d8036c5481276db2dbb13c5b). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200171815 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash an...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200170932 Maybe we can restrict it to hash only with less code changes. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13801][SQL] DataFrame.col should return...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11632#issuecomment-200170397 **[Test build #53891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53891/consoleFull)** for PR 11632 at commit [`fcebfb6`](https://github.com/apache/spark/commit/fcebfb6cbbc8c0a1d8036c5481276db2dbb13c5b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash an...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200170290 hash makes a lot of sense. greatest and least are kind of funny since i'd imagine columns are of different types... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13957] [SQL] Support Group By Ordinal i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11846#issuecomment-200170223 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53876/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13957] [SQL] Support Group By Ordinal i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11846#issuecomment-200170222 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10242#issuecomment-200170035 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13957] [SQL] Support Group By Ordinal i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11846#issuecomment-200170083 **[Test build #53876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53876/consoleFull)** for PR 11846 at commit [`74a16be`](https://github.com/apache/spark/commit/74a16be1ca2bfaf74a184eacbb7cf0f1fa53049b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10242#issuecomment-200170036 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53888/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10242#issuecomment-200169823 **[Test build #53888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53888/consoleFull)** for PR 10242 at commit [`d6c9078`](https://github.com/apache/spark/commit/d6c90781153ed72802f722abbbc6dd4cb2ff1cd8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14025][STREAMING][WEBUI] Fix streaming ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11845#issuecomment-200168135 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14025][STREAMING][WEBUI] Fix streaming ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11845#issuecomment-200168137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53869/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash an...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200168042 ok. How about the other three? `hash`, `greatest` and `least`? Should we support them for star expansion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14025][STREAMING][WEBUI] Fix streaming ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11845#issuecomment-200167891 **[Test build #53869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53869/consoleFull)** for PR 11845 at commit [`0761e5f`](https://github.com/apache/spark/commit/0761e5f5485fff1f99c0b8b7dc6bfe8f8f5a0b82). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14085] [SQL] Star Expansion for Hash an...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11904#issuecomment-200167271 It's pretty strange to do this -- who's actually doing it for concat? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11836#issuecomment-200166489 **[Test build #53890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53890/consoleFull)** for PR 11836 at commit [`e552558`](https://github.com/apache/spark/commit/e5525581d6b92b4306076fae75a7321fe346e650). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org