[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17264 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17264 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74396/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17264 **[Test build #74396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74396/testReport)** for PR 17264 at commit [`6468fde`](https://github.com/apache/spark/commit/6468fde7a9b726843e505b029cb5f7ac865690fe). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17232 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17232 **[Test build #74395 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74395/testReport)** for PR 17232 at commit [`ace4f02`](https://github.com/apache/spark/commit/ace4f0224bf67d9143b07e8a9ca610568cc49ffb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17232 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74395/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/17264 I quickly did check performance changes; ``` public class TestGenericUDF extends GenericUDF { @Override public ObjectInspector initialize(ObjectInspector[] objectInspectors) throws UDFArgumentException { return PrimitiveObjectInspectorFactory.javaLongObjectInspector; } @Override public Object evaluate(DeferredObject[] args) throws HiveException { final long a1 = PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[0].get()); final long a2 = PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[1].get()); final long a3 = PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[2].get()); final long a4 = PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[3].get()); return a1 + a2 + a3 + a4; } } ``` ``` $./bin/spark-shell --master local[1] --conf spark.sql.shuffle.partitions=1 -v scala> sql("CREATE TEMPORARY FUNCTION testUdf AS 'hivemall.ftvec.TestGenericUDF'") scala> :paste def timer[R](block: => R): R = { val t0 = System.nanoTime() val result = block val t1 = System.nanoTime() println("Elapsed time: " + ((t1 - t0 + 0.0) / 10.0)+ "s") result } scala> spark.range(300).createOrReplaceTempView("t") scala> timer { sql("SELECT testUdf(id, id, id, id) FROM t").queryExecution.executedPlan.execute().foreach(x => {}) } ``` ``` # performance w/ this patch Elapsed time: 1.901269167s # performance w/o this patch Elapsed time: 0.492860666s ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17264 **[Test build #74396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74396/testReport)** for PR 17264 at commit [`6468fde`](https://github.com/apache/spark/commit/6468fde7a9b726843e505b029cb5f7ac865690fe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/17264 I'm not sure this makes sense, so could you check? cc: @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17264: [SPARK-19923][SQL] Remove unnecessary type conver...
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/17264 [SPARK-19923][SQL] Remove unnecessary type conversions per call in Hive ## What changes were proposed in this pull request? This pr removed unnecessary type conversions per call in Hive: https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala#L116 ## How was this patch tested? Existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark SPARK-19923 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17264.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17264 commit 6468fde7a9b726843e505b029cb5f7ac865690fe Author: Takeshi Yamamuro Date: 2017-03-12T06:19:20Z Remove unnecessary type conversions per call --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17232 **[Test build #74395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74395/testReport)** for PR 17232 at commit [`ace4f02`](https://github.com/apache/spark/commit/ace4f0224bf67d9143b07e8a9ca610568cc49ffb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17260 cc @cloud-fan @yhuai @sameeragarwal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105548957 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) --- End diff -- Got it - that sounds good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17232 To show the impact of `boolean hasFollowingStatsTask`, we need to deliver the fix in `VersionSuite.scala` at first. The PR has been submitted: https://github.com/apache/spark/pull/17260 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17232: [SPARK-18112] [SQL] Support reading data from Hiv...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17232#discussion_r105548919 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -94,6 +94,10 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat try { body } catch { + case i: InvocationTargetException if isClientException(i.getTargetException) => +val e = i.getTargetException +throw new AnalysisException( + e.getClass.getCanonicalName + ": " + e.getMessage, cause = Some(e)) --- End diff -- This fix needs to be backported to the previous releases. Will submit a separate one for this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548820 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest { assert(query.exception.isEmpty) } + for((optionKey, optionValue, answer) <- Seq( +(STARTING_OFFSETS_OPTION_KEY, "earLiEst", EarliestOffsetRangeLimit), +(ENDING_OFFSETS_OPTION_KEY, "laTest", LatestOffsetRangeLimit), +(STARTING_OFFSETS_OPTION_KEY, """{"topic-A":{"0":23}}""", + SpecificOffsetRangeLimit(Map(new TopicPartition("topic-A", 0) -> 23) { +test(s"test offsets containing uppercase characters (${answer.getClass.getSimpleName})") { + val offset = getKafkaOffsetRangeLimit( +Map(optionKey -> optionValue), +optionKey, +answer + ) + + assert(offset == answer) --- End diff -- nit `==` => `===` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548818 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest { assert(query.exception.isEmpty) } + for((optionKey, optionValue, answer) <- Seq( --- End diff -- nit: move the `for` loop into the `test`. Not need to create many tests here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548819 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest { assert(query.exception.isEmpty) } + for((optionKey, optionValue, answer) <- Seq( +(STARTING_OFFSETS_OPTION_KEY, "earLiEst", EarliestOffsetRangeLimit), +(ENDING_OFFSETS_OPTION_KEY, "laTest", LatestOffsetRangeLimit), +(STARTING_OFFSETS_OPTION_KEY, """{"topic-A":{"0":23}}""", + SpecificOffsetRangeLimit(Map(new TopicPartition("topic-A", 0) -> 23) { +test(s"test offsets containing uppercase characters (${answer.getClass.getSimpleName})") { + val offset = getKafkaOffsetRangeLimit( +Map(optionKey -> optionValue), +optionKey, +answer + ) + + assert(offset == answer) +} + } + + for((optionKey, answer) <- Seq( --- End diff -- Same as above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548749 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -128,18 +123,18 @@ private[kafka010] class KafkaSourceProvider extends DataSourceRegister .map { k => k.drop(6).toString -> parameters(k) } .toMap -val startingRelationOffsets = - caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { -case Some("earliest") => EarliestOffsetRangeLimit -case Some(json) => SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)) -case None => EarliestOffsetRangeLimit +val startingRelationOffsets = KafkaSourceProvider.getKafkaOffsetRangeLimit( + caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, EarliestOffsetRangeLimit) match { +case earliest @ EarliestOffsetRangeLimit => earliest --- End diff -- `startingRelationOffsets` won't be `latest` since it's checked in `validateBatchOptions`. Why not just: ```Scala val startingRelationOffsets = KafkaSourceProvider.getKafkaOffsetRangeLimit( caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, EarliestOffsetRangeLimit) assert(startingRelationOffsets != LatestOffsetRangeLimit) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548762 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -388,34 +383,34 @@ private[kafka010] class KafkaSourceProvider extends DataSourceRegister private def validateBatchOptions(caseInsensitiveParams: Map[String, String]) = { // Batch specific options - caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { - case Some("earliest") => // good to go - case Some("latest") => +KafkaSourceProvider.getKafkaOffsetRangeLimit( + caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, EarliestOffsetRangeLimit) match { + case EarliestOffsetRangeLimit => // good to go + case LatestOffsetRangeLimit => throw new IllegalArgumentException("starting offset can't be latest " + "for batch queries on Kafka") - case Some(json) => (SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))) -.partitionOffsets.foreach { + case specific: SpecificOffsetRangeLimit => +specific.partitionOffsets.foreach { case (tp, off) if off == KafkaOffsetRangeLimit.LATEST => throw new IllegalArgumentException(s"startingOffsets for $tp can't " + "be latest for batch queries on Kafka") case _ => // ignore } - case _ => // default to earliest } - caseInsensitiveParams.get(ENDING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { - case Some("earliest") => +KafkaSourceProvider.getKafkaOffsetRangeLimit( + caseInsensitiveParams, ENDING_OFFSETS_OPTION_KEY, LatestOffsetRangeLimit) match { + case EarliestOffsetRangeLimit => throw new IllegalArgumentException("ending offset can't be earliest " + "for batch queries on Kafka") - case Some("latest") => // good to go - case Some(json) => (SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))) -.partitionOffsets.foreach { + case LatestOffsetRangeLimit => // good to go + case specific: SpecificOffsetRangeLimit => --- End diff -- nit: `case SpecificOffsetRangeLimit(partitionOffsets) =>` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548753 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -128,18 +123,18 @@ private[kafka010] class KafkaSourceProvider extends DataSourceRegister .map { k => k.drop(6).toString -> parameters(k) } .toMap -val startingRelationOffsets = - caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { -case Some("earliest") => EarliestOffsetRangeLimit -case Some(json) => SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)) -case None => EarliestOffsetRangeLimit +val startingRelationOffsets = KafkaSourceProvider.getKafkaOffsetRangeLimit( + caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, EarliestOffsetRangeLimit) match { +case earliest @ EarliestOffsetRangeLimit => earliest +case specific @ SpecificOffsetRangeLimit(_) => specific +case _ => EarliestOffsetRangeLimit } -val endingRelationOffsets = - caseInsensitiveParams.get(ENDING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { -case Some("latest") => LatestOffsetRangeLimit -case Some(json) => SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)) -case None => LatestOffsetRangeLimit +val endingRelationOffsets = KafkaSourceProvider.getKafkaOffsetRangeLimit(caseInsensitiveParams, + ENDING_OFFSETS_OPTION_KEY, LatestOffsetRangeLimit) match { +case latest @ LatestOffsetRangeLimit => latest --- End diff -- Same as above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105548760 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -388,34 +383,34 @@ private[kafka010] class KafkaSourceProvider extends DataSourceRegister private def validateBatchOptions(caseInsensitiveParams: Map[String, String]) = { // Batch specific options - caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) match { - case Some("earliest") => // good to go - case Some("latest") => +KafkaSourceProvider.getKafkaOffsetRangeLimit( + caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, EarliestOffsetRangeLimit) match { + case EarliestOffsetRangeLimit => // good to go + case LatestOffsetRangeLimit => throw new IllegalArgumentException("starting offset can't be latest " + "for batch queries on Kafka") - case Some(json) => (SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))) -.partitionOffsets.foreach { + case specific: SpecificOffsetRangeLimit => --- End diff -- nit: `case SpecificOffsetRangeLimit(partitionOffsets) =>` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17263 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17263 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74394/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17263 **[Test build #74394 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74394/testReport)** for PR 17263 at commit [`63b7ae8`](https://github.com/apache/spark/commit/63b7ae8246f53a16dfbaf3763f73feb8488a1566). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17263 **[Test build #74394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74394/testReport)** for PR 17263 at commit [`63b7ae8`](https://github.com/apache/spark/commit/63b7ae8246f53a16dfbaf3763f73feb8488a1566). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user yanji84 commented on the issue: https://github.com/apache/spark/pull/17109 @mgummelt comments addressed, please take another look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17263 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17263 **[Test build #74393 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74393/testReport)** for PR 17263 at commit [`f22f47f`](https://github.com/apache/spark/commit/f22f47f5b341f930b42ccea507a3697c0953abc1). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17263 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74393/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/17258 Thanks, I like this change but I think the reason string could be simpler, like `"killed via the Web UI"` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17263 **[Test build #74393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74393/testReport)** for PR 17263 at commit [`f22f47f`](https://github.com/apache/spark/commit/f22f47f5b341f930b42ccea507a3697c0953abc1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17263: [SPARK-19922][ML] small speedups to findSynonyms
GitHub user Krimit opened a pull request: https://github.com/apache/spark/pull/17263 [SPARK-19922][ML] small speedups to findSynonyms Currently generating synonyms using a large model (I've tested with 3m words) is very slow. These efficiencies have sped things up for us by ~17% I wasn't sure if such small changes were worthy of a jira, but the guidelines seemed to suggest that that is the preferred approach ## What changes were proposed in this pull request? Address a few small issues in the findSynonyms logic: 1) remove usage of ``Array.fill`` to zero out the ``cosineVec`` array. The default float value in Scala and Java is 0.0f, so explicitly setting the values to zero is not needed 2) use Floats throughout. The conversion to Doubles before doing the ``priorityQueue`` is totally superfluous, since all the similarity computations are done using Floats anyway. Creating a second large array just serves to put extra strain on the GC 3) convert the slow ``for(i <- cosVec.indices)`` to an ugly, but faster, ``while`` loop These efficiencies are really only apparent when working with a large model ## How was this patch tested? Existing unit tests + some in-house tests to time the difference cc @jkbradley @MLNick @srowen You can merge this pull request into a Git repository by running: $ git pull https://github.com/Krimit/spark fasterFindSynonyms Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17263.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17263 commit f22f47f5b341f930b42ccea507a3697c0953abc1 Author: Asher Krim Date: 2017-03-12T01:19:24Z small speedups to findSynonyms Currently generating synonyms using a model with 3m words is painfully slow. These efficiencies have sped things up by more than 17%. Address a few issues in the findSynonyms logic: 1) no need to zero out the cosineVec array each time, since default value for float arrays is 0.0f. This should offer some nice speedups 2) use floats throughout. The conversion to Doubles before doing the priorityQueue is totally superflous, since all the computations are done using floats anyway 3) convert the slow for(i <- cosVec.indices), which combines a scala closure with a Range, to an ugly but faster while loop --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17109 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17109 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74392/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17109 **[Test build #74392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74392/testReport)** for PR 17109 at commit [`737acf0`](https://github.com/apache/spark/commit/737acf07ceea8f4bc92b9eaa8c572af19b2e0b88). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17109 **[Test build #74392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74392/testReport)** for PR 17109 at commit [`737acf0`](https://github.com/apache/spark/commit/737acf07ceea8f4bc92b9eaa8c572af19b2e0b88). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105546348 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) --- End diff -- here's what it looks like ``` 1. Failure: No extra files are created in SPARK_HOME by starting session and making calls (@test_sparkSQL.R#2917) length(list1) not equal to length(list2). 1/1 mismatches [1] 22 - 23 == -1 2. Failure: No extra files are created in SPARK_HOME by starting session and making calls (@test_sparkSQL.R#2917) sort(list1, na.last = TRUE) not equal to sort(list2, na.last = TRUE). 3/23 mismatches x[21]: "unit-tests.out" y[21]: "spark-warehouse" x[22]: "WINDOWS.md" y[22]: "unit-tests.out" x[23]: NA y[23]: "WINDOWS.md" ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545762 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) + length(list1) <- l + length(list2) <- l + expect_equal(sort(list1, na.last = TRUE), sort(list2, na.last = TRUE)) +} + +# This should always be the last test in this test file. +test_that("No extra files are created in SPARK_HOME by starting session and making calls", { + # Check that it is not creating any extra file. + # Does not check the tempdir which would be cleaned up after. + filesAfter <- list.files(path = file.path(Sys.getenv("SPARK_HOME"), "R"), all.files = TRUE) + + expect_true(length(sparkHomeFileBefore) > 0) + compare_list(sparkHomeFileBefore, filesBefore) --- End diff -- I'm trying to catch a few things with this - will add some comment on. for instance, 1) what's created by calling `sparkR.session(enableHiveSupport = F)` (every tests except test_sparkSQL.R) 2) what's created by calling `sparkR.session(enableHiveSupport = T)` (test_sparkSQL.R) this unfortunately doesn't quite work as expected - it should have failed actually instead of passing - because we are running Scala tests before and they have caused spark-warehouse and metastore_db to be created already, before any R code is run. reworking that now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545724 --- Diff: core/src/main/scala/org/apache/spark/api/r/RRDD.scala --- @@ -127,6 +127,13 @@ private[r] object RRDD { sparkConf.setExecutorEnv(name.toString, value.toString) } +if (sparkEnvirMap.containsKey("spark.r.sql.default.derby.dir") && --- End diff -- well, in revisiting this I thought it would be easier to minimize the impact by making this R only. it would be much easier if we make the derby log going to tmp always for all lang binding --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545712 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) --- End diff -- the idea is to show enough information from the log without having to rerun the check manually. the first check will show the numeric values but it wouldn't say how exactly they are different. the next check (or moved to compare_list() here) will get testthat to dump the delta too, but first it must set the 2 lists into the same size etc.. in fact, all of these are well tested in "Check masked functions" test in test_context.R, just duplicated here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545530 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) --- End diff -- The lengths should be equal if we get to this line ? Or am I missing something ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545521 --- Diff: core/src/main/scala/org/apache/spark/api/r/RRDD.scala --- @@ -127,6 +127,13 @@ private[r] object RRDD { sparkConf.setExecutorEnv(name.toString, value.toString) } +if (sparkEnvirMap.containsKey("spark.r.sql.default.derby.dir") && --- End diff -- Its a little awkward that this is set in RRDD. Is there a more general place we can set this in across languages / runtimes (i.e. for Python / Scala as well) ? @cloud-fan @gatorsmile Any thoughts on this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16330#discussion_r105545556 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at the top of a timestamp column expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt")) }) +compare_list <- function(list1, list2) { + # get testthat to show the diff by first making the 2 lists equal in length + expect_equal(length(list1), length(list2)) + l <- max(length(list1), length(list2)) + length(list1) <- l + length(list2) <- l + expect_equal(sort(list1, na.last = TRUE), sort(list2, na.last = TRUE)) +} + +# This should always be the last test in this test file. +test_that("No extra files are created in SPARK_HOME by starting session and making calls", { + # Check that it is not creating any extra file. + # Does not check the tempdir which would be cleaned up after. + filesAfter <- list.files(path = file.path(Sys.getenv("SPARK_HOME"), "R"), all.files = TRUE) + + expect_true(length(sparkHomeFileBefore) > 0) + compare_list(sparkHomeFileBefore, filesBefore) --- End diff -- I'm not sure what we are checking by having both `sparkHomeFilesBefore` and `filesBefore` -- Wouldn't just one of them do the job and if not can we add a comment here ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16596: [SPARK-19237][SPARKR][WIP] R should check for java when ...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/16596 @felixcheung Any update on this ? Looking through the list of PRs I thought this might be a good one to add to a CRAN submission --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16330 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74391/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16330 **[Test build #74391 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74391/testReport)** for PR 16330 at commit [`e5b69ca`](https://github.com/apache/spark/commit/e5b69ca67230525c5819c52b581023475a7d7e5c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17260 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17260 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74389/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17260 **[Test build #74389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74389/testReport)** for PR 17260 at commit [`e0887d0`](https://github.com/apache/spark/commit/e0887d0568eb04392801f0d44901b4bb1d555cf6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74388/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16330 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17171 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16330 **[Test build #74388 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74388/testReport)** for PR 16330 at commit [`8062ee1`](https://github.com/apache/spark/commit/8062ee1e953b2d4393a983c20ed80ab29d8aeffc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17171 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74390/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17171 **[Test build #74390 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...
Github user elviento commented on the issue: https://github.com/apache/spark/pull/17262 closed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17262: [SPARK-17262][SQL] Fixed missing closing bracket ...
Github user elviento closed the pull request at: https://github.com/apache/spark/pull/17262 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17262 @elviento this pull request is completely wrong. Close it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...
Github user elviento commented on the issue: https://github.com/apache/spark/pull/17262 it merges into branch-2.0 - added missing bracket in DataFrameSuite.scala line 1704. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17262 @elviento close this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17261: [SPARK-17261][SQL] Fixed missing closing bracket ...
Github user elviento closed the pull request at: https://github.com/apache/spark/pull/17261 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17262 Can you close it please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17262 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17262: [SPARK-17261][SQL] Fixed missing closing bracket ...
GitHub user elviento opened a pull request: https://github.com/apache/spark/pull/17262 [SPARK-17261][SQL] Fixed missing closing bracket spark/sql/DataFrameSuite.scala ## What changes were proposed in this pull request? Fixed missing closing bracket in branch-2.0 line:1704 of DataFrameSuite.scala which was found during ./dev/make-distribution.sh /spark/sql/core/target/scala-2.11/test-classes... /spark/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala:1704: Missing closing brace `}' assumed here [error] } [error] ^ [error] one error found [error] Compile failed at Mar 11, 2017 2:36:12 PM [0.610s] ## How was this patch tested? Tested: $SPARK_SRC/spark/sql/core/target/scala-2.11/test-classes... Successful Build: $SPARK_SRC/dev/make-distribution.sh --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pyarn You can merge this pull request into a Git repository by running: $ git pull https://github.com/elviento/spark fix-dataframesuite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17262.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17262 commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b Author: Andrew Mills Date: 2016-09-26T20:41:10Z [Docs] Update spark-standalone.md to fix link Corrected a link to the configuration.html page, it was pointing to a page that does not exist (configurations.html). Documentation change, verified in preview. Author: Andrew Mills Closes #15244 from ammills01/master. (cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680) Signed-off-by: Andrew Or commit 98bbc4410181741d903a703eac289408cb5b2c5e Author: Josh Rosen Date: 2016-09-27T21:14:27Z [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats This patch ports changes from #15185 to Spark 2.x. In that patch, a correctness bug in Spark 1.6.x which was caused by an invalid `equals()` comparison between an `UnsafeRow` and another row of a different format. Spark 2.x is not affected by that specific correctness bug but it can still reap the error-prevention benefits of that patch's changes, which modify ``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen Closes #15265 from JoshRosen/SPARK-17618-master. (cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6) Signed-off-by: Josh Rosen commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888 Author: Liang-Chi Hsieh Date: 2016-09-27T23:00:39Z [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to check if unroll memory is not released too much. This assert looks wrong. ## How was this patch tested? Jenkins tests. Author: Liang-Chi Hsieh Closes #14642 from viirya/fix-unroll-memory. (cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef) Signed-off-by: Josh Rosen commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca Author: Josh Rosen Date: 2016-09-28T07:59:00Z [SPARK-17666] Ensure that RecordReaders are closed by data source file scans (backport) This is a branch-2.0 backport of #15245. ## What changes were proposed in this pull request? This patch addresses a potential cause of resource leaks in data source file scans. As reported in [SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which do not fully-consume their input may cause file handles / network connections (e.g. S3 connections) to be leaked. Spark's `NewHadoopRDD` uses a TaskContext callback to [close its record readers](https://github.com/apache/spark/blame/master/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L208), but the new data source file scans will only close record readers once their iterators are fully-consumed. This patch modifies `RecordReaderIterator` and `HadoopFileLinesReader` to add `close()` methods and modifies all six implementations of `FileFormat.buildReader()` to register TaskContext task completion callbacks to guarantee that cleanup is eventually performed. ## How was this patch tested? Tested manually for now. Author: Josh Rosen Closes #15271 from JoshRosen/SPARK-17666-backport. commit 4d73d5cd82ebc980f996c78f9afb8a97418ab7ab Author: hyukjinkwon Date: 2016-09-28T10:19:04Z [MINOR][PYSPARK][DOCS] Fix examples in PySpark documentation ## What changes were proposed in this pull request? T
[GitHub] spark issue #17261: Branch 2.0
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17261 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17261: Branch 2.0
GitHub user elviento opened a pull request: https://github.com/apache/spark/pull/17261 Branch 2.0 ## What changes were proposed in this pull request? Fixed missing closing bracket in branch-2.0 line:1704 of DataFrameSuite.scala which was found during ./dev/make-distribution.sh [warn] Pruning sources from previous analysis, due to incompatible CompileSetup. [info] Compiling 174 Scala sources and 19 Java sources to /Users/wesleyfabella/Documents/projects/RLE-CloudTeam/spark/sql/core/target/scala-2.11/test-classes... [error] /Users/wesleyfabella/Documents/projects/RLE-CloudTeam/spark/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala:1704: Missing closing brace `}' assumed here [error] } [error] ^ [error] one error found [error] Compile failed at Mar 11, 2017 2:36:12 PM [0.610s] ## How was this patch tested? Tested: $SPARK_SRC/spark/sql/core/target/scala-2.11/test-classes... Successful Build: $SPARK_SRC/dev/make-distribution.sh --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pyarn Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/elviento/spark branch-2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17261.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17261 commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b Author: Andrew Mills Date: 2016-09-26T20:41:10Z [Docs] Update spark-standalone.md to fix link Corrected a link to the configuration.html page, it was pointing to a page that does not exist (configurations.html). Documentation change, verified in preview. Author: Andrew Mills Closes #15244 from ammills01/master. (cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680) Signed-off-by: Andrew Or commit 98bbc4410181741d903a703eac289408cb5b2c5e Author: Josh Rosen Date: 2016-09-27T21:14:27Z [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats This patch ports changes from #15185 to Spark 2.x. In that patch, a correctness bug in Spark 1.6.x which was caused by an invalid `equals()` comparison between an `UnsafeRow` and another row of a different format. Spark 2.x is not affected by that specific correctness bug but it can still reap the error-prevention benefits of that patch's changes, which modify ``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen Closes #15265 from JoshRosen/SPARK-17618-master. (cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6) Signed-off-by: Josh Rosen commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888 Author: Liang-Chi Hsieh Date: 2016-09-27T23:00:39Z [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to check if unroll memory is not released too much. This assert looks wrong. ## How was this patch tested? Jenkins tests. Author: Liang-Chi Hsieh Closes #14642 from viirya/fix-unroll-memory. (cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef) Signed-off-by: Josh Rosen commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca Author: Josh Rosen Date: 2016-09-28T07:59:00Z [SPARK-17666] Ensure that RecordReaders are closed by data source file scans (backport) This is a branch-2.0 backport of #15245. ## What changes were proposed in this pull request? This patch addresses a potential cause of resource leaks in data source file scans. As reported in [SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which do not fully-consume their input may cause file handles / network connections (e.g. S3 connections) to be leaked. Spark's `NewHadoopRDD` uses a TaskContext callback to [close its record readers](https://github.com/apache/spark/blame/master/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L208), but the new data source file scans will only close record readers once their iterators are fully-consumed. This patch modifies `RecordReaderIterator` and `HadoopFileLinesReader` to add `close()` methods and modifies all six implementations of `FileFormat.buildReader()` to register TaskContext task completion callbacks to guarantee that cleanup is eventually performed. ## How was this patch tested? Tested manually for now. Author: Josh Rosen Closes #
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16330 **[Test build #74391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74391/testReport)** for PR 16330 at commit [`e5b69ca`](https://github.com/apache/spark/commit/e5b69ca67230525c5819c52b581023475a7d7e5c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17260 **[Test build #74389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74389/testReport)** for PR 17260 at commit [`e0887d0`](https://github.com/apache/spark/commit/e0887d0568eb04392801f0d44901b4bb1d555cf6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17171 **[Test build #74390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16373: [SPARK-18961][SQL] Support `SHOW TABLE EXTENDED ... PART...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16373 Will review this today. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17171 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end test...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/17260 [SPARK-19921] [SQL] [TEST] Enable end-to-end testing using different Hive metastore versions. ### What changes were proposed in this pull request? To improve the quality of our Spark SQL in different Hive metastore versions, this PR is to enable end-to-end testing using different versions. This PR allows the test cases in sql/hive to pass the existing Hive client to create a SparkSession. - Since Derby does not allow concurrent connections, the pre-built Hive clients use different database from the TestHive's built-in 1.2.1 client. - Since our test cases in sql/hive only can create a single Spark context in the same JVM, the newly created SparkSession share the same spark context with the existing TestHive's corresponding SparkSession. ### How was this patch tested? Fixed the existing test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark versionSuite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17260.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17260 commit e0887d0568eb04392801f0d44901b4bb1d555cf6 Author: Xiao Li Date: 2017-03-11T19:04:07Z fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16330 **[Test build #74388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74388/testReport)** for PR 16330 at commit [`8062ee1`](https://github.com/apache/spark/commit/8062ee1e953b2d4393a983c20ed80ab29d8aeffc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17259: Branch 2.0
Github user elviento closed the pull request at: https://github.com/apache/spark/pull/17259 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17259: Branch 2.0
GitHub user elviento opened a pull request: https://github.com/apache/spark/pull/17259 Branch 2.0 ## What changes were proposed in this pull request? Missing closing bracket '}' line 1704 found during mvn build. (Please fill in changes proposed in this fix) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala index 6a9279f..3967d07 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala @@ -1701,4 +1701,5 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { assert(e3.message.contains( "Cannot have map type columns in DataFrame which calls set operations")) } + } } ## How was this patch tested? Cloned branch, applied above fix, then successfully compiled using ./dev/make-distributable.sh (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/spark branch-2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17259.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17259 commit 8a58f2e8ec413591ec00da1e37b91b1bf49e4d1d Author: Sameer Agarwal Date: 2016-09-26T20:21:08Z [SPARK-17652] Fix confusing exception message while reserving capacity ## What changes were proposed in this pull request? This minor patch fixes a confusing exception message while reserving additional capacity in the vectorized parquet reader. ## How was this patch tested? Exisiting Unit Tests Author: Sameer Agarwal Closes #15225 from sameeragarwal/error-msg. (cherry picked from commit 7c7586aef9243081d02ea5065435234b5950ab66) Signed-off-by: Yin Huai commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b Author: Andrew Mills Date: 2016-09-26T20:41:10Z [Docs] Update spark-standalone.md to fix link Corrected a link to the configuration.html page, it was pointing to a page that does not exist (configurations.html). Documentation change, verified in preview. Author: Andrew Mills Closes #15244 from ammills01/master. (cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680) Signed-off-by: Andrew Or commit 98bbc4410181741d903a703eac289408cb5b2c5e Author: Josh Rosen Date: 2016-09-27T21:14:27Z [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats This patch ports changes from #15185 to Spark 2.x. In that patch, a correctness bug in Spark 1.6.x which was caused by an invalid `equals()` comparison between an `UnsafeRow` and another row of a different format. Spark 2.x is not affected by that specific correctness bug but it can still reap the error-prevention benefits of that patch's changes, which modify ``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen Closes #15265 from JoshRosen/SPARK-17618-master. (cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6) Signed-off-by: Josh Rosen commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888 Author: Liang-Chi Hsieh Date: 2016-09-27T23:00:39Z [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to check if unroll memory is not released too much. This assert looks wrong. ## How was this patch tested? Jenkins tests. Author: Liang-Chi Hsieh Closes #14642 from viirya/fix-unroll-memory. (cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef) Signed-off-by: Josh Rosen commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca Author: Josh Rosen Date: 2016-09-28T07:59:00Z [SPARK-17666] Ensure that RecordReaders are closed by data source file scans (backport) This is a branch-2.0 backport of #15245. ## What changes were proposed in this pull request? This patch addresses a potential cause of resource leaks in data source file scans. As reported in [SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which do not fully-consume their input may cause file handles / network connections
[GitHub] spark issue #16006: [SPARK-18580] [DStreams] [external/kafka-0-10] Use spark...
Github user omuravskiy commented on the issue: https://github.com/apache/spark/pull/16006 Yes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17258 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17258: [SPARK-19807][Web UI]Add reason for cancellation ...
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17258 [SPARK-19807][Web UI]Add reason for cancellation when a stage is killed using web UI ## What changes were proposed in this pull request? When a user kills a stage using web UI (in Stages page), StagesTab.handleKillRequest requests SparkContext to cancel the stage without giving a reason. SparkContext has cancelStage(stageId: Int, reason: String) that Spark could use to pass the information for monitoring/debugging purposes. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaolinliu/spark SPARK-19807 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17258.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17258 commit f43d1d689800d4d6fabdd7c3c4a85065f93bc34c Author: lvdongr Date: 2017-03-11T11:37:00Z [SPARK-19807][Web UI]Add reason for cancellation when a stage is killed using web UI --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17242: [SPARK-19902][SQL] Support more expression canonicalizat...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17242 @cloud-fan I would like to defer the optimization part to another PR, if possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17257 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74387/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17257 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17257 **[Test build #74387 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74387/testReport)** for PR 17257 at commit [`cd82690`](https://github.com/apache/spark/commit/cd8269022e3066d07c6fc480305ccef89efd0993). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74385/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17251 **[Test build #74385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74385/testReport)** for PR 17251 at commit [`0cd5d88`](https://github.com/apache/spark/commit/0cd5d88609a2e36459498a86caec5046d9ebe2b1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/17254 cc @cloud-fan @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17254 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74384/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17254 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17254 **[Test build #74384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74384/testReport)** for PR 17254 at commit [`36a3463`](https://github.com/apache/spark/commit/36a34632dbb000799c35727c00d1542d4bb1ce00). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17256 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74386/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17256 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17256 **[Test build #74386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74386/testReport)** for PR 17256 at commit [`9d91da1`](https://github.com/apache/spark/commit/9d91da124e0723adee7744a64999ea1c07acfe66). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/15435 Done. cc @sethah @jkbradley thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17257 **[Test build #74387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74387/testReport)** for PR 17257 at commit [`cd82690`](https://github.com/apache/spark/commit/cd8269022e3066d07c6fc480305ccef89efd0993). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17257: [DOCS][SS] fix structured streaming python exampl...
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17257 [DOCS][SS] fix structured streaming python example ## What changes were proposed in this pull request? - SS python example: `TypeError: 'xxx' object is not callable` - some other doc issue. ## How was this patch tested? Jenkins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark docs-ss-python Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17257.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17257 commit cd8269022e3066d07c6fc480305ccef89efd0993 Author: uncleGen Date: 2017-03-11T09:27:40Z fix structured streaming python example code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17256 **[Test build #74386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74386/testReport)** for PR 17256 at commit [`9d91da1`](https://github.com/apache/spark/commit/9d91da124e0723adee7744a64999ea1c07acfe66). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17255: [SPARK-19918[SQL] Use TextFileFormat in implementation o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17255 cc @cloud-fan, @joshrosen and @NathanHowell could you take a look and see if it makes sense when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17256 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org