[GitHub] spark issue #15769: [SPARK-18191][CORE] Port RDD API to use commit protocol
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15769 **[Test build #68330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68330/consoleFull)** for PR 15769 at commit [`9380f91`](https://github.com/apache/spark/commit/9380f91281587867d8a630199c01c4263bc2e197). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15047 @tejasapatil It sounds like the test case coverage is limited. It does not cover all the data types, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15769: [SPARK-18191][CORE] Port RDD API to use commit protocol
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15769 @mridulm I've added most of your comments. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15805 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15805 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68321/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15775: [SPARK-18280][Core]Fix potential deadlock in `Standalone...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15775 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15805 **[Test build #68321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68321/consoleFull)** for PR 15805 at commit [`1b81c12`](https://github.com/apache/spark/commit/1b81c12617fc8d5cd8948e633592d5d306191655). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15775: [SPARK-18280][Core]Fix potential deadlock in `Standalone...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68320/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15775: [SPARK-18280][Core]Fix potential deadlock in `Standalone...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15775 **[Test build #68320 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68320/consoleFull)** for PR 15775 at commit [`d9c5626`](https://github.com/apache/spark/commit/d9c56269833eaa62dbd48e2d9520cb996d617730). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15798: [SPARK-18262][BUILD][SQL][WIP] JSON.org license is now C...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15798 SGTM. Thanks for doing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15637: [SPARK-18000] [SQL] Aggregation function for computing b...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15637 Point query of countMinSketch can only help us on estimation for predicates like col=1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15637: [SPARK-18000] [SQL] Aggregation function for computing b...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15637 With (value, freq) pairs, i.e. equi-width histogram, given a predicate col<100 and col>50, we can know exactly how many records are in that range, which is the estimated rowcount after the filter. And we can also know the exact ndv after the filter, which is important for following estimation, e.g., in estimation of output row count of agg(group by). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68319/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15172 **[Test build #68319 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68319/consoleFull)** for PR 15172 at commit [`c50e088`](https://github.com/apache/spark/commit/c50e088b19a49fd95dd9ae60c11c8b1c81e59b73). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15797: [SPARK-17990][SPARK-18302][SQL] correct several p...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15797#discussion_r86936905 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -171,6 +171,10 @@ case class CatalogTable( throw new AnalysisException(s"table $identifier did not specify database") } + def location: String = storage.locationUri.getOrElse { --- End diff -- Like the other functions in this file, add the function descriptions too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15797: [SPARK-17990][SPARK-18302][SQL] correct several p...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15797#discussion_r86936775 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala --- @@ -591,25 +673,25 @@ abstract class ExternalCatalogSuite extends SparkFunSuite with BeforeAndAfterEac val catalog = newBasicCatalog() val db = catalog.getDatabase("db1") val table = CatalogTable( - identifier = TableIdentifier("my_table", Some("db1")), --- End diff -- Could we keep the original names unchanged? Now, the table/database names allow `alphanumeric characters and underscores`. Thus, I think they are named using underscore on purpose? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15797: [SPARK-17990][SPARK-18302][SQL] correct several p...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15797#discussion_r86935972 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -451,51 +444,6 @@ object PartitioningUtils { } } - // - // The following string escaping code is mainly copied from Hive (o.a.h.h.common.FileUtils). - // - - val charToEscape = { -val bitSet = new java.util.BitSet(128) - -/** - * ASCII 01-1F are HTTP control characters that need to be escaped. - * \u000A and \u000D are \n and \r, respectively. - */ -val clist = Array( - '\u0001', '\u0002', '\u0003', '\u0004', '\u0005', '\u0006', '\u0007', '\u0008', '\u0009', - '\n', '\u000B', '\u000C', '\r', '\u000E', '\u000F', '\u0010', '\u0011', '\u0012', '\u0013', - '\u0014', '\u0015', '\u0016', '\u0017', '\u0018', '\u0019', '\u001A', '\u001B', '\u001C', - '\u001D', '\u001E', '\u001F', '"', '#', '%', '\'', '*', '/', ':', '=', '?', '\\', '\u007F', - '{', '[', ']', '^') - -clist.foreach(bitSet.set(_)) - -if (Shell.WINDOWS) { - Array(' ', '<', '>', '|').foreach(bitSet.set(_)) -} - -bitSet - } - - def needsEscaping(c: Char): Boolean = { -c >= 0 && c < charToEscape.size() && charToEscape.get(c) - } - - def escapePathName(path: String): String = { -val builder = new StringBuilder() -path.foreach { c => - if (needsEscaping(c)) { -builder.append('%') -builder.append(f"${c.asInstanceOf[Int]}%02X") - } else { -builder.append(c) - } -} - -builder.toString() - } - def unescapePathName(path: String): String = { --- End diff -- If `escapePathName` is moved, I think we also need to move `unescapePathName` to the same file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15742: [SPARK-16808][Core] History Server main page does not ho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68318/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15742: [SPARK-16808][Core] History Server main page does not ho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15742 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15797 **[Test build #68329 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68329/consoleFull)** for PR 15797 at commit [`dddee47`](https://github.com/apache/spark/commit/dddee47b6f75da53beb1f99f1d2c6839444b0dd7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15742: [SPARK-16808][Core] History Server main page does not ho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15742 **[Test build #68318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68318/consoleFull)** for PR 15742 at commit [`a7e380d`](https://github.com/apache/spark/commit/a7e380d1a59ef31db2225dd37f21df49edd3a094). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15797 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15637: [SPARK-18000] [SQL] Aggregation function for computing b...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15637 Weird - github didn't email me notifications. Why do you care about all (value, freq) pairs? All you want should just be some summary statistics that can tell you given a value, what the estimated freq is, isn't it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15805 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15805 **[Test build #68328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68328/consoleFull)** for PR 15805 at commit [`0c18e27`](https://github.com/apache/spark/commit/0c18e279f29986263d6057ed0d76112412510be5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15797 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68322/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15797 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15797 **[Test build #68322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68322/consoleFull)** for PR 15797 at commit [`dddee47`](https://github.com/apache/spark/commit/dddee47b6f75da53beb1f99f1d2c6839444b0dd7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15717 **[Test build #68327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68327/consoleFull)** for PR 15717 at commit [`fda6d3a`](https://github.com/apache/spark/commit/fda6d3a70bba1ddccafb5c13a87aacdfd1c80547). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15751 **[Test build #68326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68326/consoleFull)** for PR 15751 at commit [`bae8db8`](https://github.com/apache/spark/commit/bae8db89fef3f25478894eee962b027d8b426b01). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15806: [SPARK-18345][STRUCTURED STREAMING] Structured Streaming...
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15806 Hi @oza, thank you for this patch, however I'm not sure if this patch fixes anything. What was your problem, what didn't work for you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15806: [SPARK-18345][STRUCTURED STREAMING] Structured St...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15806#discussion_r86933542 --- Diff: examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala --- @@ -68,6 +70,8 @@ object StructuredNetworkWordCount { val query = wordCounts.writeStream .outputMode("complete") .format("console") + .option("checkpointLocation", --- End diff -- we don't support checkpoint recovery for console sinks though --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15806: [SPARK-18345][STRUCTURED STREAMING] Structured St...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15806#discussion_r86933461 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -219,10 +219,11 @@ class StreamingQueryManager private[sql] (sparkSession: SparkSession) { } // If offsets have already been created, we trying to resume a query. - if (!recoverFromCheckpointLocation) { + if (recoverFromCheckpointLocation) { val checkpointPath = new Path(checkpointLocation, "offsets") val fs = checkpointPath.getFileSystem(df.sparkSession.sessionState.newHadoopConf()) if (fs.exists(checkpointPath)) { + // Currently, offset recovery from checkpoint is not supported. --- End diff -- why not? it is supported --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15751 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15804 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68317/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15804 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15804 **[Test build #68317 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68317/consoleFull)** for PR 15804 at commit [`e0c164b`](https://github.com/apache/spark/commit/e0c164b86ce7cff0750615c16b2ee41905bfb875). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class RenameReturnsFalseFileSystem extends RawLocalFileSystem ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15047 @gatorsmile : I have tests in `HiveHasherSuite` to compare the values against expected one. Initially I had thought about generating random input and calling the original Hive's hash function to compare the results but later dropped that as it would have added dependency on hive. See https://github.com/apache/spark/pull/15047#issuecomment-247516360 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68325/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)** for PR 9 at commit [`8516a2c`](https://github.com/apache/spark/commit/8516a2c6d8875cceee49c19f8d70fb71bd2b9225). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15805 LGTM except a minor comment and test pending Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15805#discussion_r86932312 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -324,38 +324,45 @@ case class TruncateTableCommand( override def run(spark: SparkSession): Seq[Row] = { val catalog = spark.sessionState.catalog val table = catalog.getTableMetadata(tableName) -val tableIdentwithDB = table.identifier.quotedString +val tableIdentWithDB = table.identifier.quotedString if (table.tableType == CatalogTableType.EXTERNAL) { throw new AnalysisException( -s"Operation not allowed: TRUNCATE TABLE on external tables: $tableIdentwithDB") +s"Operation not allowed: TRUNCATE TABLE on external tables: $tableIdentWithDB") } if (table.tableType == CatalogTableType.VIEW) { throw new AnalysisException( -s"Operation not allowed: TRUNCATE TABLE on views: $tableIdentwithDB") +s"Operation not allowed: TRUNCATE TABLE on views: $tableIdentWithDB") } if (table.partitionColumnNames.isEmpty && partitionSpec.isDefined) { throw new AnalysisException( s"Operation not allowed: TRUNCATE TABLE ... PARTITION is not supported " + -s"for tables that are not partitioned: $tableIdentwithDB") +s"for tables that are not partitioned: $tableIdentWithDB") } if (partitionSpec.isDefined) { DDLUtils.verifyPartitionProviderIsHive(spark, table, "TRUNCATE TABLE ... PARTITION") } + +val partCols = table.partitionColumnNames val locations = - if (table.partitionColumnNames.isEmpty) { + if (partCols.isEmpty) { Seq(table.storage.locationUri) } else { -// Here we diverge from Hive when the given partition spec contains all partition columns -// but no partition is matched: Hive will throw an exception and we just do nothing. val normalizedSpec = partitionSpec.map { spec => PartitioningUtils.normalizePartitionSpec( spec, -table.partitionColumnNames, +partCols, table.identifier.quotedString, spark.sessionState.conf.resolver) } -catalog.listPartitions(table.identifier, normalizedSpec).map(_.storage.locationUri) +val parts = --- End diff -- Nit: How about `partLocations`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)** for PR 9 at commit [`8516a2c`](https://github.com/apache/spark/commit/8516a2c6d8875cceee49c19f8d70fb71bd2b9225). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15717 **[Test build #68324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68324/consoleFull)** for PR 15717 at commit [`cba5bbd`](https://github.com/apache/spark/commit/cba5bbdb58b7510672b1392a9ec29021577998c6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15657: [DO NOT MERGE] Test partition
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15657 What's this about? Close it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15805 LGTM otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15563: [SPARK-16759][CORE] Add a configuration property ...
Github user weiqingy commented on a diff in the pull request: https://github.com/apache/spark/pull/15563#discussion_r86930677 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2523,6 +2524,8 @@ private[spark] class CallerContext( taskId: Option[Long] = None, taskAttemptNumber: Option[Int] = None) extends Logging { + val upstreamCallerContextStr = + if (upstreamCallerContext.isDefined) s"_${upstreamCallerContext.get}" else "" --- End diff -- Yes, I have updated the PR to use this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15806: [SPARK-18345][STRUCTURED STREAMING] Structured Streaming...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15806 cc @brkyvz --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/15627#discussion_r86930468 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -600,10 +600,14 @@ private[spark] class Client( val (_, localizedPath) = distribute(file, resType = resType) if (addToClasspath) { if (localizedPath != null) { - cachedSecondaryJarLinks += localizedPath +cachedSecondaryJarLinks += localizedPath } } else { - require(localizedPath !=null) + if (localizedPath != null) { --- End diff -- I guess here is `localizedPath == null` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15563: [SPARK-16759][CORE] Add a configuration property ...
Github user weiqingy commented on a diff in the pull request: https://github.com/apache/spark/pull/15563#discussion_r86930281 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2551,6 +2551,7 @@ private[util] object CallerContext extends Logging { */ private[spark] class CallerContext( from: String, + upstreamCallerContext: Option[String] = None, --- End diff -- I agree with you that new parameters should be added as last parameters so that no client code would be broken. Itâs just in this case, there are three callers in total and they are all under my control. The new optional parameter would be used much more frequently than other optional parameters, so I think I shall change the parameter orders before the class is used more widely. If I put the new parameter as the last one, users (including the existing three) would have to pass may âNoneâs as parameters. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15563: [SPARK-16759][CORE] Add a configuration property ...
Github user weiqingy commented on a diff in the pull request: https://github.com/apache/spark/pull/15563#discussion_r86930303 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2591,6 +2595,16 @@ private[spark] class CallerContext( } } } + + def prepareContext(context: String): String = { +// The default max size of Hadoop caller context is 128 +lazy val len = SparkHadoopUtil.get.conf.getInt("hadoop.caller.context.max.size", 128) +if (context == null || context.length <= len) { + context +} else { + context.substring(0, len) --- End diff -- Yes. Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15668 **[Test build #68323 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68323/consoleFull)** for PR 15668 at commit [`6e58167`](https://github.com/apache/spark/commit/6e58167153234f940f71dbe95be8efa0818e6287). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11105: [SPARK-12469][CORE] Data Property accumulators for Spark
Github user rxin commented on the issue: https://github.com/apache/spark/pull/11105 Is there a way for this to have just one class and have both values available always? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15746: [SPARK-18239][SPARKR] Gradient Boosted Tree for R
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15746#discussion_r86929250 --- Diff: R/pkg/R/mllib.R --- @@ -1863,5 +1889,209 @@ print.summary.RandomForestRegressionModel <- function(x, ...) { #' @export #' @note print.summary.RandomForestClassificationModel since 2.1.0 print.summary.RandomForestClassificationModel <- function(x, ...) { - print.summary.randomForest(x) + print.summary.treeEnsemble(x) +} + +#' Gradient Boosted Tree Model for Regression and Classification +#' +#' \code{spark.gbt} fits a Gradient Boosted Tree Regression model or Classification model on a +#' SparkDataFrame. Users can call \code{summary} to get a summary of the fitted +#' Gradient Boosted Tree model, \code{predict} to make predictions on new data, and +#' \code{write.ml}/\code{read.ml} to save/load fitted models. +#' For more details, see +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression}{ +#' GBT Regression} and +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier}{ +#' GBT Classification} +#' +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula +#'operators are supported, including '~', ':', '+', and '-'. +#' @param type type of model, one of "regression" or "classification", to fit +#' @param maxDepth Maximum depth of the tree (>= 0). +#' @param maxBins Maximum number of bins used for discretizing continuous features and for choosing +#'how to split on features at each node. More bins give higher granularity. Must be +#'>= 2 and >= number of categories in any categorical feature. +#' @param maxIter Param for maximum number of iterations (>= 0). +#' @param stepSize Param for Step size to be used for each iteration of optimization. +#' @param lossType Loss function which GBT tries to minimize. +#' For classification, must be "logistic". For regression, must be one of +#' "squared" (L2) and "absolute" (L1), default is "squared". +#' @param seed integer seed for random number generation. +#' @param subsamplingRate Fraction of the training data used for learning each decision tree, in +#'range (0, 1]. +#' @param minInstancesPerNode Minimum number of instances each child must have after split. If a +#'split causes the left or right child to have fewer than +#'minInstancesPerNode, the split will be discarded as invalid. Should be +#'>= 1. +#' @param minInfoGain Minimum information gain for a split to be considered at a tree node. +#' @param checkpointInterval Param for set checkpoint interval (>= 1) or disable checkpoint (-1). +#' @param maxMemoryInMB Maximum memory in MB allocated to histogram aggregation. +#' @param cacheNodeIds If FALSE, the algorithm will pass trees to executors to match instances with +#' nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching +#' can speed up training of deeper trees. Users can set how often should the +#' cache be checkpointed or disable it by setting checkpointInterval. +#' @param ... additional arguments passed to the method. +#' @aliases spark.gbt,SparkDataFrame,formula-method +#' @return \code{spark.gbt} returns a fitted Gradient Boosted Tree model. +#' @rdname spark.gbt +#' @name spark.gbt +#' @export +#' @examples +#' \dontrun{ +#' # fit a Gradient Boosted Tree Regression Model +#' df <- createDataFrame(longley) +#' model <- spark.gbt(df, Employed ~ ., type = "regression", maxDepth = 5, maxBins = 16) +#' +#' # get the summary of the model +#' summary(model) +#' +#' # make predictions +#' predictions <- predict(model, df) +#' +#' # save and load the model +#' path <- "path/to/model" +#' write.ml(model, path) +#' savedModel <- read.ml(path) +#' summary(savedModel) +#' +#' # fit a Gradient Boosted Tree Classification Model +#' # label must be binary - Only binary classification is supported for GBT. +#' df <- createDataFrame(iris[iris$Species != "virginica", ]) +#' model <- spark.gbt(df, Species ~ Petal_Length + Petal_Width, "classification") +#' +#' # numeric label is also supported +#' iris2 <- iris[iris$Species != "virginica", ] +#' iris2$NumericSpecies <- ifelse(iris2$Species == "setosa", 0, 1) +#' df <- createDataFrame(iris2) +#' model <- spark.gbt(df, NumericSpecies ~ ., type =
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15668 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68315/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15668 **[Test build #68315 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68315/consoleFull)** for PR 15668 at commit [`67fc72d`](https://github.com/apache/spark/commit/67fc72dc1e816566cd23234e52592e09e48fbe2c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing
Github user Yunni commented on the issue: https://github.com/apache/spark/pull/15148 @jkbradley I agree with most of your comments above. And I would like to suggest the following: - I would recommend a more intuitive name like `HyperplaneProjection` instead of `PStableHashing` if we adopt the LSH function @sethah suggested. - `x.toDense.values.zip(y.toDense.values).map(pair => pair._1 == pair._2).sum / x.size` is AND-amplification. I think we should use OR-amplification here. I have already made a pull request to fix the issue in #15800. - I think for MinHash, multi-probing NN Search is either single probing or full scan. - Here is my reference for Multi-probing: http://www.cs.princeton.edu/cass/papers/mplsh_vldb07.pdf @sethah @karlhigley Now I see your LSH function for Euclidean distance is the AND-amplification of what I have implemented. - Do you have any reference for compound AND/OR-amplification? I see this is not always working without assumptions on distance threshold and sensitivity, for example, `(0.6, 0.4)` => `(0.426, 0.098)` for `L = 4, d = 4`, and `(0.8, 0.2)` => `(0.678, 0.000)` for `L = 10, d = 10` - For the schema of `transform()`, I think we either add a generic type for the output column in LSH class or change the output type to `Array[Vector]`. I would recommend the latter way because (1) it's very easy to explode the array to get what @sethah suggested (2) The type of output column still needs to be spark sql compatible, which is not so generic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15746: [SPARK-18239][SPARKR] Gradient Boosted Tree for R
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15746#discussion_r86928504 --- Diff: R/pkg/R/mllib.R --- @@ -1828,13 +1849,13 @@ setMethod("summary", signature(object = "RandomForestRegressionModel"), #' @note summary(RandomForestClassificationModel) since 2.1.0 setMethod("summary", signature(object = "RandomForestClassificationModel"), function(object) { -ans <- summary.randomForest(object) +ans <- summary.treeEnsemble(object) class(ans) <- "summary.RandomForestClassificationModel" ans }) # Prints the summary of Random Forest Regression Model -print.summary.randomForest <- function(x) { +print.summary.treeEnsemble <- function(x) { --- End diff -- opened SPARK-18348 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15790: [SPARK-18264][SPARKR] update vignettes for CRAN release ...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15790 opened SPARK-18347 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15797 **[Test build #68322 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68322/consoleFull)** for PR 15797 at commit [`dddee47`](https://github.com/apache/spark/commit/dddee47b6f75da53beb1f99f1d2c6839444b0dd7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15806: [SPARK-18345][STRUCTURED STREAMING] Structured Streaming...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15806 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15806: [SPARK-18345][STRUCTURED STREAMING] Structured St...
GitHub user oza opened a pull request: https://github.com/apache/spark/pull/15806 [SPARK-18345][STRUCTURED STREAMING] Structured Streaming quick examples fails with default configuration ## What changes were proposed in this pull request? This PR fixes failure of a quick example of structured streaming with default configuration, because of failure of HDFS connection. * Fix to use local filesystem as checkpointLocation. * Fix a wrong branch condition of recoverFromCheckpointLocation in StreamingQueryManager. ## How was this patch tested? I test this with manual test: running a quick example of structured streaming with default configuration. With this fix, it works well without configuring HDFS. You can merge this pull request into a Git repository by running: $ git pull https://github.com/oza/spark SPARK-18345 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15806.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15806 commit 30e5364d902dd48cc154de4d8b23183477a3b5c4 Author: Tsuyoshi OzawaDate: 2016-11-08T05:34:31Z Fix to launch streaming quick example with default configuration --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15805 **[Test build #68321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68321/consoleFull)** for PR 15805 at commit [`1b81c12`](https://github.com/apache/spark/commit/1b81c12617fc8d5cd8948e633592d5d306191655). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if no part...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15805 cc @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15805: [SPARK-18346][SQL] TRUNCATE TABLE should fail if ...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15805 [SPARK-18346][SQL] TRUNCATE TABLE should fail if no partition is matched for the given non-partial partition spec ## What changes were proposed in this pull request? a follow up of https://github.com/apache/spark/pull/15688 ## How was this patch tested? updated test in `DDLSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark truncate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15805.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15805 commit 1b81c12617fc8d5cd8948e633592d5d306191655 Author: Wenchen FanDate: 2016-11-08T05:12:25Z TRUNCATE TABLE should fail if no partition is matched for the given non-partial partition spec --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15775: [SPARK-18280][Core]Fix potential deadlock in `Standalone...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15775 **[Test build #68320 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68320/consoleFull)** for PR 15775 at commit [`d9c5626`](https://github.com/apache/spark/commit/d9c56269833eaa62dbd48e2d9520cb996d617730). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15775: [SPARK-18280][Core]Fix potential deadlock in `Standalone...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15775 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15172 **[Test build #68319 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68319/consoleFull)** for PR 15172 at commit [`c50e088`](https://github.com/apache/spark/commit/c50e088b19a49fd95dd9ae60c11c8b1c81e59b73). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user cjjnjust commented on the issue: https://github.com/apache/spark/pull/15172 Build #68313 looks weird, details in link show it passed all test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15742: [SPARK-16808][Core] History Server main page does not ho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15742 **[Test build #68318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68318/consoleFull)** for PR 15742 at commit [`a7e380d`](https://github.com/apache/spark/commit/a7e380d1a59ef31db2225dd37f21df49edd3a094). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15797 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68314/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15797 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15797: [SPARK-17990][SPARK-18302][SQL] correct several partitio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15797 **[Test build #68314 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68314/consoleFull)** for PR 15797 at commit [`aa2536f`](https://github.com/apache/spark/commit/aa2536fd954910c418681cda1eb0c6164228e4ec). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15751 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68316/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15751 **[Test build #68316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68316/consoleFull)** for PR 15751 at commit [`bae8db8`](https://github.com/apache/spark/commit/bae8db89fef3f25478894eee962b027d8b426b01). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15751 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15804 **[Test build #68317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68317/consoleFull)** for PR 15804 at commit [`e0c164b`](https://github.com/apache/spark/commit/e0c164b86ce7cff0750615c16b2ee41905bfb875). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15804 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15804: [SPARK-18342] Make rename failures fatal in HDFSB...
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15804 [SPARK-18342] Make rename failures fatal in HDFSBackedStateStore ## What changes were proposed in this pull request? If the rename operation in the state store fails (`fs.rename` returns `false`), the StateStore should throw an exception and have the task retry. Currently if renames fail, nothing happens during execution immediately. However, you will observe that snapshot operations will fail, and then any attempt at recovery (executor failure / checkpoint recovery) also fails. ## How was this patch tested? Unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/brkyvz/spark rename-state Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15804.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15804 commit 4c2d1fd2766750b94a15fa8febd110f27c9f2514 Author: Burak YavuzDate: 2016-11-08T03:45:23Z fix rename commit e0c164b86ce7cff0750615c16b2ee41905bfb875 Author: Burak Yavuz Date: 2016-11-08T04:42:26Z added test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15627: [SPARK-18099][YARN] Fail if same files added to distribu...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/15627 @kishorvpatil @tgravescs It seems this pr is breaking functionalities of `--files` or `--archives`. Using `--files` or `--archives` with files which are not included to `--jars` doesn't work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15803 If I remember correctly, does not UI change require a screenshot in the PR description? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68311/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15172 **[Test build #68311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68311/consoleFull)** for PR 15172 at commit [`7071ca6`](https://github.com/apache/spark/commit/7071ca69c9237c3294804d3c73ef37d1d8ab312d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15172 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68313/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15172 **[Test build #68313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68313/consoleFull)** for PR 15172 at commit [`5ff19a5`](https://github.com/apache/spark/commit/5ff19a5215122345e9adc436a08357832687d64d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15745: [SPARK-18207][SQL] Fix a compilation error due to HashEx...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15745 **[Test build #68310 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68310/consoleFull)** for PR 15745 at commit [`a2b2408`](https://github.com/apache/spark/commit/a2b240808fc449c919c85ebe2eb840ca131ab8a4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15751: [SPARK-18246][SQL] Throws an exception before execution ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15751 **[Test build #68316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68316/consoleFull)** for PR 15751 at commit [`bae8db8`](https://github.com/apache/spark/commit/bae8db89fef3f25478894eee962b027d8b426b01). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15668 LGTM, pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15779: [SPARK-17748][ML] Minor cleanups to one-pass linear regr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15779 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15779: [SPARK-17748][ML] Minor cleanups to one-pass linear regr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15779 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68312/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15779: [SPARK-17748][ML] Minor cleanups to one-pass linear regr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15779 **[Test build #68312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68312/consoleFull)** for PR 15779 at commit [`72a9302`](https://github.com/apache/spark/commit/72a9302908ce22052095df4620bddc1f91352a97). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15668 **[Test build #68315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68315/consoleFull)** for PR 15668 at commit [`67fc72d`](https://github.com/apache/spark/commit/67fc72dc1e816566cd23234e52592e09e48fbe2c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates U...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15668#discussion_r86919161 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala --- @@ -115,9 +115,22 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { } // Extract distinct aggregate expressions. -val distinctAggGroups = aggExpressions - .filter(_.isDistinct) - .groupBy(_.aggregateFunction.children.toSet) +val distinctAggGroups = aggExpressions.filter(_.isDistinct).groupBy { --- End diff -- nit: ``` ...groupBy { e => ... } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15802: [SPARK-18338][SQL] Fix test case initialization order un...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15802 Seems it does not work with sbt? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15668: [SPARK-18137][SQL]Fix RewriteDistinctAggregates Unresolv...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15668 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org