[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18841 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18843: [SPARK-21595] Separate thresholds for buffering and spil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18843 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18843: [SPARK-21595] Separate thresholds for buffering and spil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18843 **[Test build #80236 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80236/testReport)** for PR 18843 at commit [`8e3bfb7`](https://github.com/apache/spark/commit/8e3bfb7715e366a64da6add80253373af7d07915). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18841 **[Test build #80235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80235/testReport)** for PR 18841 at commit [`ebafff0`](https://github.com/apache/spark/commit/ebafff0b6bc641d6d8cd0437271244d8d11fa2e3). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `abstract class UnaryArcExpression(f: Double => Double, name: String)` * `case class Acos(child: Expression) extends UnaryArcExpression(math.acos, \"ACOS\")` * `case class Asin(child: Expression) extends UnaryArcExpression(math.asin, \"ASIN\")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18843: [SPARK-21595] Separate thresholds for buffering and spil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18843 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80236/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18841 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80235/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18779 **[Test build #80237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80237/testReport)** for PR 18779 at commit [`791bc33`](https://github.com/apache/spark/commit/791bc330f8573345e0b5890b43b25259b854bc3d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18829: [SPARK-21620][WEB-UI][CORE]Add metrics url in spark web ...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18829 I also don't see value in exposing these machine-readable endpoints in a user interface? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18839: [SPARK-21634][SQL] Change OneRowRelation from a case obj...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18839 cc @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18710: [SPARK][Docs] Added note on meaning of position to subst...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18710 Ping @maclockard let's get this one finished --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18783: [SPARK-21254] [WebUI] History UI performance fixes
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18783 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18783: [SPARK-21254] [WebUI] History UI performance fixe...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18783 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131328048 --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql --- @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0; -- mixed cases: group-by ordinals and aliases select a, a AS k, count(b) from data group by k, 1; --- turn of group by ordinal +-- turn off group by ordinal set spark.sql.groupByOrdinal=false; -- can now group by negative literal select sum(b) from data group by -1; + +select 4, b from data group by 1, 2; + +set spark.sql.groupByOrdinal=true; + +select 4, b from data group by 1, 2; + +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal +select 3, 4, sum(b) from data group by 1, 2; +select 3 as c, 4 as d, sum(b) from data group by c, d; --- End diff -- Do we need to add those query tests? They are actually no effect in testing this bug. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/18413 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18838: [SPARK-21632] There is no need to make attempts f...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18838#discussion_r131326239 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -294,9 +294,14 @@ private[spark] object Utils extends Logging { } try { dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { +if (!dir.mkdirs()) { dir = null } +if (dir.exists()) { --- End diff -- I guess it throws NPE if we failed to create a dir? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131328405 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext { } assert(e.message.contains("aggregate functions are not allowed in GROUP BY")) } + + test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") { --- End diff -- Please also add order-by test too. Maybe add to `DataFrameSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18668 **[Test build #80239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80239/testReport)** for PR 18668 at commit [`ee47742`](https://github.com/apache/spark/commit/ee47742666a5e2d14fa4fd3e89df924542737305). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/18841 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18838: [SPARK-21632] There is no need to make attempts f...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18838#discussion_r131326197 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -294,9 +294,14 @@ private[spark] object Utils extends Logging { } try { dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { --- End diff -- I think we intendedly try to return a non-existent and a newly created directory. It looks not a hot path and performance here looks not a big deal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18841 **[Test build #80238 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80238/testReport)** for PR 18841 at commit [`ebafff0`](https://github.com/apache/spark/commit/ebafff0b6bc641d6d8cd0437271244d8d11fa2e3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18838: [SPARK-21632] There is no need to make attempts f...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18838#discussion_r131328303 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -294,9 +294,14 @@ private[spark] object Utils extends Logging { } try { dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { +if (!dir.mkdirs()) { dir = null } +if (dir.exists()) { + logInfo(s"$dir exists,can't create a new directory.") + dir = null + return dir.getCanonicalFile --- End diff -- And here it looks always NPE. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18413 **[Test build #80240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80240/testReport)** for PR 18413 at commit [`da037c8`](https://github.com/apache/spark/commit/da037c810a8c121d7075b741478419ffb77202d8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18668 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18668 **[Test build #80239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80239/testReport)** for PR 18668 at commit [`ee47742`](https://github.com/apache/spark/commit/ee47742666a5e2d14fa4fd3e89df924542737305). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18668 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80239/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...
Github user yaooqinn commented on a diff in the pull request: https://github.com/apache/spark/pull/18668#discussion_r131329214 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala --- @@ -134,6 +135,16 @@ private[hive] object SparkSQLCLIDriver extends Logging { // Hive 1.2 + not supported in CLI throw new RuntimeException("Remote operations not supported") } +// Respect the configurations set by --hiveconf from the command line +// (based on Hive's CliDriver). +val hiveConfFromCmd = sessionState.getOverriddenConfigurations.entrySet().asScala +val newHiveConf = hiveConfFromCmd.map { kv => + // If the same property is configured by spark.hadoop.xxx, we ignore it and + // obey settings from spark properties + val k = kv.getKey + val v = sys.props.getOrElseUpdate(SPARK_HADOOP_PROP_PREFIX + k, kv.getValue) --- End diff -- I checked the whole project that `newClientForExecution ` is only used at [HiveThriftServer2.scala#L58](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L58), [HiveThriftServer2.scala#L86](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L86) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131329942 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext { } assert(e.message.contains("aggregate functions are not allowed in GROUP BY")) } + + test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") { --- End diff -- ok,thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131329917 --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql --- @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0; -- mixed cases: group-by ordinals and aliases select a, a AS k, count(b) from data group by k, 1; --- turn of group by ordinal +-- turn off group by ordinal set spark.sql.groupByOrdinal=false; -- can now group by negative literal select sum(b) from data group by -1; + +select 4, b from data group by 1, 2; + +set spark.sql.groupByOrdinal=true; + +select 4, b from data group by 1, 2; + +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal +select 3, 4, sum(b) from data group by 1, 2; +select 3 as c, 4 as d, sum(b) from data group by c, d; --- End diff -- You mean these test cases are not necessary? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18668 **[Test build #80241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80241/testReport)** for PR 18668 at commit [`a2b23f3`](https://github.com/apache/spark/commit/a2b23f3f7206458b7e61c92db85ead9c6f03f6ef). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131330713 --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql --- @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0; -- mixed cases: group-by ordinals and aliases select a, a AS k, count(b) from data group by k, 1; --- turn of group by ordinal +-- turn off group by ordinal set spark.sql.groupByOrdinal=false; -- can now group by negative literal select sum(b) from data group by -1; + +select 4, b from data group by 1, 2; + +set spark.sql.groupByOrdinal=true; + +select 4, b from data group by 1, 2; + +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal +select 3, 4, sum(b) from data group by 1, 2; +select 3 as c, 4 as d, sum(b) from data group by c, d; --- End diff -- Because I ran them without the change `transform` -> `resolveOperators`, and there is no error. Do you have different result? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18838 This changes the logic and I'm not clear why it's an improvement. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18800 Merged to master/2.2/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r131331793 --- Diff: python/pyspark/ml/util.py --- @@ -61,32 +66,89 @@ def _randomUID(cls): @inherit_doc -class MLWriter(object): +class BaseReadWrite(object): +""" +Base class for MLWriter and MLReader. Stores information about the SparkContext +and SparkSession. + +.. versionadded:: 2.3.0 +""" + +def __init__(self): +self._sparkSession = None + +def context(self, sqlContext): +""" +Sets the Spark SQLContext to use for saving/loading. + +.. note:: Deprecated in 2.1 and will be removed in 3.0, use session instead. +""" +raise NotImplementedError("Read/Write is not yet implemented for type: %s" % type(self)) + +def session(self, sparkSession): +""" +Sets the Spark Session to use for saving/loading. +""" +self._sparkSession = sparkSession +return self + +@property +def sparkSession(self): +""" +Returns the user-specified Spark Session or the default. +""" +if self._sparkSession is None: +self._sparkSession = SparkSession.builder.getOrCreate() +return self._sparkSession + +@property +def sc(self): +""" +Returns the underlying `SparkContext`. +""" +return self.sparkSession.sparkContext + + +@inherit_doc +class MLWriter(BaseReadWrite): """ Utility class that can save ML instances. .. versionadded:: 2.0.0 """ +def __init__(self): +super(MLWriter, self).__init__() +self.shouldOverwrite = False + +def _handleOverwrite(self, path): +from pyspark.ml.wrapper import JavaWrapper + +_java_obj = JavaWrapper._new_java_obj("org.apache.spark.ml.util.FileSystemOverwrite") +wrapper = JavaWrapper(_java_obj) +wrapper._call_java("handleOverwrite", path, True, self.sc._jsc.sc()) + def save(self, path): """Save the ML instance to the input path.""" -raise NotImplementedError("MLWriter is not yet implemented for type: %s" % type(self)) - -def overwrite(self): -"""Overwrites if the output path already exists.""" -raise NotImplementedError("MLWriter is not yet implemented for type: %s" % type(self)) +if self.shouldOverwrite: +self._handleOverwrite(path) +self.saveImpl(path) -def context(self, sqlContext): +def saveImpl(self, path): """ -Sets the SQL context to use for saving. - -.. note:: Deprecated in 2.1 and will be removed in 3.0, use session instead. +save() handles overwriting and then calls this method. Subclasses should override this +method to implement the actual saving of the instance. """ raise NotImplementedError("MLWriter is not yet implemented for type: %s" % type(self)) +def overwrite(self): +"""Overwrites if the output path already exists.""" +self.shouldOverwrite = True +return self + def session(self, sparkSession): --- End diff -- You can remove this instance of session since it is inherited. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r131332176 --- Diff: python/pyspark/ml/util.py --- @@ -283,3 +341,198 @@ def numFeatures(self): Returns the number of features the model was trained on. If unknown, returns -1 """ return self._call_java("numFeatures") + + +@inherit_doc +class DefaultParamsWritable(MLWritable): +""" +.. note:: DeveloperApi + +Helper trait for making simple `Params` types writable. If a `Params` class stores +all data as [[pyspark.ml.param.Param]] values, then extending this trait will provide +a default implementation of writing saved instances of the class. +This only handles simple [[pyspark.ml.param.Param]] types; e.g., it will not handle +[[pyspark.sql.Dataset]]. + +@see `DefaultParamsReadable`, the counterpart to this trait + +.. versionadded:: 2.3.0 +""" + +def write(self): +"""Returns a DefaultParamsWriter instance for this class.""" +if isinstance(self, Params): +return DefaultParamsWriter(self) +else: +raise TypeError("Cannot use DefautParamsWritable with type %s because it does not " + +" extend Params.", type(self)) + + +@inherit_doc +class DefaultParamsWriter(MLWriter): +""" +.. note:: DeveloperApi + +Class for writing Estimators and Transformers whose parameters are JSON-serializable. + +.. versionadded:: 2.3.0 +""" + +def __init__(self, instance): +super(DefaultParamsWriter, self).__init__() +self.instance = instance + +def saveImpl(self, path): +DefaultParamsWriter.save_metadata(self.instance, path, self.sc) + +@staticmethod +def save_metadata(instance, path, sc, extraMetadata=None, paramMap=None): +""" +Saves metadata + Params to: path + "/metadata" +- class +- timestamp +- sparkVersion +- uid +- paramMap +- (optionally, extra metadata) +@param extraMetadata Extra metadata to be saved at same level as uid, paramMap, etc. +@param paramMap If given, this is saved in the "paramMap" field. +""" +metadataPath = os.path.join(path, "metadata") +metadataJson = DefaultParamsWriter._get_metadata_to_save(instance, + sc, + extraMetadata, + paramMap) +sc.parallelize([metadataJson], 1).saveAsTextFile(metadataPath) + +@staticmethod +def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None): +""" +Helper for [[save_metadata()]] which extracts the JSON to save. +This is useful for ensemble models which need to save metadata for many sub-models. + +@see [[save_metadata()]] for details on what this includes. +""" +uid = instance.uid +cls = instance.__module__ + '.' + instance.__class__.__name__ +params = instance.extractParamMap() +jsonParams = {} +if paramMap is not None: +jsonParams = paramMap +else: +for p in params: +jsonParams[p.name] = params[p] +basicMetadata = {"class": cls, "timestamp": long(round(time.time() * 1000)), + "sparkVersion": sc.version, "uid": uid, "paramMap": jsonParams} +if extraMetadata is not None: +basicMetadata.update(extraMetadata) +return json.dumps(basicMetadata, separators=[',', ':']) + + +@inherit_doc +class DefaultParamsReadable(MLReadable): +""" +.. note:: DeveloperApi + +Helper trait for making simple `Params` types readable. If a `Params` class stores +all data as [[pyspark.ml.param.Param]] values, then extending this trait will provide --- End diff -- For python docs, follow other examples such as ```:py:class:`MLWriter` ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@s
[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r131331807 --- Diff: python/pyspark/ml/util.py --- @@ -156,28 +218,23 @@ def write(self): @inherit_doc -class MLReader(object): +class MLReader(BaseReadWrite): """ Utility class that can load ML instances. .. versionadded:: 2.0.0 """ +def __init__(self): +super(MLReader, self).__init__() + def load(self, path): """Load the ML instance from the input path.""" raise NotImplementedError("MLReader is not yet implemented for type: %s" % type(self)) -def context(self, sqlContext): -""" -Sets the SQL context to use for loading. - -.. note:: Deprecated in 2.1 and will be removed in 3.0, use session instead. -""" -raise NotImplementedError("MLReader is not yet implemented for type: %s" % type(self)) - def session(self, sparkSession): --- End diff -- You can remove this instance of session since it is inherited. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r131332345 --- Diff: python/pyspark/ml/util.py --- @@ -283,3 +341,198 @@ def numFeatures(self): Returns the number of features the model was trained on. If unknown, returns -1 """ return self._call_java("numFeatures") + + +@inherit_doc +class DefaultParamsWritable(MLWritable): +""" +.. note:: DeveloperApi + +Helper trait for making simple `Params` types writable. If a `Params` class stores +all data as [[pyspark.ml.param.Param]] values, then extending this trait will provide +a default implementation of writing saved instances of the class. +This only handles simple [[pyspark.ml.param.Param]] types; e.g., it will not handle +[[pyspark.sql.Dataset]]. + +@see `DefaultParamsReadable`, the counterpart to this trait + +.. versionadded:: 2.3.0 +""" + +def write(self): +"""Returns a DefaultParamsWriter instance for this class.""" +if isinstance(self, Params): +return DefaultParamsWriter(self) +else: +raise TypeError("Cannot use DefautParamsWritable with type %s because it does not " + +" extend Params.", type(self)) + + +@inherit_doc +class DefaultParamsWriter(MLWriter): +""" +.. note:: DeveloperApi + +Class for writing Estimators and Transformers whose parameters are JSON-serializable. + +.. versionadded:: 2.3.0 +""" + +def __init__(self, instance): +super(DefaultParamsWriter, self).__init__() +self.instance = instance + +def saveImpl(self, path): +DefaultParamsWriter.save_metadata(self.instance, path, self.sc) + +@staticmethod +def save_metadata(instance, path, sc, extraMetadata=None, paramMap=None): --- End diff -- If this is going to be public, let's use camelCase to match style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18789: SPARK-20433 Bump jackson from 2.6.5 to 2.6.7.1
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18789 Ping @ash211 or should I take this over? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18746: [SPARK-21633][ML][Python] UnaryTransformer in Python
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/18746 LGTM Merging with master Thanks @ajaysaini725 ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18800: [SPARK-21330][SQL] Bad partitioning does not allo...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18800 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18746: [SPARK-21633][ML][Python] UnaryTransformer in Pyt...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18746 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131334204 --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql --- @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0; -- mixed cases: group-by ordinals and aliases select a, a AS k, count(b) from data group by k, 1; --- turn of group by ordinal +-- turn off group by ordinal set spark.sql.groupByOrdinal=false; -- can now group by negative literal select sum(b) from data group by -1; + +select 4, b from data group by 1, 2; + +set spark.sql.groupByOrdinal=true; + +select 4, b from data group by 1, 2; + +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal +select 3, 4, sum(b) from data group by 1, 2; +select 3 as c, 4 as d, sum(b) from data group by c, d; --- End diff -- Oh, I see,thanks. They are not necessary, i will remove them. Also these test cases already exist in the `DataFrameAggregateSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/18648 ping @gatorsmile could you help to review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18810: [SPARK-21603][sql]The wholestage codegen will be ...
Github user eatoncys commented on a diff in the pull request: https://github.com/apache/spark/pull/18810#discussion_r131340166 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -356,6 +356,16 @@ class CodegenContext { private val placeHolderToComments = new mutable.HashMap[String, String] /** + * Returns the length of codegen function is too long or not + */ + def existTooLongFunction(): Boolean = { +classFunctions.exists { case (className, functions) => + functions.exists{ case (name, code) => +CodeFormatter.stripExtraNewLines(code).count(_ == '\n') > SQLConf.get.maxFunctionLength --- End diff -- @kiszk Because when the JVM parameter -XX:+DontCompileHugeMethods is true, it can not get the JIT optimization when the byte code of a function is longer than 8000, here I just estimate a function lines by 8000 byte code, maybe there are some other good ways. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16158: [SPARK-18724][ML] Add TuningSummary for TrainVali...
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/16158#discussion_r131340257 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -133,7 +134,10 @@ class CrossValidator @Since("1.2.0") (@Since("1.4.0") override val uid: String) logInfo(s"Best cross-validation metric: $bestMetric.") val bestModel = est.fit(dataset, epm(bestIndex)).asInstanceOf[Model[_]] instr.logSuccess(bestModel) -copyValues(new CrossValidatorModel(uid, bestModel, metrics).setParent(this)) +val model = new CrossValidatorModel(uid, bestModel, metrics).setParent(this) +val summary = new TuningSummary(epm, metrics, bestIndex) +model.setSummary(Some(summary)) --- End diff -- Are there other obvious things that might go into the summary in future, that would make a `TuningSummary` class a better fit? Future support for say, multiple metrics, could simply extend the dataframe columns so that is ok. But is there anything else you can think of? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18123 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17673 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18123 **[Test build #80242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80242/testReport)** for PR 18123 at commit [`7955181`](https://github.com/apache/spark/commit/79551816d74ef6cdb40e8450e29b742107613313). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17673 **[Test build #80243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80243/testReport)** for PR 17673 at commit [`feda8dc`](https://github.com/apache/spark/commit/feda8dce8c2832bd1a3c61a84bfac9a23629866a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest should suppo...
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18764 Test failures in pyspark.ml.tests with python2.6, but I don't have the environment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18841 **[Test build #80238 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80238/testReport)** for PR 18841 at commit [`ebafff0`](https://github.com/apache/spark/commit/ebafff0b6bc641d6d8cd0437271244d8d11fa2e3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `abstract class UnaryArcExpression(f: Double => Double, name: String)` * `case class Acos(child: Expression) extends UnaryArcExpression(math.acos, \"ACOS\")` * `case class Asin(child: Expression) extends UnaryArcExpression(math.asin, \"ASIN\")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18841 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80238/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18841 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest should suppo...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18764 @facaiy No worried, I will take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest should suppo...
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18764 @yanboliang Thanks, yanbo. I am not familar with python 2.6, which is too outdated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18779 **[Test build #80244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80244/testReport)** for PR 18779 at commit [`2dc3610`](https://github.com/apache/spark/commit/2dc36107b33a989b18af39816eb87da8284abeba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user liu-zhaokun commented on the issue: https://github.com/apache/spark/pull/18838 @srowen Yes,I change the logic because I think it's no mean to attempt to create a dir while there was a same name dir. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18838 This introduces an NPE. It also changes what happens when the dir exists. I don't see a problem that this solves, so I'd close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/18837#discussion_r131348687 --- Diff: resource-managers/mesos/pom.xml --- @@ -29,7 +29,7 @@ Spark Project Mesos mesos -1.0.0 +1.3.0-rc1 --- End diff -- 1.3.0 is out I think right? Should we depend on a stable release? Does this change anything? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18842: [SPARK-21636] Several configurations which only are used...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18842 I would suggest not to remove such configurations even if it is only used in UT, in case some users depend on them explicitly, abruptly removing them will break the compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18842: [SPARK-21636] Several configurations which only are used...
Github user liu-zhaokun commented on the issue: https://github.com/apache/spark/pull/18842 @jerryshao OK.Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131349769 --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql --- @@ -52,8 +52,14 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0; -- mixed cases: group-by ordinals and aliases select a, a AS k, count(b) from data group by k, 1; --- turn of group by ordinal +-- turn off group by ordinal set spark.sql.groupByOrdinal=false; -- can now group by negative literal select sum(b) from data group by -1; + +select 4, b from data group by 1, 2; + +set spark.sql.groupByOrdinal=true; + +select 4, b from data group by 1, 2; --- End diff -- Why do we need to have change like this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131349806 --- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql --- @@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false; -- 0 is now a valid literal select * from data order by 0; select * from data sort by 0; + +select 4 as k, 5, b from data order by k, 2, 3; + +set spark.sql.orderByOrdinal=true; +select 4 as k, 5, b from data order by k, 2, 3; --- End diff -- Do we still need to keep this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18838 I think the original purpose is to void reusing the directory and always create a unique directory. So with the change here seems the semantics is changed, I don't think it is a reasonable fix here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/18837#discussion_r131350894 --- Diff: docs/running-on-mesos.md --- @@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for information on Spark config + spark.mesos.driver.secret.envkey + (none) + +If set, the contents of the secret referenced by +spark.mesos.driver.secret.name will be written to the provided +environment variable in the driver's process. + + + +spark.mesos.driver.secret.filename + (none) + +If set, the contents of the secret referenced by +spark.mesos.driver.secret.name will be written to the provided +file. Relative paths are relative to the container's work +directory. Absolute paths must already exist. Consult the Mesos Secret +protobuf for more information. + + + + spark.mesos.driver.secret.name --- End diff -- Is this the fully qualified path to the secret in the store? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131351217 --- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql --- @@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false; -- 0 is now a valid literal select * from data order by 0; select * from data sort by 0; + +select 4 as k, 5, b from data order by k, 2, 3; + +set spark.sql.orderByOrdinal=true; +select 4 as k, 5, b from data order by k, 2, 3; --- End diff -- it is not necessary, i will remove it.thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user liu-zhaokun commented on the issue: https://github.com/apache/spark/pull/18838 @jerryshao Thanks for your reply. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18838: [SPARK-21632] There is no need to make attempts for crea...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18838 Could we close this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/18837#discussion_r131351705 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -510,12 +510,20 @@ trait MesosSchedulerUtils extends Logging { } def mesosToTaskState(state: MesosTaskState): TaskState.TaskState = state match { -case MesosTaskState.TASK_STAGING | MesosTaskState.TASK_STARTING => TaskState.LAUNCHING -case MesosTaskState.TASK_RUNNING | MesosTaskState.TASK_KILLING => TaskState.RUNNING +case MesosTaskState.TASK_STAGING | + MesosTaskState.TASK_STARTING => TaskState.LAUNCHING +case MesosTaskState.TASK_RUNNING | + MesosTaskState.TASK_KILLING => TaskState.RUNNING case MesosTaskState.TASK_FINISHED => TaskState.FINISHED -case MesosTaskState.TASK_FAILED => TaskState.FAILED +case MesosTaskState.TASK_FAILED | + MesosTaskState.TASK_GONE | + MesosTaskState.TASK_GONE_BY_OPERATOR => TaskState.FAILED --- End diff -- These are new to 1.3? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18779 **[Test build #80237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80237/testReport)** for PR 18779 at commit [`791bc33`](https://github.com/apache/spark/commit/791bc330f8573345e0b5890b43b25259b854bc3d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18779 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18779 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80237/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131353213 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1) } } + + test("order-by ordinal.") { +val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b") +checkAnswer( + df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)), + Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2))) --- End diff -- I've run this test. Even without `transform` -> `resolveOperators`, it still works. Can you check it again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18841: [SPARK-21635][SQL] ACOS(2) and ASIN(2) should be null
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18841 **[Test build #80245 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80245/testReport)** for PR 18841 at commit [`450e0c9`](https://github.com/apache/spark/commit/450e0c9840931443f747ef39e77c3f5d49255468). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18779 **[Test build #80246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80246/testReport)** for PR 18779 at commit [`c1594c7`](https://github.com/apache/spark/commit/c1594c78b27a1beca446b5530ec2e88653619742). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17673 **[Test build #80243 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80243/testReport)** for PR 17673 at commit [`feda8dc`](https://github.com/apache/spark/commit/feda8dce8c2832bd1a3c61a84bfac9a23629866a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80243/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17673 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18123 **[Test build #80242 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80242/testReport)** for PR 18123 at commit [`7955181`](https://github.com/apache/spark/commit/79551816d74ef6cdb40e8450e29b742107613313). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18123 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18123 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80242/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/18837#discussion_r131354746 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -529,18 +560,54 @@ private[spark] class MesosClusterScheduler( val appName = desc.conf.get("spark.app.name") + TaskInfo.newBuilder() .setTaskId(taskId) .setName(s"Driver for ${appName}") .setSlaveId(offer.offer.getSlaveId) .setCommand(buildDriverCommand(desc)) + .setContainer(getContainerInfo(desc)) .addAllResources(cpuResourcesToUse.asJava) .addAllResources(memResourcesToUse.asJava) .setLabels(MesosProtoUtils.mesosLabels(desc.conf.get(config.DRIVER_LABELS).getOrElse(""))) --- End diff -- More clean I think to do here: val labels = MesosProtoUtils.mesosLabels(desc.conf.get(config.DRIVER_LABELS).getOrElse("")) .setLebels(labels) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18413 **[Test build #80240 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80240/testReport)** for PR 18413 at commit [`da037c8`](https://github.com/apache/spark/commit/da037c810a8c121d7075b741478419ffb77202d8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18413 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18413 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80240/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18413: [SPARK-21205][SQL] pmod(number, 0) should be null...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18413 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18843: [SPARK-21595] Separate thresholds for buffering and spil...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/18843 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21637][SPARK-21451][SQL]get `spark.hadoop.*` prop...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18668 **[Test build #80241 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80241/testReport)** for PR 18668 at commit [`a2b23f3`](https://github.com/apache/spark/commit/a2b23f3f7206458b7e61c92db85ead9c6f03f6ef). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21637][SPARK-21451][SQL]get `spark.hadoop.*` prop...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18668 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18668: [SPARK-21637][SPARK-21451][SQL]get `spark.hadoop.*` prop...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18668 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80241/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18843: [SPARK-21595] Separate thresholds for buffering and spil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18843 **[Test build #80247 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80247/testReport)** for PR 18843 at commit [`8e3bfb7`](https://github.com/apache/spark/commit/8e3bfb7715e366a64da6add80253373af7d07915). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18622: [SPARK-21340] Bring pyspark BinaryClassificationMetrics ...
Github user jakecharland commented on the issue: https://github.com/apache/spark/pull/18622 +1 on getting this verified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18622: [SPARK-21340] Bring pyspark BinaryClassificationMetrics ...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18622 @jakecharland that's just an automated message asking if it's OK to test. I'll kick it off --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18622: [SPARK-21340] Bring pyspark BinaryClassificationMetrics ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18622 **[Test build #3879 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3879/testReport)** for PR 18622 at commit [`d953bac`](https://github.com/apache/spark/commit/d953bac8a3715bf318c0639ff0472a4946162164). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18810: [SPARK-21603][sql]The wholestage codegen will be ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18810#discussion_r131359039 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -356,6 +356,16 @@ class CodegenContext { private val placeHolderToComments = new mutable.HashMap[String, String] /** + * Returns the length of codegen function is too long or not + */ + def existTooLongFunction(): Boolean = { +classFunctions.exists { case (className, functions) => + functions.exists{ case (name, code) => +CodeFormatter.stripExtraNewLines(code).count(_ == '\n') > SQLConf.get.maxFunctionLength --- End diff -- Got it. Thank you for your explanation. It would be good to add comment for this reasoning. I have seen the similar control at [here](github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L758-L768). Can we unify these control mechanisms into one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131360533 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1) } } + + test("order-by ordinal.") { +val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b") +checkAnswer( + df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)), + Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2))) --- End diff -- OK. I see. When we input query like `df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3))`, the query plan looks like: Sort [7#22 ASC NULLS FIRST, a#5 ASC NULLS FIRST, b#6 ASC NULLS FIRST], true +- Project [7 AS 7#22, a#5, b#6] +- Project [_1#2 AS a#5, _2#3 AS b#6] +- LocalRelation [_1#2, _2#3] We have a `Project` below `Sort`. The ordinal `1` be replaced with the attribute `7#22`. So we won't get an int literal 7 here. That is why it passes. Can you have a test for ordinal order-by that show different behavior? If not, I think ordinal order-by should be safe from this bug. And we don't need to add this test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131360956 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1) } } + + test("order-by ordinal.") { +val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b") +checkAnswer( + df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)), + Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2))) --- End diff -- Hmm, maybe we still can keep this test for the case that someone changes the Sort/Project relationship in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131361398 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext { } assert(e.message.contains("aggregate functions are not allowed in GROUP BY")) } + + test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") { +val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b") +checkAnswer( + df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")), + Seq(Row(3, 4, 6, 7, 9))) +checkAnswer( + df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")), + Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6))) + +df.createOrReplaceTempView("data") +checkAnswer( + spark.sql("select 3, 4, sum(b) from data group by 1, 2"), + Seq(Row(3, 4, 9))) +checkAnswer( + spark.sql("select 3 as c, 4 as d, sum(b) from data group by c, d"), + Seq(Row(3, 4, 9))) --- End diff -- I've verified that the above tests will fail without this fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18779 LGTM. cc @gatorsmile for final check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18749: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples an...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18749 **[Test build #80248 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80248/testReport)** for PR 18749 at commit [`974eab2`](https://github.com/apache/spark/commit/974eab27d77169c7bc7205595e3702b32865). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18779 @10110346 Thanks for working this! Sorry I've confused you in previous comments. Current changes looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org