[spark] branch branch-3.0 updated (5aaec8b -> 5c921bd)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 5aaec8b [SPARK-34273][CORE] Do not reregister BlockManager when SparkContext is stopped add 5c921bd [SPARK-33867][SQL][3.0] Instant and LocalDate values aren't handled when generating SQL queries No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/jdbc/JdbcDialects.scala | 10 ++ .../test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala | 14 ++ 2 files changed, 24 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a7683af -> 72b7f8a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a7683af [SPARK-26346][BUILD][SQL] Upgrade Parquet to 1.11.1 add 72b7f8a [SPARK-34261][SQL] Avoid side effect if create exists temporary function No new revisions were added by this update. Summary of changes: .../spark/sql/execution/command/functions.scala| 4 .../spark/sql/hive/execution/HiveDDLSuite.scala| 24 +- 2 files changed, 27 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33867][SQL] Instant and LocalDate values aren't handled when generating SQL queries
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 4ca628eb [SPARK-33867][SQL] Instant and LocalDate values aren't handled when generating SQL queries 4ca628eb is described below commit 4ca628eb2f54c3e039867c5ccbb0cde7413c18e4 Author: Chircu AuthorDate: Thu Jan 28 11:58:20 2021 +0900 [SPARK-33867][SQL] Instant and LocalDate values aren't handled when generating SQL queries ### What changes were proposed in this pull request? When generating SQL queries only the old date time API types are handled for values in org.apache.spark.sql.jdbc.JdbcDialect#compileValue. If the new API is used (spark.sql.datetime.java8API.enabled=true) Instant and LocalDate values are not quoted and errors are thrown. The change proposed is to handle Instant and LocalDate values the same way that Timestamp and Date are. ### Why are the changes needed? In the current state if an Instant is used in a filter, an exception will be thrown. Ex (dataset was read from PostgreSQL): dataset.filter(current_timestamp().gt(col(VALID_FROM))) Stacktrace (the T11 is from an instant formatted like -MM-dd'T'HH:mm:ss.SS'Z'): Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "T11"Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "T11" Position: 285 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) at org.postgresql.jdbc2.AbstractJdbc2Statement. [...] ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test added Closes #31148 from cristichircu/SPARK-33867. Lead-authored-by: Chircu Co-authored-by: Cristi Chircu Signed-off-by: Takeshi Yamamuro (cherry picked from commit 829f118f98ef0732c8dd784f06298465e47ee3a0) Signed-off-by: Takeshi Yamamuro --- .../scala/org/apache/spark/sql/jdbc/JdbcDialects.scala | 10 ++ .../test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala | 14 ++ 2 files changed, 24 insertions(+) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala index ead0a1a..6c72172 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala @@ -18,6 +18,7 @@ package org.apache.spark.sql.jdbc import java.sql.{Connection, Date, Timestamp} +import java.time.{Instant, LocalDate} import scala.collection.mutable.ArrayBuilder @@ -26,9 +27,11 @@ import org.apache.commons.lang3.StringUtils import org.apache.spark.annotation.{DeveloperApi, Since} import org.apache.spark.internal.Logging import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.util.{DateFormatter, DateTimeUtils, TimestampFormatter} import org.apache.spark.sql.connector.catalog.TableChange import org.apache.spark.sql.connector.catalog.TableChange._ import org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils +import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.types._ /** @@ -175,7 +178,14 @@ abstract class JdbcDialect extends Serializable with Logging{ def compileValue(value: Any): Any = value match { case stringValue: String => s"'${escapeSql(stringValue)}'" case timestampValue: Timestamp => "'" + timestampValue + "'" +case timestampValue: Instant => + val timestampFormatter = TimestampFormatter.getFractionFormatter( +DateTimeUtils.getZoneId(SQLConf.get.sessionLocalTimeZone)) + s"'${timestampFormatter.format(timestampValue)}'" case dateValue: Date => "'" + dateValue + "'" +case dateValue: LocalDate => + val dateFormatter = DateFormatter(DateTimeUtils.getZoneId(SQLConf.get.sessionLocalTimeZone)) + s"'${dateFormatter.format(dateValue)}'" case arrayValue: Array[Any] => arrayValue.map(compileValue).mkString(", ") case _ => value } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala index b81824d..70f5508 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala @@ -19,6 +19,7 @@ package org.apache.spark.sql.jdbc import java.math.BigDecimal import java.sql.{Date, DriverManager, S
[spark] branch master updated (0dedf24 -> 829f118)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0dedf24 [SPARK-34154][YARN] Extend LocalityPlacementStrategySuite's test with a timeout add 829f118 [SPARK-33867][SQL] Instant and LocalDate values aren't handled when generating SQL queries No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/jdbc/JdbcDialects.scala | 10 ++ .../test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala | 14 ++ 2 files changed, 24 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33100][SQL][3.0] Ignore a semicolon inside a bracketed comment in spark-sql
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e7d5344 [SPARK-33100][SQL][3.0] Ignore a semicolon inside a bracketed comment in spark-sql e7d5344 is described below commit e7d53449f198bd8c5ee97d58f285994e31ea2d1a Author: fwang12 AuthorDate: Fri Jan 8 10:44:12 2021 +0900 [SPARK-33100][SQL][3.0] Ignore a semicolon inside a bracketed comment in spark-sql ### What changes were proposed in this pull request? Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. NOTE: This backport comes from https://github.com/apache/spark/pull/29982 ### Why are the changes needed? Spark-sql might split the statements inside bracketed comments and it is not correct. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #31033 from turboFei/SPARK-33100. Authored-by: fwang12 Signed-off-by: Takeshi Yamamuro --- .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 50 ++ .../spark/sql/hive/thriftserver/CliSuite.scala | 23 ++ 2 files changed, 65 insertions(+), 8 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala index 6abb905..581aa68 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala @@ -518,15 +518,32 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Note: [SPARK-31595] if there is a `'` in a double quoted string, or a `"` in a single quoted // string, the origin implementation from Hive will not drop the trailing semicolon as expected, // hence we refined this function a little bit. + // Note: [SPARK-33100] Ignore a semicolon inside a bracketed comment in spark-sql. private def splitSemiColon(line: String): JList[String] = { var insideSingleQuote = false var insideDoubleQuote = false -var insideComment = false +var insideSimpleComment = false +var bracketedCommentLevel = 0 var escape = false var beginIndex = 0 +var leavingBracketedComment = false +var isStatement = false val ret = new JArrayList[String] +def insideBracketedComment: Boolean = bracketedCommentLevel > 0 +def insideComment: Boolean = insideSimpleComment || insideBracketedComment +def statementInProgress(index: Int): Boolean = isStatement || (!insideComment && + index > beginIndex && !s"${line.charAt(index)}".trim.isEmpty) + for (index <- 0 until line.length) { + // Checks if we need to decrement a bracketed comment level; the last character '/' of + // bracketed comments is still inside the comment, so `insideBracketedComment` must keep true + // in the previous loop and we decrement the level here if needed. + if (leavingBracketedComment) { +bracketedCommentLevel -= 1 +leavingBracketedComment = false + } + if (line.charAt(index) == '\'' && !insideComment) { // take a look to see if it is escaped // See the comment above about SPARK-31595 @@ -549,21 +566,34 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Sample query: select "quoted value --" //^^ avoids starting a comment if it's inside quotes. } else if (hasNext && line.charAt(index + 1) == '-') { - // ignore quotes and ; - insideComment = true + // ignore quotes and ; in simple comment + insideSimpleComment = true } } else if (line.charAt(index) == ';') { if (insideSingleQuote || insideDoubleQuote || insideComment) { // do not split } else { - // split, do not include ; itself - ret.add(line.substring(beginIndex, index)) + if (isStatement) { +// split, do not include ; itself +ret.add(line.substring(beginIndex, index)) + } beginIndex =
[spark] branch branch-3.1 updated: [SPARK-33100][SQL][FOLLOWUP] Find correct bound of bracketed comment in spark-sql
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 874c404 [SPARK-33100][SQL][FOLLOWUP] Find correct bound of bracketed comment in spark-sql 874c404 is described below commit 874c40429c5f99ab02cdfd928d9d7b6caaea16ea Author: fwang12 AuthorDate: Thu Jan 7 20:49:37 2021 +0900 [SPARK-33100][SQL][FOLLOWUP] Find correct bound of bracketed comment in spark-sql ### What changes were proposed in this pull request? This PR help find correct bound of bracketed comment in spark-sql. Here is the log for UT of SPARK-33100 in CliSuite before: ``` 2021-01-05 13:22:34.768 - stdout> spark-sql> /* SELECT 'test';*/ SELECT 'test'; 2021-01-05 13:22:41.523 - stderr> Time taken: 6.716 seconds, Fetched 1 row(s) 2021-01-05 13:22:41.599 - stdout> test 2021-01-05 13:22:41.6 - stdout> spark-sql> ;;/* SELECT 'test';*/ SELECT 'test'; 2021-01-05 13:22:41.709 - stdout> test 2021-01-05 13:22:41.709 - stdout> spark-sql> /* SELECT 'test';*/;; SELECT 'test'; 2021-01-05 13:22:41.902 - stdout> spark-sql> SELECT 'test'; -- SELECT 'test'; 2021-01-05 13:22:41.902 - stderr> Time taken: 0.129 seconds, Fetched 1 row(s) 2021-01-05 13:22:41.902 - stderr> Error in query: 2021-01-05 13:22:41.902 - stderr> mismatched input '' expecting {'(', 'ADD', 'ALTER', 'ANALYZE', 'CACHE', 'CLEAR', 'COMMENT', 'COMMIT', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DFS', 'DROP', 'EXPLAIN', 'EXPORT', 'FROM', 'GRANT', 'IMPORT', 'INSERT', 'LIST', 'LOAD', 'LOCK', 'MAP', 'MERGE', 'MSCK', 'REDUCE', 'REFRESH', 'REPLACE', 'RESET', 'REVOKE', 'ROLLBACK', 'SELECT', 'SET', 'SHOW', 'START', 'TABLE', 'TRUNCATE', 'UNCACHE', 'UNLOCK', 'UPDATE', 'USE', 'VALUES', 'WITH'}(line 1, pos 19) 2021-01-05 13:22:42.006 - stderr> 2021-01-05 13:22:42.006 - stderr> == SQL == 2021-01-05 13:22:42.006 - stderr> /* SELECT 'test';*/ 2021-01-05 13:22:42.006 - stderr> ---^^^ 2021-01-05 13:22:42.006 - stderr> 2021-01-05 13:22:42.006 - stderr> Time taken: 0.226 seconds, Fetched 1 row(s) 2021-01-05 13:22:42.006 - stdout> test ``` The root cause is that the insideBracketedComment is not accurate. For `/* comment */`, the last character `/` is not insideBracketedComment and it would be treat as beginning of statements. In this PR, this issue is fixed. ### Why are the changes needed? To fix the issue described above. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UT Closes #31054 from turboFei/SPARK-33100-followup. Authored-by: fwang12 Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7b06acc28b5c37da6c48bc44c3d921309d4ad3a8) Signed-off-by: Takeshi Yamamuro --- .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 24 +++--- .../spark/sql/hive/thriftserver/CliSuite.scala | 4 ++-- 2 files changed, 19 insertions(+), 9 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala index 9155eac..8606aaa 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala @@ -530,15 +530,24 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { var bracketedCommentLevel = 0 var escape = false var beginIndex = 0 -var includingStatement = false +var leavingBracketedComment = false +var isStatement = false val ret = new JArrayList[String] def insideBracketedComment: Boolean = bracketedCommentLevel > 0 def insideComment: Boolean = insideSimpleComment || insideBracketedComment -def statementBegin(index: Int): Boolean = includingStatement || (!insideComment && +def statementInProgress(index: Int): Boolean = isStatement || (!insideComment && index > beginIndex && !s"${line.charAt(index)}".trim.isEmpty) for (index <- 0 until line.length) { + // Checks if we need to decrement a bracketed comment level; the last character '/' of + // bracketed comments is still inside the comment, so `insideBracketedComment` must keep true + // in the previous loop and we decrement the level here if needed. + if (leavingBracketedComment) { +bracketedCommentLevel -= 1 +leavingBracketedComment = false + } + if (line.charAt(index) == '\'' && !insideComment) {
[spark] branch master updated (d36cdd5 -> 7b06acc)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d36cdd5 [SPARK-33933][SQL] Materialize BroadcastQueryStage first to avoid broadcast timeout in AQE add 7b06acc [SPARK-33100][SQL][FOLLOWUP] Find correct bound of bracketed comment in spark-sql No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 24 +++--- .../spark/sql/hive/thriftserver/CliSuite.scala | 4 ++-- 2 files changed, 19 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators"
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 2b88afb [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators" 2b88afb is described below commit 2b88afb65b23d4d06180ecd402aba9c7b0fc106a Author: gengjiaan AuthorDate: Wed Jan 6 21:14:45 2021 +0900 [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators" ### What changes were proposed in this pull request? Add doc for 'like any' and 'like all' operators in sql-ref-syntx-qry-select-like.cmd ### Why are the changes needed? make the usage of 'like any' and 'like all' known to more users ### Does this PR introduce _any_ user-facing change? Yes. https://user-images.githubusercontent.com/692303/103767385-dc1ffb80-5063-11eb-9529-89479531425f.png;> https://user-images.githubusercontent.com/692303/103767391-dde9bf00-5063-11eb-82ce-63bdd11593a1.png;> https://user-images.githubusercontent.com/692303/103767396-df1aec00-5063-11eb-8e81-a192e6c72431.png;> ### How was this patch tested? No tests Closes #31008 from beliefer/SPARK-33977. Lead-authored-by: gengjiaan Co-authored-by: beliefer Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-qry-select-like.md | 60 +- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-qry-select-like.md b/docs/sql-ref-syntax-qry-select-like.md index 6211faa8..3604a9b 100644 --- a/docs/sql-ref-syntax-qry-select-like.md +++ b/docs/sql-ref-syntax-qry-select-like.md @@ -21,12 +21,14 @@ license: | ### Description -A LIKE predicate is used to search for a specific pattern. +A LIKE predicate is used to search for a specific pattern. This predicate also supports multiple patterns with quantifiers include `ANY`, `SOME` and `ALL`. ### Syntax ```sql [ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } + +[ NOT ] { LIKE quantifiers ( search_pattern [ , ... ]) } ``` ### Parameters @@ -45,6 +47,10 @@ A LIKE predicate is used to search for a specific pattern. * **regex_pattern** Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. + +* **quantifiers** + +Specifies the predicate quantifiers include `ANY`, `SOME` and `ALL`. `ANY` or `SOME` means if one of the patterns matches the input, then return true; `ALL` means if all the patterns matches the input, then return true. ### Examples @@ -111,6 +117,58 @@ SELECT * FROM person WHERE name LIKE '%$_%' ESCAPE '$'; +---+--+---+ |500|Evan_W| 16| +---+--+---+ + +SELECT * FROM person WHERE name LIKE ALL ('%an%', '%an'); ++---+++ +| id|name| age| ++---+++ +|400| Dan| 50| ++---+++ + +SELECT * FROM person WHERE name LIKE ANY ('%an%', '%an'); ++---+--+---+ +| id| name|age| ++---+--+---+ +|400| Dan| 50| +|500|Evan_W| 16| ++---+--+---+ + +SELECT * FROM person WHERE name LIKE SOME ('%an%', '%an'); ++---+--+---+ +| id| name|age| ++---+--+---+ +|400| Dan| 50| +|500|Evan_W| 16| ++---+--+---+ + +SELECT * FROM person WHERE name NOT LIKE ALL ('%an%', '%an'); ++---+++ +| id|name| age| ++---+++ +|100|John| 30| +|200|Mary|null| +|300|Mike| 80| ++---+++ + +SELECT * FROM person WHERE name NOT LIKE ANY ('%an%', '%an'); ++---+--++ +| id| name| age| ++---+--++ +|100| John| 30| +|200| Mary|null| +|300| Mike| 80| +|500|Evan_W| 16| ++---+--++ + +SELECT * FROM person WHERE name NOT LIKE SOME ('%an%', '%an'); ++---+--++ +| id| name| age| ++---+--++ +|100| John| 30| +|200| Mary|null| +|300| Mike| 80| +|500|Evan_W| 16| ++---+--++ ``` ### Related Statements - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators"
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6788304 [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators" 6788304 is described below commit 6788304240c416d173ebdb3d544f3361c6b9fe8e Author: gengjiaan AuthorDate: Wed Jan 6 21:14:45 2021 +0900 [SPARK-33977][SQL][DOCS] Add doc for "'like any' and 'like all' operators" ### What changes were proposed in this pull request? Add doc for 'like any' and 'like all' operators in sql-ref-syntx-qry-select-like.cmd ### Why are the changes needed? make the usage of 'like any' and 'like all' known to more users ### Does this PR introduce _any_ user-facing change? Yes. https://user-images.githubusercontent.com/692303/103767385-dc1ffb80-5063-11eb-9529-89479531425f.png;> https://user-images.githubusercontent.com/692303/103767391-dde9bf00-5063-11eb-82ce-63bdd11593a1.png;> https://user-images.githubusercontent.com/692303/103767396-df1aec00-5063-11eb-8e81-a192e6c72431.png;> ### How was this patch tested? No tests Closes #31008 from beliefer/SPARK-33977. Lead-authored-by: gengjiaan Co-authored-by: beliefer Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-qry-select-like.md | 60 +- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-qry-select-like.md b/docs/sql-ref-syntax-qry-select-like.md index 6211faa8..3604a9b 100644 --- a/docs/sql-ref-syntax-qry-select-like.md +++ b/docs/sql-ref-syntax-qry-select-like.md @@ -21,12 +21,14 @@ license: | ### Description -A LIKE predicate is used to search for a specific pattern. +A LIKE predicate is used to search for a specific pattern. This predicate also supports multiple patterns with quantifiers include `ANY`, `SOME` and `ALL`. ### Syntax ```sql [ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } + +[ NOT ] { LIKE quantifiers ( search_pattern [ , ... ]) } ``` ### Parameters @@ -45,6 +47,10 @@ A LIKE predicate is used to search for a specific pattern. * **regex_pattern** Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. + +* **quantifiers** + +Specifies the predicate quantifiers include `ANY`, `SOME` and `ALL`. `ANY` or `SOME` means if one of the patterns matches the input, then return true; `ALL` means if all the patterns matches the input, then return true. ### Examples @@ -111,6 +117,58 @@ SELECT * FROM person WHERE name LIKE '%$_%' ESCAPE '$'; +---+--+---+ |500|Evan_W| 16| +---+--+---+ + +SELECT * FROM person WHERE name LIKE ALL ('%an%', '%an'); ++---+++ +| id|name| age| ++---+++ +|400| Dan| 50| ++---+++ + +SELECT * FROM person WHERE name LIKE ANY ('%an%', '%an'); ++---+--+---+ +| id| name|age| ++---+--+---+ +|400| Dan| 50| +|500|Evan_W| 16| ++---+--+---+ + +SELECT * FROM person WHERE name LIKE SOME ('%an%', '%an'); ++---+--+---+ +| id| name|age| ++---+--+---+ +|400| Dan| 50| +|500|Evan_W| 16| ++---+--+---+ + +SELECT * FROM person WHERE name NOT LIKE ALL ('%an%', '%an'); ++---+++ +| id|name| age| ++---+++ +|100|John| 30| +|200|Mary|null| +|300|Mike| 80| ++---+++ + +SELECT * FROM person WHERE name NOT LIKE ANY ('%an%', '%an'); ++---+--++ +| id| name| age| ++---+--++ +|100| John| 30| +|200| Mary|null| +|300| Mike| 80| +|500|Evan_W| 16| ++---+--++ + +SELECT * FROM person WHERE name NOT LIKE SOME ('%an%', '%an'); ++---+--++ +| id| name| age| ++---+--++ +|100| John| 30| +|200| Mary|null| +|300| Mike| 80| +|500|Evan_W| 16| ++---+--++ ``` ### Related Statements - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-34012][SQL][3.0] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new aaa3dcc [SPARK-34012][SQL][3.0] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide aaa3dcc is described below commit aaa3dcc2c9effde3dd3b4bbe04f7c06e299294cb Author: angerszhu AuthorDate: Wed Jan 6 20:57:03 2021 +0900 [SPARK-34012][SQL][3.0] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/22696 we support HAVING without GROUP BY means global aggregate But since we treat having as Filter before, in this way will cause a lot of analyze error, after https://github.com/apache/spark/pull/28294 we use `UnresolvedHaving` to instead `Filter` to solve such problem, but break origin logical about treat `SELECT 1 FROM range(10) HAVING true` as `SELECT 1 FROM range(10) WHERE true` . This PR fix this issue and add UT. NOTE: This backport comes from https://github.com/apache/spark/pull/31039 ### Why are the changes needed? Keep consistent behavior of migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added UT Closes #31049 from AngersZh/SPARK-34012-3.0. Authored-by: angerszhu Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 63 +- 3 files changed, 77 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 938976e..2fcee5a 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -723,7 +723,11 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging val withProject = if (aggregationClause == null && havingClause != null) { if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) { // If the legacy conf is set, treat HAVING without GROUP BY as WHERE. -withHavingClause(havingClause, createProject()) +val predicate = expression(havingClause.booleanExpression) match { + case p: Predicate => p + case e => Cast(e, BooleanType) +} +Filter(predicate, createProject()) } else { // According to SQL standard, HAVING without GROUP BY means global aggregate. withHavingClause(havingClause, Aggregate(Nil, namedExpressions, withFilter)) diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql index fedf03d..3f5f556 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql @@ -86,6 +86,16 @@ SELECT 1 FROM range(10) HAVING MAX(id) > 0; SELECT id FROM range(10) HAVING id > 0; +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true; + +SELECT 1 FROM range(10) HAVING true; + +SELECT 1 FROM range(10) HAVING MAX(id) > 0; + +SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=false; + -- Test data CREATE OR REPLACE TEMPORARY VIEW test_agg AS SELECT * FROM VALUES (1, true), (1, false), diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out index 50eb2a9..e5b7058 100644 --- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 56 +-- Number of queries: 61 -- !query @@ -278,6 +278,67 @@ grouping expressions sequence is empty, and '`id`' is not an aggregate function. -- !query +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true +-- !query schema +struct +-- !query output +spark.sql.legacy.parser.havingWithoutGroupByAsWheretrue + + +-- !query +SELECT 1 FROM range(10) HAVING true +-- !query schema +struct<1:int> +-- !query output +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 + + +-- !query +SELECT 1 FROM range(10) HAVING MAX(id) > 0 +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException + +Aggregate/Window/Generate expressions are not valid in where clause of the
[spark] branch branch-2.4 updated: [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d442146 [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide d442146 is described below commit d442146964a981dd7f074c4954f7fed2752124e8 Author: angerszhu AuthorDate: Wed Jan 6 20:54:47 2021 +0900 [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/22696 we support HAVING without GROUP BY means global aggregate But since we treat having as Filter before, in this way will cause a lot of analyze error, after https://github.com/apache/spark/pull/28294 we use `UnresolvedHaving` to instead `Filter` to solve such problem, but break origin logical about treat `SELECT 1 FROM range(10) HAVING true` as `SELECT 1 FROM range(10) WHERE true` . This PR fix this issue and add UT. NOTE: This backport comes from #31039 ### Why are the changes needed? Keep consistent behavior of migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added UT Closes #31050 from AngersZh/SPARK-34012-2.4. Authored-by: angerszhu Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 60 +- 3 files changed, 74 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 90e7d1c..4c4e4f1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -467,7 +467,11 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging val withProject = if (aggregation == null && having != null) { if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) { // If the legacy conf is set, treat HAVING without GROUP BY as WHERE. -withHaving(having, createProject()) +val predicate = expression(having) match { + case p: Predicate => p + case e => Cast(e, BooleanType) +} +Filter(predicate, createProject()) } else { // According to SQL standard, HAVING without GROUP BY means global aggregate. withHaving(having, Aggregate(Nil, namedExpressions, withFilter)) diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql index 433db71..0c40a8c 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql @@ -80,3 +80,13 @@ SELECT 1 FROM range(10) HAVING true; SELECT 1 FROM range(10) HAVING MAX(id) > 0; SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true; + +SELECT 1 FROM range(10) HAVING true; + +SELECT 1 FROM range(10) HAVING MAX(id) > 0; + +SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=false; diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out index f9d1ee8..d23a58a 100644 --- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 30 +-- Number of queries: 35 -- !query 0 @@ -275,3 +275,61 @@ struct<> -- !query 29 output org.apache.spark.sql.AnalysisException grouping expressions sequence is empty, and '`id`' is not an aggregate function. Wrap '()' in windowing function(s) or wrap '`id`' in first() (or first_value) if you don't care which value you get.; + + +-- !query 30 +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true +-- !query 30 schema +struct +-- !query 30 output +spark.sql.legacy.parser.havingWithoutGroupByAsWheretrue + + +-- !query 31 +SELECT 1 FROM range(10) HAVING true +-- !query 31 schema +struct<1:int> +-- !query 31 output +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 + + +-- !query 32 +SELECT 1 FROM range(10) HAVING MAX(id) > 0 +-- !query 32 schema +struct<> +-- !query 32 output +java.lang
[spark] branch branch-2.4 updated (45e19bb -> 3e6a6b7)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 45e19bb [SPARK-33911][SQL][DOCS][2.4] Update the SQL migration guide about changes in `HiveClientImpl` add 3e6a6b7 [SPARK-33935][SQL][2.4] Fix CBO cost function No new revisions were added by this update. Summary of changes: .../sql/catalyst/optimizer/CostBasedJoinReorder.scala | 13 + .../spark/sql/catalyst/optimizer/JoinReorderSuite.scala | 15 +++ .../optimizer/StarJoinCostBasedReorderSuite.scala | 8 3 files changed, 24 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new d729158 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide d729158 is described below commit d7291582ebaf815f89474c76d8a35b49172b1ecf Author: angerszhu AuthorDate: Wed Jan 6 08:48:24 2021 +0900 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/22696 we support HAVING without GROUP BY means global aggregate But since we treat having as Filter before, in this way will cause a lot of analyze error, after https://github.com/apache/spark/pull/28294 we use `UnresolvedHaving` to instead `Filter` to solve such problem, but break origin logical about treat `SELECT 1 FROM range(10) HAVING true` as `SELECT 1 FROM range(10) WHERE true` . This PR fix this issue and add UT. ### Why are the changes needed? Keep consistent behavior of migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added UT Closes #31039 from AngersZh/SPARK-25780-Follow-up. Authored-by: angerszhu Signed-off-by: Takeshi Yamamuro (cherry picked from commit e279ed304475a6d5a9fbf739fe9ed32ef58171cb) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 63 +- 3 files changed, 77 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index a22383c..9d74ac9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -714,7 +714,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with SQLConfHelper with Logg val withProject = if (aggregationClause == null && havingClause != null) { if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) { // If the legacy conf is set, treat HAVING without GROUP BY as WHERE. -withHavingClause(havingClause, createProject()) +val predicate = expression(havingClause.booleanExpression) match { + case p: Predicate => p + case e => Cast(e, BooleanType) +} +Filter(predicate, createProject()) } else { // According to SQL standard, HAVING without GROUP BY means global aggregate. withHavingClause(havingClause, Aggregate(Nil, namedExpressions, withFilter)) diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql index 81e2204..6ee1014 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql @@ -86,6 +86,16 @@ SELECT 1 FROM range(10) HAVING MAX(id) > 0; SELECT id FROM range(10) HAVING id > 0; +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true; + +SELECT 1 FROM range(10) HAVING true; + +SELECT 1 FROM range(10) HAVING MAX(id) > 0; + +SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=false; + -- Test data CREATE OR REPLACE TEMPORARY VIEW test_agg AS SELECT * FROM VALUES (1, true), (1, false), diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out index 75bda87..cc07cd6 100644 --- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 57 +-- Number of queries: 62 -- !query @@ -278,6 +278,67 @@ grouping expressions sequence is empty, and '`id`' is not an aggregate function. -- !query +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true +-- !query schema +struct +-- !query output +spark.sql.legacy.parser.havingWithoutGroupByAsWheretrue + + +-- !query +SELECT 1 FROM range(10) HAVING true +-- !query schema +struct<1:int> +-- !query output +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 + + +-- !query +SELECT 1 FROM range(10) HAVING MAX(id) > 0 +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException + +Aggregate/Window/Generate expressions ar
[spark] branch master updated (171db85 -> e279ed3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 171db85 [SPARK-33874][K8S][FOLLOWUP] Handle long lived sidecars - clean up logging add e279ed3 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 63 +- 3 files changed, 77 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a071826 -> f252a93)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a071826 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql add f252a93 [SPARK-33935][SQL] Fix CBO cost function No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 13 +- .../optimizer/joinReorder/JoinReorderSuite.scala | 15 + .../StarJoinCostBasedReorderSuite.scala| 8 +- .../approved-plans-v1_4/q13.sf100/explain.txt | 132 ++--- .../approved-plans-v1_4/q13.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q17.sf100/explain.txt | 194 +++ .../approved-plans-v1_4/q17.sf100/simplified.txt | 130 ++--- .../approved-plans-v1_4/q18.sf100/explain.txt | 158 +++--- .../approved-plans-v1_4/q18.sf100/simplified.txt | 50 +- .../approved-plans-v1_4/q19.sf100/explain.txt | 368 ++--- .../approved-plans-v1_4/q19.sf100/simplified.txt | 116 ++--- .../approved-plans-v1_4/q24a.sf100/explain.txt | 118 ++--- .../approved-plans-v1_4/q24a.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q24b.sf100/explain.txt | 118 ++--- .../approved-plans-v1_4/q24b.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q25.sf100/explain.txt | 194 +++ .../approved-plans-v1_4/q25.sf100/simplified.txt | 130 ++--- .../approved-plans-v1_4/q33.sf100/explain.txt | 264 +- .../approved-plans-v1_4/q33.sf100/simplified.txt | 58 +-- .../approved-plans-v1_4/q52.sf100/explain.txt | 138 ++--- .../approved-plans-v1_4/q52.sf100/simplified.txt | 26 +- .../approved-plans-v1_4/q55.sf100/explain.txt | 134 ++--- .../approved-plans-v1_4/q55.sf100/simplified.txt | 26 +- .../approved-plans-v1_4/q72.sf100/explain.txt | 264 +- .../approved-plans-v1_4/q72.sf100/simplified.txt | 150 +++--- .../approved-plans-v1_4/q81.sf100/explain.txt | 570 ++--- .../approved-plans-v1_4/q81.sf100/simplified.txt | 142 ++--- .../approved-plans-v1_4/q91.sf100/explain.txt | 306 +-- .../approved-plans-v1_4/q91.sf100/simplified.txt | 62 +-- .../approved-plans-v2_7/q18a.sf100/explain.txt | 306 +-- .../approved-plans-v2_7/q18a.sf100/simplified.txt | 54 +- .../approved-plans-v2_7/q72.sf100/explain.txt | 264 +- .../approved-plans-v2_7/q72.sf100/simplified.txt | 150 +++--- 33 files changed, 2386 insertions(+), 2374 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new f702a95 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql f702a95 is described below commit f702a95e81e4b3318dec701d5a8eb2898bbd8ff6 Author: fwang12 AuthorDate: Tue Jan 5 15:55:30 2021 +0900 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql ### What changes were proposed in this pull request? Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. ### Why are the changes needed? Spark-sql might split the statements inside bracketed comments and it is not correct. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #29982 from turboFei/SPARK-33110. Lead-authored-by: fwang12 Co-authored-by: turbofei Signed-off-by: Takeshi Yamamuro (cherry picked from commit a071826f72cd717a58bf37b877f805490f7a147f) Signed-off-by: Takeshi Yamamuro --- .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 40 +- .../spark/sql/hive/thriftserver/CliSuite.scala | 23 + 2 files changed, 55 insertions(+), 8 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala index f2fd373..9155eac 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala @@ -522,14 +522,22 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Note: [SPARK-31595] if there is a `'` in a double quoted string, or a `"` in a single quoted // string, the origin implementation from Hive will not drop the trailing semicolon as expected, // hence we refined this function a little bit. + // Note: [SPARK-33100] Ignore a semicolon inside a bracketed comment in spark-sql. private def splitSemiColon(line: String): JList[String] = { var insideSingleQuote = false var insideDoubleQuote = false -var insideComment = false +var insideSimpleComment = false +var bracketedCommentLevel = 0 var escape = false var beginIndex = 0 +var includingStatement = false val ret = new JArrayList[String] +def insideBracketedComment: Boolean = bracketedCommentLevel > 0 +def insideComment: Boolean = insideSimpleComment || insideBracketedComment +def statementBegin(index: Int): Boolean = includingStatement || (!insideComment && + index > beginIndex && !s"${line.charAt(index)}".trim.isEmpty) + for (index <- 0 until line.length) { if (line.charAt(index) == '\'' && !insideComment) { // take a look to see if it is escaped @@ -553,21 +561,33 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Sample query: select "quoted value --" //^^ avoids starting a comment if it's inside quotes. } else if (hasNext && line.charAt(index + 1) == '-') { - // ignore quotes and ; - insideComment = true + // ignore quotes and ; in simple comment + insideSimpleComment = true } } else if (line.charAt(index) == ';') { if (insideSingleQuote || insideDoubleQuote || insideComment) { // do not split } else { - // split, do not include ; itself - ret.add(line.substring(beginIndex, index)) + if (includingStatement) { +// split, do not include ; itself +ret.add(line.substring(beginIndex, index)) + } beginIndex = index + 1 + includingStatement = false } } else if (line.charAt(index) == '\n') { -// with a new line the inline comment should end. +// with a new line the inline simple comment should end. if (!escape) { - insideComment = false + insideSimpleComment = false +} + } else if (line.charAt(index) == '/' && !insideSimpleComment) { +
[spark] branch master updated (a7d3fcd -> a071826)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a7d3fcd [SPARK-34000][CORE] Fix stageAttemptToNumSpeculativeTasks java.util.NoSuchElementException add a071826 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 40 +- .../spark/sql/hive/thriftserver/CliSuite.scala | 23 + 2 files changed, 55 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (65a9ac2 -> 10b6466)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 65a9ac2 [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec add 10b6466 [SPARK-33084][CORE][SQL] Add jar support ivy path No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/SparkContext.scala | 45 --- .../org/apache/spark/deploy/SparkSubmit.scala | 8 +- .../apache/spark/deploy/worker/DriverWrapper.scala | 16 +-- .../spark/{deploy => util}/DependencyUtils.scala | 137 - .../scala/org/apache/spark/SparkContextSuite.scala | 116 + .../org/apache/spark/deploy/SparkSubmitSuite.scala | 2 +- .../spark/deploy/SparkSubmitUtilsSuite.scala | 14 ++- .../org/apache/spark/util/DependencyUtils.scala| 60 + docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md | 16 ++- .../apache/spark/sql/internal/SessionState.scala | 30 +++-- sql/core/src/test/resources/SPARK-33084.jar| Bin 0 -> 6322 bytes .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 54 .../spark/sql/hive/HiveSessionStateBuilder.scala | 9 +- .../sql/hive/client/IsolatedClientLoader.scala | 1 + .../spark/sql/hive/execution/HiveQuerySuite.scala | 17 +++ 15 files changed, 475 insertions(+), 50 deletions(-) rename core/src/main/scala/org/apache/spark/{deploy => util}/DependencyUtils.scala (54%) create mode 100644 core/src/test/scala/org/apache/spark/util/DependencyUtils.scala create mode 100644 sql/core/src/test/resources/SPARK-33084.jar - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f62e957 -> 7466031)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f62e957 [SPARK-33873][CORE][TESTS] Test all compression codecs with encrypted spilling add 7466031 [SPARK-32106][SQL] Implement script transform in sql/core No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/AstBuilder.scala | 52 ++- .../sql/catalyst/parser/PlanParserSuite.scala | 113 ++- .../apache/spark/sql/execution/SparkPlanner.scala | 1 + .../execution/SparkScriptTransformationExec.scala | 91 ++ .../spark/sql/execution/SparkSqlParser.scala | 115 --- .../spark/sql/execution/SparkStrategies.scala | 14 + .../test/resources/sql-tests/inputs/transform.sql | 195 +++ .../resources/sql-tests/results/transform.sql.out | 357 + .../org/apache/spark/sql/SQLQueryTestSuite.scala | 5 +- .../execution/SparkScriptTransformationSuite.scala | 102 ++ .../execution/HiveScriptTransformationExec.scala | 2 + 11 files changed, 982 insertions(+), 65 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/SparkScriptTransformationExec.scala create mode 100644 sql/core/src/test/resources/sql-tests/inputs/transform.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/transform.sql.out create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/SparkScriptTransformationSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1339168 -> 3c8be39)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1339168 [SPARK-33756][SQL] Make BytesToBytesMap's MapIterator idempotent add 3c8be39 [SPARK-33850][SQL][FOLLOWUP] Improve and cleanup the test code No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/ExplainSuite.scala | 25 -- 1 file changed, 9 insertions(+), 16 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ea7c2a1 [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar ea7c2a1 is described below commit ea7c2a15a02d8c8cf3e3f5a1260da76829d59596 Author: luluorta AuthorDate: Tue Dec 8 20:45:25 2020 +0900 [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar ### What changes were proposed in this pull request? `LikeSimplification` rule does not work correctly for many cases that have patterns containing escape characters, for example: `SELECT s LIKE 'm%aca' ESCAPE '%' FROM t` `SELECT s LIKE 'maacaa' ESCAPE 'a' FROM t` For simpilicy, this PR makes this rule just be skipped if `pattern` contains any `escapeChar`. ### Why are the changes needed? Result corrupt. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added Unit test. Closes #30625 from luluorta/SPARK-33677. Authored-by: luluorta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 99613cd5815b2de12274027dee0c0a6c0c57bd95) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/optimizer/expressions.scala | 18 +--- .../optimizer/LikeSimplificationSuite.scala| 48 ++ .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 14 +++ 3 files changed, 74 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala index 7773f5c..7cbeb47 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala @@ -525,27 +525,33 @@ object LikeSimplification extends Rule[LogicalPlan] { private val equalTo = "([^_%]*)".r def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions { -case Like(input, Literal(pattern, StringType), escapeChar) => +case l @ Like(input, Literal(pattern, StringType), escapeChar) => if (pattern == null) { // If pattern is null, return null value directly, since "col like null" == null. Literal(null, BooleanType) } else { -val escapeStr = String.valueOf(escapeChar) pattern.toString match { - case startsWith(prefix) if !prefix.endsWith(escapeStr) => + // There are three different situations when pattern containing escapeChar: + // 1. pattern contains invalid escape sequence, e.g. 'm\aca' + // 2. pattern contains escaped wildcard character, e.g. 'ma\%ca' + // 3. pattern contains escaped escape character, e.g. 'ma\\ca' + // Although there are patterns can be optimized if we handle the escape first, we just + // skip this rule if pattern contains any escapeChar for simplicity. + case p if p.contains(escapeChar) => l + case startsWith(prefix) => StartsWith(input, Literal(prefix)) case endsWith(postfix) => EndsWith(input, Literal(postfix)) // 'a%a' pattern is basically same with 'a%' && '%a'. // However, the additional `Length` condition is required to prevent 'a' match 'a%a'. - case startsAndEndsWith(prefix, postfix) if !prefix.endsWith(escapeStr) => + case startsAndEndsWith(prefix, postfix) => And(GreaterThanOrEqual(Length(input), Literal(prefix.length + postfix.length)), And(StartsWith(input, Literal(prefix)), EndsWith(input, Literal(postfix - case contains(infix) if !infix.endsWith(escapeStr) => + case contains(infix) => Contains(input, Literal(infix)) case equalTo(str) => EqualTo(input, Literal(str)) - case _ => Like(input, Literal.create(pattern, StringType), escapeChar) + case _ => l } } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala index 436f62e..1812dce 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala @@ -116,4 +116,52 @@ class LikeSimplificationSuite extends PlanTest { val optimized2 = Optimize.execute(originalQuery2.analyze) comparePlans(optimized2, originalQuery2.analyze) } + +
[spark] branch master updated (031c5ef -> 99613cd)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 031c5ef [SPARK-33679][SQL] Enable spark.sql.adaptive.enabled by default add 99613cd [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/expressions.scala | 18 +--- .../optimizer/LikeSimplificationSuite.scala| 48 ++ .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 14 +++ 3 files changed, 74 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 54a73ab [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar 54a73ab is described below commit 54a73ab7bb062b0fad3b8925d03bb4dca9fdc17a Author: luluorta AuthorDate: Tue Dec 8 20:45:25 2020 +0900 [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar ### What changes were proposed in this pull request? `LikeSimplification` rule does not work correctly for many cases that have patterns containing escape characters, for example: `SELECT s LIKE 'm%aca' ESCAPE '%' FROM t` `SELECT s LIKE 'maacaa' ESCAPE 'a' FROM t` For simpilicy, this PR makes this rule just be skipped if `pattern` contains any `escapeChar`. ### Why are the changes needed? Result corrupt. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added Unit test. Closes #30625 from luluorta/SPARK-33677. Authored-by: luluorta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 99613cd5815b2de12274027dee0c0a6c0c57bd95) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/optimizer/expressions.scala | 18 +--- .../optimizer/LikeSimplificationSuite.scala| 48 ++ .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 14 +++ 3 files changed, 74 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala index 1b1e2ad..b2fc334 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala @@ -543,27 +543,33 @@ object LikeSimplification extends Rule[LogicalPlan] { private val equalTo = "([^_%]*)".r def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions { -case Like(input, Literal(pattern, StringType), escapeChar) => +case l @ Like(input, Literal(pattern, StringType), escapeChar) => if (pattern == null) { // If pattern is null, return null value directly, since "col like null" == null. Literal(null, BooleanType) } else { -val escapeStr = String.valueOf(escapeChar) pattern.toString match { - case startsWith(prefix) if !prefix.endsWith(escapeStr) => + // There are three different situations when pattern containing escapeChar: + // 1. pattern contains invalid escape sequence, e.g. 'm\aca' + // 2. pattern contains escaped wildcard character, e.g. 'ma\%ca' + // 3. pattern contains escaped escape character, e.g. 'ma\\ca' + // Although there are patterns can be optimized if we handle the escape first, we just + // skip this rule if pattern contains any escapeChar for simplicity. + case p if p.contains(escapeChar) => l + case startsWith(prefix) => StartsWith(input, Literal(prefix)) case endsWith(postfix) => EndsWith(input, Literal(postfix)) // 'a%a' pattern is basically same with 'a%' && '%a'. // However, the additional `Length` condition is required to prevent 'a' match 'a%a'. - case startsAndEndsWith(prefix, postfix) if !prefix.endsWith(escapeStr) => + case startsAndEndsWith(prefix, postfix) => And(GreaterThanOrEqual(Length(input), Literal(prefix.length + postfix.length)), And(StartsWith(input, Literal(prefix)), EndsWith(input, Literal(postfix - case contains(infix) if !infix.endsWith(escapeStr) => + case contains(infix) => Contains(input, Literal(infix)) case equalTo(str) => EqualTo(input, Literal(str)) - case _ => Like(input, Literal.create(pattern, StringType), escapeChar) + case _ => l } } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala index 436f62e..1812dce 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LikeSimplificationSuite.scala @@ -116,4 +116,52 @@ class LikeSimplificationSuite extends PlanTest { val optimized2 = Optimize.execute(originalQuery2.analyze) comparePlans(optimized2, originalQuery2.analyze) } + +
[spark] branch master updated (9273d42 -> cf4ad21)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9273d42 [SPARK-33045][SQL][FOLLOWUP] Support built-in function like_any and fix StackOverflowError issue add cf4ad21 [SPARK-33503][SQL] Refactor SortOrder class to allow multiple childrens No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 2 +- .../apache/spark/sql/catalyst/dsl/package.scala| 4 ++-- .../spark/sql/catalyst/expressions/SortOrder.scala | 10 + .../spark/sql/catalyst/parser/AstBuilder.scala | 2 +- .../main/scala/org/apache/spark/sql/Column.scala | 8 +++ .../sql/execution/AliasAwareOutputExpression.scala | 6 + .../sql/execution/joins/SortMergeJoinExec.scala| 9 .../apache/spark/sql/execution/PlannerSuite.scala | 26 ++ 8 files changed, 46 insertions(+), 21 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (feda729 -> 4851453)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from feda729 [SPARK-33567][SQL] DSv2: Use callback instead of passing Spark session and v2 relation for refreshing cache add 4851453 [MINOR] Spelling bin core docs external mllib repl No new revisions were added by this update. Summary of changes: bin/docker-image-tool.sh | 2 +- .../org/apache/spark/ui/static/spark-dag-viz.js| 2 +- .../resources/org/apache/spark/ui/static/utils.js | 2 +- .../apache/spark/ExecutorAllocationManager.scala | 4 +- .../org/apache/spark/api/java/JavaPairRDD.scala| 4 +- .../org/apache/spark/api/java/JavaRDDLike.scala| 2 +- .../org/apache/spark/api/python/PythonRDD.scala| 6 +- .../org/apache/spark/deploy/JsonProtocol.scala | 2 +- .../org/apache/spark/deploy/SparkSubmit.scala | 2 +- .../spark/deploy/history/FsHistoryProvider.scala | 2 +- .../apache/spark/deploy/history/HybridStore.scala | 2 +- .../scala/org/apache/spark/executor/Executor.scala | 4 +- .../org/apache/spark/metrics/MetricsConfig.scala | 2 +- .../spark/metrics/sink/PrometheusServlet.scala | 6 +- .../org/apache/spark/rdd/DoubleRDDFunctions.scala | 2 +- .../org/apache/spark/rdd/OrderedRDDFunctions.scala | 4 +- core/src/main/scala/org/apache/spark/rdd/RDD.scala | 2 +- .../spark/resource/TaskResourceRequest.scala | 2 +- .../org/apache/spark/rpc/netty/NettyRpcEnv.scala | 4 +- .../scheduler/BarrierJobAllocationFailed.scala | 4 +- .../org/apache/spark/scheduler/DAGScheduler.scala | 8 +- .../org/apache/spark/scheduler/HealthTracker.scala | 4 +- .../apache/spark/scheduler/TaskSetManager.scala| 2 +- .../apache/spark/security/CryptoStreamUtils.scala | 2 +- .../org/apache/spark/storage/BlockManager.scala| 4 +- .../spark/storage/BlockManagerMasterEndpoint.scala | 2 +- .../org/apache/spark/ui/jobs/AllJobsPage.scala | 2 +- .../scala/org/apache/spark/ui/jobs/JobPage.scala | 2 +- .../org/apache/spark/util/ClosureCleaner.scala | 2 +- .../main/scala/org/apache/spark/util/Utils.scala | 22 ++-- .../apache/spark/util/io/ChunkedByteBuffer.scala | 2 +- .../shuffle/sort/UnsafeShuffleWriterSuite.java | 10 +- .../java/test/org/apache/spark/JavaAPISuite.java | 2 +- .../scala/org/apache/spark/CheckpointSuite.scala | 12 +-- .../org/apache/spark/ContextCleanerSuite.scala | 10 +- .../spark/ExecutorAllocationManagerSuite.scala | 2 +- .../test/scala/org/apache/spark/FileSuite.scala| 2 +- .../org/apache/spark/benchmark/BenchmarkBase.scala | 2 +- .../deploy/history/FsHistoryProviderSuite.scala| 4 +- .../apache/spark/deploy/master/MasterSuite.scala | 2 +- .../apache/spark/deploy/worker/WorkerSuite.scala | 2 +- .../org/apache/spark/executor/ExecutorSuite.scala | 2 +- .../io/FileCommitProtocolInstantiationSuite.scala | 4 +- .../spark/metrics/InputOutputMetricsSuite.scala| 2 +- .../netty/NettyBlockTransferServiceSuite.scala | 2 +- .../apache/spark/rdd/PairRDDFunctionsSuite.scala | 34 +++--- .../test/scala/org/apache/spark/rdd/RDDSuite.scala | 2 +- .../apache/spark/resource/ResourceUtilsSuite.scala | 2 +- .../apache/spark/rpc/netty/NettyRpcEnvSuite.scala | 2 +- .../apache/spark/scheduler/DAGSchedulerSuite.scala | 6 +- .../spark/scheduler/ReplayListenerSuite.scala | 2 +- .../scheduler/SchedulerIntegrationSuite.scala | 8 +- .../spark/scheduler/SparkListenerSuite.scala | 6 +- .../spark/scheduler/TaskSetManagerSuite.scala | 6 +- .../spark/status/AppStatusListenerSuite.scala | 2 +- .../apache/spark/storage/BlockManagerSuite.scala | 4 +- .../org/apache/spark/util/JsonProtocolSuite.scala | 8 +- .../org/apache/spark/util/SizeEstimatorSuite.scala | 2 +- docs/_plugins/include_example.rb | 4 +- docs/building-spark.md | 2 +- docs/configuration.md | 2 +- docs/css/main.css | 4 +- docs/graphx-programming-guide.md | 4 +- docs/ml-migration-guide.md | 2 +- docs/mllib-clustering.md | 2 +- docs/mllib-data-types.md | 2 +- docs/monitoring.md | 6 +- docs/running-on-kubernetes.md | 4 +- docs/running-on-mesos.md | 2 +- docs/running-on-yarn.md| 2 +- docs/sparkr.md | 2 +- docs/sql-data-sources-jdbc.md | 2 +- docs/sql-migration-guide.md| 6 +- docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 2 +- docs/sql-ref-syntax-ddl-create-table
[spark] branch master updated: [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new cf98a76 [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite cf98a76 is described below commit cf98a761de677c733f3c33230e1c63ddb785d5c5 Author: Kousuke Saruta AuthorDate: Sat Nov 28 23:38:11 2020 +0900 [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite ### What changes were proposed in this pull request? This PR changes mariadb_docker_entrypoint.sh to set the proper version automatically for mariadb-plugin-gssapi-server. The proper version is based on the one of mariadb-server. Also, this PR enables to use arbitrary docker image by setting the environment variable `MARIADB_CONTAINER_IMAGE_NAME`. ### Why are the changes needed? For `MariaDBKrbIntegrationSuite`, the version of `mariadb-plugin-gssapi-server` is currently set to `10.5.5` in `mariadb_docker_entrypoint.sh` but it's no longer available in the official apt repository and `MariaDBKrbIntegrationSuite` doesn't pass for now. It seems that only the most recent three versions are available for each major version and they are `10.5.6`, `10.5.7` and `10.5.8` for now. Further, the release cycle of MariaDB seems to be very rapid (1 ~ 2 months) so I don't think it's a good idea to set to an specific version for `mariadb-plugin-gssapi-server`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Confirmed that `MariaDBKrbIntegrationSuite` passes with the following commands. ``` $ build/sbt -Pdocker-integration-tests -Phive -Phive-thriftserver package "testOnly org.apache.spark.sql.jdbc.MariaDBKrbIntegrationSuite" ``` In this case, we can see what version of `mariadb-plugin-gssapi-server` is going to be installed in the following container log message. ``` Installing mariadb-plugin-gssapi-server=1:10.5.8+maria~focal ``` Or, we can set MARIADB_CONTAINER_IMAGE_NAME for a specific version of MariaDB. ``` $ MARIADB_DOCKER_IMAGE_NAME=mariadb:10.5.6 build/sbt -Pdocker-integration-tests -Phive -Phive-thriftserver package "testOnly org.apache.spark.sql.jdbc.MariaDBKrbIntegrationSuite" ``` ``` Installing mariadb-plugin-gssapi-server=1:10.5.6+maria~focal ``` Closes #30515 from sarutak/fix-MariaDBKrbIntegrationSuite. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro --- .../src/test/resources/mariadb_docker_entrypoint.sh | 4 +++- .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala | 12 +--- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh b/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh index 97c00a9..ab7d967 100755 --- a/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh +++ b/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh @@ -18,7 +18,9 @@ dpkg-divert --add /bin/systemctl && ln -sT /bin/true /bin/systemctl apt update -apt install -y mariadb-plugin-gssapi-server=1:10.5.5+maria~focal +GSSAPI_PLUGIN=mariadb-plugin-gssapi-server=$(dpkg -s mariadb-server | sed -n "s/^Version: \(.*\)/\1/p") +echo "Installing $GSSAPI_PLUGIN" +apt install -y "$GSSAPI_PLUGIN" echo "gssapi_keytab_path=/docker-entrypoint-initdb.d/mariadb.keytab" >> /etc/mysql/mariadb.conf.d/auth_gssapi.cnf echo "gssapi_principal_name=mariadb/__ip_address_replace_m...@example.com" >> /etc/mysql/mariadb.conf.d/auth_gssapi.cnf docker-entrypoint.sh mysqld diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala index adee2be..59a6f53 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala @@ -24,15 +24,21 @@ import com.spotify.docker.client.messages.{ContainerConfig, HostConfig} import org.apache.spark.sql.execution.datasources.jdbc.connection.SecureConnectionProvider import org.apache.spark.tags.DockerTest +/** + * To run this test suite for a specific version (e.g., mariadb:10.5.8): + * {{{ + * MARIADB_DOCKER_IMAGE_NAME=mariadb:10.5.8 + * ./build/sbt -Pdocker-integration-tests + * "testOnly org.apache.spark.s
[spark] branch master updated (1bd897c -> 0592181)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1bd897c [SPARK-32918][SHUFFLE] RPC implementation to support control plane coordination for push-based shuffle add 0592181 [SPARK-33479][DOC][FOLLOWUP] DocSearch: Support filtering search results by version No new revisions were added by this update. Summary of changes: docs/_config.yml | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples 9d58a2f is described below commit 9d58a2f0f0f308a03830bf183959a4743a77b78a Author: Josh Soref AuthorDate: Thu Nov 12 08:29:22 2020 +0900 [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30326 from jsoref/spelling-graphx. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // N
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (4a1c143 -> 577dbb9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org