date:20211010

[spark] branch master updated: [SPARK-36849][SQL] Migrate UseStatement to v2 command framework

2021-10-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 73edf31  [SPARK-36849][SQL] Migrate UseStatement to v2 command 
framework
73edf31 is described below

commit 73edf31c63608c6416594a4c8f4a087f10dcd7a2
Author: dohongdayi 
AuthorDate: Mon Oct 11 13:37:55 2021 +0800

[SPARK-36849][SQL] Migrate UseStatement to v2 command framework

What changes were proposed in this pull request?
Migrate `UseStatement` to v2 command framework, add `SetNamespaceCommand`

Why are the changes needed?
Migrate to the standard V2 framework

Does this PR introduce any user-facing change?
no

How was this patch tested?
existing tests

Closes #34127 from dohongdayi/use_branch.

Lead-authored-by: dohongdayi 
Co-authored-by: Herbert Liao 
Signed-off-by: Wenchen Fan 
---
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|  3 +-
 .../sql/catalyst/analysis/ResolveCatalogs.scala|  9 --
 .../spark/sql/catalyst/parser/AstBuilder.scala |  4 +--
 .../sql/catalyst/plans/logical/statements.scala|  5 
 .../sql/catalyst/plans/logical/v2Commands.scala| 11 
 .../spark/sql/execution/SparkSqlParser.scala   |  8 ++
 .../execution/command/SetNamespaceCommand.scala| 33 ++
 .../datasources/v2/DataSourceV2Strategy.scala  |  6 ++--
 .../spark/sql/connector/DataSourceV2SQLSuite.scala |  6 
 9 files changed, 61 insertions(+), 24 deletions(-)

diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index 886810e..3cd39a9 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -106,7 +106,8 @@ singleTableSchema
 statement
 : query
#statementDefault
 | ctes? dmlStatementNoWith 
#dmlStatement
-| USE NAMESPACE? multipartIdentifier   #use
+| USE multipartIdentifier  #use
+| USE NAMESPACE multipartIdentifier
#useNamespace
 | SET CATALOG (identifier | STRING)
#setCatalog
 | CREATE namespace (IF NOT EXISTS)? multipartIdentifier
 (commentSpec |
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala
index e9204ad..efc1ab2 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala
@@ -82,15 +82,6 @@ class ResolveCatalogs(val catalogManager: CatalogManager)
 convertTableProperties(c),
 writeOptions = c.writeOptions,
 orCreate = c.orCreate)
-
-case UseStatement(isNamespaceSet, nameParts) =>
-  if (isNamespaceSet) {
-SetCatalogAndNamespace(catalogManager, None, Some(nameParts))
-  } else {
-val CatalogAndNamespace(catalog, ns) = nameParts
-val namespace = if (ns.nonEmpty) Some(ns) else None
-SetCatalogAndNamespace(catalogManager, Some(catalog.name()), namespace)
-  }
   }
 
   object NonSessionCatalogAndTable {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 6a24a9d..1968142 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -3565,11 +3565,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with SQLConfHelper with Logg
   }
 
   /**
-   * Create a [[UseStatement]] logical plan.
+   * Create a [[SetCatalogAndNamespace]] command.
*/
   override def visitUse(ctx: UseContext): LogicalPlan = withOrigin(ctx) {
 val nameParts = visitMultipartIdentifier(ctx.multipartIdentifier)
-UseStatement(ctx.NAMESPACE != null, nameParts)
+SetCatalogAndNamespace(UnresolvedDBObjectName(nameParts, isNamespace = 
true))
   }
 
   /**
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
index 0373c25..c502981 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
+++

[spark] branch master updated: [SPARK-36645][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-10-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 128168d  [SPARK-36645][SQL] Aggregate (Min/Max/Count) push down for 
Parquet
128168d is described below

commit 128168d8c4019a1e10a9f1be734868524f6a09f0
Author: Huaxin Gao 
AuthorDate: Sun Oct 10 22:20:09 2021 -0700

[SPARK-36645][SQL] Aggregate (Min/Max/Count) push down for Parquet

### What changes were proposed in this pull request?
Push down Min/Max/Count to Parquet with the following restrictions:

- nested types such as Array, Map or Struct will not be pushed down
- Timestamp not pushed down because INT96 sort order is undefined, Parquet 
doesn't return statistics for INT96
- If the aggregate column is on partition column, only Count will be 
pushed, Min or Max will not be pushed down because Parquet doesn't return 
max/min for partition column.
- If somehow the file doesn't have stats for the aggregate columns, Spark 
will throw Exception.
- Currently, if filter/GROUP BY is involved, Min/Max/Count will not be 
pushed down, but the restriction will be lifted if the filter or GROUP BY is on 
partition column (https://issues.apache.org/jira/browse/SPARK-36646 and 
https://issues.apache.org/jira/browse/SPARK-36647)

### Why are the changes needed?
Since parquet has the statistics information for min, max and count, we 
want to take advantage of this info and push down Min/Max/Count to parquet 
layer for better performance.

### Does this PR introduce _any_ user-facing change?
Yes, `SQLConf.PARQUET_AGGREGATE_PUSHDOWN_ENABLED` was added. If sets to 
true, we will push down Min/Max/Count to Parquet.

### How was this patch tested?
new test suites

Closes #33639 from huaxingao/parquet_agg.

Authored-by: Huaxin Gao 
Signed-off-by: Liang-Chi Hsieh 
---
 .../org/apache/spark/sql/internal/SQLConf.scala|  10 +
 .../org/apache/spark/sql/types/StructType.scala|   2 +-
 .../datasources/parquet/ParquetUtils.scala | 227 +
 .../execution/datasources/v2/FileScanBuilder.scala |   2 +-
 .../v2/parquet/ParquetPartitionReaderFactory.scala | 123 -
 .../datasources/v2/parquet/ParquetScan.scala   |  37 +-
 .../v2/parquet/ParquetScanBuilder.scala|  96 +++-
 .../scala/org/apache/spark/sql/FileScanSuite.scala |   2 +-
 .../parquet/ParquetAggregatePushDownSuite.scala| 518 +
 9 files changed, 984 insertions(+), 33 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 6443dfd..98aad1c 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -853,6 +853,14 @@ object SQLConf {
   .checkValue(threshold => threshold >= 0, "The threshold must not be 
negative.")
   .createWithDefault(10)
 
+  val PARQUET_AGGREGATE_PUSHDOWN_ENABLED = 
buildConf("spark.sql.parquet.aggregatePushdown")
+.doc("If true, MAX/MIN/COUNT without filter and group by will be pushed" +
+  " down to Parquet for optimization. MAX/MIN/COUNT for complex types and 
timestamp" +
+  " can't be pushed down")
+.version("3.3.0")
+.booleanConf
+.createWithDefault(false)
+
   val PARQUET_WRITE_LEGACY_FORMAT = 
buildConf("spark.sql.parquet.writeLegacyFormat")
 .doc("If true, data will be written in a way of Spark 1.4 and earlier. For 
example, decimal " +
   "values will be written in Apache Parquet's fixed-length byte array 
format, which other " +
@@ -3660,6 +3668,8 @@ class SQLConf extends Serializable with Logging {
   def parquetFilterPushDownInFilterThreshold: Int =
 getConf(PARQUET_FILTER_PUSHDOWN_INFILTERTHRESHOLD)
 
+  def parquetAggregatePushDown: Boolean = 
getConf(PARQUET_AGGREGATE_PUSHDOWN_ENABLED)
+
   def orcFilterPushDown: Boolean = getConf(ORC_FILTER_PUSHDOWN_ENABLED)
 
   def isOrcSchemaMergingEnabled: Boolean = getConf(ORC_SCHEMA_MERGING_ENABLED)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala
index c9862cb..50b197f 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala
@@ -115,7 +115,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   def names: Array[String] = fieldNames
 
   private lazy val fieldNamesSet: Set[String] = fieldNames.toSet
-  private lazy val nameToField: Map[String, StructField] = fields.map(f => 
f.name -> f).toMap
+  private[sql] lazy val nameToField: Map[String, StructField] = fields.map(f 
=> f.name ->

[spark] branch master updated: [SPARK-36943][SQL] Improve readability of missing column error message

2021-10-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d83abfc  [SPARK-36943][SQL] Improve readability of missing column 
error message
d83abfc is described below

commit d83abfc530d8dcc117e0123820b181c98d9f46f6
Author: Karen Feng 
AuthorDate: Mon Oct 11 12:48:03 2021 +0800

[SPARK-36943][SQL] Improve readability of missing column error message

### What changes were proposed in this pull request?

Improves the quality of the error message encountered by users when they 
attempt to access a column that does not exist.
Removes the lingo term "resolve" and sorts the suggestions by probability.

### Why are the changes needed?

Improves the user experience

### Does this PR introduce _any_ user-facing change?

Yes:

Before:
```
cannot resolve 'foo' given input columns [bar, baz, froo]
```
After:
```
Column 'foo' does not exist. Did you mean one of the following? [bar, baz, 
froo]
```

### How was this patch tested?

Unit tests

Closes #34202 from karenfeng/improve-error-msg-missing-col.

Authored-by: Karen Feng 
Signed-off-by: Wenchen Fan 
---
 core/src/main/resources/error/error-classes.json   |  2 +-
 .../org/apache/spark/SparkThrowableSuite.scala |  4 +-
 python/pyspark/pandas/tests/test_indexops_spark.py |  4 +-
 python/pyspark/sql/tests/test_utils.py |  2 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala  | 11 --
 .../spark/sql/catalyst/util/StringUtils.scala  |  8 
 .../sql/catalyst/analysis/AnalysisErrorSuite.scala | 35 -
 .../sql/catalyst/analysis/AnalysisSuite.scala  | 16 +---
 .../spark/sql/catalyst/analysis/AnalysisTest.scala | 31 +++
 .../catalyst/analysis/ResolveSubquerySuite.scala   | 25 +++-
 .../catalyst/analysis/V2WriteAnalysisSuite.scala   |  4 +-
 .../results/columnresolution-negative.sql.out  |  8 ++--
 .../resources/sql-tests/results/group-by.sql.out   |  2 +-
 .../sql-tests/results/join-lateral.sql.out |  8 ++--
 .../sql-tests/results/natural-join.sql.out |  2 +-
 .../test/resources/sql-tests/results/pivot.sql.out |  4 +-
 .../results/postgreSQL/aggregates_part1.sql.out|  2 +-
 .../sql-tests/results/postgreSQL/join.sql.out  | 16 
 .../results/postgreSQL/select_having.sql.out   |  2 +-
 .../results/postgreSQL/select_implicit.sql.out |  4 +-
 .../sql-tests/results/postgreSQL/union.sql.out |  2 +-
 .../sql-tests/results/query_regex_column.sql.out   | 16 
 .../negative-cases/invalid-correlation.sql.out |  2 +-
 .../sql-tests/results/table-aliases.sql.out|  2 +-
 .../udf/postgreSQL/udf-aggregates_part1.sql.out|  2 +-
 .../results/udf/postgreSQL/udf-join.sql.out| 16 
 .../udf/postgreSQL/udf-select_having.sql.out   |  2 +-
 .../udf/postgreSQL/udf-select_implicit.sql.out |  4 +-
 .../sql-tests/results/udf/udf-group-by.sql.out |  2 +-
 .../sql-tests/results/udf/udf-pivot.sql.out|  4 +-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 24 
 .../org/apache/spark/sql/DataFrameSuite.scala  |  3 +-
 .../spark/sql/DataFrameWindowFunctionsSuite.scala  |  3 +-
 .../scala/org/apache/spark/sql/DatasetSuite.scala  | 21 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala |  7 ++--
 .../scala/org/apache/spark/sql/SubquerySuite.scala |  3 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  3 +-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 45 --
 .../apache/spark/sql/execution/SQLViewSuite.scala  | 12 +++---
 .../sql/execution/datasources/csv/CSVSuite.scala   |  7 ++--
 .../sql/execution/datasources/json/JsonSuite.scala |  7 ++--
 .../org/apache/spark/sql/sources/InsertSuite.scala |  7 ++--
 .../apache/spark/sql/hive/HiveParquetSuite.scala   |  7 ++--
 43 files changed, 250 insertions(+), 141 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index d270f0e..301f8d0 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -90,7 +90,7 @@
 "message" : [ "Key %s does not exist." ]
   },
   "MISSING_COLUMN" : {
-"message" : [ "cannot resolve '%s' given input columns: [%s]" ],
+"message" : [ "Column '%s' does not exist. Did you mean one of the 
following? [%s]" ],
 "sqlState" : "42000"
   },
   "MISSING_STATIC_PARTITION_COLUMN" : {
diff --git a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala 
b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala
index 5af55af..1cd3ba3 100644
--- a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala
+++

[spark] branch master updated: [SPARK-36963][SQL] Add max_by/min_by to sql.functions

2021-10-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 13b0251  [SPARK-36963][SQL] Add max_by/min_by to sql.functions
13b0251 is described below

commit 13b02512db6e1d735994d79b92c9783e78f5f745
Author: Ruifeng Zheng 
AuthorDate: Mon Oct 11 09:47:04 2021 +0900

[SPARK-36963][SQL] Add max_by/min_by to sql.functions

### What changes were proposed in this pull request?
Add max_by/min_by to sql.functions

### Why are the changes needed?
for convenience

### Does this PR introduce _any_ user-facing change?
yes, new methods are added

### How was this patch tested?
existing testsuits and added testsuits

Closes #34229 from zhengruifeng/functions_add_max_min_by.

Authored-by: Ruifeng Zheng 
Signed-off-by: Hyukjin Kwon 
---
 .../src/main/scala/org/apache/spark/sql/functions.scala  | 16 
 .../org/apache/spark/sql/DataFrameAggregateSuite.scala   | 10 ++
 2 files changed, 26 insertions(+)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index 7bca29f..b32c1f8 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -674,6 +674,14 @@ object functions {
   def max(columnName: String): Column = max(Column(columnName))
 
   /**
+   * Aggregate function: returns the value associated with the maximum value 
of ord.
+   *
+   * @group agg_funcs
+   * @since 3.3.0
+   */
+  def max_by(e: Column, ord: Column): Column = withAggregateFunction { 
MaxBy(e.expr, ord.expr) }
+
+  /**
* Aggregate function: returns the average of the values in a group.
* Alias for avg.
*
@@ -708,6 +716,14 @@ object functions {
   def min(columnName: String): Column = min(Column(columnName))
 
   /**
+   * Aggregate function: returns the value associated with the minimum value 
of ord.
+   *
+   * @group agg_funcs
+   * @since 3.3.0
+   */
+  def min_by(e: Column, ord: Column): Column = withAggregateFunction { 
MinBy(e.expr, ord.expr) }
+
+  /**
* Aggregate function: returns the approximate `percentile` of the numeric 
column `col` which
* is the smallest value in the ordered `col` values (sorted from least to 
greatest) such that
* no more than `percentage` of `col` values is less than the value or equal 
to that value.
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
index 1f8638c..c3076c5 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
@@ -876,6 +876,11 @@ class DataFrameAggregateSuite extends QueryTest
 checkAnswer(yearOfMaxEarnings, Row("dotNET", 2013) :: Row("Java", 2013) :: 
Nil)
 
 checkAnswer(
+  courseSales.groupBy("course").agg(max_by(col("year"), col("earnings"))),
+  Row("dotNET", 2013) :: Row("Java", 2013) :: Nil
+)
+
+checkAnswer(
   sql("SELECT max_by(x, y) FROM VALUES (('a', 10)), (('b', 50)), (('c', 
20)) AS tab(x, y)"),
   Row("b") :: Nil
 )
@@ -932,6 +937,11 @@ class DataFrameAggregateSuite extends QueryTest
 checkAnswer(yearOfMinEarnings, Row("dotNET", 2012) :: Row("Java", 2012) :: 
Nil)
 
 checkAnswer(
+  courseSales.groupBy("course").agg(min_by(col("year"), col("earnings"))),
+  Row("dotNET", 2012) :: Row("Java", 2012) :: Nil
+)
+
+checkAnswer(
   sql("SELECT min_by(x, y) FROM VALUES (('a', 10)), (('b', 50)), (('c', 
20)) AS tab(x, y)"),
   Row("a") :: Nil
 )

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

2021-10-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 29ebfdc  [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for 
Java 11
29ebfdc is described below

commit 29ebfdcdff74af72c6900fa0856ada3ab07f8de1
Author: Sean Owen 
AuthorDate: Sun Oct 10 18:08:37 2021 -0500

[SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

### What changes were proposed in this pull request?

Increase JVM test memory from 4g to 6g.

### Why are the changes needed?

Running tests under Java 11 fails consistently on a few tests without more 
memory. The tests do legitimately grab a lot of memory, I believe. It's not 
super clear why memory usage is different in Java 11, but, also seems fine to 
just give comfortably more heap to tests for now.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests, run manually with Java 11.

Closes #34214 from srowen/SPARK-36900.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit 6ed13147c99b2f652748b716c70dd1937230cafd)
Signed-off-by: Sean Owen 
---
 pom.xml| 6 +++---
 project/SparkBuild.scala   | 4 ++--
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive/pom.xml   | 2 +-
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/pom.xml b/pom.xml
index d9c10ee..bd8ede6 100644
--- a/pom.xml
+++ b/pom.xml
@@ -2640,7 +2640,7 @@
 
   -Xss128m
   -Xms4g
-  -Xmx4g
+  -Xmx6g
   -XX:MaxMetaspaceSize=2g
   -XX:ReservedCodeCacheSize=${CodeCacheSize}
 
@@ -2690,7 +2690,7 @@
   **/*Suite.java
 
 
${project.build.directory}/surefire-reports
--ea -Xmx4g -Xss4m -XX:MaxMetaspaceSize=2g 
-XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
+-ea -Xmx6g -Xss4m -XX:MaxMetaspaceSize=2g 
-XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
 
   
-  -da -Xmx4g -XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
+  -da -Xmx6g -XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
 
   
   

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

2021-10-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6ed1314  [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for 
Java 11
6ed1314 is described below

commit 6ed13147c99b2f652748b716c70dd1937230cafd
Author: Sean Owen 
AuthorDate: Sun Oct 10 18:08:37 2021 -0500

[SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

### What changes were proposed in this pull request?

Increase JVM test memory from 4g to 6g.

### Why are the changes needed?

Running tests under Java 11 fails consistently on a few tests without more 
memory. The tests do legitimately grab a lot of memory, I believe. It's not 
super clear why memory usage is different in Java 11, but, also seems fine to 
just give comfortably more heap to tests for now.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests, run manually with Java 11.

Closes #34214 from srowen/SPARK-36900.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
---
 pom.xml| 6 +++---
 project/SparkBuild.scala   | 4 ++--
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive/pom.xml   | 2 +-
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/pom.xml b/pom.xml
index 5c6d3a8..1ef33e1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -2657,7 +2657,7 @@
 
   -Xss128m
   -Xms4g
-  -Xmx4g
+  -Xmx6g
   -XX:MaxMetaspaceSize=2g
   -XX:ReservedCodeCacheSize=${CodeCacheSize}
 
@@ -2707,7 +2707,7 @@
   **/*Suite.java
 
 
${project.build.directory}/surefire-reports
--ea -Xmx4g -Xss4m -XX:MaxMetaspaceSize=2g 
-XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
+-ea -Xmx6g -Xss4m -XX:MaxMetaspaceSize=2g 
-XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
 
   
-  -da -Xmx4g -XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
+  -da -Xmx6g -XX:ReservedCodeCacheSize=${CodeCacheSize} 
-Dio.netty.tryReflectionSetAccessible=true
 
   
   

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-36849][SQL] Migrate UseStatement to v2 command framework

[spark] branch master updated: [SPARK-36645][SQL] Aggregate (Min/Max/Count) push down for Parquet

[spark] branch master updated: [SPARK-36943][SQL] Improve readability of missing column error message

[spark] branch master updated: [SPARK-36963][SQL] Add max_by/min_by to sql.functions

[spark] branch branch-3.2 updated: [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

[spark] branch master updated: [SPARK-36900][TESTS][BUILD] Increase test memory to 6g for Java 11

6 matches

Site Navigation

Mail list logo

Footer information