date:20211201

[spark] branch master updated (ffe3fc9 -> b9b5562)

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ffe3fc9  [SPARK-37514][PYTHON] Remove workarounds due to older pandas
 add b9b5562  [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW 
CREATE TABLE` command

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/command/tables.scala   |  7 +++---
 .../sql-tests/results/show-create-table.sql.out|  2 +-
 .../apache/spark/sql/ShowCreateTableSuite.scala| 26 ++
 .../org/apache/spark/sql/jdbc/JDBCSuite.scala  |  4 ++--
 4 files changed, 33 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37514][PYTHON] Remove workarounds due to older pandas

2021-12-01 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ffe3fc9  [SPARK-37514][PYTHON] Remove workarounds due to older pandas
ffe3fc9 is described below

commit ffe3fc9d23967e41092cf67539aa7f0d77b9eb75
Author: Takuya UESHIN 
AuthorDate: Thu Dec 2 10:51:05 2021 +0900

[SPARK-37514][PYTHON] Remove workarounds due to older pandas

### What changes were proposed in this pull request?

Removes workarounds due to older pandas.

### Why are the changes needed?

Now that we upgraded the minimum version of pandas to `1.0.5`.
We can remove workarounds for pandas API on Spark to run with older pandas.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Modified existing tests to remove workarounds for older pandas.

Closes #34772 from ueshin/issues/SPARK-37514/older_pandas.

Authored-by: Takuya UESHIN 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/pandas/frame.py |  29 +-
 python/pyspark/pandas/generic.py   |  42 +-
 python/pyspark/pandas/groupby.py   |  26 +-
 python/pyspark/pandas/indexes/multi.py |  36 +-
 python/pyspark/pandas/namespace.py |  11 +-
 python/pyspark/pandas/plot/matplotlib.py   |  47 +--
 .../pandas/tests/data_type_ops/test_boolean_ops.py |   8 +-
 .../pandas/tests/data_type_ops/test_num_ops.py |  27 +-
 python/pyspark/pandas/tests/indexes/test_base.py   | 252 +---
 .../pyspark/pandas/tests/indexes/test_category.py  |   2 +-
 .../tests/plot/test_frame_plot_matplotlib.py   |   7 +-
 .../pandas/tests/plot/test_frame_plot_plotly.py|   5 -
 .../tests/plot/test_series_plot_matplotlib.py  |   7 +-
 .../pandas/tests/plot/test_series_plot_plotly.py   |   5 -
 python/pyspark/pandas/tests/test_dataframe.py  | 383 +-
 .../pandas/tests/test_dataframe_conversion.py  |  14 +-
 .../pandas/tests/test_dataframe_spark_io.py|  28 +-
 python/pyspark/pandas/tests/test_expanding.py  | 128 +-
 python/pyspark/pandas/tests/test_groupby.py| 130 ++
 python/pyspark/pandas/tests/test_indexing.py   |   6 -
 python/pyspark/pandas/tests/test_numpy_compat.py   |  31 +-
 .../pandas/tests/test_ops_on_diff_frames.py|  45 +--
 .../tests/test_ops_on_diff_frames_groupby.py   |   1 -
 .../test_ops_on_diff_frames_groupby_expanding.py   |  39 +-
 python/pyspark/pandas/tests/test_reshape.py|   7 +-
 python/pyspark/pandas/tests/test_series.py | 442 -
 .../pyspark/pandas/tests/test_series_conversion.py |   5 -
 python/pyspark/pandas/tests/test_stats.py  |  27 +-
 28 files changed, 508 insertions(+), 1282 deletions(-)

diff --git a/python/pyspark/pandas/frame.py b/python/pyspark/pandas/frame.py
index edfb62e..de36531 100644
--- a/python/pyspark/pandas/frame.py
+++ b/python/pyspark/pandas/frame.py
@@ -20,7 +20,6 @@ A wrapper class for Spark DataFrame to behave similar to 
pandas DataFrame.
 """
 from collections import OrderedDict, defaultdict, namedtuple
 from collections.abc import Mapping
-from distutils.version import LooseVersion
 import re
 import warnings
 import inspect
@@ -58,10 +57,7 @@ from pandas.tseries.frequencies import DateOffset, to_offset
 if TYPE_CHECKING:
 from pandas.io.formats.style import Styler
 
-if LooseVersion(pd.__version__) >= LooseVersion("0.24"):
-from pandas.core.dtypes.common import infer_dtype_from_object
-else:
-from pandas.core.dtypes.common import _get_dtype_from_object as 
infer_dtype_from_object
+from pandas.core.dtypes.common import infer_dtype_from_object
 from pandas.core.accessor import CachedAccessor
 from pandas.core.dtypes.inference import is_sequence
 from pyspark import StorageLevel
@@ -3128,17 +3124,9 @@ defaultdict(, {'col..., 'col...})]
 psdf.index.name = verify_temp_column_name(psdf, "__index_name__")
 return_types = [psdf.index.dtype] + list(psdf.dtypes)
 
-if LooseVersion(pd.__version__) < LooseVersion("0.24"):
-
-@no_type_check
-def pandas_at_time(pdf) -> ps.DataFrame[return_types]:
-return pdf.at_time(time, asof).reset_index()
-
-else:
-
-@no_type_check
-def pandas_at_time(pdf) -> ps.DataFrame[return_types]:
-return pdf.at_time(time, asof, axis).reset_index()
+@no_type_check
+def pandas_at_time(pdf) -> ps.DataFrame[return_types]:
+return pdf.at_time(time, asof, axis).reset_index()
 
 # apply_batch will remove the index of the pandas-on-Spark DataFrame 
and attach
 # a default index, which will never be used. So use "distributed" 
index as a dummy
@@ -12103,17 +12091,14 @@ defaultd

[spark] branch master updated (40b239c -> f97de30)

2021-12-01 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 40b239c  [SPARK-37326][SQL][FOLLOWUP] Fix the test for Java 11
 add f97de30  [SPARK-37496][SQL] Migrate ReplaceTableAsSelectStatement to 
v2 command

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/ResolveCatalogs.scala| 12 
 .../spark/sql/catalyst/parser/AstBuilder.scala | 11 ---
 .../sql/catalyst/plans/logical/statements.scala| 23 ---
 .../sql/catalyst/plans/logical/v2Commands.scala| 27 ++---
 .../sql/connector/catalog/CatalogV2Util.scala  | 10 ---
 .../spark/sql/catalyst/parser/DDLParserSuite.scala | 22 +++---
 .../org/apache/spark/sql/DataFrameWriter.scala | 31 +++-
 .../org/apache/spark/sql/DataFrameWriterV2.scala   | 34 --
 .../catalyst/analysis/ResolveSessionCatalog.scala  | 27 +++--
 .../execution/datasources/v2/CreateTableExec.scala |  8 +
 .../datasources/v2/DataSourceV2Strategy.scala  | 14 -
 .../datasources/v2/WriteToDataSourceV2Exec.scala   | 12 +---
 .../connector/V2CommandsCaseSensitivitySuite.scala | 18 +++-
 13 files changed, 112 insertions(+), 137 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38115cb -> 40b239c)

2021-12-01 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38115cb  [SPARK-37501][SQL] CREATE/REPLACE TABLE should qualify 
location for v2 command
 add 40b239c  [SPARK-37326][SQL][FOLLOWUP] Fix the test for Java 11

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated (46fc98c -> 87af2fd)

2021-12-01 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 46fc98c  [SPARK-37460][DOCS] Add the description of ALTER DATABASE SET 
LOCATION
 add 87af2fd  [SPARK-37480][K8S][DOC][3.2] Sync Kubernetes configuration to 
latest in running-on-k8s.md

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md | 172 +++---
 1 file changed, 162 insertions(+), 10 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen closed pull request #370: Fix the link of SPARK-24554 in Spark 3.1.1 release note

2021-12-01 Thread GitBox



srowen closed pull request #370:
URL: https://github.com/apache/spark-website/pull/370


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Fix the link of SPARK-24554 in Spark 3.1.1 release note

2021-12-01 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 81ab561  Fix the link of SPARK-24554 in Spark 3.1.1 release note
81ab561 is described below

commit 81ab561dcb9da5983674a9f8b090e54908cc97e7
Author: Hyukjin Kwon 
AuthorDate: Wed Dec 1 08:52:50 2021 -0600

Fix the link of SPARK-24554 in Spark 3.1.1 release note

SPARK-24554 has to link to 
https://issues.apache.org/jira/browse/SPARK-24554 but it's linked to 
https://issues.apache.org/jira/browse/SPARK-33748. This PR fixes the link.

Author: Hyukjin Kwon 

Closes #370 from HyukjinKwon/typo-jira.
---
 releases/_posts/2021-03-02-spark-release-3-1-1.md | 2 +-
 site/releases/spark-release-3-1-1.html| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/releases/_posts/2021-03-02-spark-release-3-1-1.md 
b/releases/_posts/2021-03-02-spark-release-3-1-1.md
index 2126301..59010ea 100644
--- a/releases/_posts/2021-03-02-spark-release-3-1-1.md
+++ b/releases/_posts/2021-03-02-spark-release-3-1-1.md
@@ -222,7 +222,7 @@ Please read the migration guides for each component: [Spark 
Core](https://spark.
 - Support getCheckpointDir method in PySpark SparkContext 
([SPARK-33017](https://issues.apache.org/jira/browse/SPARK-33017))
 - Support to fill nulls for missing columns in unionByName 
([SPARK-32798](https://issues.apache.org/jira/browse/SPARK-32798))
 - Update cloudpickle to v1.5.0 
([SPARK-32094](https://issues.apache.org/jira/browse/SPARK-32094))
-- Add MapType support for PySpark with Arrow 
([SPARK-24554](https://issues.apache.org/jira/browse/SPARK-33748))
+- Add MapType support for PySpark with Arrow 
([SPARK-24554](https://issues.apache.org/jira/browse/SPARK-24554))
 - DataStreamReader.table and DataStreamWriter.toTable 
([SPARK-33836](https://issues.apache.org/jira/browse/SPARK-33836))
 
 **Changes of behavior**
diff --git a/site/releases/spark-release-3-1-1.html 
b/site/releases/spark-release-3-1-1.html
index 8fafba3..b0c9786 100644
--- a/site/releases/spark-release-3-1-1.html
+++ b/site/releases/spark-release-3-1-1.html
@@ -424,7 +424,7 @@
   Support getCheckpointDir method in PySpark SparkContext (https://issues.apache.org/jira/browse/SPARK-33017";>SPARK-33017)
   Support to fill nulls for missing columns in unionByName (https://issues.apache.org/jira/browse/SPARK-32798";>SPARK-32798)
   Update cloudpickle to v1.5.0 (https://issues.apache.org/jira/browse/SPARK-32094";>SPARK-32094)
-  Add MapType support for PySpark with Arrow (https://issues.apache.org/jira/browse/SPARK-33748";>SPARK-24554)
+  Add MapType support for PySpark with Arrow (https://issues.apache.org/jira/browse/SPARK-24554";>SPARK-24554)
   DataStreamReader.table and DataStreamWriter.toTable (https://issues.apache.org/jira/browse/SPARK-33836";>SPARK-33836)
 
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37501][SQL] CREATE/REPLACE TABLE should qualify location for v2 command

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 38115cb  [SPARK-37501][SQL] CREATE/REPLACE TABLE should qualify 
location for v2 command
38115cb is described below

commit 38115cb907ec93151382260cda327330e78ca340
Author: PengLei 
AuthorDate: Wed Dec 1 22:04:35 2021 +0800

[SPARK-37501][SQL] CREATE/REPLACE TABLE should qualify location for v2 
command

### What changes were proposed in this pull request?
1. Rename method name `makeQualifiedNamespacePath` -> 
`makeQualifiedLocationPath` in `CatalogUtils`, so it not only for db/namespace, 
also for table.
2. Override the method `makeQualifiedLocationPath` to take more types of 
parameters
3. In `CreateTableExec` add handle the `location` properties convert.
4. Add handle for `Replace table` command.

### Why are the changes needed?
keep consistent for v1 and v2, and disscuss at 
[#comments](https://github.com/apache/spark/pull/34719#discussion_r758156938)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existed test case.

Closes #34758 from Peng-Lei/qualify-location.

Authored-by: PengLei 
Signed-off-by: Wenchen Fan 
---
 .../sql/catalyst/catalog/ExternalCatalogUtils.scala  | 10 +-
 .../spark/sql/catalyst/catalog/SessionCatalog.scala  |  2 +-
 .../datasources/v2/DataSourceV2Strategy.scala| 20 +++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 13 +++--
 4 files changed, 28 insertions(+), 17 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala
index 4b0e676..67c57ec 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala
@@ -259,7 +259,7 @@ object CatalogUtils {
 new Path(str).toUri
   }
 
-  def makeQualifiedNamespacePath(
+  def makeQualifiedDBObjectPath(
   locationUri: URI,
   warehousePath: String,
   hadoopConf: Configuration): URI = {
@@ -271,6 +271,14 @@ object CatalogUtils {
 }
   }
 
+  def makeQualifiedDBObjectPath(
+  warehouse: String,
+  location: String,
+  hadoopConf: Configuration): String = {
+val nsPath = makeQualifiedDBObjectPath(stringToURI(location), warehouse, 
hadoopConf)
+URIToString(nsPath)
+  }
+
   def makeQualifiedPath(path: URI, hadoopConf: Configuration): URI = {
 val hadoopPath = new Path(path)
 val fs = hadoopPath.getFileSystem(hadoopConf)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
index 610a683..60f68fb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
@@ -252,7 +252,7 @@ class SessionCatalog(
   }
 
   private def makeQualifiedDBPath(locationUri: URI): URI = {
-CatalogUtils.makeQualifiedNamespacePath(locationUri, conf.warehousePath, 
hadoopConf)
+CatalogUtils.makeQualifiedDBObjectPath(locationUri, conf.warehousePath, 
hadoopConf)
   }
 
   def dropDatabase(db: String, ignoreIfNotExists: Boolean, cascade: Boolean): 
Unit = {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
index f64c1ee..fc44f70 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
@@ -94,11 +94,9 @@ class DataSourceV2Strategy(session: SparkSession) extends 
Strategy with Predicat
 session.sharedState.cacheManager.uncacheQuery(session, v2Relation, cascade 
= true)
   }
 
-  private def makeQualifiedNamespacePath(location: String): String = {
-val warehousePath = session.sharedState.conf.get(WAREHOUSE_PATH)
-val nsPath = CatalogUtils.makeQualifiedNamespacePath(
-  CatalogUtils.stringToURI(location), warehousePath, 
session.sharedState.hadoopConf)
-CatalogUtils.URIToString(nsPath)
+  private def makeQualifiedDBObjectPath(location: String): String = {
+
CatalogUtils.makeQualifiedDBObjectPath(session.sharedState.conf.get(WAREHOUSE_PATH),
+  location, session.sharedState.hadoopConf)
   }
 
   override def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
@@ -167,8 +165,9 @@ class DataSourceV

[spark] branch branch-3.2 updated: [SPARK-37460][DOCS] Add the description of ALTER DATABASE SET LOCATION

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 46fc98c  [SPARK-37460][DOCS] Add the description of ALTER DATABASE SET 
LOCATION
46fc98c is described below

commit 46fc98cbac1f06ce7fb11d042a041ea0f17e8843
Author: Yuto Akutsu 
AuthorDate: Wed Dec 1 20:13:55 2021 +0800

[SPARK-37460][DOCS] Add the description of ALTER DATABASE SET LOCATION

### What changes were proposed in this pull request?

Added the description of `ALTER DATABASE SET LOCATION` command in 
`sql-ref-syntax-ddl-alter-database.md`

### Why are the changes needed?

This command can be used but not documented anywhere.

### Does this PR introduce _any_ user-facing change?

Yes, docs changes.

### How was this patch tested?

`SKIP_API=1 bundle exec jekyll build`
https://user-images.githubusercontent.com/87687356/143523751-4cc32ad1-390c-491b-a6ea-bf1664535c28.png";>
https://user-images.githubusercontent.com/87687356/143523757-aa741d74-0e51-4e17-8768-9da7eb86a7d8.png";>

Closes #34718 from yutoacts/SPARK-37460.

Authored-by: Yuto Akutsu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 5924961e4ac308d3910b8430c1e72dea275073a9)
Signed-off-by: Wenchen Fan 
---
 docs/sql-ref-syntax-ddl-alter-database.md | 54 +--
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index fbc454e..0ac0038 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -21,25 +21,47 @@ license: |
 
 ### Description
 
-You can alter metadata associated with a database by setting `DBPROPERTIES`.  
The specified property
-values override any existing value with the same property name. Please note 
that the usage of 
-`SCHEMA` and `DATABASE` are interchangeable and one can be used in place of 
the other. An error message
-is issued if the database is not found in the system. This command is mostly 
used to record the metadata
-for a database and may be used for auditing purposes.
+`ALTER DATABASE` statement changes the properties or location of a database. 
Please note that the usage of
+`DATABASE`, `SCHEMA` and `NAMESPACE` are interchangeable and one can be used 
in place of the others. An error message
+is issued if the database is not found in the system.
 
-### Syntax
+### ALTER PROPERTIES
+`ALTER DATABASE SET DBPROPERTIES` statement changes the properties associated 
with a database.
+The specified property values override any existing value with the same 
property name. 
+This command is mostly used to record the metadata for a database and may be 
used for auditing purposes.
+
+ Syntax
 
 ```sql
-ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( property_name = property_value [ , ... ] )
+ALTER { DATABASE | SCHEMA | NAMESPACE } database_name
+SET { DBPROPERTIES | PROPERTIES } ( property_name = property_value [ , ... 
] )
 ```
 
-### Parameters
+ Parameters
 
 * **database_name**
 
 Specifies the name of the database to be altered.
 
+### ALTER LOCATION
+`ALTER DATABASE SET LOCATION` statement changes the default parent-directory 
where new tables will be added 
+for a database. Please note that it does not move the contents of the 
database's current directory to the newly 
+specified location or change the locations associated with any 
tables/partitions under the specified database 
+(available since Spark 3.0.0 with the Hive metastore version 3.0.0 and later).
+
+ Syntax
+
+```sql
+ALTER { DATABASE | SCHEMA | NAMESPACE } database_name
+SET LOCATION 'new_location'
+```
+
+ Parameters
+
+* **database_name**
+
+  Specifies the name of the database to be altered.
+
 ### Examples
 
 ```sql
@@ -59,6 +81,20 @@ DESCRIBE DATABASE EXTENDED inventory;
 | Location|   file:/temp/spark-warehouse/inventory.db|
 |   Properties|((Edit-date,01/01/2001), (Edited-by,John))|
 +-+--+
+
+-- Alters the database to set a new location.
+ALTER DATABASE inventory SET LOCATION 
'file:/temp/spark-warehouse/new_inventory.db';
+
+-- Verify that a new location is set.
+DESCRIBE DATABASE EXTENDED inventory;
++-+---+
+|database_description_item| database_description_value|
++-+---+
+|Database Name|  inventory|
+|  Description|   |
+| Location|file:/temp/spark-warehouse/new_inventory.db|
+|   Properties| ((Edit-date,01/01/

[spark] branch master updated (710120a -> 5924961)

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 710120a  [SPARK-37508][SQL] Add CONTAINS() string function
 add 5924961  [SPARK-37460][DOCS] Add the description of ALTER DATABASE SET 
LOCATION

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-alter-database.md | 54 +--
 1 file changed, 45 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] HyukjinKwon opened a new pull request #370: Fix the link of SPARK-24554 in Spark 3.1.1 release note

2021-12-01 Thread GitBox



HyukjinKwon opened a new pull request #370:
URL: https://github.com/apache/spark-website/pull/370


   SPARK-24554 has to link to https://issues.apache.org/jira/browse/SPARK-24554 
but it's linked to https://issues.apache.org/jira/browse/SPARK-33748. This PR 
fixes the link.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37508][SQL] Add CONTAINS() string function

2021-12-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 710120a  [SPARK-37508][SQL] Add CONTAINS() string function
710120a is described below

commit 710120a499d6082bcec6b65ad1f8dbe4789f4bd9
Author: Angerszh 
AuthorDate: Wed Dec 1 12:57:22 2021 +0300

[SPARK-37508][SQL] Add CONTAINS() string function

### What changes were proposed in this pull request?
Add `CONTAINS` string function.

| function| arguments | Returns |
|---|---|---|
| CONTAINS( left , right) | left: String, right: String | Returns a 
BOOLEAN. The value is True if right is found inside left. Returns NULL if 
either input expression is NULL. Otherwise, returns False.|

### Why are the changes needed?
contains() is a common convenient function supported by a number of 
database systems:

- 
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#contains_substr
- https://docs.snowflake.com/en/sql-reference/functions/contains.html

Support of the function can make the migration from other systems to Spark 
SQL easier.

### Does this PR introduce _any_ user-facing change?
User can use `contains(left, right)`:

| Left   |  Right  |  Return |
|--|:-:|--:|
| null |  "Spark SQL" | null |
| "Spark SQL" |  null | null |
| null |  null | null |
| "Spark SQL" |  "Spark" | true |
| "Spark SQL" |  "k SQL" | true |
| "Spark SQL" | "SPARK" | false |

### How was this patch tested?
Added UT

Closes #34761 from AngersZh/SPARK-37508.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  1 +
 .../catalyst/expressions/stringExpressions.scala   | 17 
 .../expressions/StringExpressionsSuite.scala   |  9 
 .../sql-functions/sql-expression-schema.md |  3 +-
 .../sql-tests/inputs/string-functions.sql  | 10 -
 .../results/ansi/string-functions.sql.out  | 50 +-
 .../sql-tests/results/string-functions.sql.out | 50 +-
 7 files changed, 136 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 0668460..b2788f8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -455,6 +455,7 @@ object FunctionRegistry {
 expression[Ascii]("ascii"),
 expression[Chr]("char", true),
 expression[Chr]("chr"),
+expression[Contains]("contains"),
 expression[Base64]("base64"),
 expression[BitLength]("bit_length"),
 expression[Length]("char_length", true),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 2b997da..959c834 100755
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -465,6 +465,23 @@ abstract class StringPredicate extends BinaryExpression
 /**
  * A function that returns true if the string `left` contains the string 
`right`.
  */
+@ExpressionDescription(
+  usage = """
+_FUNC_(expr1, expr2) - Returns a boolean value if expr2 is found inside 
expr1.
+Returns NULL if either input expression is NULL.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_('Spark SQL', 'Spark');
+   true
+  > SELECT _FUNC_('Spark SQL', 'SPARK');
+   false
+  > SELECT _FUNC_('Spark SQL', null);
+   NULL
+  """,
+  since = "3.3.0",
+  group = "string_funcs"
+)
 case class Contains(left: Expression, right: Expression) extends 
StringPredicate {
   override def compare(l: UTF8String, r: UTF8String): Boolean = l.contains(r)
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
index 823ce77..443a94b 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
@@ -1019,4 +1019,13 @@ class StringExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {

[spark] branch master updated (eaa1358 -> 654cd97)

2021-12-01 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eaa1358  [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse 
comments
 add 654cd97  [SPARK-37511][PYTHON] Introduce TimedeltaIndex to pandas API 
on Spark

No new revisions were added by this update.

Summary of changes:
 .../source/reference/pyspark.pandas/indexing.rst   |   7 ++
 python/pyspark/pandas/__init__.py  |   2 +
 python/pyspark/pandas/data_type_ops/base.py|   4 +
 .../data_type_ops/{udt_ops.py => timedelta_ops.py} |   7 +-
 python/pyspark/pandas/indexes/__init__.py  |   1 +
 python/pyspark/pandas/indexes/base.py  |  14 ++-
 python/pyspark/pandas/indexes/timedelta.py | 100 +
 python/pyspark/pandas/internal.py  |  11 +++
 python/pyspark/pandas/missing/indexes.py   |  18 
 python/pyspark/pandas/tests/indexes/test_base.py   |  41 -
 python/pyspark/pandas/typedef/typehints.py |   6 ++
 11 files changed, 205 insertions(+), 6 deletions(-)
 copy python/pyspark/pandas/data_type_ops/{udt_ops.py => timedelta_ops.py} (89%)
 create mode 100644 python/pyspark/pandas/indexes/timedelta.py

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 17c36a0  [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse 
comments
17c36a0 is described below

commit 17c36a04c77a26194b5a5909dd540059af7be876
Author: Wenchen Fan 
AuthorDate: Wed Dec 1 16:33:52 2021 +0800

[SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

This PR is a followup of https://github.com/apache/spark/pull/34668 , to 
fix a breaking change. The SET command uses wildcard which may contain unclosed 
comment, e.g. `/path/to/*`, and we shouldn't fail it.

This PR fixes it by skipping the unclosed comment check if we are parsing 
SET command.

fix a breaking change

no, the breaking change is not released yet.

new tests

Closes #34763 from cloud-fan/set.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit eaa135870a30fb89c2f1087991328a6f72a1860c)
Signed-off-by: Wenchen Fan 
---
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|  2 +-
 .../spark/sql/catalyst/parser/ParseDriver.scala|  6 ++-
 .../spark/sql/execution/SparkSqlParser.scala   | 57 --
 .../spark/sql/execution/SparkSqlParserSuite.scala  |  9 
 4 files changed, 45 insertions(+), 29 deletions(-)

diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index 7463ce2..f7c0f0e 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -251,7 +251,7 @@ statement
 | SET TIME ZONE timezone=(STRING | LOCAL)  
#setTimeZone
 | SET TIME ZONE .*?
#setTimeZone
 | SET configKey EQ configValue 
#setQuotedConfiguration
-| SET configKey (EQ .*?)?  
#setQuotedConfiguration
+| SET configKey (EQ .*?)?  
#setConfiguration
 | SET .*? EQ configValue   
#setQuotedConfiguration
 | SET .*?  
#setConfiguration
 | RESET configKey  
#resetQuotedConfiguration
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
index dc3c0cd..444268a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
@@ -338,7 +338,11 @@ case class UnclosedCommentProcessor(
   }
 
   override def exitSingleStatement(ctx: SqlBaseParser.SingleStatementContext): 
Unit = {
-checkUnclosedComment(tokenStream, command)
+// SET command uses a wildcard to match anything, and we shouldn't parse 
the comments, e.g.
+// `SET myPath =/a/*`.
+if (!ctx.statement().isInstanceOf[SqlBaseParser.SetConfigurationContext]) {
+  checkUnclosedComment(tokenStream, command)
+}
   }
 
   /** check `has_unclosed_bracketed_comment` to find out the unclosed 
bracketed comment. */
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
index 72dd007..b67c603 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
@@ -68,43 +68,46 @@ class SparkSqlAstBuilder extends AstBuilder {
* character in the raw string.
*/
   override def visitSetConfiguration(ctx: SetConfigurationContext): 
LogicalPlan = withOrigin(ctx) {
-remainder(ctx.SET.getSymbol).trim match {
-  case configKeyValueDef(key, value) =>
-SetCommand(Some(key -> Option(value.trim)))
-  case configKeyDef(key) =>
-SetCommand(Some(key -> None))
-  case s if s == "-v" =>
-SetCommand(Some("-v" -> None))
-  case s if s.isEmpty =>
-SetCommand(None)
-  case _ => throw new ParseException("Expected format is 'SET', 'SET key', 
or " +
-"'SET key=value'. If you want to include special characters in key, or 
include semicolon " +
-"in value, please use quotes, e.g., SET `ke y`=`v;alue`.", ctx)
+if (ctx.configKey() != null) {
+  val keyStr = ctx.configKey().getText
+  if (ctx.EQ() != null) {
+remainder(ctx.EQ().getSymbol).trim match {
+  case configValueDef(valueStr) => SetC

[spark] branch branch-3.2 updated: [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new ad5ac3a  [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse 
comments
ad5ac3a is described below

commit ad5ac3a22337b04fdd3413d148873f0d1077a0ca
Author: Wenchen Fan 
AuthorDate: Wed Dec 1 16:33:52 2021 +0800

[SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

### What changes were proposed in this pull request?

This PR is a followup of https://github.com/apache/spark/pull/34668 , to 
fix a breaking change. The SET command uses wildcard which may contain unclosed 
comment, e.g. `/path/to/*`, and we shouldn't fail it.

This PR fixes it by skipping the unclosed comment check if we are parsing 
SET command.

### Why are the changes needed?

fix a breaking change

### Does this PR introduce _any_ user-facing change?

no, the breaking change is not released yet.

### How was this patch tested?

new tests

Closes #34763 from cloud-fan/set.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit eaa135870a30fb89c2f1087991328a6f72a1860c)
Signed-off-by: Wenchen Fan 
---
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|  2 +-
 .../spark/sql/catalyst/parser/ParseDriver.scala|  6 ++-
 .../spark/sql/errors/QueryParsingErrors.scala  |  6 +--
 .../spark/sql/execution/SparkSqlParser.scala   | 49 --
 .../spark/sql/execution/SparkSqlParserSuite.scala  |  9 
 5 files changed, 44 insertions(+), 28 deletions(-)

diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index 1aa89ac..319536c 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -254,7 +254,7 @@ statement
 | SET TIME ZONE timezone=(STRING | LOCAL)  
#setTimeZone
 | SET TIME ZONE .*?
#setTimeZone
 | SET configKey EQ configValue 
#setQuotedConfiguration
-| SET configKey (EQ .*?)?  
#setQuotedConfiguration
+| SET configKey (EQ .*?)?  
#setConfiguration
 | SET .*? EQ configValue   
#setQuotedConfiguration
 | SET .*?  
#setConfiguration
 | RESET configKey  
#resetQuotedConfiguration
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
index 7dcf41b..c6633b1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
@@ -347,7 +347,11 @@ case class UnclosedCommentProcessor(
   }
 
   override def exitSingleStatement(ctx: SqlBaseParser.SingleStatementContext): 
Unit = {
-checkUnclosedComment(tokenStream, command)
+// SET command uses a wildcard to match anything, and we shouldn't parse 
the comments, e.g.
+// `SET myPath =/a/*`.
+if (!ctx.statement().isInstanceOf[SqlBaseParser.SetConfigurationContext]) {
+  checkUnclosedComment(tokenStream, command)
+}
   }
 
   /** check `has_unclosed_bracketed_comment` to find out the unclosed 
bracketed comment. */
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index 86fd41f..b6a8163 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -328,7 +328,7 @@ object QueryParsingErrors {
 new ParseException(errorClass = "DUPLICATE_KEY", messageParameters = 
Array(key), ctx)
   }
 
-  def unexpectedFomatForSetConfigurationError(ctx: SetConfigurationContext): 
Throwable = {
+  def unexpectedFomatForSetConfigurationError(ctx: ParserRuleContext): 
Throwable = {
 new ParseException(
   s"""
  |Expected format is 'SET', 'SET key', or 'SET key=value'. If you want 
to include
@@ -338,13 +338,13 @@ object QueryParsingErrors {
   }
 
   def invalidPropertyKeyForSetQuotedConfigurationError(
-  keyCandidate: String, valueStr: String, ctx: 
SetQuotedConfigurationContext): Throwable = {
+  keyCandidate: String, valueStr: String, ctx: ParserRuleContext):

[spark] branch master updated (ec47c3c -> eaa1358)

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ec47c3c  [SPARK-37513][SQL][DOC] date +/- interval with only day-time 
fields returns different data type between Spark3.2 and Spark3.1
 add eaa1358  [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse 
comments

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|  2 +-
 .../spark/sql/catalyst/parser/ParseDriver.scala|  6 ++-
 .../spark/sql/errors/QueryParsingErrors.scala  |  6 +--
 .../spark/sql/execution/SparkSqlParser.scala   | 49 --
 .../spark/sql/execution/SparkSqlParserSuite.scala  |  9 
 5 files changed, 44 insertions(+), 28 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new c836e93  [SPARK-37513][SQL][DOC] date +/- interval with only day-time 
fields returns different data type between Spark3.2 and Spark3.1
c836e93 is described below

commit c836e93306d1816a0232b69ef83d86c2782688e3
Author: Jiaan Geng 
AuthorDate: Wed Dec 1 16:19:50 2021 +0800

[SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns 
different data type between Spark3.2 and Spark3.1

### What changes were proposed in this pull request?
The SQL show below previously returned the date type, now it returns the 
timestamp type.
`select date '2011-11-11' + interval 12 hours;`
`select date '2011-11-11' - interval 12 hours;`

The basic reason is:
In Spark3.1

https://github.com/apache/spark/blob/75cac1fe0a46dbdf2ad5b741a3a49c9ab618cdce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L338
In Spark3.2

https://github.com/apache/spark/blob/ceae41ba5cafb479cdcfc9a6a162945646a68f05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L376

Because Spark3.2 have been released, so we add the migration guide for it.

### Why are the changes needed?
Give a migration guide for different behavior between Spark3.1 and Spark3.2.

### Does this PR introduce _any_ user-facing change?
'No'.
Just modify the docs.

### How was this patch tested?
No need.

Closes #34766 from beliefer/SPARK-37513.

Authored-by: Jiaan Geng 
Signed-off-by: Wenchen Fan 
(cherry picked from commit ec47c3c4394b2410a277e7f7105cf896c28b2ed4)
Signed-off-by: Wenchen Fan 
---
 docs/sql-migration-guide.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 9f51c75..6fcc059 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -103,6 +103,8 @@ license: |
 
   - In Spark 3.2, create/alter view will fail if the input query output 
columns contain auto-generated alias. This is necessary to make sure the query 
output column names are stable across different spark versions. To restore the 
behavior before Spark 3.2, set 
`spark.sql.legacy.allowAutoGeneratedAliasForView` to `true`.
 
+  - In Spark 3.2, date +/- interval with only day-time fields such as `date 
'2011-11-11' + interval 12 hours` returns timestamp. In Spark 3.1 and earlier, 
the same expression returns date. To restore the behavior before Spark 3.2, you 
can use `cast` to convert timestamp as date.
+
 ## Upgrading from Spark SQL 3.0 to 3.1
 
   - In Spark 3.1, statistical aggregation function includes `std`, `stddev`, 
`stddev_samp`, `variance`, `var_samp`, `skewness`, `kurtosis`, `covar_samp`, 
`corr` will return `NULL` instead of `Double.NaN` when `DivideByZero` occurs 
during expression evaluation, for example, when `stddev_samp` applied on a 
single element set. In Spark version 3.0 and earlier, it will return 
`Double.NaN` in such case. To restore the behavior before Spark 3.1, you can 
set `spark.sql.legacy.statisticalAggrega [...]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

2021-12-01 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ec47c3c  [SPARK-37513][SQL][DOC] date +/- interval with only day-time 
fields returns different data type between Spark3.2 and Spark3.1
ec47c3c is described below

commit ec47c3c4394b2410a277e7f7105cf896c28b2ed4
Author: Jiaan Geng 
AuthorDate: Wed Dec 1 16:19:50 2021 +0800

[SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns 
different data type between Spark3.2 and Spark3.1

### What changes were proposed in this pull request?
The SQL show below previously returned the date type, now it returns the 
timestamp type.
`select date '2011-11-11' + interval 12 hours;`
`select date '2011-11-11' - interval 12 hours;`

The basic reason is:
In Spark3.1

https://github.com/apache/spark/blob/75cac1fe0a46dbdf2ad5b741a3a49c9ab618cdce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L338
In Spark3.2

https://github.com/apache/spark/blob/ceae41ba5cafb479cdcfc9a6a162945646a68f05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L376

Because Spark3.2 have been released, so we add the migration guide for it.

### Why are the changes needed?
Give a migration guide for different behavior between Spark3.1 and Spark3.2.

### Does this PR introduce _any_ user-facing change?
'No'.
Just modify the docs.

### How was this patch tested?
No need.

Closes #34766 from beliefer/SPARK-37513.

Authored-by: Jiaan Geng 
Signed-off-by: Wenchen Fan 
---
 docs/sql-migration-guide.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 12d9cd4..c15f55d 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -133,6 +133,8 @@ license: |
 
   - In Spark 3.2, create/alter view will fail if the input query output 
columns contain auto-generated alias. This is necessary to make sure the query 
output column names are stable across different spark versions. To restore the 
behavior before Spark 3.2, set 
`spark.sql.legacy.allowAutoGeneratedAliasForView` to `true`.
 
+  - In Spark 3.2, date +/- interval with only day-time fields such as `date 
'2011-11-11' + interval 12 hours` returns timestamp. In Spark 3.1 and earlier, 
the same expression returns date. To restore the behavior before Spark 3.2, you 
can use `cast` to convert timestamp as date.
+
 ## Upgrading from Spark SQL 3.0 to 3.1
 
   - In Spark 3.1, statistical aggregation function includes `std`, `stddev`, 
`stddev_samp`, `variance`, `var_samp`, `skewness`, `kurtosis`, `covar_samp`, 
`corr` will return `NULL` instead of `Double.NaN` when `DivideByZero` occurs 
during expression evaluation, for example, when `stddev_samp` applied on a 
single element set. In Spark version 3.0 and earlier, it will return 
`Double.NaN` in such case. To restore the behavior before Spark 3.1, you can 
set `spark.sql.legacy.statisticalAggrega [...]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ce1f97f -> 3d9c588)

2021-12-01 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ce1f97f  [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
 add 3d9c588  [SPARK-37458][SS] Remove unnecessary SerializeFromObject from 
the plan of foreachBatch

No new revisions were added by this update.

Summary of changes:
 .../streaming/sources/ForeachBatchSink.scala   | 10 +--
 .../streaming/sources/ForeachBatchSinkSuite.scala  | 76 ++
 2 files changed, 79 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ffe3fc9 -> b9b5562)

[spark] branch master updated: [SPARK-37514][PYTHON] Remove workarounds due to older pandas

[spark] branch master updated (40b239c -> f97de30)

[spark] branch master updated (38115cb -> 40b239c)

[spark] branch branch-3.2 updated (46fc98c -> 87af2fd)

[GitHub] [spark-website] srowen closed pull request #370: Fix the link of SPARK-24554 in Spark 3.1.1 release note

[spark-website] branch asf-site updated: Fix the link of SPARK-24554 in Spark 3.1.1 release note

[spark] branch master updated: [SPARK-37501][SQL] CREATE/REPLACE TABLE should qualify location for v2 command

[spark] branch branch-3.2 updated: [SPARK-37460][DOCS] Add the description of ALTER DATABASE SET LOCATION

[spark] branch master updated (710120a -> 5924961)

[GitHub] [spark-website] HyukjinKwon opened a new pull request #370: Fix the link of SPARK-24554 in Spark 3.1.1 release note

[spark] branch master updated: [SPARK-37508][SQL] Add CONTAINS() string function

[spark] branch master updated (eaa1358 -> 654cd97)

[spark] branch branch-3.1 updated: [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

[spark] branch branch-3.2 updated: [SPARK-37389][SQL][FOLLOWUP] SET command shuold not parse comments

[spark] branch master updated (ec47c3c -> eaa1358)

[spark] branch branch-3.2 updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

[spark] branch master updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

[spark] branch master updated (ce1f97f -> 3d9c588)

19 matches

Site Navigation

Mail list logo

Footer information