[jira] [Created] (SPARK-29939) Add a conf for CompressionCodec for Ser/Deser of MapOutputStatus

2019-11-17 Thread Xiao Li (Jira)
Xiao Li created SPARK-29939:
---

 Summary: Add a conf for CompressionCodec for Ser/Deser of 
MapOutputStatus
 Key: SPARK-29939
 URL: https://issues.apache.org/jira/browse/SPARK-29939
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Xiao Li
Assignee: wuyi


All the other compressions have conf. Could we do it for this too? See the 
examples:

https://github.com/apache/spark/blob/1b575ef5d1b8e3e672b2fca5c354d6678bd78bd1/core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala#L67-L73



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29883) Improve error messages when function name is an alias

2019-11-13 Thread Xiao Li (Jira)
Xiao Li created SPARK-29883:
---

 Summary: Improve error messages when function name is an alias
 Key: SPARK-29883
 URL: https://issues.apache.org/jira/browse/SPARK-29883
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0
Reporter: Xiao Li


 

A general issue in error message when the function name is just an alias name 
of the actual built-in function. For example, every is an alias of bool_and in 
Spark 3.0 
{code:java}
cannot resolve 'every('true')' due to data type mismatch: Input to function 
'every' should have been boolean, but it's [string].; line 1 pos 7 
{code}
{code:java}
cannot resolve 'bool_and('true')' due to data type mismatch: Input to function 
'bool_and' should have been boolean, but it's [string].; line 1 pos 7{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29396) Extend Spark plugin interface to driver

2019-11-10 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29396:

Labels: release-notes  (was: )

> Extend Spark plugin interface to driver
> ---
>
> Key: SPARK-29396
> URL: https://issues.apache.org/jira/browse/SPARK-29396
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
>  Labels: release-notes
>
> Spark provides an extension API for people to implement executor plugins, 
> added in SPARK-24918 and later extended in SPARK-28091.
> That API does not offer any functionality for doing similar things on the 
> driver side, though. As a consequence of that, there is not a good way for 
> the executor plugins to get information or communicate in any way with the 
> Spark driver.
> I've been playing with such an improved API for developing some new 
> functionality. I'll file a few child bugs for the work to get the changes in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28091) Extend Spark metrics system with user-defined metrics using executor plugins

2019-11-10 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-28091:

Labels: release-notes  (was: )

> Extend Spark metrics system with user-defined metrics using executor plugins
> 
>
> Key: SPARK-28091
> URL: https://issues.apache.org/jira/browse/SPARK-28091
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Luca Canali
>Assignee: Luca Canali
>Priority: Minor
>  Labels: release-notes
> Fix For: 3.0.0
>
>
> This proposes to improve Spark instrumentation by adding a hook for 
> user-defined metrics, extending Spark’s Dropwizard/Codahale metrics system.
> The original motivation of this work was to add instrumentation for S3 
> filesystem access metrics by Spark job. Currently, [[ExecutorSource]] 
> instruments HDFS and local filesystem metrics. Rather than extending the code 
> there, we proposes with this JIRA to add a metrics plugin system which is of 
> more flexible and general use.
> Context: The Spark metrics system provides a large variety of metrics, see 
> also , useful to  monitor and troubleshoot Spark workloads. A typical 
> workflow is to sink the metrics to a storage system and build dashboards on 
> top of that.
> Highlights:
>  * The metric plugin system makes it easy to implement instrumentation for S3 
> access by Spark jobs.
>  * The metrics plugin system allows for easy extensions of how Spark collects 
> HDFS-related workload metrics. This is currently done using the Hadoop 
> Filesystem GetAllStatistics method, which is deprecated in recent versions of 
> Hadoop. Recent versions of Hadoop Filesystem recommend using method 
> GetGlobalStorageStatistics, which also provides several additional metrics. 
> GetGlobalStorageStatistics is not available in Hadoop 2.7 (had been 
> introduced in Hadoop 2.8). Using a metric plugin for Spark would allow an 
> easy way to “opt in” using such new API calls for those deploying suitable 
> Hadoop versions.
>  * We also have the use case of adding Hadoop filesystem monitoring for a 
> custom Hadoop compliant filesystem in use in our organization (EOS using the 
> XRootD protocol). The metrics plugin infrastructure makes this easy to do. 
> Others may have similar use cases.
>  * More generally, this method makes it straightforward to plug in Filesystem 
> and other metrics to the Spark monitoring system. Future work on plugin 
> implementation can address extending monitoring to measure usage of external 
> resources (OS, filesystem, network, accelerator cards, etc), that maybe would 
> not normally be considered general enough for inclusion in Apache Spark code, 
> but that can be nevertheless useful for specialized use cases, tests or 
> troubleshooting.
> Implementation:
> The proposed implementation builds on top of the work on Executor Plugin of 
> SPARK-24918 and builds on recent work on extending Spark executor metrics, 
> such as SPARK-25228
> Tests and examples:
> This has been so far manually tested running Spark on YARN and K8S clusters, 
> in particular for monitoring S3 and for extending HDFS instrumentation with 
> the Hadoop Filesystem “GetGlobalStorageStatistics” metrics. Executor metric 
> plugin example and code used for testing are available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29759) LocalShuffleReaderExec.outputPartitioning should use the corrected attributes

2019-11-06 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29759.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> LocalShuffleReaderExec.outputPartitioning should use the corrected attributes
> -
>
> Key: SPARK-29759
> URL: https://issues.apache.org/jira/browse/SPARK-29759
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29752) make AdaptiveQueryExecSuite more robust

2019-11-06 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29752.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> make AdaptiveQueryExecSuite more robust
> ---
>
> Key: SPARK-29752
> URL: https://issues.apache.org/jira/browse/SPARK-29752
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29768) nondeterministic expression fails column pruning

2019-11-05 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29768:

   Fix Version/s: (was: 3.0.0)
Target Version/s: 3.0.0

> nondeterministic expression fails column pruning
> 
>
> Key: SPARK-29768
> URL: https://issues.apache.org/jira/browse/SPARK-29768
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: yucai
>Assignee: Wenchen Fan
>Priority: Major
>
> nondeterministic expression like monotonically_increasing_id fails column 
> pruning
> {code}
> spark.range(10).selectExpr("id as key", "id * 2 as value").
>   write.format("parquet").save("/tmp/source")
> spark.range(10).selectExpr("id as key", "id * 3 as s1", "id * 5 as s2").
>   write.format("parquet").save("/tmp/target")
> val sourceDF = spark.read.parquet("/tmp/source")
> val targetDF = spark.read.parquet("/tmp/target").
>   withColumn("row_id", monotonically_increasing_id())
> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> {code}
> Spark reads all columns from targetDF, but actually, we only need `key` 
> column.
> {code}
> scala> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> == Physical Plan ==
> *(2) Project [key#78L, row_id#88L]
> +- *(2) BroadcastHashJoin [key#78L], [key#82L], Inner, BuildLeft
>:- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
> true]))
>:  +- *(1) Project [key#78L]
>: +- *(1) Filter isnotnull(key#78L)
>:+- *(1) FileScan parquet [key#78L] Batched: true, Format: 
> Parquet, Location: InMemoryFileIndex[file:/tmp/source], PartitionFilters: [], 
> PushedFilters: [IsNotNull(key)], ReadSchema: struct
>+- *(2) Filter isnotnull(key#82L)
>   +- *(2) Project [key#82L, monotonically_increasing_id() AS row_id#88L]
>  +- *(2) FileScan parquet [key#82L,s1#83L,s2#84L] Batched: true, 
> Format: Parquet, Location: InMemoryFileIndex[file:/tmp/target], 
> PartitionFilters: [], PushedFilters: [], ReadSchema: 
> struct
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29768) nondeterministic expression fails column pruning

2019-11-05 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29768:

Fix Version/s: 3.0.0

> nondeterministic expression fails column pruning
> 
>
> Key: SPARK-29768
> URL: https://issues.apache.org/jira/browse/SPARK-29768
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: yucai
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.0.0
>
>
> nondeterministic expression like monotonically_increasing_id fails column 
> pruning
> {code}
> spark.range(10).selectExpr("id as key", "id * 2 as value").
>   write.format("parquet").save("/tmp/source")
> spark.range(10).selectExpr("id as key", "id * 3 as s1", "id * 5 as s2").
>   write.format("parquet").save("/tmp/target")
> val sourceDF = spark.read.parquet("/tmp/source")
> val targetDF = spark.read.parquet("/tmp/target").
>   withColumn("row_id", monotonically_increasing_id())
> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> {code}
> Spark reads all columns from targetDF, but actually, we only need `key` 
> column.
> {code}
> scala> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> == Physical Plan ==
> *(2) Project [key#78L, row_id#88L]
> +- *(2) BroadcastHashJoin [key#78L], [key#82L], Inner, BuildLeft
>:- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
> true]))
>:  +- *(1) Project [key#78L]
>: +- *(1) Filter isnotnull(key#78L)
>:+- *(1) FileScan parquet [key#78L] Batched: true, Format: 
> Parquet, Location: InMemoryFileIndex[file:/tmp/source], PartitionFilters: [], 
> PushedFilters: [IsNotNull(key)], ReadSchema: struct
>+- *(2) Filter isnotnull(key#82L)
>   +- *(2) Project [key#82L, monotonically_increasing_id() AS row_id#88L]
>  +- *(2) FileScan parquet [key#82L,s1#83L,s2#84L] Batched: true, 
> Format: Parquet, Location: InMemoryFileIndex[file:/tmp/target], 
> PartitionFilters: [], PushedFilters: [], ReadSchema: 
> struct
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29768) nondeterministic expression fails column pruning

2019-11-05 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-29768:
---

Assignee: Wenchen Fan

> nondeterministic expression fails column pruning
> 
>
> Key: SPARK-29768
> URL: https://issues.apache.org/jira/browse/SPARK-29768
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: yucai
>Assignee: Wenchen Fan
>Priority: Major
>
> nondeterministic expression like monotonically_increasing_id fails column 
> pruning
> {code}
> spark.range(10).selectExpr("id as key", "id * 2 as value").
>   write.format("parquet").save("/tmp/source")
> spark.range(10).selectExpr("id as key", "id * 3 as s1", "id * 5 as s2").
>   write.format("parquet").save("/tmp/target")
> val sourceDF = spark.read.parquet("/tmp/source")
> val targetDF = spark.read.parquet("/tmp/target").
>   withColumn("row_id", monotonically_increasing_id())
> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> {code}
> Spark reads all columns from targetDF, but actually, we only need `key` 
> column.
> {code}
> scala> sourceDF.join(targetDF, "key").select("key", "row_id").explain()
> == Physical Plan ==
> *(2) Project [key#78L, row_id#88L]
> +- *(2) BroadcastHashJoin [key#78L], [key#82L], Inner, BuildLeft
>:- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
> true]))
>:  +- *(1) Project [key#78L]
>: +- *(1) Filter isnotnull(key#78L)
>:+- *(1) FileScan parquet [key#78L] Batched: true, Format: 
> Parquet, Location: InMemoryFileIndex[file:/tmp/source], PartitionFilters: [], 
> PushedFilters: [IsNotNull(key)], ReadSchema: struct
>+- *(2) Filter isnotnull(key#82L)
>   +- *(2) Project [key#82L, monotonically_increasing_id() AS row_id#88L]
>  +- *(2) FileScan parquet [key#82L,s1#83L,s2#84L] Batched: true, 
> Format: Parquet, Location: InMemoryFileIndex[file:/tmp/target], 
> PartitionFilters: [], PushedFilters: [], ReadSchema: 
> struct
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29743) sample should set needCopyResult to true if its child is

2019-11-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29743:

Priority: Blocker  (was: Major)

> sample should set needCopyResult to true if its child is
> 
>
> Key: SPARK-29743
> URL: https://issues.apache.org/jira/browse/SPARK-29743
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29743) sample should set needCopyResult to true if its child is

2019-11-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29743:

Affects Version/s: (was: 2.4.0)
   2.3.4
   2.4.4

> sample should set needCopyResult to true if its child is
> 
>
> Key: SPARK-29743
> URL: https://issues.apache.org/jira/browse/SPARK-29743
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.4
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29449) Add tooltip to Spark WebUI

2019-10-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29449:

Description: 
The initial effort was made in 
https://issues.apache.org/jira/browse/SPARK-2384. This umbrella Jira is to 
track the progress of adding tooltip to all the WebUI for better usability.

 

  was:This umbrella Jira is to track the progress of adding tooltip to all the 
WebUI for better usability. 


> Add tooltip to Spark WebUI
> --
>
> Key: SPARK-29449
> URL: https://issues.apache.org/jira/browse/SPARK-29449
> Project: Spark
>  Issue Type: Umbrella
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> The initial effort was made in 
> https://issues.apache.org/jira/browse/SPARK-2384. This umbrella Jira is to 
> track the progress of adding tooltip to all the WebUI for better usability.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29019) Improve tooltip information in JDBC/ODBC Server tab

2019-10-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29019:

Parent: SPARK-29449
Issue Type: Sub-task  (was: Improvement)

> Improve tooltip information in JDBC/ODBC Server tab
> ---
>
> Key: SPARK-29019
> URL: https://issues.apache.org/jira/browse/SPARK-29019
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Pablo Langa Blanco
>Assignee: Pablo Langa Blanco
>Priority: Trivial
> Fix For: 3.0.0
>
>
> Some of the columns of JDBC/ODBC server tab in Web UI are hard to understand.
> We have documented it at SPARK-28373 but I think it is better to have some 
> tooltips in the SQL statistics table to explain the columns
> More information at the pull request



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29323) Add tooltip for The Executors Tab's column names in the Spark history server Page

2019-10-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29323:

Parent: SPARK-29449
Issue Type: Sub-task  (was: Improvement)

> Add tooltip for The Executors Tab's column names in the Spark history server 
> Page
> -
>
> Key: SPARK-29323
> URL: https://issues.apache.org/jira/browse/SPARK-29323
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: liucht-inspur
>Assignee: liucht-inspur
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-09-42-14-174.png
>
>
> the spark Executors of history Tab page, the Summary part shows the line in 
> the list of title, but format is irregular.
> Some column names have tooltip, such as Storage Memory, Task Time(GC Time), 
> Input, Shuffle Read,Shuffle Write and Blacklisted, but there are still some 
> list names that have not tooltip. They are RDD Blocks, Disk Used,Cores, 
> Activity Tasks, Failed Tasks , Complete Tasks and Total Tasks. oddly, 
> Executors section below,All the column names Contains the column names above 
> have tooltip .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29449) Add tooltip to Spark WebUI

2019-10-12 Thread Xiao Li (Jira)
Xiao Li created SPARK-29449:
---

 Summary: Add tooltip to Spark WebUI
 Key: SPARK-29449
 URL: https://issues.apache.org/jira/browse/SPARK-29449
 Project: Spark
  Issue Type: Umbrella
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Xiao Li


This umbrella Jira is to track the progress of adding tooltip to all the WebUI 
for better usability. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27986) Support Aggregate Expressions with filter

2019-10-09 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-27986:

Target Version/s: 3.0.0

> Support Aggregate Expressions with filter
> -
>
> Key: SPARK-27986
> URL: https://issues.apache.org/jira/browse/SPARK-27986
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> An aggregate expression represents the application of an aggregate function 
> across the rows selected by a query. An aggregate function reduces multiple 
> inputs to a single output value, such as the sum or average of the inputs. 
> The syntax of an aggregate expression is one of the following:
> {noformat}
> aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE 
> filter_clause ) ]
> aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( 
> WHERE filter_clause ) ]
> aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER 
> ( WHERE filter_clause ) ]
> aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
> aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) 
> [ FILTER ( WHERE filter_clause ) ]{noformat}
> [https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27986) Support Aggregate Expressions with filter

2019-10-09 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947822#comment-16947822
 ] 

Xiao Li commented on SPARK-27986:
-

There is a blog to describe this feature too. 
[https://blog.jooq.org/2014/12/30/the-awesome-postgresql-9-4-sql2003-filter-clause-for-aggregate-functions/]

> Support Aggregate Expressions with filter
> -
>
> Key: SPARK-27986
> URL: https://issues.apache.org/jira/browse/SPARK-27986
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> An aggregate expression represents the application of an aggregate function 
> across the rows selected by a query. An aggregate function reduces multiple 
> inputs to a single output value, such as the sum or average of the inputs. 
> The syntax of an aggregate expression is one of the following:
> {noformat}
> aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE 
> filter_clause ) ]
> aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( 
> WHERE filter_clause ) ]
> aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER 
> ( WHERE filter_clause ) ]
> aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
> aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) 
> [ FILTER ( WHERE filter_clause ) ]{noformat}
> [https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27986) Support Aggregate Expressions with filter

2019-10-09 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947816#comment-16947816
 ] 

Xiao Li commented on SPARK-27986:
-

cc [~beliefer]

> Support Aggregate Expressions with filter
> -
>
> Key: SPARK-27986
> URL: https://issues.apache.org/jira/browse/SPARK-27986
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> An aggregate expression represents the application of an aggregate function 
> across the rows selected by a query. An aggregate function reduces multiple 
> inputs to a single output value, such as the sum or average of the inputs. 
> The syntax of an aggregate expression is one of the following:
> {noformat}
> aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE 
> filter_clause ) ]
> aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( 
> WHERE filter_clause ) ]
> aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER 
> ( WHERE filter_clause ) ]
> aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
> aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) 
> [ FILTER ( WHERE filter_clause ) ]{noformat}
> [https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29366) Subqueries created for DPP are not printed in EXPLAIN FORMATTED

2019-10-08 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29366.
-
Fix Version/s: 3.0.0
 Assignee: Dilip Biswal
   Resolution: Fixed

> Subqueries created for DPP are not printed in EXPLAIN FORMATTED
> ---
>
> Key: SPARK-29366
> URL: https://issues.apache.org/jira/browse/SPARK-29366
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: Dilip Biswal
>Assignee: Dilip Biswal
>Priority: Major
> Fix For: 3.0.0
>
>
> The subquery expressions introduced by DPP are not printed in the newer 
> explain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Labels: release-notes  (was: )

> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
>  Labels: release-notes
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Implements dynamic partition pruning by adding a dynamic-partition-pruning 
> filter if there is a partitioned table and a filter on the dimension table. 
> The filter is then planned using a heuristic approach:
>  # As a broadcast relation if it is a broadcast hash join. The broadcast 
> relation will then be transformed into a reused broadcast exchange by the 
> {{ReuseExchange}} rule; or
>  # As a subquery duplicate if the estimated benefit of partition table scan 
> being saved is greater than the estimated cost of the extra scan of the 
> duplicated subquery; otherwise
>  # As a bypassed condition ({{true}}).
>  Below shows a basic example of DPP.
> !image-2019-10-04-11-20-02-616.png|width=521,height=225!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Description: 
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 Below shows a basic example of DPP.

!image-2019-10-04-11-20-02-616.png|width=521,height=225!

  was:
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 Below is an example to show how it takes an effect

!image-2019-10-04-11-20-02-616.png|width=521,height=225!


> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Implements dynamic partition pruning by adding a dynamic-partition-pruning 
> filter if there is a partitioned table and a filter on the dimension table. 
> The filter is then planned using a heuristic approach:
>  # As a broadcast relation if it is a broadcast hash join. The broadcast 
> relation will then be transformed into a reused broadcast exchange by the 
> {{ReuseExchange}} rule; or
>  # As a subquery duplicate if the estimated benefit of partition table scan 
> being saved is greater than the estimated cost of the extra scan of the 
> duplicated subquery; otherwise
>  # As a bypassed condition ({{true}}).
>  Below shows a basic example of DPP.
> !image-2019-10-04-11-20-02-616.png|width=521,height=225!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Description: 
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 Below is an example to show how it takes an effect

!image-2019-10-04-11-20-02-616.png|width=521,height=225!

  was:
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 

!image-2019-10-04-11-20-02-616.png|width=521,height=225!


> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Implements dynamic partition pruning by adding a dynamic-partition-pruning 
> filter if there is a partitioned table and a filter on the dimension table. 
> The filter is then planned using a heuristic approach:
>  # As a broadcast relation if it is a broadcast hash join. The broadcast 
> relation will then be transformed into a reused broadcast exchange by the 
> {{ReuseExchange}} rule; or
>  # As a subquery duplicate if the estimated benefit of partition table scan 
> being saved is greater than the estimated cost of the extra scan of the 
> duplicated subquery; otherwise
>  # As a bypassed condition ({{true}}).
>  Below is an example to show how it takes an effect
> !image-2019-10-04-11-20-02-616.png|width=521,height=225!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Attachment: image-2019-10-04-11-20-02-616.png

> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Partitions are not pruned when joined on the partition columns.
> This is the same issue as HIVE-9152.
> Ex: 
> Select  from tab where partcol=1 will prune on value 1
> Select  from tab join dim on (dim.partcol=tab.partcol) where 
> dim.partcol=1 will scan all partitions.
> Tables are based on parquets.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Description: 
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 

!image-2019-10-04-11-20-02-616.png!

  was:
Partitions are not pruned when joined on the partition columns.
This is the same issue as HIVE-9152.
Ex: 
Select  from tab where partcol=1 will prune on value 1
Select  from tab join dim on (dim.partcol=tab.partcol) where dim.partcol=1 
will scan all partitions.
Tables are based on parquets.


> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Implements dynamic partition pruning by adding a dynamic-partition-pruning 
> filter if there is a partitioned table and a filter on the dimension table. 
> The filter is then planned using a heuristic approach:
>  # As a broadcast relation if it is a broadcast hash join. The broadcast 
> relation will then be transformed into a reused broadcast exchange by the 
> {{ReuseExchange}} rule; or
>  # As a subquery duplicate if the estimated benefit of partition table scan 
> being saved is greater than the estimated cost of the extra scan of the 
> duplicated subquery; otherwise
>  # As a bypassed condition ({{true}}).
>  
> !image-2019-10-04-11-20-02-616.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-11150) Dynamic partition pruning

2019-10-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-11150:

Description: 
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 

!image-2019-10-04-11-20-02-616.png|width=521,height=225!

  was:
Implements dynamic partition pruning by adding a dynamic-partition-pruning 
filter if there is a partitioned table and a filter on the dimension table. The 
filter is then planned using a heuristic approach:
 # As a broadcast relation if it is a broadcast hash join. The broadcast 
relation will then be transformed into a reused broadcast exchange by the 
{{ReuseExchange}} rule; or
 # As a subquery duplicate if the estimated benefit of partition table scan 
being saved is greater than the estimated cost of the extra scan of the 
duplicated subquery; otherwise
 # As a bypassed condition ({{true}}).

 

!image-2019-10-04-11-20-02-616.png!


> Dynamic partition pruning
> -
>
> Key: SPARK-11150
> URL: https://issues.apache.org/jira/browse/SPARK-11150
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.5.1, 1.6.0, 2.0.0, 2.1.2, 2.2.1, 2.3.0
>Reporter: Younes
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2019-10-04-11-20-02-616.png
>
>
> Implements dynamic partition pruning by adding a dynamic-partition-pruning 
> filter if there is a partitioned table and a filter on the dimension table. 
> The filter is then planned using a heuristic approach:
>  # As a broadcast relation if it is a broadcast hash join. The broadcast 
> relation will then be transformed into a reused broadcast exchange by the 
> {{ReuseExchange}} rule; or
>  # As a subquery duplicate if the estimated benefit of partition table scan 
> being saved is greater than the estimated cost of the extra scan of the 
> duplicated subquery; otherwise
>  # As a bypassed condition ({{true}}).
>  
> !image-2019-10-04-11-20-02-616.png|width=521,height=225!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29350) Fix BroadcastExchange reuse in Dynamic Partition Pruning

2019-10-03 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-29350:
---

Assignee: Wei Xue

> Fix BroadcastExchange reuse in Dynamic Partition Pruning
> 
>
> Key: SPARK-29350
> URL: https://issues.apache.org/jira/browse/SPARK-29350
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wei Xue
>Assignee: Wei Xue
>Priority: Major
>
> Dynamic partition pruning filters are added as an in-subquery containing a 
> {{BroadcastExchange}} in a broadcast hash join. To ensure this new 
> {{BroadcastExchange}} can be reused, we need to make the {{ReuseExchange}} 
> rule visit in-subquery nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29350) Fix BroadcastExchange reuse in Dynamic Partition Pruning

2019-10-03 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29350.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Fix BroadcastExchange reuse in Dynamic Partition Pruning
> 
>
> Key: SPARK-29350
> URL: https://issues.apache.org/jira/browse/SPARK-29350
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wei Xue
>Assignee: Wei Xue
>Priority: Major
> Fix For: 3.0.0
>
>
> Dynamic partition pruning filters are added as an in-subquery containing a 
> {{BroadcastExchange}} in a broadcast hash join. To ensure this new 
> {{BroadcastExchange}} can be reused, we need to make the {{ReuseExchange}} 
> rule visit in-subquery nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28476) Support ALTER DATABASE SET LOCATION

2019-09-29 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28476.
-
Fix Version/s: 3.0.0
 Assignee: yuming.wang  (was: Weichen Xu)
   Resolution: Fixed

> Support ALTER DATABASE SET LOCATION
> ---
>
> Key: SPARK-28476
> URL: https://issues.apache.org/jira/browse/SPARK-28476
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: yuming.wang
>Priority: Major
> Fix For: 3.0.0
>
>
> We can support the syntax of ALTER (DATABASE|SCHEMA) database_name SET 
> LOCATION path
> Ref: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-21 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935119#comment-16935119
 ] 

Xiao Li commented on SPARK-29038:
-

Building it using parquet does not perform well for incremental refresh, since 
parquet does not support update/delete/merge. Also, parquet does not guarantee 
the ACID. Thus, I would suggest using Delta-like data source to implement it. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28819) Document CREATE OR REPLACE FUNCTION in SQL Reference

2019-09-20 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28819.
-
Resolution: Duplicate

They are duplicate. 

> Document CREATE OR REPLACE FUNCTION in SQL Reference
> 
>
> Key: SPARK-28819
> URL: https://issues.apache.org/jira/browse/SPARK-28819
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28816) Document ADD JAR statement in SQL Reference.

2019-09-20 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934760#comment-16934760
 ] 

Xiao Li commented on SPARK-28816:
-

Any update?

> Document ADD JAR statement in SQL Reference.
> 
>
> Key: SPARK-28816
> URL: https://issues.apache.org/jira/browse/SPARK-28816
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28827) Document SELECT CURRENT_DATABASE in SQL Reference

2019-09-20 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934757#comment-16934757
 ] 

Xiao Li commented on SPARK-28827:
-

CURRENT_DATABASE is just a function. We do not need to document it in a 
separate doc. 

> Document SELECT CURRENT_DATABASE in SQL Reference
> -
>
> Key: SPARK-28827
> URL: https://issues.apache.org/jira/browse/SPARK-28827
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28827) Document SELECT CURRENT_DATABASE in SQL Reference

2019-09-20 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28827.
-
Resolution: Invalid

> Document SELECT CURRENT_DATABASE in SQL Reference
> -
>
> Key: SPARK-28827
> URL: https://issues.apache.org/jira/browse/SPARK-28827
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28827) Document SELECT CURRENT_DATABASE in SQL Reference

2019-09-20 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934756#comment-16934756
 ] 

Xiao Li commented on SPARK-28827:
-

Any update?

> Document SELECT CURRENT_DATABASE in SQL Reference
> -
>
> Key: SPARK-28827
> URL: https://issues.apache.org/jira/browse/SPARK-28827
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28800) Document REPAIR TABLE statement in SQL Reference.

2019-09-20 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934754#comment-16934754
 ] 

Xiao Li commented on SPARK-28800:
-

Any update?

> Document REPAIR TABLE statement in SQL Reference.
> -
>
> Key: SPARK-28800
> URL: https://issues.apache.org/jira/browse/SPARK-28800
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28794) Document CREATE TABLE in SQL Reference.

2019-09-20 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934752#comment-16934752
 ] 

Xiao Li commented on SPARK-28794:
-

Any update on this?

> Document CREATE TABLE in SQL Reference.
> ---
>
> Key: SPARK-28794
> URL: https://issues.apache.org/jira/browse/SPARK-28794
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28822) Document USE DATABASE in SQL Reference

2019-09-19 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28822.
-
Fix Version/s: 3.0.0
 Assignee: Shivu Sondur
   Resolution: Fixed

> Document USE DATABASE in SQL Reference
> --
>
> Key: SPARK-28822
> URL: https://issues.apache.org/jira/browse/SPARK-28822
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Assignee: Shivu Sondur
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28989) Add `spark.sql.ansi.enabled`

2019-09-18 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28989.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Add `spark.sql.ansi.enabled`
> 
>
> Key: SPARK-28989
> URL: https://issues.apache.org/jira/browse/SPARK-28989
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, there are new configurations for compatibility with ANSI SQL:
> * spark.sql.parser.ansi.enabled
> * spark.sql.decimalOperations.nullOnOverflow
> * spark.sql.failOnIntegralTypeOverflow
> To make it simple and straightforward, let's merge these configurations into 
> a single one, `spark.sql.ansi.enabled`. When the configuration is true, Spark 
> tries to conform to ANSI SQL specification. It will be disabled by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-16452) basic INFORMATION_SCHEMA support

2019-09-18 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-16452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-16452:

Target Version/s:   (was: 3.0.0)

> basic INFORMATION_SCHEMA support
> 
>
> Key: SPARK-16452
> URL: https://issues.apache.org/jira/browse/SPARK-16452
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Reporter: Reynold Xin
>Priority: Major
> Attachments: INFORMATION_SCHEMAsupport.pdf
>
>
> INFORMATION_SCHEMA is part of SQL92 support. This ticket proposes adding a 
> few tables as defined in SQL92 standard to Spark SQL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26022) PySpark Comparison with Pandas

2019-09-18 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-26022.
-
Target Version/s:   (was: 3.0.0)
  Resolution: Later

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Hyukjin Kwon
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26022) PySpark Comparison with Pandas

2019-09-18 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932610#comment-16932610
 ] 

Xiao Li commented on SPARK-26022:
-

[https://github.com/databricks/koalas] is to close the gap. 

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Hyukjin Kwon
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28792) Document CREATE DATABASE statement in SQL Reference.

2019-09-17 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28792.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Document CREATE DATABASE statement in SQL Reference.
> 
>
> Key: SPARK-28792
> URL: https://issues.apache.org/jira/browse/SPARK-28792
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Sharanabasappa G Keriwaddi
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28792) Document CREATE DATABASE statement in SQL Reference.

2019-09-17 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28792:
---

Assignee: Sharanabasappa G Keriwaddi

> Document CREATE DATABASE statement in SQL Reference.
> 
>
> Key: SPARK-28792
> URL: https://issues.apache.org/jira/browse/SPARK-28792
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Sharanabasappa G Keriwaddi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28814) Document SET/RESET in SQL Reference.

2019-09-17 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28814.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Document SET/RESET in SQL Reference.
> 
>
> Key: SPARK-28814
> URL: https://issues.apache.org/jira/browse/SPARK-28814
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Sharanabasappa G Keriwaddi
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28814) Document SET/RESET in SQL Reference.

2019-09-17 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28814:
---

Assignee: Sharanabasappa G Keriwaddi

> Document SET/RESET in SQL Reference.
> 
>
> Key: SPARK-28814
> URL: https://issues.apache.org/jira/browse/SPARK-28814
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Sharanabasappa G Keriwaddi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27972) Move SQL migration guide to the top level

2019-09-15 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-27972.
-
Resolution: Duplicate

> Move SQL migration guide to the top level
> -
>
> Key: SPARK-27972
> URL: https://issues.apache.org/jira/browse/SPARK-27972
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, only SQL and MLLib  have the dedicated section for documenting 
> behavior changes and breaking changes. We found these guides simplify the 
> upgrade experience for end users. 
> [https://spark.apache.org/docs/latest/sql-migration-guide.html]
> [https://spark.apache.org/docs/2.4.3/ml-guide.html#migration-guide]
> The other components can do similar things in the same doc. Here, we propose 
> to combine the migration guides and move it to the top level. All the 
> components can document their behavior changes in the same PR that introduced 
> the changes. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29092) EXPLAIN FORMATTED does not work well with DPP

2019-09-15 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29092:

Description: 
 
{code:java}
withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
  SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
  withTable("df1", "df2") {
spark.range(1000)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df1")

spark.range(100)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df2")

sql("EXPLAIN FORMATTED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)

sql("EXPLAIN EXTENDED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)
  }
}
{code}
The output of EXPLAIN EXTENDED is expected.
{code:java}
== Physical Plan ==
*(2) Project [id#2721L, k#2724L]
+- *(2) BroadcastHashJoin [k#2722L], [k#2724L], Inner, BuildRight
   :- *(2) ColumnarToRow
   :  +- FileScan parquet default.df1[id#2721L,k#2722L] Batched: true, 
DataFilters: [], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2722L), dynamicpruningexpression(k#2722L IN 
subquery2741)], PushedFilters: [], ReadSchema: struct
   :+- Subquery subquery2741, [id=#358]
   :   +- *(2) HashAggregate(keys=[k#2724L], functions=[], 
output=[k#2724L#2740L])
   :  +- Exchange hashpartitioning(k#2724L, 5), true, [id=#354]
   : +- *(1) HashAggregate(keys=[k#2724L], functions=[], 
output=[k#2724L])
   :+- *(1) Project [k#2724L]
   :   +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 
2))
   :  +- *(1) ColumnarToRow
   : +- FileScan parquet 
default.df2[id#2723L,k#2724L] Batched: true, DataFilters: [isnotnull(id#2723L), 
(id#2723L < 2)], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
LessThan(id,2)], ReadSchema: struct
   +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
true])), [id=#379]
  +- *(1) Project [k#2724L]
 +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 2))
+- *(1) ColumnarToRow
   +- FileScan parquet default.df2[id#2723L,k#2724L] Batched: true, 
DataFilters: [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
LessThan(id,2)], ReadSchema: struct

{code}
However, the output of FileScan node of EXPLAIN FORMATTED does not show the 
effect of DPP
{code:java}
* Project (9)
+- * BroadcastHashJoin Inner BuildRight (8)
   :- * ColumnarToRow (2)
   :  +- Scan parquet default.df1 (1)
   +- BroadcastExchange (7)
  +- * Project (6)
 +- * Filter (5)
+- * ColumnarToRow (4)
   +- Scan parquet default.df2 (3)

(1) Scan parquet default.df1 
Output: [id#2716L, k#2717L]
{code}
 

  was:
 
{code:java}
withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
  SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
  withTable("df1", "df2") {
spark.range(1000)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df1")

spark.range(100)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df2")

sql("EXPLAIN FORMATTED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)

sql("EXPLAIN EXTENDED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)
  }
}
{code}
The output of EXPLAIN EXTENDED is expected.
{code:java}

== Physical Plan ==
*(2) Project [id#2721L, k#2724L]
+- *(2) BroadcastHashJoin [k#2722L], [k#2724L], Inner, BuildRight
   :- *(2) ColumnarToRow
   :  +- FileScan parquet default.df1[id#2721L,k#2722L] Batched: true, 
DataFilters: [], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2722L), dynamicpruningexpression(k#2722L IN 
subquery2741)], PushedFilters: [], ReadSchema: struct
   :+- Subquery subquery2741, [id=#358]
   :   +- *(2) HashAggregate(keys=[k#2724L], functions=[], 
output=[k#2724L#2740L])
   : 

[jira] [Updated] (SPARK-29092) EXPLAIN FORMATTED does not work well with DPP

2019-09-15 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29092:

Target Version/s: 3.0.0

> EXPLAIN FORMATTED does not work well with DPP
> -
>
> Key: SPARK-29092
> URL: https://issues.apache.org/jira/browse/SPARK-29092
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
>  
> {code:java}
> withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
>   SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
>   withTable("df1", "df2") {
> spark.range(1000)
>   .select(col("id"), col("id").as("k"))
>   .write
>   .partitionBy("k")
>   .format(tableFormat)
>   .mode("overwrite")
>   .saveAsTable("df1")
> spark.range(100)
>   .select(col("id"), col("id").as("k"))
>   .write
>   .partitionBy("k")
>   .format(tableFormat)
>   .mode("overwrite")
>   .saveAsTable("df2")
> sql("EXPLAIN FORMATTED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
> df2.k AND df2.id < 2")
>   .show(false)
> sql("EXPLAIN EXTENDED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
> df2.k AND df2.id < 2")
>   .show(false)
>   }
> }
> {code}
> The output of EXPLAIN EXTENDED is expected.
> {code:java}
> == Physical Plan ==
> *(2) Project [id#2721L, k#2724L]
> +- *(2) BroadcastHashJoin [k#2722L], [k#2724L], Inner, BuildRight
>:- *(2) ColumnarToRow
>:  +- FileScan parquet default.df1[id#2721L,k#2722L] Batched: true, 
> DataFilters: [], Format: Parquet, Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2722L), dynamicpruningexpression(k#2722L IN 
> subquery2741)], PushedFilters: [], ReadSchema: struct
>:+- Subquery subquery2741, [id=#358]
>:   +- *(2) HashAggregate(keys=[k#2724L], functions=[], 
> output=[k#2724L#2740L])
>:  +- Exchange hashpartitioning(k#2724L, 5), true, [id=#354]
>: +- *(1) HashAggregate(keys=[k#2724L], functions=[], 
> output=[k#2724L])
>:+- *(1) Project [k#2724L]
>:   +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L 
> < 2))
>:  +- *(1) ColumnarToRow
>: +- FileScan parquet 
> default.df2[id#2723L,k#2724L] Batched: true, DataFilters: 
> [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
> LessThan(id,2)], ReadSchema: struct
>+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
> true])), [id=#379]
>   +- *(1) Project [k#2724L]
>  +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 2))
> +- *(1) ColumnarToRow
>+- FileScan parquet default.df2[id#2723L,k#2724L] Batched: 
> true, DataFilters: [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, 
> Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
> LessThan(id,2)], ReadSchema: struct
> {code}
> However, the output of FileScan node of EXPLAIN FORMATTED does not show the 
> effect of DPP
> {code:java}
> |== Physical Plan ==
> * Project (9)
> +- * BroadcastHashJoin Inner BuildRight (8)
>  :- * ColumnarToRow (2)
>  : +- Scan parquet default.df1 (1)
>  +- BroadcastExchange (7)
>  +- * Project (6)
>  +- * Filter (5)
>  +- * ColumnarToRow (4)
>  +- Scan parquet default.df2 (3)
> (1) Scan parquet default.df1 
> Output: [id#2716L, k#2717L]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29092) EXPLAIN FORMATTED does not work well with DPP

2019-09-15 Thread Xiao Li (Jira)
Xiao Li created SPARK-29092:
---

 Summary: EXPLAIN FORMATTED does not work well with DPP
 Key: SPARK-29092
 URL: https://issues.apache.org/jira/browse/SPARK-29092
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0
Reporter: Xiao Li


 
{code:java}
withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
  SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
  withTable("df1", "df2") {
spark.range(1000)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df1")

spark.range(100)
  .select(col("id"), col("id").as("k"))
  .write
  .partitionBy("k")
  .format(tableFormat)
  .mode("overwrite")
  .saveAsTable("df2")

sql("EXPLAIN FORMATTED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)

sql("EXPLAIN EXTENDED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
df2.k AND df2.id < 2")
  .show(false)
  }
}
{code}
The output of EXPLAIN EXTENDED is expected.
{code:java}

== Physical Plan ==
*(2) Project [id#2721L, k#2724L]
+- *(2) BroadcastHashJoin [k#2722L], [k#2724L], Inner, BuildRight
   :- *(2) ColumnarToRow
   :  +- FileScan parquet default.df1[id#2721L,k#2722L] Batched: true, 
DataFilters: [], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2722L), dynamicpruningexpression(k#2722L IN 
subquery2741)], PushedFilters: [], ReadSchema: struct
   :+- Subquery subquery2741, [id=#358]
   :   +- *(2) HashAggregate(keys=[k#2724L], functions=[], 
output=[k#2724L#2740L])
   :  +- Exchange hashpartitioning(k#2724L, 5), true, [id=#354]
   : +- *(1) HashAggregate(keys=[k#2724L], functions=[], 
output=[k#2724L])
   :+- *(1) Project [k#2724L]
   :   +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 
2))
   :  +- *(1) ColumnarToRow
   : +- FileScan parquet 
default.df2[id#2723L,k#2724L] Batched: true, DataFilters: [isnotnull(id#2723L), 
(id#2723L < 2)], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
LessThan(id,2)], ReadSchema: struct
   +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
true])), [id=#379]
  +- *(1) Project [k#2724L]
 +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 2))
+- *(1) ColumnarToRow
   +- FileScan parquet default.df2[id#2723L,k#2724L] Batched: true, 
DataFilters: [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, Location: 
PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
 PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
LessThan(id,2)], ReadSchema: struct

{code}
However, the output of FileScan node of EXPLAIN FORMATTED does not show the 
effect of DPP
{code:java}
|== Physical Plan ==
* Project (9)
+- * BroadcastHashJoin Inner BuildRight (8)
 :- * ColumnarToRow (2)
 : +- Scan parquet default.df1 (1)
 +- BroadcastExchange (7)
 +- * Project (6)
 +- * Filter (5)
 +- * ColumnarToRow (4)
 +- Scan parquet default.df2 (3)

(1) Scan parquet default.df1 
Output: [id#2716L, k#2717L]
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28372) Document Spark WEB UI

2019-09-14 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28372.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Document Spark WEB UI
> -
>
> Key: SPARK-28372
> URL: https://issues.apache.org/jira/browse/SPARK-28372
> Project: Spark
>  Issue Type: Umbrella
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
> Fix For: 3.0.0
>
>
> Spark web UIs are being used to monitor the status and resource consumption 
> of your Spark applications and clusters. However, we do not have the 
> corresponding document. It is hard for end users to use and understand them. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-14 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28373.
-
Fix Version/s: 3.0.0
 Assignee: Pablo Langa Blanco
   Resolution: Fixed

> Document JDBC/ODBC Server page
> --
>
> Key: SPARK-28373
> URL: https://issues.apache.org/jira/browse/SPARK-28373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Pablo Langa Blanco
>Priority: Major
> Fix For: 3.0.0
>
>
> !https://user-images.githubusercontent.com/5399861/60809590-9dcf2500-a1bd-11e9-826e-33729bb97daf.png|width=1720,height=503!
>  
> [https://github.com/apache/spark/pull/25062] added a new column CLOSE TIME 
> and EXECUTION TIME. It is hard to understand the difference. We need to 
> document them; otherwise, it is hard for end users to understand them
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28134) Trigonometric Functions

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28134.
-
Resolution: Later

> Trigonometric Functions
> ---
>
> Key: SPARK-28134
> URL: https://issues.apache.org/jira/browse/SPARK-28134
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function (radians)||Function (degrees)||Description||
> |{{acos(_x_}})|{{acosd(_x_}})|inverse cosine|
> |{{asin(_x_}})|{{asind(_x_}})|inverse sine|
> |{{atan(_x_}})|{{atand(_x_}})|inverse tangent|
> |{{atan2(_y_}}, _{{x}}_)|{{atan2d(_y_}}, _{{x}}_)|inverse tangent of 
> {{_y_}}/_{{x}}_|
> |{{cos(_x_}})|{{cosd(_x_}})|cosine|
> |{{cot(_x_}})|{{cotd(_x_}})|cotangent|
> |{{sin(_x_}})|{{sind(_x_}})|sine|
> |{{tan(_x_}})|{{tand(_x_}})|tangent|
>  
> [https://www.postgresql.org/docs/12/functions-math.html#FUNCTIONS-MATH-TRIG-TABLE]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28909) ANSI SQL: delete and update does not support in Spark

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28909.
-
Resolution: Duplicate

> ANSI SQL: delete and update does not support in Spark
> -
>
> Key: SPARK-28909
> URL: https://issues.apache.org/jira/browse/SPARK-28909
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> delete and update supported in PostgresSQL
> create table emp_test(id int);
> insert into emp_test values(100);
> insert into emp_test values(200);
> select * from emp_test;
> *delete from emp_test where id=100;*
> select * from emp_test;
> *update emp_test set id=500 where id=200;*
> select * from emp_test;



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28663) Aggregate Functions for Statistics

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28663.
-
Resolution: Later

> Aggregate Functions for Statistics
> --
>
> Key: SPARK-28663
> URL: https://issues.apache.org/jira/browse/SPARK-28663
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Argument Type||Return Type||Partial Mode||Description||
> |{{corr(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|correlation coefficient|
> |{{covar_pop(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|population covariance|
> |{{covar_samp(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|sample covariance|
> |{{regr_avgx(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|average of the independent variable 
> ({{sum(_{{X_}})/_{{N}}_}})|
> |{{regr_avgy(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|average of the dependent variable ({{sum(_{{Y_}})/_{{N}}_}})|
> |{{regr_count(_Y_}}, _{{X}}_)|{{double precision}}|{{bigint}}|Yes|number of 
> input rows in which both expressions are nonnull|
> |{{regr_intercept(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|y-intercept of the least-squares-fit linear equation 
> determined by the (_{{X}}_, _{{Y}}_) pairs|
> |{{regr_r2(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|square of the correlation coefficient|
> |{{regr_slope(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|slope of the least-squares-fit linear equation determined by 
> the (_{{X}}_, _{{Y}}_) pairs|
> |{{regr_sxx(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|{{sum(_{{X_}}^2) - sum(_{{X}}_)^2/_{{N}}_}} (“sum of squares” 
> of the independent variable)|
> |{{regr_sxy(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|{{sum(_{{X_}}*_{{Y}}_) - sum(_{{X}}_) * 
> sum(_{{Y}}_)/_{{N}}_}} (“sum of products”of independent times dependent 
> variable)|
> |{{regr_syy(_Y_}}, _{{X}}_)|{{double precision}}|{{double 
> precision}}|Yes|{{sum(_{{Y_}}^2) - sum(_{{Y}}_)^2/_{{N}}_}} (“sum of squares” 
> of the dependent variable)|
> [https://www.postgresql.org/docs/11/functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28731) Support limit on recursive queries

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28731.
-
Resolution: Later

> Support limit on recursive queries
> --
>
> Key: SPARK-28731
> URL: https://issues.apache.org/jira/browse/SPARK-28731
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Minor
>
> Recursive queries should support LIMIT and stop recursion if the required 
> amount of rows are reached.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28661) Hypothetical-Set Aggregate Functions

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28661.
-
Resolution: Later

> Hypothetical-Set Aggregate Functions
> 
>
> Key: SPARK-28661
> URL: https://issues.apache.org/jira/browse/SPARK-28661
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Direct Argument Type(s)||Aggregated Argument Type(s)||Return 
> Type||Partial Mode||Description||
> |{{rank(_args_}}) WITHIN GROUP (ORDER BY {{sorted_args}})|{{VARIADIC}} 
> {{"any"}}|{{VARIADIC}} {{"any"}}|{{bigint}}|No|rank of the hypothetical row, 
> with gaps for duplicate rows|
> |{{dense_rank(_args_}}) WITHIN GROUP (ORDER BY {{sorted_args}})|{{VARIADIC}} 
> {{"any"}}|{{VARIADIC}} {{"any"}}|{{bigint}}|No|rank of the hypothetical row, 
> without gaps|
> |{{percent_rank(_args_}}) WITHIN GROUP (ORDER BY 
> {{sorted_args}})|{{VARIADIC}} {{"any"}}|{{VARIADIC}} {{"any"}}|{{double 
> precision}}|No|relative rank of the hypothetical row, ranging from 0 to 1|
> |{{cume_dist(_args_}}) WITHIN GROUP (ORDER BY {{sorted_args}})|{{VARIADIC}} 
> {{"any"}}|{{VARIADIC}} {{"any"}}|{{double precision}}|No|relative rank of the 
> hypothetical row, ranging from 1/_{{N}}_ to 1|
> [https://www.postgresql.org/docs/11/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28171) Difference of interval type conversion between SparkSQL and PostgreSQL

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28171.
-
Resolution: Later

> Difference of interval type conversion between SparkSQL and PostgreSQL
> --
>
> Key: SPARK-28171
> URL: https://issues.apache.org/jira/browse/SPARK-28171
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Zhu, Lipeng
>Priority: Major
>
> When calculate between date and interval.
> {code:sql}
> select timestamp '2019-01-01 00:00:00' + interval '1 2:03:04' day to second, 
> timestamp '2019-01-01 00:00:00' + interval '-1 2:03:04' day to second
> {code}
>  * PostgreSQL return *2019-01-02 02:03:04*    *2018-12-31 02:03:04*
>  * SparkSQL return    *2019-01-02 02:03:04*    *2018-12-30 21:56:56*
> {code:sql}
> select timestamp '2019-01-01 00:00:00' + interval '1 -2:03:04' day to second, 
> timestamp '2019-01-01 00:00:00' + interval '-1 -2:03:04' day to second
> {code}
>  * PostgreSQL return *2019-01-01 21:56:56*    *2018-12-30 21:56:56*
>  * SparkSQL return    _*Interval string does not match day-time format of 'd 
> h:m:s.n': '1 -2:03:04'(line 1, pos 50)*_



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28379) Correlated scalar subqueries must be aggregated

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28379.
-
Resolution: Later

> Correlated scalar subqueries must be aggregated
> ---
>
> Key: SPARK-28379
> URL: https://issues.apache.org/jira/browse/SPARK-28379
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:sql}
> create or replace temporary view INT8_TBL as select * from
>   (values
> (123, 456),
> (123, 4567890123456789),
> (4567890123456789, 123),
> (4567890123456789, 4567890123456789),
> (4567890123456789, -4567890123456789))
>   as v(q1, q2);
> select * from
>   int8_tbl t1 left join
>   (select q1 as x, 42 as y from int8_tbl t2) ss
>   on t1.q2 = ss.x
> where
>   1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1)
> order by 1,2;
> {code}
> PostgreSQL:
> {noformat}
> postgres=# select * from
> postgres-#   int8_tbl t1 left join
> postgres-#   (select q1 as x, 42 as y from int8_tbl t2) ss
> postgres-#   on t1.q2 = ss.x
> postgres-# where
> postgres-#   1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1)
> postgres-# order by 1,2;
> q1|q2|x | y
> --+--+--+
>   123 | 4567890123456789 | 4567890123456789 | 42
>   123 | 4567890123456789 | 4567890123456789 | 42
>   123 | 4567890123456789 | 4567890123456789 | 42
>  4567890123456789 |  123 |  123 | 42
>  4567890123456789 |  123 |  123 | 42
>  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
>  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
>  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
> (8 rows)
> {noformat}
> Spark SQL:
> {noformat}
> spark-sql> select * from
>  >   int8_tbl t1 left join
>  >   (select q1 as x, 42 as y from int8_tbl t2) ss
>  >   on t1.q2 = ss.x
>  > where
>  >   1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1)
>  > order by 1,2;
> Error in query: Correlated scalar subqueries must be aggregated: GlobalLimit 1
> +- LocalLimit 1
>+- Project [1 AS 1#169]
>   +- Filter isnotnull(outer(y#167))
>  +- SubqueryAlias `t3`
> +- SubqueryAlias `int8_tbl`
>+- Project [q1#164L, q2#165L]
>   +- Project [col1#162L AS q1#164L, col2#163L AS q2#165L]
>  +- SubqueryAlias `v`
> +- LocalRelation [col1#162L, col2#163L]
> ;;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28316) Decimal precision issue

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28316.
-
Resolution: Later

> Decimal precision issue
> ---
>
> Key: SPARK-28316
> URL: https://issues.apache.org/jira/browse/SPARK-28316
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> Multiply check:
> {code:sql}
> -- Spark SQL
> spark-sql> select cast(-34338492.215397047 as decimal(38, 10)) * 
> cast(-34338492.215397047 as decimal(38, 10));
> 1179132047626883.596862
> -- PostgreSQL
> postgres=# select cast(-34338492.215397047 as numeric(38, 10)) * 
> cast(-34338492.215397047 as numeric(38, 10));
>?column?
> ---
>  1179132047626883.59686213585632020900
> (1 row)
> {code}
> Division check:
> {code:sql}
> -- Spark SQL
> spark-sql> select cast(93901.57763026 as decimal(38, 10)) / cast(4.31 as 
> decimal(38, 10));
> 21786.908963
> -- PostgreSQL
> postgres=# select cast(93901.57763026 as numeric(38, 10)) / cast(4.31 as 
> numeric(38, 10));
>   ?column?
> 
>  21786.908962937355
> (1 row)
> {code}
> POWER(10, LN(value)) check:
> {code:sql}
> -- Spark SQL
> spark-sql> SELECT CAST(POWER(cast('10' as decimal(38, 18)), 
> LN(ABS(round(cast(-24926804.04504742 as decimal(38, 10)),200 AS 
> decimal(38, 10));
> 107511333880051856
> -- PostgreSQL
> postgres=# SELECT CAST(POWER(cast('10' as numeric(38, 18)), 
> LN(ABS(round(cast(-24926804.04504742 as numeric(38, 10)),200 AS 
> numeric(38, 10));
>  power
> ---
>  107511333880052007.0414112467
> (1 row)
> {code}
> STDDEV and VARIANCE returns double type:
> {code:sql}
> -- Spark SQL
> spark-sql> create temporary view t1 as select * from values
>  >   (cast(-24926804.04504742 as decimal(38, 10))),
>  >   (cast(16397.038491 as decimal(38, 10))),
>  >   (cast(7799461.4119 as decimal(38, 10)))
>  >   as t1(t);
> spark-sql> SELECT STDDEV(t), VARIANCE(t) FROM t1;
> 1.7096528995154984E7  2.922913036821751E14
> -- PostgreSQL
> postgres=# SELECT STDDEV(t), VARIANCE(t)  from (values 
> (cast(-24926804.04504742 as decimal(38, 10))), (cast(16397.038491 as 
> decimal(38, 10))), (cast(7799461.4119 as decimal(38, 10 t1(t);
> stddev |   variance
> ---+--
>  17096528.99515498420743029415 | 292291303682175.094017569588
> (1 row)
> {code}
> EXP returns double type:
> {code:sql}
> -- Spark SQL
> spark-sql> select exp(cast(1.0 as decimal(31,30)));
> 2.718281828459045
> -- PostgreSQL
> postgres=# select exp(cast(1.0 as decimal(31,30)));
>exp
> --
>  2.718281828459045235360287471353
> (1 row)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-25411:

Parent: (was: SPARK-27764)
Issue Type: New Feature  (was: Sub-task)

> Implement range partition in Spark
> --
>
> Key: SPARK-25411
> URL: https://issues.apache.org/jira/browse/SPARK-25411
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wang, Gang
>Priority: Major
> Attachments: range partition design doc.pdf
>
>
> In our product environment, there are some partitioned fact tables, which are 
> all quite huge. To accelerate join execution, we need make them also 
> bucketed. Than comes the problem, if the bucket number is large enough, there 
> may be too many files(files count = bucket number * partition count), which 
> may bring pressure to the HDFS. And if the bucket number is small, Spark will 
> launch equal number of tasks to read/write it.
>  
> So, can we implement a new partition support range values, just like range 
> partition in Oracle/MySQL 
> ([https://docs.oracle.com/cd/E17952_01/mysql-5.7-en/partitioning-range.html]).
>  Say, we can partition by a date column, and make every two months as a 
> partition, or partitioned by a integer column, make interval of 1 as a 
> partition.
>  
> Ideally, feature like range partition should be implemented in Hive. While, 
> it's been always hard to update Hive version in a prod environment, and much 
> lightweight and flexible if we implement it in Spark.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25411) Implement range partition in Spark

2019-09-13 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929690#comment-16929690
 ] 

Xiao Li commented on SPARK-25411:
-

I think it is very specific to the impl of data sources. We can discuss this in 
DSV2 design instead of the feature parity with PostgreSQL

> Implement range partition in Spark
> --
>
> Key: SPARK-25411
> URL: https://issues.apache.org/jira/browse/SPARK-25411
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wang, Gang
>Priority: Major
> Attachments: range partition design doc.pdf
>
>
> In our product environment, there are some partitioned fact tables, which are 
> all quite huge. To accelerate join execution, we need make them also 
> bucketed. Than comes the problem, if the bucket number is large enough, there 
> may be too many files(files count = bucket number * partition count), which 
> may bring pressure to the HDFS. And if the bucket number is small, Spark will 
> launch equal number of tasks to read/write it.
>  
> So, can we implement a new partition support range values, just like range 
> partition in Oracle/MySQL 
> ([https://docs.oracle.com/cd/E17952_01/mysql-5.7-en/partitioning-range.html]).
>  Say, we can partition by a date column, and make every two months as a 
> partition, or partitioned by a integer column, make interval of 1 as a 
> partition.
>  
> Ideally, feature like range partition should be implemented in Hive. While, 
> it's been always hard to update Hive version in a prod environment, and much 
> lightweight and flexible if we implement it in Spark.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28669) System Information Functions

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28669.
-
Resolution: Later

> System Information Functions
> 
>
> Key: SPARK-28669
> URL: https://issues.apache.org/jira/browse/SPARK-28669
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Name||Return Type||Description||
> |{{current_catalog}}|{{name}}|name of current database (called “catalog” in 
> the SQL standard)|
> |{{current_database()}}|{{name}}|name of current database|
> |{{current_query()}}|{{text}}|text of the currently executing query, as 
> submitted by the client (might contain more than one statement)|
> |{{current_role}}|{{name}}|equivalent to {{current_user}}|
> |{{current_schema}}{{[()]}}|{{name}}|name of current schema|
> |{{current_schemas(}}{{boolean}}{{)}}|{{name[]}}|names of schemas in search 
> path, optionally including implicit schemas|
> |{{current_user}}|{{name}}|user name of current execution context|
> |{{inet_client_addr()}}|{{inet}}|address of the remote connection|
> |{{inet_client_port()}}|{{int}}|port of the remote connection|
> |{{inet_server_addr()}}|{{inet}}|address of the local connection|
> |{{inet_server_port()}}|{{int}}|port of the local connection|
> |{{pg_backend_pid()}}|{{int}}|Process ID of the server process attached to 
> the current session|
> |{{pg_blocking_pids(}}{{int}}{{)}}|{{int[]}}|Process ID(s) that are blocking 
> specified server process ID from acquiring a lock|
> |{{pg_conf_load_time()}}|{{timestamp with time zone}}|configuration load time|
> |{{pg_current_logfile([{{text}}])}}|{{text}}|Primary log file name, or log in 
> the requested format, currently in use by the logging collector|
> |{{pg_my_temp_schema()}}|{{oid}}|OID of session's temporary schema, or 0 if 
> none|
> |{{pg_is_other_temp_schema(}}{{oid}}{{)}}|{{boolean}}|is schema another 
> session's temporary schema?|
> |{{pg_listening_channels()}}|{{setof text}}|channel names that the session is 
> currently listening on|
> |{{pg_notification_queue_usage()}}|{{double}}|fraction of the asynchronous 
> notification queue currently occupied (0-1)|
> |{{pg_postmaster_start_time()}}|{{timestamp with time zone}}|server start 
> time|
> |{{pg_safe_snapshot_blocking_pids(}}{{int}}{{)}}|{{int[]}}|Process ID(s) that 
> are blocking specified server process ID from acquiring a safe snapshot|
> |{{pg_trigger_depth()}}|{{int}}|current nesting level of PostgreSQL triggers 
> (0 if not called, directly or indirectly, from inside a trigger)|
> |{{session_user}}|{{name}}|session user name|
> |{{user}}|{{name}}|equivalent to {{current_user}}|
> Example:
> {code:sql}
> postgres=# SELECT pg_collation_for(description) FROM pg_description LIMIT 1;
>  pg_collation_for
> --
>  "default"
> (1 row)
> {code}
> https://www.postgresql.org/docs/10/functions-info.html



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28400) Add built-in Array Functions: array_upper

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28400.
-
Resolution: Later

> Add built-in Array Functions: array_upper
> -
>
> Key: SPARK-28400
> URL: https://issues.apache.org/jira/browse/SPARK-28400
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Major
>
> ||Function||Return Type||Description||Example||Result||
> |{{array_upper}}{{(}}{{anyarray}}{{, }}{{int}}{{)}}|int|returns upper bound 
> of the requested array dimension|array_upper(ARRAY[1,8,3,7], 1)|4|
> [https://www.postgresql.org/docs/current/functions-array.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28865) Table inheritance

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28865.
-
Resolution: Won't Fix

> Table inheritance
> -
>
> Key: SPARK-28865
> URL: https://issues.apache.org/jira/browse/SPARK-28865
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL implements table inheritance, which can be a useful tool for 
> database designers. (SQL:1999 and later define a type inheritance feature, 
> which differs in many respects from the features described here.)
>  
> [https://www.postgresql.org/docs/11/ddl-inherit.html|https://www.postgresql.org/docs/9.5/ddl-inherit.html]
> [https://www.postgresql.org/docs/11/tutorial-inheritance.html]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28566) window functions should not be allowed in window definitions

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28566.
-
Resolution: Won't Fix

> window functions should not be allowed in window definitions
> 
>
> Key: SPARK-28566
> URL: https://issues.apache.org/jira/browse/SPARK-28566
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark allows the usage of window functions inside window 
> definitions, such as:
> {code:sql}
>  SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()));{code}
> However, in PgSQL such behavior is now allowed:
> {code:sql}
> postgres=# SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()));
> ERROR:  window functions are not allowed in window definitions
> LINE 1: SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random())...{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28795) Document CREATE VIEW statement in SQL Reference.

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28795.
-
Fix Version/s: 3.0.0
 Assignee: Aman Omer
   Resolution: Fixed

> Document CREATE VIEW statement in SQL Reference.
> 
>
> Key: SPARK-28795
> URL: https://issues.apache.org/jira/browse/SPARK-28795
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Aman Omer
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28796) Document DROP DATABASE statement in SQL Reference.

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28796.
-
Fix Version/s: 3.0.0
 Assignee: Sandeep Katta
   Resolution: Fixed

> Document DROP DATABASE statement in SQL Reference.
> --
>
> Key: SPARK-28796
> URL: https://issues.apache.org/jira/browse/SPARK-28796
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: Sandeep Katta
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28828) Document REFRESH statement in SQL Reference.

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28828:
---

Assignee: kevin yu

> Document REFRESH statement in SQL Reference.
> 
>
> Key: SPARK-28828
> URL: https://issues.apache.org/jira/browse/SPARK-28828
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: jobit mathew
>Assignee: kevin yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28828) Document REFRESH statement in SQL Reference.

2019-09-13 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28828.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Document REFRESH statement in SQL Reference.
> 
>
> Key: SPARK-28828
> URL: https://issues.apache.org/jira/browse/SPARK-28828
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: jobit mathew
>Assignee: kevin yu
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29048) Query optimizer slow when using Column.isInCollection() with a large size collection

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29048.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Query optimizer slow when using Column.isInCollection() with a large size 
> collection
> 
>
> Key: SPARK-29048
> URL: https://issues.apache.org/jira/browse/SPARK-29048
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: Weichen Xu
>Assignee: Weichen Xu
>Priority: Major
> Fix For: 3.0.0
>
>
> Query optimizer slow when using Column.isInCollection() with a large size 
> collection.
> The query optimizer takes a long time to do its thing and on the UI all I see 
> is "Running commands". This can take from 10s of minutes to 11 hours 
> depending on how many values there are.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27149) Support LOCALTIMESTAMP when ANSI mode enabled

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-27149.
-
Resolution: Won't Fix

> Support LOCALTIMESTAMP when ANSI mode enabled
> -
>
> Key: SPARK-27149
> URL: https://issues.apache.org/jira/browse/SPARK-27149
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>
> LOCALTIMESTAMP should be supported in the ANSI standard;
> {code}
> postgres=# select LOCALTIMESTAMP;
>  timestamp  
> 
>  2019-03-13 16:53:42.086357
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28453) Support recursive view syntax

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28453.
-
Resolution: Won't Fix

> Support recursive view syntax
> -
>
> Key: SPARK-28453
> URL: https://issues.apache.org/jira/browse/SPARK-28453
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Minor
>
> PostgreSQL does support recursive view syntax:
> {noformat}
> CREATE RECURSIVE VIEW nums (n) AS
>   VALUES (1)
>   UNION ALL
>   SELECT n+1 FROM nums WHERE n < 5
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28453) Support recursive view syntax

2019-09-12 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928947#comment-16928947
 ] 

Xiao Li commented on SPARK-28453:
-

This is a very low priority task. Let us do not spend time on it. 

> Support recursive view syntax
> -
>
> Key: SPARK-28453
> URL: https://issues.apache.org/jira/browse/SPARK-28453
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Minor
>
> PostgreSQL does support recursive view syntax:
> {noformat}
> CREATE RECURSIVE VIEW nums (n) AS
>   VALUES (1)
>   UNION ALL
>   SELECT n+1 FROM nums WHERE n < 5
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-28298) Fully support char and varchar types

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li closed SPARK-28298.
---

> Fully support char and varchar types
> 
>
> Key: SPARK-28298
> URL: https://issues.apache.org/jira/browse/SPARK-28298
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Zhu, Lipeng
>Priority: Major
>
> Execute below SQL in Spark, the result is "abcdef".   But the result of other 
> DBMS is "abc"(I think this is more sensible).
> {code:sql}
> select cast("abcdef" as char(3));
> {code}
> And then I checked the source code, seems char/varchar only be used in DDL 
> parse.
> {code:java}
> /**
>  * Hive char type. Similar to other HiveStringType's, these datatypes should 
> only used for
>  * parsing, and should NOT be used anywhere else. Any instance of these data 
> types should be
>  * replaced by a [[StringType]] before analysis.
>  */
> case class CharType(length: Int) extends HiveStringType {
>   override def simpleString: String = s"char($length)"
> }
> /**
>  * Hive varchar type. Similar to other HiveStringType's, these datatypes 
> should only used for
>  * parsing, and should NOT be used anywhere else. Any instance of these data 
> types should be
>  * replaced by a [[StringType]] before analysis.
>  */
> case class VarcharType(length: Int) extends HiveStringType {
>   override def simpleString: String = s"varchar($length)"
> }
> {code}
> Is this behavior expected? 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-28448) Implement ILIKE operator

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li closed SPARK-28448.
---

> Implement ILIKE operator
> 
>
> Key: SPARK-28448
> URL: https://issues.apache.org/jira/browse/SPARK-28448
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> The key word {{ILIKE}} can be used instead of {{LIKE}} to make the match 
> case-insensitive according to the active locale. This is not in the SQL 
> standard but is a PostgreSQL extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28298) Fully support char and varchar types

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28298.
-
Resolution: Won't Fix

> Fully support char and varchar types
> 
>
> Key: SPARK-28298
> URL: https://issues.apache.org/jira/browse/SPARK-28298
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Zhu, Lipeng
>Priority: Major
>
> Execute below SQL in Spark, the result is "abcdef".   But the result of other 
> DBMS is "abc"(I think this is more sensible).
> {code:sql}
> select cast("abcdef" as char(3));
> {code}
> And then I checked the source code, seems char/varchar only be used in DDL 
> parse.
> {code:java}
> /**
>  * Hive char type. Similar to other HiveStringType's, these datatypes should 
> only used for
>  * parsing, and should NOT be used anywhere else. Any instance of these data 
> types should be
>  * replaced by a [[StringType]] before analysis.
>  */
> case class CharType(length: Int) extends HiveStringType {
>   override def simpleString: String = s"char($length)"
> }
> /**
>  * Hive varchar type. Similar to other HiveStringType's, these datatypes 
> should only used for
>  * parsing, and should NOT be used anywhere else. Any instance of these data 
> types should be
>  * replaced by a [[StringType]] before analysis.
>  */
> case class VarcharType(length: Int) extends HiveStringType {
>   override def simpleString: String = s"varchar($length)"
> }
> {code}
> Is this behavior expected? 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28448) Implement ILIKE operator

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28448.
-
Resolution: Won't Fix

> Implement ILIKE operator
> 
>
> Key: SPARK-28448
> URL: https://issues.apache.org/jira/browse/SPARK-28448
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> The key word {{ILIKE}} can be used instead of {{LIKE}} to make the match 
> case-insensitive according to the active locale. This is not in the SQL 
> standard but is a PostgreSQL extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-27877) ANSI SQL: LATERAL derived table(T491)

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li closed SPARK-27877.
---

> ANSI SQL: LATERAL derived table(T491)
> -
>
> Key: SPARK-27877
> URL: https://issues.apache.org/jira/browse/SPARK-27877
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> Subqueries appearing in {{FROM}} can be preceded by the key word {{LATERAL}}. 
> This allows them to reference columns provided by preceding {{FROM}} items. 
> (Without {{LATERAL}}, each subquery is evaluated independently and so cannot 
> cross-reference any other {{FROM}} item.)
> Table functions appearing in {{FROM}} can also be preceded by the key word 
> {{LATERAL}}, but for functions the key word is optional; the function's 
> arguments can contain references to columns provided by preceding {{FROM}} 
> items in any case.
> A {{LATERAL}} item can appear at top level in the {{FROM}} list, or within a 
> {{JOIN}} tree. In the latter case it can also refer to any items that are on 
> the left-hand side of a {{JOIN}} that it is on the right-hand side of.
> When a {{FROM}} item contains {{LATERAL}} cross-references, evaluation 
> proceeds as follows: for each row of the {{FROM}} item providing the 
> cross-referenced column(s), or set of rows of multiple {{FROM}} items 
> providing the columns, the {{LATERAL}} item is evaluated using that row or 
> row set's values of the columns. The resulting row(s) are joined as usual 
> with the rows they were computed from. This is repeated for each row or set 
> of rows from the column source table(s).
> A trivial example of {{LATERAL}} is
> {code:sql}
> SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
> {code}
> *Feature ID*: T491
> [https://www.postgresql.org/docs/11/queries-table-expressions.html#QUERIES-FROM]
> [https://github.com/postgres/postgres/commit/5ebaaa49445eb1ba7b299bbea3a477d4e4c0430]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27877) ANSI SQL: LATERAL derived table(T491)

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-27877.
-
Resolution: Won't Fix

> ANSI SQL: LATERAL derived table(T491)
> -
>
> Key: SPARK-27877
> URL: https://issues.apache.org/jira/browse/SPARK-27877
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> Subqueries appearing in {{FROM}} can be preceded by the key word {{LATERAL}}. 
> This allows them to reference columns provided by preceding {{FROM}} items. 
> (Without {{LATERAL}}, each subquery is evaluated independently and so cannot 
> cross-reference any other {{FROM}} item.)
> Table functions appearing in {{FROM}} can also be preceded by the key word 
> {{LATERAL}}, but for functions the key word is optional; the function's 
> arguments can contain references to columns provided by preceding {{FROM}} 
> items in any case.
> A {{LATERAL}} item can appear at top level in the {{FROM}} list, or within a 
> {{JOIN}} tree. In the latter case it can also refer to any items that are on 
> the left-hand side of a {{JOIN}} that it is on the right-hand side of.
> When a {{FROM}} item contains {{LATERAL}} cross-references, evaluation 
> proceeds as follows: for each row of the {{FROM}} item providing the 
> cross-referenced column(s), or set of rows of multiple {{FROM}} items 
> providing the columns, the {{LATERAL}} item is evaluated using that row or 
> row set's values of the columns. The resulting row(s) are joined as usual 
> with the rows they were computed from. This is repeated for each row or set 
> of rows from the column source table(s).
> A trivial example of {{LATERAL}} is
> {code:sql}
> SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
> {code}
> *Feature ID*: T491
> [https://www.postgresql.org/docs/11/queries-table-expressions.html#QUERIES-FROM]
> [https://github.com/postgres/postgres/commit/5ebaaa49445eb1ba7b299bbea3a477d4e4c0430]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-27980) Ordered-Set Aggregate Functions

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li closed SPARK-27980.
---

> Ordered-Set Aggregate Functions
> ---
>
> Key: SPARK-27980
> URL: https://issues.apache.org/jira/browse/SPARK-27980
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Direct Argument Type(s)||Aggregated Argument Type(s)||Return 
> Type||Partial Mode||Description||
> |{{mode() WITHIN GROUP (ORDER BY sort_expression)}}| |any sortable type|same 
> as sort expression|No|returns the most frequent input value (arbitrarily 
> choosing the first one if there are multiple equally-frequent results)|
> |{{percentile_cont(_fraction_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision}}|{{double precision}} or 
> {{interval}}|same as sort expression|No|continuous percentile: returns a 
> value corresponding to the specified fraction in the ordering, interpolating 
> between adjacent input items if needed|
> |{{percentile_cont(_fractions_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision[]}}|{{double precision}} or 
> {{interval}}|array of sort expression's type|No|multiple continuous 
> percentile: returns an array of results matching the shape of the 
> _{{fractions}}_ parameter, with each non-null element replaced by the value 
> corresponding to that percentile|
> |{{percentile_disc(_fraction_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision}}|any sortable type|same as sort 
> expression|No|discrete percentile: returns the first input value whose 
> position in the ordering equals or exceeds the specified fraction|
> |{{percentile_disc(_fractions_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision[]}}|any sortable type|array of sort 
> expression's type|No|multiple discrete percentile: returns an array of 
> results matching the shape of the _{{fractions}}_ parameter, with each 
> non-null element replaced by the input value corresponding to that percentile|
> [https://www.postgresql.org/docs/11/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28027) Missing some mathematical operators

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28027.
-
Resolution: Won't Fix

> Missing some mathematical operators
> ---
>
> Key: SPARK-28027
> URL: https://issues.apache.org/jira/browse/SPARK-28027
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Operator||Description||Example||Result||
> |{{^}}|exponentiation (associates left to right)|{{2.0 ^ 3.0}}|{{8}}|
> |{{\|/}}|square root|{{\|/ 25.0}}|{{5}}|
> |{{\|\|/}}|cube root|{{\|\|/ 27.0}}|{{3}}|
> |{{\!}}|factorial|{{5 !}}|{{120}}|
> |{{\!\!}}|factorial (prefix operator)|{{!! 5}}|{{120}}|
> |{{@}}|absolute value|{{@ -5.0}}|{{5}}|
> |{{#}}|bitwise XOR|{{17 # 5}}|{{20}}|
> |{{<<}}|bitwise shift left|{{1 << 4}}|{{16}}|
> |{{>>}}|bitwise shift right|{{8 >> 2}}|{{2}}|
>  
>  Please note that we have {{^}}, {{\!}} and {{\!!\}}, but it has different 
> meanings.
> [https://www.postgresql.org/docs/11/functions-math.html]
>  
> [https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Operators/BitwiseOperators.htm]
>  [https://docs.aws.amazon.com/redshift/latest/dg/r_OPERATOR_SYMBOLS.html]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27980) Ordered-Set Aggregate Functions

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-27980.
-
Resolution: Won't Fix

> Ordered-Set Aggregate Functions
> ---
>
> Key: SPARK-27980
> URL: https://issues.apache.org/jira/browse/SPARK-27980
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Direct Argument Type(s)||Aggregated Argument Type(s)||Return 
> Type||Partial Mode||Description||
> |{{mode() WITHIN GROUP (ORDER BY sort_expression)}}| |any sortable type|same 
> as sort expression|No|returns the most frequent input value (arbitrarily 
> choosing the first one if there are multiple equally-frequent results)|
> |{{percentile_cont(_fraction_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision}}|{{double precision}} or 
> {{interval}}|same as sort expression|No|continuous percentile: returns a 
> value corresponding to the specified fraction in the ordering, interpolating 
> between adjacent input items if needed|
> |{{percentile_cont(_fractions_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision[]}}|{{double precision}} or 
> {{interval}}|array of sort expression's type|No|multiple continuous 
> percentile: returns an array of results matching the shape of the 
> _{{fractions}}_ parameter, with each non-null element replaced by the value 
> corresponding to that percentile|
> |{{percentile_disc(_fraction_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision}}|any sortable type|same as sort 
> expression|No|discrete percentile: returns the first input value whose 
> position in the ordering equals or exceeds the specified fraction|
> |{{percentile_disc(_fractions_}}) WITHIN GROUP (ORDER BY 
> {{sort_expression}})|{{double precision[]}}|any sortable type|array of sort 
> expression's type|No|multiple discrete percentile: returns an array of 
> results matching the shape of the _{{fractions}}_ parameter, with each 
> non-null element replaced by the input value corresponding to that percentile|
> [https://www.postgresql.org/docs/11/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28027) Missing some mathematical operators

2019-09-12 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928940#comment-16928940
 ] 

Xiao Li commented on SPARK-28027:
-

Here, we will only support ^. 

> Missing some mathematical operators
> ---
>
> Key: SPARK-28027
> URL: https://issues.apache.org/jira/browse/SPARK-28027
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Operator||Description||Example||Result||
> |{{^}}|exponentiation (associates left to right)|{{2.0 ^ 3.0}}|{{8}}|
> |{{\|/}}|square root|{{\|/ 25.0}}|{{5}}|
> |{{\|\|/}}|cube root|{{\|\|/ 27.0}}|{{3}}|
> |{{\!}}|factorial|{{5 !}}|{{120}}|
> |{{\!\!}}|factorial (prefix operator)|{{!! 5}}|{{120}}|
> |{{@}}|absolute value|{{@ -5.0}}|{{5}}|
> |{{#}}|bitwise XOR|{{17 # 5}}|{{20}}|
> |{{<<}}|bitwise shift left|{{1 << 4}}|{{16}}|
> |{{>>}}|bitwise shift right|{{8 >> 2}}|{{2}}|
>  
>  Please note that we have {{^}}, {{\!}} and {{\!!\}}, but it has different 
> meanings.
> [https://www.postgresql.org/docs/11/functions-math.html]
>  
> [https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Operators/BitwiseOperators.htm]
>  [https://docs.aws.amazon.com/redshift/latest/dg/r_OPERATOR_SYMBOLS.html]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28027) Missing some mathematical operators

2019-09-12 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928942#comment-16928942
 ] 

Xiao Li commented on SPARK-28027:
-

OK, since ^ has different semantics. Let us close it. 

> Missing some mathematical operators
> ---
>
> Key: SPARK-28027
> URL: https://issues.apache.org/jira/browse/SPARK-28027
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Operator||Description||Example||Result||
> |{{^}}|exponentiation (associates left to right)|{{2.0 ^ 3.0}}|{{8}}|
> |{{\|/}}|square root|{{\|/ 25.0}}|{{5}}|
> |{{\|\|/}}|cube root|{{\|\|/ 27.0}}|{{3}}|
> |{{\!}}|factorial|{{5 !}}|{{120}}|
> |{{\!\!}}|factorial (prefix operator)|{{!! 5}}|{{120}}|
> |{{@}}|absolute value|{{@ -5.0}}|{{5}}|
> |{{#}}|bitwise XOR|{{17 # 5}}|{{20}}|
> |{{<<}}|bitwise shift left|{{1 << 4}}|{{16}}|
> |{{>>}}|bitwise shift right|{{8 >> 2}}|{{2}}|
>  
>  Please note that we have {{^}}, {{\!}} and {{\!!\}}, but it has different 
> meanings.
> [https://www.postgresql.org/docs/11/functions-math.html]
>  
> [https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Operators/BitwiseOperators.htm]
>  [https://docs.aws.amazon.com/redshift/latest/dg/r_OPERATOR_SYMBOLS.html]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29071) Upgrade Scala to 2.12.10

2019-09-12 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29071.
-
Resolution: Duplicate

> Upgrade Scala to 2.12.10
> 
>
> Key: SPARK-29071
> URL: https://issues.apache.org/jira/browse/SPARK-29071
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> Supposed to compile another 5-10% faster than the 2.12.8 we're on now:
>  * [https://github.com/scala/scala/releases/tag/v2.12.9]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29071) Upgrade Scala to 2.12.10

2019-09-12 Thread Xiao Li (Jira)
Xiao Li created SPARK-29071:
---

 Summary: Upgrade Scala to 2.12.10
 Key: SPARK-29071
 URL: https://issues.apache.org/jira/browse/SPARK-29071
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.0.0
Reporter: Xiao Li


Supposed to compile another 5-10% faster than the 2.12.8 we're on now:
 * [https://github.com/scala/scala/releases/tag/v2.12.9]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-11 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927784#comment-16927784
 ] 

Xiao Li commented on SPARK-29038:
-

So far, the doc does not contain enough details. It requires comprehensive 
comparison with the corresponding features in the other commercial database. We 
also need to document how to implement them one by one.

Also, based on my understanding, the materialized view should not be 
memory-based. It has to be physically stored. Usage of Spark cache could affect 
the other memory-intensive queries. Any major feature in cache usage requires a 
memory manager.   

I am not against this, but the efforts for supporting this feature are pretty 
big. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927274#comment-16927274
 ] 

Xiao Li commented on SPARK-29038:
-

https://www.bwdb2ug.org/Presentations/BWDUG_%20MQT.pps is a reference. It only 
shows the basic ideas how it work, but implementation details are complex.

 

 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927268#comment-16927268
 ] 

Xiao Li commented on SPARK-29038:
-

We need to follow ANSI SQL if we plan to support the materialized views. 
Materialized views are well defined concepts in DBMSs. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28831) Document CLEAR CACHE in SQL Reference

2019-09-09 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28831.
-
Fix Version/s: 3.0.0
 Assignee: Huaxin Gao
   Resolution: Fixed

> Document CLEAR CACHE in SQL Reference
> -
>
> Key: SPARK-28831
> URL: https://issues.apache.org/jira/browse/SPARK-28831
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28773) NULL Handling

2019-09-09 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28773.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> NULL Handling
> -
>
> Key: SPARK-28773
> URL: https://issues.apache.org/jira/browse/SPARK-28773
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Dilip Biswal
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28773) NULL Handling

2019-09-09 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28773:
---

Assignee: Dilip Biswal  (was: Xiao Li)

> NULL Handling
> -
>
> Key: SPARK-28773
> URL: https://issues.apache.org/jira/browse/SPARK-28773
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28637) Thriftserver can not support interval type

2019-09-09 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28637.
-
Fix Version/s: 3.0.0
 Assignee: Yuming Wang
   Resolution: Fixed

> Thriftserver can not support interval type
> --
>
> Key: SPARK-28637
> URL: https://issues.apache.org/jira/browse/SPARK-28637
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> {code:sql}
> 0: jdbc:hive2://localhost:1> select interval '10-11' year to month;
> Error: java.lang.IllegalArgumentException: Unrecognized type name: interval 
> (state=,code=0)
> {code}
> {code:sql}
> spark-sql> select interval '10-11' year to month;
> interval 10 years 11 months
> {code}
> Thriftserver log:
> {noformat}
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Unrecognized 
> type name: interval
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at 
> java.security.AccessController.doPrivileged(AccessController.java:770)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy26.getResultSetMetadata(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.getResultSetMetadata(CLIService.java:436)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.GetResultSetMetadata(ThriftCLIService.java:607)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$GetResultSetMetadata.getResult(TCLIService.java:1533)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$GetResultSetMetadata.getResult(TCLIService.java:1518)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:53)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:819)
> Caused by: java.lang.IllegalArgumentException: Unrecognized type name: 
> interval
>   at org.apache.hive.service.cli.Type.getType(Type.java:169)
>   at 
> org.apache.hive.service.cli.TypeDescriptor.(TypeDescriptor.java:53)
>   at 
> org.apache.hive.service.cli.ColumnDescriptor.(ColumnDescriptor.java:53)
>   at org.apache.hive.service.cli.TableSchema.(TableSchema.java:52)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$.getTableSchema(SparkExecuteStatementOperation.scala:314)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.resultSchema$lzycompute(SparkExecuteStatementOperation.scala:69)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.resultSchema(SparkExecuteStatementOperation.scala:64)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.getResultSetSchema(SparkExecuteStatementOperation.scala:158)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationResultSetSchema(OperationManager.java:209)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.getResultSetMetadata(HiveSessionImpl.java:773)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   ... 18 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28542) Document Stages page

2019-09-08 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28542:
---

Assignee: Pablo Langa Blanco

> Document Stages page
> 
>
> Key: SPARK-28542
> URL: https://issues.apache.org/jira/browse/SPARK-28542
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Pablo Langa Blanco
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28912) MatchError exception in CheckpointWriteHandler

2019-09-07 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-28912:

Fix Version/s: (was: 2.4.5)

> MatchError exception in CheckpointWriteHandler
> --
>
> Key: SPARK-28912
> URL: https://issues.apache.org/jira/browse/SPARK-28912
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.2, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4
>Reporter: Aleksandr Kashkirov
>Assignee: Aleksandr Kashkirov
>Priority: Minor
> Fix For: 3.0.0
>
>
> Setting checkpoint directory name to "checkpoint-" plus some digits (e.g. 
> "checkpoint-01") results in the following error:
> {code:java}
> Exception in thread "pool-32-thread-1" scala.MatchError: 
> 0523a434-0daa-4ea6-a050-c4eb3c557d8c (of class java.lang.String) 
>  at 
> org.apache.spark.streaming.Checkpoint$.org$apache$spark$streaming$Checkpoint$$sortFunc$1(Checkpoint.scala:121)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at scala.math.Ordering$$anon$9.compare(Ordering.scala:200) 
>  at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) 
>  at java.util.TimSort.sort(TimSort.java:234) 
>  at java.util.Arrays.sort(Arrays.java:1438) 
>  at scala.collection.SeqLike$class.sorted(SeqLike.scala:648) 
>  at scala.collection.mutable.ArrayOps$ofRef.sorted(ArrayOps.scala:186) 
>  at scala.collection.SeqLike$class.sortWith(SeqLike.scala:601) 
>  at scala.collection.mutable.ArrayOps$ofRef.sortWith(ArrayOps.scala:186) 
>  at 
> org.apache.spark.streaming.Checkpoint$.getCheckpointFiles(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:262)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
>  at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28935) Document SQL metrics for Details for Query Plan

2019-09-06 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28935.
-
Fix Version/s: 3.0.0
 Assignee: Liang-Chi Hsieh
   Resolution: Fixed

> Document SQL metrics for Details for Query Plan
> ---
>
> Key: SPARK-28935
> URL: https://issues.apache.org/jira/browse/SPARK-28935
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Liang-Chi Hsieh
>Priority: Major
> Fix For: 3.0.0
>
>
> [https://github.com/apache/spark/pull/25349] shows the query plans but it 
> does not describe the meaning of each metric in the plan. For end users, they 
> might not understand the meaning of the metrics we output. 
>  
> !https://user-images.githubusercontent.com/7322292/62421634-9d9c4980-b6d7-11e9-8e31-1e6ba9b402e8.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29002) Avoid changing SMJ to BHJ if the build side has a high ratio of empty partitions

2019-09-06 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29002.
-
Fix Version/s: 3.0.0
 Assignee: Maryann Xue
   Resolution: Fixed

> Avoid changing SMJ to BHJ if the build side has a high ratio of empty 
> partitions
> 
>
> Key: SPARK-29002
> URL: https://issues.apache.org/jira/browse/SPARK-29002
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28802) Document DESCRIBE DATABASE in SQL Reference.

2019-09-05 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-28802:
---

Assignee: kevin yu

> Document DESCRIBE DATABASE in SQL Reference.
> 
>
> Key: SPARK-28802
> URL: https://issues.apache.org/jira/browse/SPARK-28802
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: kevin yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28802) Document DESCRIBE DATABASE in SQL Reference.

2019-09-05 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28802.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> Document DESCRIBE DATABASE in SQL Reference.
> 
>
> Key: SPARK-28802
> URL: https://issues.apache.org/jira/browse/SPARK-28802
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Assignee: kevin yu
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28830) Document UNCACHE TABLE in SQL Reference

2019-09-04 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-28830.
-
Fix Version/s: 3.0.0
 Assignee: Huaxin Gao
   Resolution: Fixed

> Document UNCACHE TABLE in SQL Reference
> ---
>
> Key: SPARK-28830
> URL: https://issues.apache.org/jira/browse/SPARK-28830
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >