[jira] [Commented] (SPARK-41270) Add Catalog tableExists and namespaceExists in Connect proto

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638805#comment-17638805
 ] 

Apache Spark commented on SPARK-41270:
--

User 'amaliujia' has created a pull request for this issue:
https://github.com/apache/spark/pull/38807

> Add Catalog tableExists and namespaceExists in Connect proto
> 
>
> Key: SPARK-41270
> URL: https://issues.apache.org/jira/browse/SPARK-41270
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41270) Add Catalog tableExists and namespaceExists in Connect proto

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41270:


Assignee: Apache Spark

> Add Catalog tableExists and namespaceExists in Connect proto
> 
>
> Key: SPARK-41270
> URL: https://issues.apache.org/jira/browse/SPARK-41270
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41270) Add Catalog tableExists and namespaceExists in Connect proto

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638804#comment-17638804
 ] 

Apache Spark commented on SPARK-41270:
--

User 'amaliujia' has created a pull request for this issue:
https://github.com/apache/spark/pull/38807

> Add Catalog tableExists and namespaceExists in Connect proto
> 
>
> Key: SPARK-41270
> URL: https://issues.apache.org/jira/browse/SPARK-41270
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41270) Add Catalog tableExists and namespaceExists in Connect proto

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41270:


Assignee: (was: Apache Spark)

> Add Catalog tableExists and namespaceExists in Connect proto
> 
>
> Key: SPARK-41270
> URL: https://issues.apache.org/jira/browse/SPARK-41270
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41270) Add Catalog tableExists and namespaceExists in Connect proto

2022-11-25 Thread Rui Wang (Jira)
Rui Wang created SPARK-41270:


 Summary: Add Catalog tableExists and namespaceExists in Connect 
proto
 Key: SPARK-41270
 URL: https://issues.apache.org/jira/browse/SPARK-41270
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41269) Move image matrix into version's workflow

2022-11-25 Thread Yikun Jiang (Jira)
Yikun Jiang created SPARK-41269:
---

 Summary: Move image matrix into version's workflow
 Key: SPARK-41269
 URL: https://issues.apache.org/jira/browse/SPARK-41269
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 3.4.0
Reporter: Yikun Jiang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41255) RemoteSparkSession should be called SparkSession

2022-11-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-41255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-41255.
---
Fix Version/s: 3.4.0
 Assignee: Martin Grund
   Resolution: Fixed

> RemoteSparkSession should be called SparkSession
> 
>
> Key: SPARK-41255
> URL: https://issues.apache.org/jira/browse/SPARK-41255
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Assignee: Martin Grund
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41268) Refactor "Column" for API Compatibility

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41268:


Assignee: Apache Spark

> Refactor "Column" for API Compatibility 
> 
>
> Key: SPARK-41268
> URL: https://issues.apache.org/jira/browse/SPARK-41268
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41268) Refactor "Column" for API Compatibility

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41268:


Assignee: (was: Apache Spark)

> Refactor "Column" for API Compatibility 
> 
>
> Key: SPARK-41268
> URL: https://issues.apache.org/jira/browse/SPARK-41268
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41268) Refactor "Column" for API Compatibility

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638776#comment-17638776
 ] 

Apache Spark commented on SPARK-41268:
--

User 'amaliujia' has created a pull request for this issue:
https://github.com/apache/spark/pull/38806

> Refactor "Column" for API Compatibility 
> 
>
> Key: SPARK-41268
> URL: https://issues.apache.org/jira/browse/SPARK-41268
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41268) Refactor "Column" for API Compatibility

2022-11-25 Thread Rui Wang (Jira)
Rui Wang created SPARK-41268:


 Summary: Refactor "Column" for API Compatibility 
 Key: SPARK-41268
 URL: https://issues.apache.org/jira/browse/SPARK-41268
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41267) Add unpivot / melt to SparkR

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638745#comment-17638745
 ] 

Apache Spark commented on SPARK-41267:
--

User 'zero323' has created a pull request for this issue:
https://github.com/apache/spark/pull/38804

> Add unpivot / melt to SparkR
> 
>
> Key: SPARK-41267
> URL: https://issues.apache.org/jira/browse/SPARK-41267
> Project: Spark
>  Issue Type: Improvement
>  Components: R, SQL
>Affects Versions: 3.4.0
>Reporter: Maciej Szymkiewicz
>Priority: Major
>
> Unpivot / melt operations have been implemented for Scala {{Dataset}} and 
> core Python {{{}DataFrame{}}}, but are missing from SparkR. We should add 
> these to achieve feature parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41267) Add unpivot / melt to SparkR

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41267:


Assignee: (was: Apache Spark)

> Add unpivot / melt to SparkR
> 
>
> Key: SPARK-41267
> URL: https://issues.apache.org/jira/browse/SPARK-41267
> Project: Spark
>  Issue Type: Improvement
>  Components: R, SQL
>Affects Versions: 3.4.0
>Reporter: Maciej Szymkiewicz
>Priority: Major
>
> Unpivot / melt operations have been implemented for Scala {{Dataset}} and 
> core Python {{{}DataFrame{}}}, but are missing from SparkR. We should add 
> these to achieve feature parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41267) Add unpivot / melt to SparkR

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41267:


Assignee: Apache Spark

> Add unpivot / melt to SparkR
> 
>
> Key: SPARK-41267
> URL: https://issues.apache.org/jira/browse/SPARK-41267
> Project: Spark
>  Issue Type: Improvement
>  Components: R, SQL
>Affects Versions: 3.4.0
>Reporter: Maciej Szymkiewicz
>Assignee: Apache Spark
>Priority: Major
>
> Unpivot / melt operations have been implemented for Scala {{Dataset}} and 
> core Python {{{}DataFrame{}}}, but are missing from SparkR. We should add 
> these to achieve feature parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41267) Add unpivot / melt to SparkR

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638743#comment-17638743
 ] 

Apache Spark commented on SPARK-41267:
--

User 'zero323' has created a pull request for this issue:
https://github.com/apache/spark/pull/38804

> Add unpivot / melt to SparkR
> 
>
> Key: SPARK-41267
> URL: https://issues.apache.org/jira/browse/SPARK-41267
> Project: Spark
>  Issue Type: Improvement
>  Components: R, SQL
>Affects Versions: 3.4.0
>Reporter: Maciej Szymkiewicz
>Priority: Major
>
> Unpivot / melt operations have been implemented for Scala {{Dataset}} and 
> core Python {{{}DataFrame{}}}, but are missing from SparkR. We should add 
> these to achieve feature parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41267) Add unpivot / melt to SparkR

2022-11-25 Thread Maciej Szymkiewicz (Jira)
Maciej Szymkiewicz created SPARK-41267:
--

 Summary: Add unpivot / melt to SparkR
 Key: SPARK-41267
 URL: https://issues.apache.org/jira/browse/SPARK-41267
 Project: Spark
  Issue Type: Improvement
  Components: R, SQL
Affects Versions: 3.4.0
Reporter: Maciej Szymkiewicz


Unpivot / melt operations have been implemented for Scala {{Dataset}} and core 
Python {{{}DataFrame{}}}, but are missing from SparkR. We should add these to 
achieve feature parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-41236) The renamed field name cannot be recognized after group filtering

2022-11-25 Thread huldar chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638626#comment-17638626
 ] 

huldar chen edited comment on SPARK-41236 at 11/25/22 5:53 PM:
---

If I fix it according to my idea, it will cause 2 closed jira issues to 
reappear.

SPARK-31663 and SPARK-31519.

This may involve knowledge of SQL standards, and I am not good at it here.I 
don't think I can fix this bug.:(
h4. [jingxiong 
zhong|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhongjingxiong]


was (Author: huldar):
If I fix it according to my idea, it will cause 2 closed jira issues to 
reappear.

SPARK-31663 and SPARK-31663.

This may involve knowledge of SQL standards, and I am not good at it here.I 
don't think I can fix this bug.:(
h4. [jingxiong 
zhong|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhongjingxiong]

> The renamed field name cannot be recognized after group filtering
> -
>
> Key: SPARK-41236
> URL: https://issues.apache.org/jira/browse/SPARK-41236
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jingxiong zhong
>Priority: Major
>
> {code:java}
> select collect_set(age) as age
> from db_table.table1
> group by name
> having size(age) > 1 
> {code}
> a simple sql, it work well in spark2.4, but doesn't work in spark3.2.0
> Is it a bug or a new standard?
> h3. *like this:*
> {code:sql}
> create db1.table1(age int, name string);
> insert into db1.table1 values(1, 'a');
> insert into db1.table1 values(2, 'b');
> insert into db1.table1 values(3, 'c');
> --then run sql like this 
> select collect_set(age) as age from db1.table1 group by name having size(age) 
> > 1 ;
> {code}
> h3. Stack Information
> org.apache.spark.sql.AnalysisException: cannot resolve 'age' given input 
> columns: [age]; line 4 pos 12;
> 'Filter (size('age, true) > 1)
> +- Aggregate [name#2], [collect_set(age#1, 0, 0) AS age#0]
>+- SubqueryAlias spark_catalog.db1.table1
>   +- HiveTableRelation [`db1`.`table1`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [age#1, name#2], 
> Partition Cols: []]
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:54)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:179)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$2(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1128)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1127)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:467)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren(TreeNode.scala:1154)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren$(TreeNode.scala:1153)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.mapChildren(Expression.scala:555)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:323)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.tr

[jira] [Commented] (SPARK-41114) Support local data for LocalRelation

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638734#comment-17638734
 ] 

Apache Spark commented on SPARK-41114:
--

User 'grundprinzip' has created a pull request for this issue:
https://github.com/apache/spark/pull/38803

> Support local data for LocalRelation
> 
>
> Key: SPARK-41114
> URL: https://issues.apache.org/jira/browse/SPARK-41114
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Deng Ziming
>Assignee: Deng Ziming
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40539) PySpark readwriter API parity for Spark Connect

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638666#comment-17638666
 ] 

Apache Spark commented on SPARK-40539:
--

User 'grundprinzip' has created a pull request for this issue:
https://github.com/apache/spark/pull/38801

> PySpark readwriter API parity for Spark Connect
> ---
>
> Key: SPARK-40539
> URL: https://issues.apache.org/jira/browse/SPARK-40539
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Assignee: Rui Wang
>Priority: Major
> Fix For: 3.4.0
>
>
> Spark Connect / PySpark ReadWriter parity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41266) Spark does not parse timestamp strings when using the IN operator

2022-11-25 Thread Laurens Versluis (Jira)
Laurens Versluis created SPARK-41266:


 Summary: Spark does not parse timestamp strings when using the IN 
operator
 Key: SPARK-41266
 URL: https://issues.apache.org/jira/browse/SPARK-41266
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.1
 Environment: Windows 10, Spark 3.2.1 with Java 11
Reporter: Laurens Versluis


Likely affects more versions, tested only with 3.2.1.

 

Summary:

Spark will convert a timestamp string to a timestamp when using the equal 
operator (=), yet won't do this when using the IN operator.

 

Details:

While debugging an issue why we got no results on a query, we found out that 
when using the equal symbol `=` in the WHERE clause combined with a 
TimeStampType column that Spark will convert the string to a timestamp and 
filter.

However, when using the IN operator (our query), it will not do so, and perform 
a cast to string. We expected the behavior to be similar, or at least that 
Spark realizes the IN clause operates on a TimeStampType column and thus 
attempts to convert to timestamp first before falling back to string comparison.

 

*Minimal reproducible example:*

Suppose we have a one-line dataset with the follow contents and schema:

 
{noformat}
++
|starttime   |
++
|2019-08-11 19:33:05         |
++
root
 |-- starttime: timestamp (nullable = true){noformat}
Then if we fire the following queries, we will not get results for the 
IN-clause one using a timestamp string with timezone information:

 

 
{code:java}
// Works - Spark casts the argument to a string and the internal representation 
of the time seems to match it...
singleCol.filter("starttime IN ('2019-08-11 19:33:05')").show();
// Works
singleCol.filter("starttime = '2019-08-11 19:33:05'").show();
// Works
singleCol.filter("starttime = '2019-08-11T19:33:05Z'").show();
// Doesn't work
singleCol.filter("starttime IN ('2019-08-11T19:33:05Z')").show();
//Works
singleCol.filter("starttime IN (to_timestamp('2019-08-11T19:33:05Z'))").show(); 
{code}
 

We can see from the output that a cast to string is taking place:
{noformat}
[...] isnotnull(starttime#59),(cast(starttime#59 as string) = 2019-08-11 
19:33:05){noformat}
Since the = operator does work, it would be consistent if operators such as the 
IN operator would have similar, consistent behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33001) Why am I receiving this warning?

2022-11-25 Thread geekyouth (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638647#comment-17638647
 ] 

geekyouth commented on SPARK-33001:
---

i have set "spark-3.0.0-bin-hadoop3.2\conf\spark-defaults.conf" as follow:
{code:java}
spark.executor.processTreeMetrics.enabled false {code}
then restart terminal,run  spark-shell, but the warn rise again

 

> Why am I receiving this warning?
> 
>
> Key: SPARK-33001
> URL: https://issues.apache.org/jira/browse/SPARK-33001
> Project: Spark
>  Issue Type: Question
>  Components: Spark Core
>Affects Versions: 3.0.1
>Reporter: George Fotopoulos
>Priority: Major
>
> I am running Apache Spark Core using Scala 2.12.12 on IntelliJ IDEA 2020.2 
> with Docker 2.3.0.5
> I am running Windows 10 build 2004
> Can somebody explain me why am I receiving this warning and what can I do 
> about it?
> I tried googling this warning but, all I found was people asking about it and 
> no answers.
> [screenshot|https://user-images.githubusercontent.com/1548352/94319642-c8102c80-ff93-11ea-9fea-f58de8da2268.png]
> {code:scala}
> WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a 
> result reporting of ProcessTree metrics is stopped
> {code}
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-41236) The renamed field name cannot be recognized after group filtering

2022-11-25 Thread huldar chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638626#comment-17638626
 ] 

huldar chen edited comment on SPARK-41236 at 11/25/22 10:49 AM:


If I fix it according to my idea, it will cause 2 closed jira issues to 
reappear.

SPARK-31663 and SPARK-31663.

This may involve knowledge of SQL standards, and I am not good at it here.I 
don't think I can fix this bug.:(
h4. [jingxiong 
zhong|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhongjingxiong]


was (Author: huldar):
If I fix it according to my idea, it will cause 2 closed jira issues to 
reappear.

SPARK-31663 and SPARK-31663.

This may involve knowledge of SQL standards, and I am not good at it here.I 
don't think I can fix this bug.:(

> The renamed field name cannot be recognized after group filtering
> -
>
> Key: SPARK-41236
> URL: https://issues.apache.org/jira/browse/SPARK-41236
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jingxiong zhong
>Priority: Major
>
> {code:java}
> select collect_set(age) as age
> from db_table.table1
> group by name
> having size(age) > 1 
> {code}
> a simple sql, it work well in spark2.4, but doesn't work in spark3.2.0
> Is it a bug or a new standard?
> h3. *like this:*
> {code:sql}
> create db1.table1(age int, name string);
> insert into db1.table1 values(1, 'a');
> insert into db1.table1 values(2, 'b');
> insert into db1.table1 values(3, 'c');
> --then run sql like this 
> select collect_set(age) as age from db1.table1 group by name having size(age) 
> > 1 ;
> {code}
> h3. Stack Information
> org.apache.spark.sql.AnalysisException: cannot resolve 'age' given input 
> columns: [age]; line 4 pos 12;
> 'Filter (size('age, true) > 1)
> +- Aggregate [name#2], [collect_set(age#1, 0, 0) AS age#0]
>+- SubqueryAlias spark_catalog.db1.table1
>   +- HiveTableRelation [`db1`.`table1`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [age#1, name#2], 
> Partition Cols: []]
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:54)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:179)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$2(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1128)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1127)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:467)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren(TreeNode.scala:1154)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren$(TreeNode.scala:1153)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.mapChildren(Expression.scala:555)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:323)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst

[jira] [Commented] (SPARK-41236) The renamed field name cannot be recognized after group filtering

2022-11-25 Thread huldar chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638626#comment-17638626
 ] 

huldar chen commented on SPARK-41236:
-

If I fix it according to my idea, it will cause 2 closed jira issues to 
reappear.

SPARK-31663 and SPARK-31663.

This may involve knowledge of SQL standards, and I am not good at it here.I 
don't think I can fix this bug.:(

> The renamed field name cannot be recognized after group filtering
> -
>
> Key: SPARK-41236
> URL: https://issues.apache.org/jira/browse/SPARK-41236
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jingxiong zhong
>Priority: Major
>
> {code:java}
> select collect_set(age) as age
> from db_table.table1
> group by name
> having size(age) > 1 
> {code}
> a simple sql, it work well in spark2.4, but doesn't work in spark3.2.0
> Is it a bug or a new standard?
> h3. *like this:*
> {code:sql}
> create db1.table1(age int, name string);
> insert into db1.table1 values(1, 'a');
> insert into db1.table1 values(2, 'b');
> insert into db1.table1 values(3, 'c');
> --then run sql like this 
> select collect_set(age) as age from db1.table1 group by name having size(age) 
> > 1 ;
> {code}
> h3. Stack Information
> org.apache.spark.sql.AnalysisException: cannot resolve 'age' given input 
> columns: [age]; line 4 pos 12;
> 'Filter (size('age, true) > 1)
> +- Aggregate [name#2], [collect_set(age#1, 0, 0) AS age#0]
>+- SubqueryAlias spark_catalog.db1.table1
>   +- HiveTableRelation [`db1`.`table1`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [age#1, name#2], 
> Partition Cols: []]
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:54)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:179)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$2(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1128)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1127)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:467)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren(TreeNode.scala:1154)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren$(TreeNode.scala:1153)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.mapChildren(Expression.scala:555)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:323)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:161)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:94)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:263)
>   a

[jira] [Updated] (SPARK-41262) Enable canChangeCachedPlanOutputPartitioning by default

2022-11-25 Thread XiDuo You (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiDuo You updated SPARK-41262:
--
Description: Remove the `internal` tag of 
`spark.sql.optimizer.canChangeCachedPlanOutputPartitioning`, and tune it from 
false to true by default to make AQE work with cached plan.  (was: Tune 
spark.sql.optimizer.canChangeCachedPlanOutputPartitioning from false to true by 
default to make AQE work with cached plan.)

> Enable canChangeCachedPlanOutputPartitioning by default
> ---
>
> Key: SPARK-41262
> URL: https://issues.apache.org/jira/browse/SPARK-41262
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: XiDuo You
>Priority: Major
>
> Remove the `internal` tag of 
> `spark.sql.optimizer.canChangeCachedPlanOutputPartitioning`, and tune it from 
> false to true by default to make AQE work with cached plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41240) Upgrade Protobuf from 3.19.4 to 3.19.5

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41240.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38774
[https://github.com/apache/spark/pull/38774]

> Upgrade Protobuf from 3.19.4 to 3.19.5
> --
>
> Key: SPARK-41240
> URL: https://issues.apache.org/jira/browse/SPARK-41240
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build, Connect
>Affects Versions: 3.4.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
> Fix For: 3.4.0
>
>
> [CVE-2022-1941|https://nvd.nist.gov/vuln/detail/CVE-2022-1941]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41265) Check and upgrade buf.build/protocolbuffers/plugins/python to 3.19.5

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng updated SPARK-41265:
--
Description: 
https://github.com/apache/spark/pull/38774

https://buf.build/protocolbuffers/plugins/python

when there is new version >= 3.19.5, should upgrade then

  was:
https://github.com/apache/spark/pull/38774

when there is new version >= 3.19.5, should upgrade then


> Check and upgrade buf.build/protocolbuffers/plugins/python to 3.19.5
> 
>
> Key: SPARK-41265
> URL: https://issues.apache.org/jira/browse/SPARK-41265
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Project Infra
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>
> https://github.com/apache/spark/pull/38774
> https://buf.build/protocolbuffers/plugins/python
> when there is new version >= 3.19.5, should upgrade then



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41240) Upgrade Protobuf from 3.19.4 to 3.19.5

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-41240:
-

Assignee: Bjørn Jørgensen

> Upgrade Protobuf from 3.19.4 to 3.19.5
> --
>
> Key: SPARK-41240
> URL: https://issues.apache.org/jira/browse/SPARK-41240
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build, Connect
>Affects Versions: 3.4.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
>
> [CVE-2022-1941|https://nvd.nist.gov/vuln/detail/CVE-2022-1941]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41265) Check and upgrade buf.build/protocolbuffers/plugins/python to 3.19.5

2022-11-25 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-41265:
-

 Summary: Check and upgrade 
buf.build/protocolbuffers/plugins/python to 3.19.5
 Key: SPARK-41265
 URL: https://issues.apache.org/jira/browse/SPARK-41265
 Project: Spark
  Issue Type: Sub-task
  Components: Connect, Project Infra
Affects Versions: 3.4.0
Reporter: Ruifeng Zheng


https://github.com/apache/spark/pull/38774

when there is new version >= 3.19.5, should upgrade then



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41245) Upgrade postgresql from 42.5.0 to 42.5.1

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41245.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38791
[https://github.com/apache/spark/pull/38791]

> Upgrade postgresql from 42.5.0 to 42.5.1
> 
>
> Key: SPARK-41245
> URL: https://issues.apache.org/jira/browse/SPARK-41245
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
> Fix For: 3.4.0
>
>
> [CVE-2022-41946|https://nvd.nist.gov/vuln/detail/CVE-2022-41946]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41245) Upgrade postgresql from 42.5.0 to 42.5.1

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-41245:
-

Assignee: Bjørn Jørgensen

> Upgrade postgresql from 42.5.0 to 42.5.1
> 
>
> Key: SPARK-41245
> URL: https://issues.apache.org/jira/browse/SPARK-41245
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
>
> [CVE-2022-41946|https://nvd.nist.gov/vuln/detail/CVE-2022-41946]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41252) Upgrade arrow from 10.0.0 to 10.0.1

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-41252:
-

Assignee: BingKun Pan

> Upgrade arrow from 10.0.0 to 10.0.1
> ---
>
> Key: SPARK-41252
> URL: https://issues.apache.org/jira/browse/SPARK-41252
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41252) Upgrade arrow from 10.0.0 to 10.0.1

2022-11-25 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41252.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38788
[https://github.com/apache/spark/pull/38788]

> Upgrade arrow from 10.0.0 to 10.0.1
> ---
>
> Key: SPARK-41252
> URL: https://issues.apache.org/jira/browse/SPARK-41252
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41264) Make Literal support more datatypes

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638567#comment-17638567
 ] 

Apache Spark commented on SPARK-41264:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38800

> Make Literal support more datatypes
> ---
>
> Key: SPARK-41264
> URL: https://issues.apache.org/jira/browse/SPARK-41264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41264) Make Literal support more datatypes

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638565#comment-17638565
 ] 

Apache Spark commented on SPARK-41264:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38800

> Make Literal support more datatypes
> ---
>
> Key: SPARK-41264
> URL: https://issues.apache.org/jira/browse/SPARK-41264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41264) Make Literal support more datatypes

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41264:


Assignee: (was: Apache Spark)

> Make Literal support more datatypes
> ---
>
> Key: SPARK-41264
> URL: https://issues.apache.org/jira/browse/SPARK-41264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41264) Make Literal support more datatypes

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41264:


Assignee: Apache Spark

> Make Literal support more datatypes
> ---
>
> Key: SPARK-41264
> URL: https://issues.apache.org/jira/browse/SPARK-41264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] (SPARK-41236) The renamed field name cannot be recognized after group filtering

2022-11-25 Thread huldar chen (Jira)


[ https://issues.apache.org/jira/browse/SPARK-41236 ]


huldar chen deleted comment on SPARK-41236:
-

was (Author: huldar):
ok, let's make it my first pr:)

> The renamed field name cannot be recognized after group filtering
> -
>
> Key: SPARK-41236
> URL: https://issues.apache.org/jira/browse/SPARK-41236
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jingxiong zhong
>Priority: Major
>
> {code:java}
> select collect_set(age) as age
> from db_table.table1
> group by name
> having size(age) > 1 
> {code}
> a simple sql, it work well in spark2.4, but doesn't work in spark3.2.0
> Is it a bug or a new standard?
> h3. *like this:*
> {code:sql}
> create db1.table1(age int, name string);
> insert into db1.table1 values(1, 'a');
> insert into db1.table1 values(2, 'b');
> insert into db1.table1 values(3, 'c');
> --then run sql like this 
> select collect_set(age) as age from db1.table1 group by name having size(age) 
> > 1 ;
> {code}
> h3. Stack Information
> org.apache.spark.sql.AnalysisException: cannot resolve 'age' given input 
> columns: [age]; line 4 pos 12;
> 'Filter (size('age, true) > 1)
> +- Aggregate [name#2], [collect_set(age#1, 0, 0) AS age#0]
>+- SubqueryAlias spark_catalog.db1.table1
>   +- HiveTableRelation [`db1`.`table1`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [age#1, name#2], 
> Partition Cols: []]
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:54)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:179)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$2(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1128)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1127)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:467)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren(TreeNode.scala:1154)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren$(TreeNode.scala:1153)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.mapChildren(Expression.scala:555)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:323)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:161)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:94)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:263)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:94)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:91)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAna

[jira] [Created] (SPARK-41264) Make Literal support more datatypes

2022-11-25 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-41264:
-

 Summary: Make Literal support more datatypes
 Key: SPARK-41264
 URL: https://issues.apache.org/jira/browse/SPARK-41264
 Project: Spark
  Issue Type: Sub-task
  Components: Connect, PySpark
Affects Versions: 3.4.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41263) Upgrade buf to v1.9.0

2022-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638556#comment-17638556
 ] 

Apache Spark commented on SPARK-41263:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38797

> Upgrade buf to v1.9.0
> -
>
> Key: SPARK-41263
> URL: https://issues.apache.org/jira/browse/SPARK-41263
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Project Infra
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41263) Upgrade buf to v1.9.0

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41263:


Assignee: Apache Spark

> Upgrade buf to v1.9.0
> -
>
> Key: SPARK-41263
> URL: https://issues.apache.org/jira/browse/SPARK-41263
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Project Infra
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41263) Upgrade buf to v1.9.0

2022-11-25 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41263:


Assignee: (was: Apache Spark)

> Upgrade buf to v1.9.0
> -
>
> Key: SPARK-41263
> URL: https://issues.apache.org/jira/browse/SPARK-41263
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Project Infra
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41263) Upgrade buf to v1.9.0

2022-11-25 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-41263:
-

 Summary: Upgrade buf to v1.9.0
 Key: SPARK-41263
 URL: https://issues.apache.org/jira/browse/SPARK-41263
 Project: Spark
  Issue Type: Sub-task
  Components: Connect, Project Infra
Affects Versions: 3.4.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41236) The renamed field name cannot be recognized after group filtering

2022-11-25 Thread huldar chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638551#comment-17638551
 ] 

huldar chen commented on SPARK-41236:
-

ok, let's make it my first pr:)

> The renamed field name cannot be recognized after group filtering
> -
>
> Key: SPARK-41236
> URL: https://issues.apache.org/jira/browse/SPARK-41236
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jingxiong zhong
>Priority: Major
>
> {code:java}
> select collect_set(age) as age
> from db_table.table1
> group by name
> having size(age) > 1 
> {code}
> a simple sql, it work well in spark2.4, but doesn't work in spark3.2.0
> Is it a bug or a new standard?
> h3. *like this:*
> {code:sql}
> create db1.table1(age int, name string);
> insert into db1.table1 values(1, 'a');
> insert into db1.table1 values(2, 'b');
> insert into db1.table1 values(3, 'c');
> --then run sql like this 
> select collect_set(age) as age from db1.table1 group by name having size(age) 
> > 1 ;
> {code}
> h3. Stack Information
> org.apache.spark.sql.AnalysisException: cannot resolve 'age' given input 
> columns: [age]; line 4 pos 12;
> 'Filter (size('age, true) > 1)
> +- Aggregate [name#2], [collect_set(age#1, 0, 0) AS age#0]
>+- SubqueryAlias spark_catalog.db1.table1
>   +- HiveTableRelation [`db1`.`table1`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [age#1, name#2], 
> Partition Cols: []]
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:54)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:179)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$2(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:535)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1128)
>   at 
> org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1127)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:467)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren(TreeNode.scala:1154)
>   at 
> org.apache.spark.sql.catalyst.trees.BinaryLike.mapChildren$(TreeNode.scala:1153)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.mapChildren(Expression.scala:555)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:532)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:323)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:214)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:181)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:161)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:175)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:94)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:263)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:94)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.s