[GitHub] [spark] Peng-Lei commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784626408 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -1054,6 +1054,15 @@ trait ShowCreateTableCommandBase {

[GitHub] [spark] ulysses-you opened a new pull request #35208: [SPARK-37904][SQL] Improve RebalancePartitions in rules of Optimizer

2022-01-13 Thread GitBox
ulysses-you opened a new pull request #35208: URL: https://github.com/apache/spark/pull/35208 ### What changes were proposed in this pull request? Improve `RebalancePartitions` in following rules: - `NestedColumnAliasing` - `CollapseRepartition ` - `EliminateSorts`

[GitHub] [spark] AngersZhuuuu commented on pull request #35207: [SPARK-37907][SQL] StaticInvoke support ConstantFolding

2022-01-13 Thread GitBox
AngersZh commented on pull request #35207: URL: https://github.com/apache/spark/pull/35207#issuecomment-1012846128 > shall we fix `Invoke` as well? Done, need to add UT for another two case? -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] viirya commented on a change in pull request #35068: [SPARK-37896][SQL] Implement a ConstantColumnVector and improve performance of the hidden file metadata

2022-01-13 Thread GitBox
viirya commented on a change in pull request #35068: URL: https://github.com/apache/spark/pull/35068#discussion_r784614079 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java ## @@ -0,0 +1,264 @@ +/* + * Licensed to the

[GitHub] [spark] attilapiros closed pull request #34234: [SPARK-36967][CORE] Report accurate shuffle block size if its skewed

2022-01-13 Thread GitBox
attilapiros closed pull request #34234: URL: https://github.com/apache/spark/pull/34234 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] viirya commented on a change in pull request #35068: [SPARK-37896][SQL] Implement a ConstantColumnVector and improve performance of the hidden file metadata

2022-01-13 Thread GitBox
viirya commented on a change in pull request #35068: URL: https://github.com/apache/spark/pull/35068#discussion_r784606503 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java ## @@ -0,0 +1,264 @@ +/* + * Licensed to the

[GitHub] [spark] viirya commented on a change in pull request #35068: [SPARK-37896][SQL] Implement a ConstantColumnVector and improve performance of the hidden file metadata

2022-01-13 Thread GitBox
viirya commented on a change in pull request #35068: URL: https://github.com/apache/spark/pull/35068#discussion_r784606028 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java ## @@ -0,0 +1,264 @@ +/* + * Licensed to the

[GitHub] [spark] viirya commented on a change in pull request #35068: [SPARK-37896][SQL] Implement a ConstantColumnVector and improve performance of the hidden file metadata

2022-01-13 Thread GitBox
viirya commented on a change in pull request #35068: URL: https://github.com/apache/spark/pull/35068#discussion_r784603695 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java ## @@ -0,0 +1,264 @@ +/* + * Licensed to the

[GitHub] [spark] viirya commented on a change in pull request #35068: [SPARK-37896][SQL] Implement a ConstantColumnVector and improve performance of the hidden file metadata

2022-01-13 Thread GitBox
viirya commented on a change in pull request #35068: URL: https://github.com/apache/spark/pull/35068#discussion_r784601008 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java ## @@ -0,0 +1,264 @@ +/* + * Licensed to the

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34848: [SPARK-37582][SPARK-37583][SQL] CONTAINS, STARTSWITH, ENDSWITH should support all data type

2022-01-13 Thread GitBox
AngersZh commented on a change in pull request #34848: URL: https://github.com/apache/spark/pull/34848#discussion_r784600216 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala ## @@ -450,22 +451,49 @@ case class

[GitHub] [spark] cloud-fan commented on pull request #35207: [SPARK-37907][SQL] StaticInvoke support ConstantFolding

2022-01-13 Thread GitBox
cloud-fan commented on pull request #35207: URL: https://github.com/apache/spark/pull/35207#issuecomment-1012837092 shall we fix `Invoke` as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AngersZhuuuu commented on pull request #35207: [SPARK-37907][SQL] StaticInvoke support ConstantFolding

2022-01-13 Thread GitBox
AngersZh commented on pull request #35207: URL: https://github.com/apache/spark/pull/35207#issuecomment-1012836061 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AngersZhuuuu opened a new pull request #35207: [SPARK-37907][SQL] StaticInvoke support ConstantFolding

2022-01-13 Thread GitBox
AngersZh opened a new pull request #35207: URL: https://github.com/apache/spark/pull/35207 ### What changes were proposed in this pull request? Currently, StaticInvoke not implement `foldable`, can't be optimized by ConstantFolding, this pr support this ### Why are the

[GitHub] [spark] beliefer commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
beliefer commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784565837 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala ## @@ -888,6 +889,182 @@ class

[GitHub] [spark] yaooqinn commented on pull request #35178: [WIP][SPARK-37877][SQL] Support clear shuffle dependencies eagerly for thrift server

2022-01-13 Thread GitBox
yaooqinn commented on pull request #35178: URL: https://github.com/apache/spark/pull/35178#issuecomment-1012808007 Still in POC, any inputs are welcome :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] yaooqinn commented on a change in pull request #35178: [WIP][SPARK-37877][SQL] Support clear shuffle dependencies eagerly for thrift server

2022-01-13 Thread GitBox
yaooqinn commented on a change in pull request #35178: URL: https://github.com/apache/spark/pull/35178#discussion_r784533731 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala ## @@ -288,10 +293,12 @@

[GitHub] [spark] itholic commented on pull request #34940: [PYTHON] Use raise ... from instead of simply raise where applicable

2022-01-13 Thread GitBox
itholic commented on pull request #34940: URL: https://github.com/apache/spark/pull/34940#issuecomment-1012807312 Just out of curious, could you tell why this "provides better tracebacks in tools such as Sentry" ? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] yaooqinn commented on a change in pull request #35178: [WIP][SPARK-37877][SQL] Support clear shuffle dependencies eagerly for thrift server

2022-01-13 Thread GitBox
yaooqinn commented on a change in pull request #35178: URL: https://github.com/apache/spark/pull/35178#discussion_r784531321 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala ## @@ -332,6 +339,10 @@

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34848: [SPARK-37582][SPARK-37583][SQL] CONTAINS, STARTSWITH, ENDSWITH should support all data type

2022-01-13 Thread GitBox
AngersZh commented on a change in pull request #34848: URL: https://github.com/apache/spark/pull/34848#discussion_r784531099 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala ## @@ -450,22 +451,49 @@ case class

[GitHub] [spark] AngersZhuuuu opened a new pull request #35206: [SPARK-37906][SQL] spark-sql should not pass last comment to backend

2022-01-13 Thread GitBox
AngersZh opened a new pull request #35206: URL: https://github.com/apache/spark/pull/35206 ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/34815 we change back support unclosed bracketed comment to backend. But miss the case

[GitHub] [spark] itholic commented on pull request #35199: [SPARK-37902][PYTHON] Resolve typing issues detected by mypy==0.931

2022-01-13 Thread GitBox
itholic commented on pull request #35199: URL: https://github.com/apache/spark/pull/35199#issuecomment-1012800627 LGTM except https://github.com/apache/spark/pull/35199#discussion_r784439721 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] itholic commented on a change in pull request #35199: [SPARK-37902][PYTHON] Resolve typing issues detected by mypy==0.931

2022-01-13 Thread GitBox
itholic commented on a change in pull request #35199: URL: https://github.com/apache/spark/pull/35199#discussion_r784526473 ## File path: python/pyspark/sql/tests/test_pandas_udf_typehints_with_future_annotations.py ## @@ -19,7 +19,7 @@ import sys import unittest from

[GitHub] [spark] itholic commented on a change in pull request #35199: [SPARK-37902][PYTHON] Resolve typing issues detected by mypy==0.931

2022-01-13 Thread GitBox
itholic commented on a change in pull request #35199: URL: https://github.com/apache/spark/pull/35199#discussion_r784526473 ## File path: python/pyspark/sql/tests/test_pandas_udf_typehints_with_future_annotations.py ## @@ -19,7 +19,7 @@ import sys import unittest from

[GitHub] [spark] huaxingao commented on pull request #34914: [SPARK-37627][SQL][FOLLOWUP] Separate SortedBucketTransform from BucketTransform

2022-01-13 Thread GitBox
huaxingao commented on pull request #34914: URL: https://github.com/apache/spark/pull/34914#issuecomment-1012799610 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] huaxingao commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
huaxingao commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784522687 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowCreateTableExec.scala ## @@ -71,19 +75,39 @@ case class

[GitHub] [spark] Peng-Lei commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784521194 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -266,12 +266,15 @@ class

[GitHub] [spark] Peng-Lei commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784520723 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/V1Table.scala ## @@ -58,7 +59,7 @@ private[sql] case class

[GitHub] [spark] huaxingao commented on a change in pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
huaxingao commented on a change in pull request #35202: URL: https://github.com/apache/spark/pull/35202#discussion_r784519897 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropNamespaceExec.scala ## @@ -46,9 +46,17 @@ case class

[GitHub] [spark] beliefer commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
beliefer commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784511626 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] beliefer commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
beliefer commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784510828 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] Peng-Lei commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784509866 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/V1Table.scala ## @@ -58,7 +59,7 @@ private[sql] case class

[GitHub] [spark] beliefer commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
beliefer commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784507815 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784507786 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowCreateTableExec.scala ## @@ -71,19 +75,39 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784506250 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -266,12 +266,15 @@ class

[GitHub] [spark] cloud-fan commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784505264 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -1054,6 +1054,15 @@ trait ShowCreateTableCommandBase {

[GitHub] [spark] cloud-fan commented on a change in pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35204: URL: https://github.com/apache/spark/pull/35204#discussion_r784504955 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/V1Table.scala ## @@ -58,7 +59,7 @@ private[sql] case class

[GitHub] [spark] pralabhkumar commented on pull request #35191: [SPARK-37491][PYTHON]Fix Series.asof for unsorted values

2022-01-13 Thread GitBox
pralabhkumar commented on pull request #35191: URL: https://github.com/apache/spark/pull/35191#issuecomment-1012759452 @HyukjinKwon Will rebase and sync to latest master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] pralabhkumar commented on a change in pull request #35191: [SPARK-37491][PYTHON]Fix Series.asof for unsorted values

2022-01-13 Thread GitBox
pralabhkumar commented on a change in pull request #35191: URL: https://github.com/apache/spark/pull/35191#discussion_r784503236 ## File path: python/pyspark/pandas/series.py ## @@ -5228,22 +5228,62 @@ def asof(self, where: Union[Any, List]) -> Union[Scalar, "Series"]:

[GitHub] [spark] dongjoon-hyun closed pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun closed pull request #35205: URL: https://github.com/apache/spark/pull/35205 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35205: URL: https://github.com/apache/spark/pull/35205#issuecomment-1012745234 Thank you all! I'll merge this because this is irrelevant to the UTs. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784500632 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] gengliangwang commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
gengliangwang commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784498340 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
HyukjinKwon commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784496247 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] gengliangwang commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
gengliangwang commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784495201 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] Peng-Lei commented on pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei commented on pull request #35204: URL: https://github.com/apache/spark/pull/35204#issuecomment-1012732429 @cloud-fan @imback82 Could you take a look? Thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784491596 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784491470 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784491281 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
HyukjinKwon commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784490412 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784490728 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
HyukjinKwon commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784490412 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] gengliangwang commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
gengliangwang commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784489492 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784487190 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784487084 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #35205: URL: https://github.com/apache/spark/pull/35205#discussion_r784487035 ## File path: dev/merge_spark_pr.py ## @@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):

[GitHub] [spark] dongjoon-hyun commented on pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35205: URL: https://github.com/apache/spark/pull/35205#issuecomment-1012719263 cc @LuciferYang , @viirya , @gengliangwang , @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LuciferYang commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
LuciferYang commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012718577 It doesn't matter ~ :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun opened a new pull request #35205: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties

2022-01-13 Thread GitBox
dongjoon-hyun opened a new pull request #35205: URL: https://github.com/apache/spark/pull/35205 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[GitHub] [spark] beliefer commented on a change in pull request #35130: [SPARK-37839][SQL] DS V2 supports partial aggregate push-down `AVG`

2022-01-13 Thread GitBox
beliefer commented on a change in pull request #35130: URL: https://github.com/apache/spark/pull/35130#discussion_r784483389 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -88,25 +88,49 @@ object

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012715794 I'll make a PR to fix the bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun closed pull request #35181: [SPARK-37880][BUILD] Upgrade Scala to 2.13.8

2022-01-13 Thread GitBox
dongjoon-hyun closed pull request #35181: URL: https://github.com/apache/spark/pull/35181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun edited a comment on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012710793 Until now, I thought `Primary Author` is determined by the number of lines. However, tt turns out that it's just a number of commits.

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012710793 Until now, I thought `Primary Author` is determined by the number of lines. It turns out that it's just a number of commits.

[GitHub] [spark] Peng-Lei opened a new pull request #35204: [SPARK-37878][SQL] Migrate SHOW CREATE TABLE to use v2 command by default

2022-01-13 Thread GitBox
Peng-Lei opened a new pull request #35204: URL: https://github.com/apache/spark/pull/35204 ### What changes were proposed in this pull request? 1. Add `quoted(identifier: TableIdentifier)` to quoted the table name of V1 command(SHOW CREATE TABLE[AS SERDE]) to match V2

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012707127 In your branch, the ownership was clean line by line. Let me check the bug in our merge script. ``` 7fd361973d2 (Liang-Chi Hsieh 2021-12-23 19:41:02 -0800 272)

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012705323 Very sorry, @LuciferYang . The merge script didn't work like this so far. :( -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012704987 Uh, it looks wrong. Something goes wrong during merging via the merge script. I only changed one line here, but it causes the following. ``` Lead-authored-by:

[GitHub] [spark] dongjoon-hyun closed pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun closed pull request #35190: URL: https://github.com/apache/spark/pull/35190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
dongjoon-hyun commented on pull request #35190: URL: https://github.com/apache/spark/pull/35190#issuecomment-1012703926 Thank you all. Merged to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] cloud-fan closed pull request #35158: [SPARK-37859][SQL] Do not check for metadata during schema comparison

2022-01-13 Thread GitBox
cloud-fan closed pull request #35158: URL: https://github.com/apache/spark/pull/35158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #35158: [SPARK-37859][SQL] Do not check for metadata during schema comparison

2022-01-13 Thread GitBox
cloud-fan commented on pull request #35158: URL: https://github.com/apache/spark/pull/35158#issuecomment-1012703463 thanks, merging to master/3.2! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on pull request #35158: [SPARK-37859][SQL] Do not check for metadata during schema comparison

2022-01-13 Thread GitBox
HyukjinKwon commented on pull request #35158: URL: https://github.com/apache/spark/pull/35158#issuecomment-1012703190 offline synced. It's because of a bug in my reverted fix. should be fine to go and merge  -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] Yikun edited a comment on pull request #35183: [SPARK-37886][PYTHON][TESTS] Use ComparisonTestBase in pandas test

2022-01-13 Thread GitBox
Yikun edited a comment on pull request #35183: URL: https://github.com/apache/spark/pull/35183#issuecomment-1012699764 FYI https://github.com/apache/spark/pull/35203, the Ops related tests have some refactors, so I also make it in a separated PR to help easy review. -- This is an

[GitHub] [spark] LuciferYang commented on pull request #35163: [SPARK-37864][SQL] Support vectorized read boolean values use RLE encoding with Parquet DataPage V2

2022-01-13 Thread GitBox
LuciferYang commented on pull request #35163: URL: https://github.com/apache/spark/pull/35163#issuecomment-1012702635 thanks @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #35183: [SPARK-37886][PYTHON][TESTS] Use ComparisonTestBase in pandas test

2022-01-13 Thread GitBox
HyukjinKwon commented on pull request #35183: URL: https://github.com/apache/spark/pull/35183#issuecomment-1012702418  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] LuciferYang commented on a change in pull request #35190: [SPARK-37893][CORE][TESTS] Avoid ConcurrentModificationException related to SparkFunSuite.LogAppender#_threshold"

2022-01-13 Thread GitBox
LuciferYang commented on a change in pull request #35190: URL: https://github.com/apache/spark/pull/35190#discussion_r784473891 ## File path: core/src/test/scala/org/apache/spark/SparkFunSuite.scala ## @@ -272,19 +272,23 @@ abstract class SparkFunSuite override def

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784473607 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala ## @@ -888,6 +889,182 @@ class

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784472530 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala ## @@ -888,6 +889,182 @@ class

[GitHub] [spark] Yikun commented on pull request #35183: [SPARK-37886][PYTHON][TESTS] Use ComparisonTestBase in pandas test

2022-01-13 Thread GitBox
Yikun commented on pull request #35183: URL: https://github.com/apache/spark/pull/35183#issuecomment-1012699764 FYI https://github.com/apache/spark/pull/35203, the Ops related tests have some refactors, so I make it in a separated PR to help easy review. -- This is an automated message

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784470211 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala ## @@ -888,6 +889,182 @@ class

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784469883 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784469313 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] Yikun opened a new pull request #35203: [SPARK-37886][PYTHON][TESTS] Refactor on OpsTestCase and use ComparisonTestBase

2022-01-13 Thread GitBox
Yikun opened a new pull request #35203: URL: https://github.com/apache/spark/pull/35203 ### What changes were proposed in this pull request? - Rename TestCasesUtils to OpsTestCase - Make OpsTestCase inherited from `ComparisonTestBase`(`PandasOnSparkTestCase` with `pdf` and `psdf`)

[GitHub] [spark] imback82 commented on a change in pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
imback82 commented on a change in pull request #35202: URL: https://github.com/apache/spark/pull/35202#discussion_r784468742 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropNamespaceExec.scala ## @@ -46,9 +46,17 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784468567 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784468389 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784467972 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784467824 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #35060: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35060: URL: https://github.com/apache/spark/pull/35060#discussion_r784467082 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConstants.scala ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34848: [SPARK-37582][SPARK-37583][SQL] CONTAINS, STARTSWITH, ENDSWITH should support all data type

2022-01-13 Thread GitBox
AngersZh commented on a change in pull request #34848: URL: https://github.com/apache/spark/pull/34848#discussion_r784466209 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala ## @@ -450,22 +451,49 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35202: URL: https://github.com/apache/spark/pull/35202#discussion_r784463563 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropNamespaceExec.scala ## @@ -46,9 +46,17 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35202: URL: https://github.com/apache/spark/pull/35202#discussion_r784463563 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropNamespaceExec.scala ## @@ -46,9 +46,17 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
cloud-fan commented on a change in pull request #35202: URL: https://github.com/apache/spark/pull/35202#discussion_r784459401 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropNamespaceExec.scala ## @@ -46,9 +46,17 @@ case class

[GitHub] [spark] Yaohua628 commented on a change in pull request #35147: [SPARK-37768][SQL][FOLLOWUP] Schema pruning for the metadata struct

2022-01-13 Thread GitBox
Yaohua628 commented on a change in pull request #35147: URL: https://github.com/apache/spark/pull/35147#discussion_r784456351 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SchemaPruning.scala ## @@ -179,12 +189,16 @@ object SchemaPruning

[GitHub] [spark] Yaohua628 commented on a change in pull request #35147: [SPARK-37768][SQL][FOLLOWUP] Schema pruning for the metadata struct

2022-01-13 Thread GitBox
Yaohua628 commented on a change in pull request #35147: URL: https://github.com/apache/spark/pull/35147#discussion_r784456113 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SchemaPruning.scala ## @@ -31,58 +31,70 @@ import

[GitHub] [spark] dchvn commented on pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
dchvn commented on pull request #35202: URL: https://github.com/apache/spark/pull/35202#issuecomment-1012676663 cc @cloud-fan @imback82. Could you take a look if you have time? Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] dchvn opened a new pull request #35202: [SPARK-37479][SQL] Migrate DROP NAMESPACE to use V2 command by default

2022-01-13 Thread GitBox
dchvn opened a new pull request #35202: URL: https://github.com/apache/spark/pull/35202 ### What changes were proposed in this pull request? This PR migrates `DROP NAMESPACE` to use V2 command by default. ### Why are the changes needed? It's been a while since we introduced

[GitHub] [spark] cloud-fan closed pull request #34914: [SPARK-37627][SQL][FOLLOWUP] Separate SortedBucketTransform from BucketTransform

2022-01-13 Thread GitBox
cloud-fan closed pull request #34914: URL: https://github.com/apache/spark/pull/34914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #34914: [SPARK-37627][SQL][FOLLOWUP] Separate SortedBucketTransform from BucketTransform

2022-01-13 Thread GitBox
cloud-fan commented on pull request #34914: URL: https://github.com/apache/spark/pull/34914#issuecomment-1012673782 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] Yikun commented on pull request #35183: [SPARK-37886][PySpark][TEST] Use ComparisonTestBase in pandas test

2022-01-13 Thread GitBox
Yikun commented on pull request #35183: URL: https://github.com/apache/spark/pull/35183#issuecomment-1012673454 FYI @xinrong-databricks @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

  1   2   3   >