[jira] [Commented] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436232#comment-17436232
 ] 

Apache Spark commented on SPARK-37161:
--

User 'Peng-Lei' has created a pull request for this issue:
https://github.com/apache/spark/pull/34446

> RowToColumnConverter  support AnsiIntervalType
> --
>
> Key: SPARK-37161
> URL: https://issues.apache.org/jira/browse/SPARK-37161
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: PengLei
>Priority: Major
>
> currently, we have RowToColumnConverter for all data types except 
> AnsiIntervalType
> {code:java}
> // code placeholder
> val core = dataType match {
>   case BinaryType => BinaryConverter
>   case BooleanType => BooleanConverter
>   case ByteType => ByteConverter
>   case ShortType => ShortConverter
>   case IntegerType | DateType => IntConverter
>   case FloatType => FloatConverter
>   case LongType | TimestampType => LongConverter
>   case DoubleType => DoubleConverter
>   case StringType => StringConverter
>   case CalendarIntervalType => CalendarConverter
>   case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
> at.containsNull))
>   case st: StructType => new StructConverter(st.fields.map(
> (f) => getConverterForType(f.dataType, f.nullable)))
>   case dt: DecimalType => new DecimalConverter(dt)
>   case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
> false),
> getConverterForType(mt.valueType, mt.valueContainsNull))
>   case unknown => throw 
> QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
> }
> if (nullable) {
>   dataType match {
> case CalendarIntervalType => new StructNullableTypeConverter(core)
> case st: StructType => new StructNullableTypeConverter(core)
> case _ => new BasicNullableTypeConverter(core)
>   }
> } else {
>   core
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37161:


Assignee: Apache Spark

> RowToColumnConverter  support AnsiIntervalType
> --
>
> Key: SPARK-37161
> URL: https://issues.apache.org/jira/browse/SPARK-37161
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: Apache Spark
>Priority: Major
>
> currently, we have RowToColumnConverter for all data types except 
> AnsiIntervalType
> {code:java}
> // code placeholder
> val core = dataType match {
>   case BinaryType => BinaryConverter
>   case BooleanType => BooleanConverter
>   case ByteType => ByteConverter
>   case ShortType => ShortConverter
>   case IntegerType | DateType => IntConverter
>   case FloatType => FloatConverter
>   case LongType | TimestampType => LongConverter
>   case DoubleType => DoubleConverter
>   case StringType => StringConverter
>   case CalendarIntervalType => CalendarConverter
>   case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
> at.containsNull))
>   case st: StructType => new StructConverter(st.fields.map(
> (f) => getConverterForType(f.dataType, f.nullable)))
>   case dt: DecimalType => new DecimalConverter(dt)
>   case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
> false),
> getConverterForType(mt.valueType, mt.valueContainsNull))
>   case unknown => throw 
> QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
> }
> if (nullable) {
>   dataType match {
> case CalendarIntervalType => new StructNullableTypeConverter(core)
> case st: StructType => new StructNullableTypeConverter(core)
> case _ => new BasicNullableTypeConverter(core)
>   }
> } else {
>   core
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37161:


Assignee: (was: Apache Spark)

> RowToColumnConverter  support AnsiIntervalType
> --
>
> Key: SPARK-37161
> URL: https://issues.apache.org/jira/browse/SPARK-37161
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: PengLei
>Priority: Major
>
> currently, we have RowToColumnConverter for all data types except 
> AnsiIntervalType
> {code:java}
> // code placeholder
> val core = dataType match {
>   case BinaryType => BinaryConverter
>   case BooleanType => BooleanConverter
>   case ByteType => ByteConverter
>   case ShortType => ShortConverter
>   case IntegerType | DateType => IntConverter
>   case FloatType => FloatConverter
>   case LongType | TimestampType => LongConverter
>   case DoubleType => DoubleConverter
>   case StringType => StringConverter
>   case CalendarIntervalType => CalendarConverter
>   case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
> at.containsNull))
>   case st: StructType => new StructConverter(st.fields.map(
> (f) => getConverterForType(f.dataType, f.nullable)))
>   case dt: DecimalType => new DecimalConverter(dt)
>   case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
> false),
> getConverterForType(mt.valueType, mt.valueContainsNull))
>   case unknown => throw 
> QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
> }
> if (nullable) {
>   dataType match {
> case CalendarIntervalType => new StructNullableTypeConverter(core)
> case st: StructType => new StructNullableTypeConverter(core)
> case _ => new BasicNullableTypeConverter(core)
>   }
> } else {
>   core
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436231#comment-17436231
 ] 

Apache Spark commented on SPARK-37161:
--

User 'Peng-Lei' has created a pull request for this issue:
https://github.com/apache/spark/pull/34446

> RowToColumnConverter  support AnsiIntervalType
> --
>
> Key: SPARK-37161
> URL: https://issues.apache.org/jira/browse/SPARK-37161
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: PengLei
>Priority: Major
>
> currently, we have RowToColumnConverter for all data types except 
> AnsiIntervalType
> {code:java}
> // code placeholder
> val core = dataType match {
>   case BinaryType => BinaryConverter
>   case BooleanType => BooleanConverter
>   case ByteType => ByteConverter
>   case ShortType => ShortConverter
>   case IntegerType | DateType => IntConverter
>   case FloatType => FloatConverter
>   case LongType | TimestampType => LongConverter
>   case DoubleType => DoubleConverter
>   case StringType => StringConverter
>   case CalendarIntervalType => CalendarConverter
>   case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
> at.containsNull))
>   case st: StructType => new StructConverter(st.fields.map(
> (f) => getConverterForType(f.dataType, f.nullable)))
>   case dt: DecimalType => new DecimalConverter(dt)
>   case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
> false),
> getConverterForType(mt.valueType, mt.valueContainsNull))
>   case unknown => throw 
> QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
> }
> if (nullable) {
>   dataType match {
> case CalendarIntervalType => new StructNullableTypeConverter(core)
> case st: StructType => new StructNullableTypeConverter(core)
> case _ => new BasicNullableTypeConverter(core)
>   }
> } else {
>   core
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36646) Push down group by partition column for Aggregate (Min/Max/Count) for Parquet

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436228#comment-17436228
 ] 

Apache Spark commented on SPARK-36646:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/34445

> Push down group by partition column for Aggregate (Min/Max/Count) for Parquet
> -
>
> Key: SPARK-36646
> URL: https://issues.apache.org/jira/browse/SPARK-36646
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> If Aggregate (Min/Max/Count) in parquet is group by partition column, push 
> down group by



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-32567) Code-gen for full outer shuffled hash join

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-32567:


Assignee: Apache Spark

> Code-gen for full outer shuffled hash join
> --
>
> Key: SPARK-32567
> URL: https://issues.apache.org/jira/browse/SPARK-32567
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Cheng Su
>Assignee: Apache Spark
>Priority: Minor
>
> As a followup for [https://github.com/apache/spark/pull/29342] (non-codegen 
> full outer shuffled hash join), this task is to add code-gen for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32567) Code-gen for full outer shuffled hash join

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436215#comment-17436215
 ] 

Apache Spark commented on SPARK-32567:
--

User 'c21' has created a pull request for this issue:
https://github.com/apache/spark/pull/3

> Code-gen for full outer shuffled hash join
> --
>
> Key: SPARK-32567
> URL: https://issues.apache.org/jira/browse/SPARK-32567
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Cheng Su
>Priority: Minor
>
> As a followup for [https://github.com/apache/spark/pull/29342] (non-codegen 
> full outer shuffled hash join), this task is to add code-gen for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32567) Code-gen for full outer shuffled hash join

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436214#comment-17436214
 ] 

Apache Spark commented on SPARK-32567:
--

User 'c21' has created a pull request for this issue:
https://github.com/apache/spark/pull/3

> Code-gen for full outer shuffled hash join
> --
>
> Key: SPARK-32567
> URL: https://issues.apache.org/jira/browse/SPARK-32567
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Cheng Su
>Priority: Minor
>
> As a followup for [https://github.com/apache/spark/pull/29342] (non-codegen 
> full outer shuffled hash join), this task is to add code-gen for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-32567) Code-gen for full outer shuffled hash join

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-32567:


Assignee: (was: Apache Spark)

> Code-gen for full outer shuffled hash join
> --
>
> Key: SPARK-32567
> URL: https://issues.apache.org/jira/browse/SPARK-32567
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Cheng Su
>Priority: Minor
>
> As a followup for [https://github.com/apache/spark/pull/29342] (non-codegen 
> full outer shuffled hash join), this task is to add code-gen for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37168) Improve error messages for SQL functions and operators under ANSI mode

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436179#comment-17436179
 ] 

Apache Spark commented on SPARK-37168:
--

User 'allisonwang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/34443

> Improve error messages for SQL functions and operators under ANSI mode
> --
>
> Key: SPARK-37168
> URL: https://issues.apache.org/jira/browse/SPARK-37168
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> Make error messages more actionable when ANSI mode is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37168) Improve error messages for SQL functions and operators under ANSI mode

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37168:


Assignee: (was: Apache Spark)

> Improve error messages for SQL functions and operators under ANSI mode
> --
>
> Key: SPARK-37168
> URL: https://issues.apache.org/jira/browse/SPARK-37168
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> Make error messages more actionable when ANSI mode is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37168) Improve error messages for SQL functions and operators under ANSI mode

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37168:


Assignee: Apache Spark

> Improve error messages for SQL functions and operators under ANSI mode
> --
>
> Key: SPARK-37168
> URL: https://issues.apache.org/jira/browse/SPARK-37168
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Apache Spark
>Priority: Major
>
> Make error messages more actionable when ANSI mode is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37168) Improve error messages for SQL functions and operators under ANSI mode

2021-10-29 Thread Allison Wang (Jira)
Allison Wang created SPARK-37168:


 Summary: Improve error messages for SQL functions and operators 
under ANSI mode
 Key: SPARK-37168
 URL: https://issues.apache.org/jira/browse/SPARK-37168
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: Allison Wang


Make error messages more actionable when ANSI mode is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37117) Can't read files in one of Parquet encryption modes (external keymaterial)

2021-10-29 Thread Huaxin Gao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao resolved SPARK-37117.

Fix Version/s: 3.3.0
   3.2.1
 Assignee: Gidon Gershinsky
   Resolution: Fixed

> Can't read files in one of Parquet encryption modes (external keymaterial) 
> ---
>
> Key: SPARK-37117
> URL: https://issues.apache.org/jira/browse/SPARK-37117
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gidon Gershinsky
>Assignee: Gidon Gershinsky
>Priority: Major
> Fix For: 3.2.1, 3.3.0
>
>
> Parquet encryption has a number of modes. One of them is "external 
> keymaterial", which keeps encrypted data keys in a separate file (as opposed 
> to inside Parquet file). Upon reading, the Spark Parquet connector does not 
> pass the file path, which causes an NPE. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37167) Add benchmark for aggregate push down

2021-10-29 Thread Cheng Su (Jira)
Cheng Su created SPARK-37167:


 Summary: Add benchmark for aggregate push down
 Key: SPARK-37167
 URL: https://issues.apache.org/jira/browse/SPARK-37167
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: Cheng Su


As we added aggregate push down for Parquet and ORC, let's also add a micro 
benchmark for both file formats, similar to filter push down and nested schema 
pruning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2021-10-29 Thread Naresh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436144#comment-17436144
 ] 

Naresh edited comment on SPARK-26365 at 10/29/21, 7:41 PM:
---

[~oscar.bonilla] Any plans to prioritize issue?? This will definitely block the 
spark usage with K8s


was (Author: gangishetty):
[~oscar.bonilla] Any plans to prioritize issue?? This will definitely lock the 
spark usage with K8s

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0
>Reporter: Oscar Bonilla
>Priority: Minor
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2021-10-29 Thread Naresh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436144#comment-17436144
 ] 

Naresh commented on SPARK-26365:


[~oscar.bonilla] Any plans to prioritize issue?? This will definitely lock the 
spark usage with K8s

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0
>Reporter: Oscar Bonilla
>Priority: Minor
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36844) Window function "first" (unboundedFollowing) appears significantly slower than "last" (unboundedPreceding) in identical circumstances

2021-10-29 Thread Tanel Kiis (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436116#comment-17436116
 ] 

Tanel Kiis commented on SPARK-36844:


Hello,

I also hit this issue a while back and found that, It is a bit explained in 
this code comment:
https://github.com/apache/spark/blob/abf9675a7559d5666e40f25098334b5edbf8c0c3/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala#L609-L611

So it is not the fault of the first aggregator, but it is the 
UnboundedFollowing window frame. There are definetly some optimizations, that 
could be done.

If I followed your code correctly, then I think you would be better of using 
the [lead|https://spark.apache.org/docs/latest/api/sql/index.html#lead] and 
[lag|https://spark.apache.org/docs/latest/api/sql/index.html#lag] window 
functions. With those you can drop the .rowsBetween(...) part from your window 
specs.

> Window function "first" (unboundedFollowing) appears significantly slower 
> than "last" (unboundedPreceding) in identical circumstances
> -
>
> Key: SPARK-36844
> URL: https://issues.apache.org/jira/browse/SPARK-36844
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Windows
>Affects Versions: 3.1.1
>Reporter: Alain Bryden
>Priority: Minor
> Attachments: Physical Plan 2 - workaround.png, Pysical Plan.png
>
>
> I originally posted a question on SO because I thought perhaps I was doing 
> something wrong:
> [https://stackoverflow.com/questions/69308560|https://stackoverflow.com/questions/69308560/spark-first-window-function-is-taking-much-longer-than-last?noredirect=1#comment122505685_69308560]
> Perhaps I am, but I'm now fairly convinced that there's something wonky with 
> the implementation of `first` that's causing it to unnecessarily have a much 
> worse complexity than `last`.
>  
> More or less copy-pasted from the above post:
> I was working on a pyspark routine to interpolate the missing values in a 
> configuration table.
> Imagine a table of configuration values that go from 0 to 50,000. The user 
> specifies a few data points in between (say at 0, 50, 100, 500, 2000, 50) 
> and we interpolate the remainder. My solution mostly follows [this blog 
> post|https://walkenho.github.io/interpolating-time-series-p2-spark/] quite 
> closely, except I'm not using any UDFs.
> In troubleshooting the performance of this (takes ~3 minutes) I found that 
> one particular window function is taking all of the time, and everything else 
> I'm doing takes mere seconds.
> Here is the main area of interest - where I use window functions to fill in 
> the previous and next user-supplied configuration values:
> {code:python}
> from pyspark.sql import Window, functions as F
> # Create partition windows that are required to generate new rows from the 
> ones provided
> win_last = Window.partitionBy('PORT_TYPE', 
> 'loss_process').orderBy('rank').rowsBetween(Window.unboundedPreceding, 0)
> win_next = Window.partitionBy('PORT_TYPE', 
> 'loss_process').orderBy('rank').rowsBetween(0, Window.unboundedFollowing)
> # Join back in the provided config table to populate the "known" scale factors
> df_part1 = (df_scale_factors_template
>   .join(df_users_config, ['PORT_TYPE', 'loss_process', 'rank'], 'leftouter')
>   # Add computed columns that can lookup the prior config and next config for 
> each missing value
>   .withColumn('last_rank', F.last( F.col('rank'), 
> ignorenulls=True).over(win_last))
>   .withColumn('last_sf',   F.last( F.col('scale_factor'), 
> ignorenulls=True).over(win_last))
> ).cache()
> debug_log_dataframe(df_part1 , 'df_part1') # Force a .count() and time Part1
> df_part2 = (df_part1
>   .withColumn('next_rank', F.first(F.col('rank'), 
> ignorenulls=True).over(win_next))
>   .withColumn('next_sf',   F.first(F.col('scale_factor'), 
> ignorenulls=True).over(win_next))
> ).cache()
> debug_log_dataframe(df_part2 , 'df_part2') # Force a .count() and time Part2
> df_part3 = (df_part2
>   # Implements standard linear interpolation: y = y1 + ((y2-y1)/(x2-x1)) * 
> (x-x1)
>   .withColumn('scale_factor', 
>   F.when(F.col('last_rank')==F.col('next_rank'), 
> F.col('last_sf')) # Handle div/0 case
>   .otherwise(F.col('last_sf') + 
> ((F.col('next_sf')-F.col('last_sf'))/(F.col('next_rank')-F.col('last_rank'))) 
> * (F.col('rank')-F.col('last_rank'
>   .select('PORT_TYPE', 'loss_process', 'rank', 'scale_factor')
> ).cache()
> debug_log_dataframe(df_part3, 'df_part3', explain: True)
> {code}
>  
> The above used to be a single chained dataframe statement, but I've since 
> split it into 3 parts so that I could isolate the part that's taking so long. 
> The results 

[jira] [Created] (SPARK-37166) SPIP: Storage Partitioned Join

2021-10-29 Thread Chao Sun (Jira)
Chao Sun created SPARK-37166:


 Summary: SPIP: Storage Partitioned Join
 Key: SPARK-37166
 URL: https://issues.apache.org/jira/browse/SPARK-37166
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 3.3.0
Reporter: Chao Sun


This JIRA tracks the SPIP for storage partitioned join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37165) Add REPEATABLE in TABLESAMPLE to specify seed

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436052#comment-17436052
 ] 

Apache Spark commented on SPARK-37165:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/34442

> Add REPEATABLE in TABLESAMPLE to specify seed
> -
>
> Key: SPARK-37165
> URL: https://issues.apache.org/jira/browse/SPARK-37165
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37165) Add REPEATABLE in TABLESAMPLE to specify seed

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37165:


Assignee: Apache Spark

> Add REPEATABLE in TABLESAMPLE to specify seed
> -
>
> Key: SPARK-37165
> URL: https://issues.apache.org/jira/browse/SPARK-37165
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Assignee: Apache Spark
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37165) Add REPEATABLE in TABLESAMPLE to specify seed

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37165:


Assignee: (was: Apache Spark)

> Add REPEATABLE in TABLESAMPLE to specify seed
> -
>
> Key: SPARK-37165
> URL: https://issues.apache.org/jira/browse/SPARK-37165
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37165) Add REPEATABLE in TABLESAMPLE to specify seed

2021-10-29 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-37165:
--

 Summary: Add REPEATABLE in TABLESAMPLE to specify seed
 Key: SPARK-37165
 URL: https://issues.apache.org/jira/browse/SPARK-37165
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: Huaxin Gao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37164) Add ExpressionBuilder for functions with complex overloads

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436037#comment-17436037
 ] 

Apache Spark commented on SPARK-37164:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/34441

> Add ExpressionBuilder for functions with complex overloads
> --
>
> Key: SPARK-37164
> URL: https://issues.apache.org/jira/browse/SPARK-37164
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37164) Add ExpressionBuilder for functions with complex overloads

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37164:


Assignee: Apache Spark

> Add ExpressionBuilder for functions with complex overloads
> --
>
> Key: SPARK-37164
> URL: https://issues.apache.org/jira/browse/SPARK-37164
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37164) Add ExpressionBuilder for functions with complex overloads

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37164:


Assignee: (was: Apache Spark)

> Add ExpressionBuilder for functions with complex overloads
> --
>
> Key: SPARK-37164
> URL: https://issues.apache.org/jira/browse/SPARK-37164
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37164) Add ExpressionBuilder for functions with complex overloads

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436036#comment-17436036
 ] 

Apache Spark commented on SPARK-37164:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/34441

> Add ExpressionBuilder for functions with complex overloads
> --
>
> Key: SPARK-37164
> URL: https://issues.apache.org/jira/browse/SPARK-37164
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37140) Inline type hints for python/pyspark/resultiterable.py

2021-10-29 Thread Maciej Szymkiewicz (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Szymkiewicz resolved SPARK-37140.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34413
[https://github.com/apache/spark/pull/34413]

> Inline type hints for python/pyspark/resultiterable.py
> --
>
> Key: SPARK-37140
> URL: https://issues.apache.org/jira/browse/SPARK-37140
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Assignee: dch nguyen
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37140) Inline type hints for python/pyspark/resultiterable.py

2021-10-29 Thread Maciej Szymkiewicz (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Szymkiewicz reassigned SPARK-37140:
--

Assignee: dch nguyen

> Inline type hints for python/pyspark/resultiterable.py
> --
>
> Key: SPARK-37140
> URL: https://issues.apache.org/jira/browse/SPARK-37140
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Assignee: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37164) Add ExpressionBuilder for functions with complex overloads

2021-10-29 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-37164:
---

 Summary: Add ExpressionBuilder for functions with complex overloads
 Key: SPARK-37164
 URL: https://issues.apache.org/jira/browse/SPARK-37164
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37163) Disallow casting Date as Numeric types

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435952#comment-17435952
 ] 

Apache Spark commented on SPARK-37163:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/34440

> Disallow casting Date as Numeric types
> --
>
> Key: SPARK-37163
> URL: https://issues.apache.org/jira/browse/SPARK-37163
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Currently, Date type values can be cast as Numeric types. However, the result 
> is always NULL.
> On the other hand, Numeric values can't be cast as Date type.
> It doesn't make sense to keep the behavior of casting Date to null numeric 
> types. I suggest to disallow the conversion. We can have a legacy flag 
> `spark.sql.legacy.allowCastDateAsNumeric` if users really want to fall back 
> to the legacy behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37163) Disallow casting Date as Numeric types

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37163:


Assignee: Gengliang Wang  (was: Apache Spark)

> Disallow casting Date as Numeric types
> --
>
> Key: SPARK-37163
> URL: https://issues.apache.org/jira/browse/SPARK-37163
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Currently, Date type values can be cast as Numeric types. However, the result 
> is always NULL.
> On the other hand, Numeric values can't be cast as Date type.
> It doesn't make sense to keep the behavior of casting Date to null numeric 
> types. I suggest to disallow the conversion. We can have a legacy flag 
> `spark.sql.legacy.allowCastDateAsNumeric` if users really want to fall back 
> to the legacy behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37163) Disallow casting Date as Numeric types

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37163:


Assignee: Apache Spark  (was: Gengliang Wang)

> Disallow casting Date as Numeric types
> --
>
> Key: SPARK-37163
> URL: https://issues.apache.org/jira/browse/SPARK-37163
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> Currently, Date type values can be cast as Numeric types. However, the result 
> is always NULL.
> On the other hand, Numeric values can't be cast as Date type.
> It doesn't make sense to keep the behavior of casting Date to null numeric 
> types. I suggest to disallow the conversion. We can have a legacy flag 
> `spark.sql.legacy.allowCastDateAsNumeric` if users really want to fall back 
> to the legacy behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37163) Disallow casting Date as Numeric types

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435950#comment-17435950
 ] 

Apache Spark commented on SPARK-37163:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/34440

> Disallow casting Date as Numeric types
> --
>
> Key: SPARK-37163
> URL: https://issues.apache.org/jira/browse/SPARK-37163
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Currently, Date type values can be cast as Numeric types. However, the result 
> is always NULL.
> On the other hand, Numeric values can't be cast as Date type.
> It doesn't make sense to keep the behavior of casting Date to null numeric 
> types. I suggest to disallow the conversion. We can have a legacy flag 
> `spark.sql.legacy.allowCastDateAsNumeric` if users really want to fall back 
> to the legacy behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37163) Disallow casting Date as Numeric types

2021-10-29 Thread Gengliang Wang (Jira)
Gengliang Wang created SPARK-37163:
--

 Summary: Disallow casting Date as Numeric types
 Key: SPARK-37163
 URL: https://issues.apache.org/jira/browse/SPARK-37163
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


Currently, Date type values can be cast as Numeric types. However, the result 
is always NULL.
On the other hand, Numeric values can't be cast as Date type.
It doesn't make sense to keep the behavior of casting Date to null numeric 
types. I suggest to disallow the conversion. We can have a legacy flag 
`spark.sql.legacy.allowCastDateAsNumeric` if users really want to fall back to 
the legacy behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-36800) The create table as select statement verifies the valid column name

2021-10-29 Thread dohongdayi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435947#comment-17435947
 ] 

dohongdayi edited comment on SPARK-36800 at 10/29/21, 11:54 AM:


I guess the image might be similar with:

!SparkIssue.png!


was (Author: dohongdayi):
The image might be like:

!SparkIssue.png!

> The  create table as select statement verifies the valid column name
> 
>
> Key: SPARK-36800
> URL: https://issues.apache.org/jira/browse/SPARK-36800
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: melin
>Priority: Trivial
> Attachments: SparkIssue.png
>
>
> If the column name output by select is not a valid column name, the prompt is 
> not very clear, it is recommended to add a column name check
> {code:java}
> create table tdl_demo_dd as select 1+1{code}
> !image-2021-09-19-17-25-02-239.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36800) The create table as select statement verifies the valid column name

2021-10-29 Thread dohongdayi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435947#comment-17435947
 ] 

dohongdayi commented on SPARK-36800:


The image might be like:

!SparkIssue.png!

> The  create table as select statement verifies the valid column name
> 
>
> Key: SPARK-36800
> URL: https://issues.apache.org/jira/browse/SPARK-36800
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: melin
>Priority: Trivial
> Attachments: SparkIssue.png
>
>
> If the column name output by select is not a valid column name, the prompt is 
> not very clear, it is recommended to add a column name check
> {code:java}
> create table tdl_demo_dd as select 1+1{code}
> !image-2021-09-19-17-25-02-239.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-36800) The create table as select statement verifies the valid column name

2021-10-29 Thread dohongdayi (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dohongdayi updated SPARK-36800:
---
Attachment: SparkIssue.png

> The  create table as select statement verifies the valid column name
> 
>
> Key: SPARK-36800
> URL: https://issues.apache.org/jira/browse/SPARK-36800
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: melin
>Priority: Trivial
> Attachments: SparkIssue.png
>
>
> If the column name output by select is not a valid column name, the prompt is 
> not very clear, it is recommended to add a column name check
> {code:java}
> create table tdl_demo_dd as select 1+1{code}
> !image-2021-09-19-17-25-02-239.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37162) Web UI tasks table column sort order is wrong

2021-10-29 Thread Lichuanliang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lichuanliang updated SPARK-37162:
-
Attachment: spark-3.2.0-web-ui.png

> Web UI tasks table column sort order is wrong
> -
>
> Key: SPARK-37162
> URL: https://issues.apache.org/jira/browse/SPARK-37162
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.0
>Reporter: Lichuanliang
>Priority: Minor
> Attachments: spark-3.2.0-web-ui.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37162) Web UI tasks table column sort order is wrong

2021-10-29 Thread Lichuanliang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lichuanliang updated SPARK-37162:
-
Description: (was: !image-2021-10-29-19-20-23-667.png!)

> Web UI tasks table column sort order is wrong
> -
>
> Key: SPARK-37162
> URL: https://issues.apache.org/jira/browse/SPARK-37162
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.0
>Reporter: Lichuanliang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37162) Web UI tasks table column sort order is wrong

2021-10-29 Thread Lichuanliang (Jira)
Lichuanliang created SPARK-37162:


 Summary: Web UI tasks table column sort order is wrong
 Key: SPARK-37162
 URL: https://issues.apache.org/jira/browse/SPARK-37162
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.2.0
Reporter: Lichuanliang


!image-2021-10-29-19-20-23-667.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37095:


Assignee: Apache Spark

> Inline type hints for files in python/pyspark/broadcast.py
> --
>
> Key: SPARK-37095
> URL: https://issues.apache.org/jira/browse/SPARK-37095
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37095:


Assignee: (was: Apache Spark)

> Inline type hints for files in python/pyspark/broadcast.py
> --
>
> Key: SPARK-37095
> URL: https://issues.apache.org/jira/browse/SPARK-37095
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435917#comment-17435917
 ] 

Apache Spark commented on SPARK-37095:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34439

> Inline type hints for files in python/pyspark/broadcast.py
> --
>
> Key: SPARK-37095
> URL: https://issues.apache.org/jira/browse/SPARK-37095
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435914#comment-17435914
 ] 

Apache Spark commented on SPARK-37095:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34439

> Inline type hints for files in python/pyspark/broadcast.py
> --
>
> Key: SPARK-37095
> URL: https://issues.apache.org/jira/browse/SPARK-37095
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread PengLei (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435880#comment-17435880
 ] 

PengLei commented on SPARK-37161:
-

working on this

> RowToColumnConverter  support AnsiIntervalType
> --
>
> Key: SPARK-37161
> URL: https://issues.apache.org/jira/browse/SPARK-37161
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: PengLei
>Priority: Major
>
> currently, we have RowToColumnConverter for all data types except 
> AnsiIntervalType
> {code:java}
> // code placeholder
> val core = dataType match {
>   case BinaryType => BinaryConverter
>   case BooleanType => BooleanConverter
>   case ByteType => ByteConverter
>   case ShortType => ShortConverter
>   case IntegerType | DateType => IntConverter
>   case FloatType => FloatConverter
>   case LongType | TimestampType => LongConverter
>   case DoubleType => DoubleConverter
>   case StringType => StringConverter
>   case CalendarIntervalType => CalendarConverter
>   case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
> at.containsNull))
>   case st: StructType => new StructConverter(st.fields.map(
> (f) => getConverterForType(f.dataType, f.nullable)))
>   case dt: DecimalType => new DecimalConverter(dt)
>   case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
> false),
> getConverterForType(mt.valueType, mt.valueContainsNull))
>   case unknown => throw 
> QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
> }
> if (nullable) {
>   dataType match {
> case CalendarIntervalType => new StructNullableTypeConverter(core)
> case st: StructType => new StructNullableTypeConverter(core)
> case _ => new BasicNullableTypeConverter(core)
>   }
> } else {
>   core
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37161) RowToColumnConverter support AnsiIntervalType

2021-10-29 Thread PengLei (Jira)
PengLei created SPARK-37161:
---

 Summary: RowToColumnConverter  support AnsiIntervalType
 Key: SPARK-37161
 URL: https://issues.apache.org/jira/browse/SPARK-37161
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: PengLei


currently, we have RowToColumnConverter for all data types except 
AnsiIntervalType
{code:java}
// code placeholder
val core = dataType match {
  case BinaryType => BinaryConverter
  case BooleanType => BooleanConverter
  case ByteType => ByteConverter
  case ShortType => ShortConverter
  case IntegerType | DateType => IntConverter
  case FloatType => FloatConverter
  case LongType | TimestampType => LongConverter
  case DoubleType => DoubleConverter
  case StringType => StringConverter
  case CalendarIntervalType => CalendarConverter
  case at: ArrayType => ArrayConverter(getConverterForType(at.elementType, 
at.containsNull))
  case st: StructType => new StructConverter(st.fields.map(
(f) => getConverterForType(f.dataType, f.nullable)))
  case dt: DecimalType => new DecimalConverter(dt)
  case mt: MapType => MapConverter(getConverterForType(mt.keyType, nullable = 
false),
getConverterForType(mt.valueType, mt.valueContainsNull))
  case unknown => throw 
QueryExecutionErrors.unsupportedDataTypeError(unknown.toString)
}

if (nullable) {
  dataType match {
case CalendarIntervalType => new StructNullableTypeConverter(core)
case st: StructType => new StructNullableTypeConverter(core)
case _ => new BasicNullableTypeConverter(core)
  }
} else {
  core
}

{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37157) Inline type hints for python/pyspark/util.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37157:


Assignee: Apache Spark

> Inline type hints for python/pyspark/util.py
> 
>
> Key: SPARK-37157
> URL: https://issues.apache.org/jira/browse/SPARK-37157
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37157) Inline type hints for python/pyspark/util.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37157:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/util.py
> 
>
> Key: SPARK-37157
> URL: https://issues.apache.org/jira/browse/SPARK-37157
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37157) Inline type hints for python/pyspark/util.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435876#comment-17435876
 ] 

Apache Spark commented on SPARK-37157:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34438

> Inline type hints for python/pyspark/util.py
> 
>
> Key: SPARK-37157
> URL: https://issues.apache.org/jira/browse/SPARK-37157
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37156) Inline type hints for python/pyspark/storagelevel.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37156:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/storagelevel.py
> 
>
> Key: SPARK-37156
> URL: https://issues.apache.org/jira/browse/SPARK-37156
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37156) Inline type hints for python/pyspark/storagelevel.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37156:


Assignee: Apache Spark

> Inline type hints for python/pyspark/storagelevel.py
> 
>
> Key: SPARK-37156
> URL: https://issues.apache.org/jira/browse/SPARK-37156
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37156) Inline type hints for python/pyspark/storagelevel.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435875#comment-17435875
 ] 

Apache Spark commented on SPARK-37156:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34437

> Inline type hints for python/pyspark/storagelevel.py
> 
>
> Key: SPARK-37156
> URL: https://issues.apache.org/jira/browse/SPARK-37156
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-36975.
-
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34400
[https://github.com/apache/spark/pull/34400]

> Refactor HiveClientImpl collect hive client call logic
> --
>
> Key: SPARK-36975
> URL: https://issues.apache.org/jira/browse/SPARK-36975
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.1.2, 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.3.0
>
>
> Currently, we treat one call withHiveState as one Hive Client call, it's too 
> weirld. Need to refator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-36975:
---

Assignee: angerszhu

> Refactor HiveClientImpl collect hive client call logic
> --
>
> Key: SPARK-36975
> URL: https://issues.apache.org/jira/browse/SPARK-36975
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.1.2, 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>
> Currently, we treat one call withHiveState as one Hive Client call, it's too 
> weirld. Need to refator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37149) Improve error messages for arithmetic overflow under ANSI mode

2021-10-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-37149.
-
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34427
[https://github.com/apache/spark/pull/34427]

> Improve error messages for arithmetic overflow under ANSI mode
> --
>
> Key: SPARK-37149
> URL: https://issues.apache.org/jira/browse/SPARK-37149
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
> Fix For: 3.3.0
>
>
> Improve error messages for arithmetic overflow exceptions. We can instruct 
> users to 1) turn off ANSI mode or 2) use `try_` functions if applicable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37149) Improve error messages for arithmetic overflow under ANSI mode

2021-10-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-37149:
---

Assignee: Allison Wang

> Improve error messages for arithmetic overflow under ANSI mode
> --
>
> Key: SPARK-37149
> URL: https://issues.apache.org/jira/browse/SPARK-37149
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>
> Improve error messages for arithmetic overflow exceptions. We can instruct 
> users to 1) turn off ANSI mode or 2) use `try_` functions if applicable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37160) Add a config to optionally disable paddin for char type

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435851#comment-17435851
 ] 

Apache Spark commented on SPARK-37160:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/34436

> Add a config to optionally disable paddin for char type
> ---
>
> Key: SPARK-37160
> URL: https://issues.apache.org/jira/browse/SPARK-37160
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37160) Add a config to optionally disable paddin for char type

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37160:


Assignee: Wenchen Fan  (was: Apache Spark)

> Add a config to optionally disable paddin for char type
> ---
>
> Key: SPARK-37160
> URL: https://issues.apache.org/jira/browse/SPARK-37160
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37160) Add a config to optionally disable paddin for char type

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37160:


Assignee: Apache Spark  (was: Wenchen Fan)

> Add a config to optionally disable paddin for char type
> ---
>
> Key: SPARK-37160
> URL: https://issues.apache.org/jira/browse/SPARK-37160
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37160) Add a config to optionally disable paddin for char type

2021-10-29 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-37160:
---

 Summary: Add a config to optionally disable paddin for char type
 Key: SPARK-37160
 URL: https://issues.apache.org/jira/browse/SPARK-37160
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37159:


Assignee: Apache Spark  (was: Kousuke Saruta)

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Apache Spark
>Priority: Minor
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
> BlockManagerMasterEndpoint: Using 
> 

[jira] [Assigned] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37159:


Assignee: Kousuke Saruta  (was: Apache Spark)

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
> BlockManagerMasterEndpoint: Using 
> 

[jira] [Commented] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435819#comment-17435819
 ] 

Apache Spark commented on SPARK-37159:
--

User 'sarutak' has created a pull request for this issue:
https://github.com/apache/spark/pull/34425

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 

[jira] [Commented] (SPARK-37094) Inline type hints for files in python/pyspark

2021-10-29 Thread Byron Hsu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435818#comment-17435818
 ] 

Byron Hsu commented on SPARK-37094:
---

[~dchvn] sure! feel free to take it!

> Inline type hints for files in python/pyspark
> -
>
> Key: SPARK-37094
> URL: https://issues.apache.org/jira/browse/SPARK-37094
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-10-29 Thread Kousuke Saruta (Jira)
Kousuke Saruta created SPARK-37159:
--

 Summary: Change HiveExternalCatalogVersionsSuite to be able to 
test with Java 17
 Key: SPARK-37159
 URL: https://issues.apache.org/jira/browse/SPARK-37159
 Project: Spark
  Issue Type: Bug
  Components: SQL, Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
`HiveExternalCatalogVersionsSuite`.

{code}
[info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
*** (42 seconds, 526 milliseconds)
[info]   spark-submit returned with exit code 1.
[info]   Command line: 
'/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
 '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
'spark.sql.hive.metastore.version=2.3' '--conf' 
'spark.sql.hive.metastore.jars=maven' '--conf' 
'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
 '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
'-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
 
'/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
[info]   
[info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
[info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO SparkContext: 
Running Spark version 3.2.0
[info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
NativeCodeLoader: Unable to load native-hadoop library for your platform... 
using builtin-java classes where applicable
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: ==
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: No custom resources configured for spark.driver.
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: ==
[info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO SparkContext: 
Submitted application: prepare testing tables
[info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfile: Default ResourceProfile created, executor resources: Map(cores 
-> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 
1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , 
vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
[info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfile: Limiting resource is cpu
[info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfileManager: Added ResourceProfile id: 0
[info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing view acls to: kou
[info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing modify acls to: kou
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing view acls groups to: 
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing modify acls groups to: 
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
users  with view permissions: Set(kou); groups with view permissions: Set(); 
users  with modify permissions: Set(kou); groups with modify permissions: Set()
[info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
Successfully started service 'sparkDriver' on port 35867.
[info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
Registering MapOutputTracker
[info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
Registering BlockManagerMaster
[info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
[info]   2021-10-28 06:07:18.944 - stderr> 21/10/28 22:07:18 INFO 
BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
[info]   2021-10-28 06:07:18.945 - stdout> Traceback (most recent call last):
[info]   2021-10-28 06:07:18.946 - stdout>   File 

[jira] [Assigned] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37155:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/statcounter.py
> ---
>
> Key: SPARK-37155
> URL: https://issues.apache.org/jira/browse/SPARK-37155
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435815#comment-17435815
 ] 

Apache Spark commented on SPARK-37155:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34435

> Inline type hints for python/pyspark/statcounter.py
> ---
>
> Key: SPARK-37155
> URL: https://issues.apache.org/jira/browse/SPARK-37155
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435816#comment-17435816
 ] 

Apache Spark commented on SPARK-37155:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34435

> Inline type hints for python/pyspark/statcounter.py
> ---
>
> Key: SPARK-37155
> URL: https://issues.apache.org/jira/browse/SPARK-37155
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37155:


Assignee: Apache Spark

> Inline type hints for python/pyspark/statcounter.py
> ---
>
> Key: SPARK-37155
> URL: https://issues.apache.org/jira/browse/SPARK-37155
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37094) Inline type hints for files in python/pyspark

2021-10-29 Thread dch nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435810#comment-17435810
 ] 

dch nguyen commented on SPARK-37094:


[~ByronHsu]  I have worked on some issues (statcounter, storagelevel and util) 
last week but haven't created PRs yet, so i create them soon! . Sorry for this!

 

> Inline type hints for files in python/pyspark
> -
>
> Key: SPARK-37094
> URL: https://issues.apache.org/jira/browse/SPARK-37094
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37158:


Assignee: (was: Apache Spark)

> Add doc about spark not supported hive built-in function
> 
>
> Key: SPARK-37158
> URL: https://issues.apache.org/jira/browse/SPARK-37158
> Project: Spark
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435803#comment-17435803
 ] 

Apache Spark commented on SPARK-37158:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34434

> Add doc about spark not supported hive built-in function
> 
>
> Key: SPARK-37158
> URL: https://issues.apache.org/jira/browse/SPARK-37158
> Project: Spark
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435802#comment-17435802
 ] 

Apache Spark commented on SPARK-37158:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34434

> Add doc about spark not supported hive built-in function
> 
>
> Key: SPARK-37158
> URL: https://issues.apache.org/jira/browse/SPARK-37158
> Project: Spark
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37158:


Assignee: Apache Spark

> Add doc about spark not supported hive built-in function
> 
>
> Key: SPARK-37158
> URL: https://issues.apache.org/jira/browse/SPARK-37158
> Project: Spark
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: Apache Spark
>Priority: Major
>
> Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-10-29 Thread angerszhu (Jira)
angerszhu created SPARK-37158:
-

 Summary: Add doc about spark not supported hive built-in function
 Key: SPARK-37158
 URL: https://issues.apache.org/jira/browse/SPARK-37158
 Project: Spark
  Issue Type: Improvement
  Components: docs
Affects Versions: 3.2.0
Reporter: angerszhu


Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37146) Inline type hints for python/pyspark/__init__.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435788#comment-17435788
 ] 

Apache Spark commented on SPARK-37146:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34433

> Inline type hints for python/pyspark/__init__.py
> 
>
> Key: SPARK-37146
> URL: https://issues.apache.org/jira/browse/SPARK-37146
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37146) Inline type hints for python/pyspark/__init__.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37146:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/__init__.py
> 
>
> Key: SPARK-37146
> URL: https://issues.apache.org/jira/browse/SPARK-37146
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37146) Inline type hints for python/pyspark/__init__.py

2021-10-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435787#comment-17435787
 ] 

Apache Spark commented on SPARK-37146:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34433

> Inline type hints for python/pyspark/__init__.py
> 
>
> Key: SPARK-37146
> URL: https://issues.apache.org/jira/browse/SPARK-37146
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37146) Inline type hints for python/pyspark/__init__.py

2021-10-29 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37146:


Assignee: Apache Spark

> Inline type hints for python/pyspark/__init__.py
> 
>
> Key: SPARK-37146
> URL: https://issues.apache.org/jira/browse/SPARK-37146
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org