[jira] [Commented] (SPARK-32002) spark error while select nest data

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237174#comment-17237174
 ] 

Apache Spark commented on SPARK-32002:
--

User 'guiyanakuang' has created a pull request for this issue:
https://github.com/apache/spark/pull/30467

> spark error while select nest data
> --
>
> Key: SPARK-32002
> URL: https://issues.apache.org/jira/browse/SPARK-32002
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: Yiqun Zhang
>Priority: Major
>
> nest-data.json
> {code:java}
> {"a": [{"b": [{"c": [1,2]}]}]}
> {"a": [{"b": [{"c": [1]}, {"c": [2]}]}]}{code}
> {code:java}
> val df: DataFrame = spark.read.json(testFile("nest-data.json"))
> df.createTempView("nest_table")
> sql("select a.b.c from nest_table").show()
> {code}
> {color:#ff}org.apache.spark.sql.AnalysisException: cannot resolve 
> 'nest_table.`a`.`b`['c']' due to data type mismatch: argument 2 requires 
> integral type, however, ''c'' is of string type.; line 1 pos 7;{color}
>  {color:#ff}'Project [a#6.b[c] AS c#8|#6.b[c] AS c#8]{color}
>  {color:#ff}+- SubqueryAlias `nest_table`{color}
>  {color:#ff} +- Relation[a#6|#6] json{color}
> {color:#172b4d}Analyse the causes, a.b Expression dataType match extractor 
> for c, but a.b extractor is GetArrayStructFields, ArrayType(ArrayType()) 
> match {color}GetArrayItem, extraction ("c") treat as an ordinal.
> org.apache.spark.sql.catalyst.expressions.ExtractValue
> {code:java}
> def apply(
>   child: Expression,
>   extraction: Expression,
>   resolver: Resolver): Expression = {
>(child.dataType, extraction) match {
>   case (StructType(fields), NonNullLiteral(v, StringType)) =>
> val fieldName = v.toString
> val ordinal = findField(fields, fieldName, resolver)
> GetStructField(child, ordinal, Some(fieldName))  
>   case (ArrayType(StructType(fields), containsNull), NonNullLiteral(v, 
> StringType)) =>
> val fieldName = v.toString
> val ordinal = findField(fields, fieldName, resolver)
> GetArrayStructFields(child, fields(ordinal).copy(name = fieldName),
>   ordinal, fields.length, containsNull)  
>   case (_: ArrayType, _) => GetArrayItem(child, extraction)  
>   case (MapType(kt, _, _), _) => GetMapValue(child, extraction)  
>   case (otherType, _) =>
> val errorMsg = otherType match {
>   case StructType(_) =>
> s"Field name should be String Literal, but it's $extraction"
>   case other =>
> s"Can't extract value from $child: need struct type but got 
> ${other.catalogString}"
> }
> throw new AnalysisException(errorMsg)
> }
>   }{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32587) SPARK SQL writing to JDBC target with bit datatype using Dataframe is writing NULL values

2020-11-22 Thread Mohit Dave (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237171#comment-17237171
 ] 

Mohit Dave commented on SPARK-32587:


Hi [~zdh]

The issue here is if backend table has BIT datatype and we use SPARK API to 
read and write to the BIT type its writing NULL instead of actual value for 
Azure SQL DB.

Steps to reproduce in Azure SQL DB :

 

1)Create source table with BIT datype and insert some value.

2)Create dataframe using spark.read from the source table 

3)Create target table with BIT datatype.

4)Write the dataframe to Traget table using df.write API.

 

Observation : Data is written as NULL for target BIT type.Other datatypes are 
working fine.

 

 

 

 

 

> SPARK SQL writing to JDBC target with bit datatype using Dataframe is writing 
> NULL values
> -
>
> Key: SPARK-32587
> URL: https://issues.apache.org/jira/browse/SPARK-32587
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.5
>Reporter: Mohit Dave
>Priority: Major
>
> While writing to a target in SQL Server using Microsoft's SQL Server driver 
> using dataframe.write API the target is storing NULL values for BIT columns.
>  
> Table definition
> Azure SQL DB 
> 1)Create 2 tables with column type as bit
> 2)Insert some record into 1 table
> Create a SPARK job 
> 1)Create a Dataframe using spark.read with the following query
> select  from 
> 2)Write the dataframe to a target table with bit type  as column.
>  
> Observation : Bit type is getting converted to NULL at the target
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-33510) Update SBT to 1.4.3

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-33510.
---
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30453
[https://github.com/apache/spark/pull/30453]

> Update SBT to 1.4.3
> ---
>
> Key: SPARK-33510
> URL: https://issues.apache.org/jira/browse/SPARK-33510
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: William Hyun
>Assignee: William Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-33510) Update SBT to 1.4.4

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-33510:
--
Summary: Update SBT to 1.4.4  (was: Update SBT to 1.4.3)

> Update SBT to 1.4.4
> ---
>
> Key: SPARK-33510
> URL: https://issues.apache.org/jira/browse/SPARK-33510
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: William Hyun
>Assignee: William Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33510) Update SBT to 1.4.3

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-33510:
-

Assignee: William Hyun

> Update SBT to 1.4.3
> ---
>
> Key: SPARK-33510
> URL: https://issues.apache.org/jira/browse/SPARK-33510
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: William Hyun
>Assignee: William Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33517) Incorrect menu item display and link in PySpark Usage Guide for Pandas with Apache Arrow

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33517:


Assignee: Apache Spark

> Incorrect menu item display and link in PySpark Usage Guide for Pandas with 
> Apache Arrow
> 
>
> Key: SPARK-33517
> URL: https://issues.apache.org/jira/browse/SPARK-33517
> Project: Spark
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 3.0.0, 3.0.1
>Reporter: liucht-inspur
>Assignee: Apache Spark
>Priority: Minor
> Attachments: spark-doc.jpg
>
>
> Error setting menu item and link, change "Apache Arrow in Spark" to "Apache 
> Arrow in PySpark"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33517) Incorrect menu item display and link in PySpark Usage Guide for Pandas with Apache Arrow

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33517:


Assignee: (was: Apache Spark)

> Incorrect menu item display and link in PySpark Usage Guide for Pandas with 
> Apache Arrow
> 
>
> Key: SPARK-33517
> URL: https://issues.apache.org/jira/browse/SPARK-33517
> Project: Spark
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 3.0.0, 3.0.1
>Reporter: liucht-inspur
>Priority: Minor
> Attachments: spark-doc.jpg
>
>
> Error setting menu item and link, change "Apache Arrow in Spark" to "Apache 
> Arrow in PySpark"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33517) Incorrect menu item display and link in PySpark Usage Guide for Pandas with Apache Arrow

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237150#comment-17237150
 ] 

Apache Spark commented on SPARK-33517:
--

User 'liucht-inspur' has created a pull request for this issue:
https://github.com/apache/spark/pull/30466

> Incorrect menu item display and link in PySpark Usage Guide for Pandas with 
> Apache Arrow
> 
>
> Key: SPARK-33517
> URL: https://issues.apache.org/jira/browse/SPARK-33517
> Project: Spark
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 3.0.0, 3.0.1
>Reporter: liucht-inspur
>Priority: Minor
> Attachments: spark-doc.jpg
>
>
> Error setting menu item and link, change "Apache Arrow in Spark" to "Apache 
> Arrow in PySpark"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-33517) Incorrect menu item display and link in PySpark Usage Guide for Pandas with Apache Arrow

2020-11-22 Thread liucht-inspur (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liucht-inspur updated SPARK-33517:
--
Attachment: spark-doc.jpg

> Incorrect menu item display and link in PySpark Usage Guide for Pandas with 
> Apache Arrow
> 
>
> Key: SPARK-33517
> URL: https://issues.apache.org/jira/browse/SPARK-33517
> Project: Spark
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 3.0.0, 3.0.1
>Reporter: liucht-inspur
>Priority: Minor
> Attachments: spark-doc.jpg
>
>
> Error setting menu item and link, change "Apache Arrow in Spark" to "Apache 
> Arrow in PySpark"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-33517) Incorrect menu item display and link in PySpark Usage Guide for Pandas with Apache Arrow

2020-11-22 Thread liucht-inspur (Jira)
liucht-inspur created SPARK-33517:
-

 Summary: Incorrect menu item display and link in PySpark Usage 
Guide for Pandas with Apache Arrow
 Key: SPARK-33517
 URL: https://issues.apache.org/jira/browse/SPARK-33517
 Project: Spark
  Issue Type: Bug
  Components: docs
Affects Versions: 3.0.1, 3.0.0
Reporter: liucht-inspur


Error setting menu item and link, change "Apache Arrow in Spark" to "Apache 
Arrow in PySpark"

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33045) Implement built-in LIKE ANY and LIKE ALL UDF

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237141#comment-17237141
 ] 

Apache Spark commented on SPARK-33045:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/30465

> Implement built-in LIKE ANY and LIKE ALL UDF
> 
>
> Key: SPARK-33045
> URL: https://issues.apache.org/jira/browse/SPARK-33045
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Yuming Wang
>Assignee: jiaan.geng
>Priority: Major
> Fix For: 3.1.0
>
>
> We already support LIKE ANY / SOME / ALL syntax, but it will throw 
> {{StackOverflowError}} if there are many elements(more than 14378 elements). 
> We should implement built-in LIKE ANY and LIKE ALL UDF to fix this issue.
> {noformat}
> java.lang.StackOverflowError
>   at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>   at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
>   at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
>   at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
>   at 
> scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:184)
>   at 
> scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:47)
>   at 
> scala.collection.generic.GenericCompanion.apply(GenericCompanion.scala:53)
>   at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.children(Expression.scala:549)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
>   at scala.collection.immutable.List.foreach(List.scala:392)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-33143) Make SocketAuthServer socket timeout configurable

2020-11-22 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33143.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30389
[https://github.com/apache/spark/pull/30389]

> Make SocketAuthServer socket timeout configurable
> -
>
> Key: SPARK-33143
> URL: https://issues.apache.org/jira/browse/SPARK-33143
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, Spark Core
>Affects Versions: 2.4.7, 3.0.1
>Reporter: Miklos Szurap
>Assignee: Gabor Somogyi
>Priority: Major
> Fix For: 3.1.0
>
>
> In SPARK-21551 the socket timeout for the Pyspark applications has been 
> increased from 3 to 15 seconds. However it is still hardcoded.
> In certain situations even the 15 seconds is not enough, so it should be made 
> configurable. 
> This is requested after seeing it in real-life workload failures.
> Also it has been suggested and requested in an earlier comment in 
> [SPARK-18649|https://issues.apache.org/jira/browse/SPARK-18649?focusedCommentId=16493498=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16493498]
> In 
> Spark 2.4 it is under
> [PythonRDD.scala|https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L899]
> in Spark 3.x the code has been moved to
> [SocketAuthServer.scala|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala#L51]
> {code}
> serverSocket.setSoTimeout(15000)
> {code}
> Please include this in both 2.4 and 3.x branches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33143) Make SocketAuthServer socket timeout configurable

2020-11-22 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-33143:


Assignee: Gabor Somogyi

> Make SocketAuthServer socket timeout configurable
> -
>
> Key: SPARK-33143
> URL: https://issues.apache.org/jira/browse/SPARK-33143
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, Spark Core
>Affects Versions: 2.4.7, 3.0.1
>Reporter: Miklos Szurap
>Assignee: Gabor Somogyi
>Priority: Major
>
> In SPARK-21551 the socket timeout for the Pyspark applications has been 
> increased from 3 to 15 seconds. However it is still hardcoded.
> In certain situations even the 15 seconds is not enough, so it should be made 
> configurable. 
> This is requested after seeing it in real-life workload failures.
> Also it has been suggested and requested in an earlier comment in 
> [SPARK-18649|https://issues.apache.org/jira/browse/SPARK-18649?focusedCommentId=16493498=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16493498]
> In 
> Spark 2.4 it is under
> [PythonRDD.scala|https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L899]
> in Spark 3.x the code has been moved to
> [SocketAuthServer.scala|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala#L51]
> {code}
> serverSocket.setSoTimeout(15000)
> {code}
> Please include this in both 2.4 and 3.x branches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33516) Upgrade Scala 2.13 from 2.13.3 to 2.13.4

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33516:


Assignee: (was: Apache Spark)

> Upgrade Scala 2.13 from 2.13.3 to 2.13.4
> 
>
> Key: SPARK-33516
> URL: https://issues.apache.org/jira/browse/SPARK-33516
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Major
>
> Scala 2.13.4 released(https://github.com/scala/scala/releases/tag/v2.13.4)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33516) Upgrade Scala 2.13 from 2.13.3 to 2.13.4

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33516:


Assignee: Apache Spark

> Upgrade Scala 2.13 from 2.13.3 to 2.13.4
> 
>
> Key: SPARK-33516
> URL: https://issues.apache.org/jira/browse/SPARK-33516
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>
> Scala 2.13.4 released(https://github.com/scala/scala/releases/tag/v2.13.4)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33516) Upgrade Scala 2.13 from 2.13.3 to 2.13.4

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237137#comment-17237137
 ] 

Apache Spark commented on SPARK-33516:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/30464

> Upgrade Scala 2.13 from 2.13.3 to 2.13.4
> 
>
> Key: SPARK-33516
> URL: https://issues.apache.org/jira/browse/SPARK-33516
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Major
>
> Scala 2.13.4 released(https://github.com/scala/scala/releases/tag/v2.13.4)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32481) Support truncate table to move the data to trash

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237124#comment-17237124
 ] 

Apache Spark commented on SPARK-32481:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/30463

> Support truncate table to move the data to trash
> 
>
> Key: SPARK-32481
> URL: https://issues.apache.org/jira/browse/SPARK-32481
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.1.0
>Reporter: jobit mathew
>Assignee: Udbhav Agrawal
>Priority: Minor
> Fix For: 3.1.0
>
>
> *Instead of deleting the data, move the data to trash.So from trash based on 
> configuration data can be deleted permanently.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33423) DoubleParam can't parse a JSON number without decimal places

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33423:


Assignee: Apache Spark

> DoubleParam can't parse a JSON number without decimal places
> 
>
> Key: SPARK-33423
> URL: https://issues.apache.org/jira/browse/SPARK-33423
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib
>Affects Versions: 2.4.6
>Reporter: Alexander Bouriakov
>Assignee: Apache Spark
>Priority: Minor
>
> {quote}{color:#0033b3}new {color}DoubleParam({color:#067d17}""{color}, 
> {color:#067d17}""{color}, 
> {color:#067d17}""{color}).jsonDecode({color:#067d17}"1"{color})
> {quote}
>  
> throws an following exception (see below). It would be great if it could make 
> a double from a JInt (trivial)
>  
> {{ java.lang.IllegalArgumentException: Cannot decode JInt(1) to Double.}}
> {{at org.apache.spark.ml.param.DoubleParam$.jValueDecode(params.scala:380)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:349)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:330)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33423) DoubleParam can't parse a JSON number without decimal places

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33423:


Assignee: (was: Apache Spark)

> DoubleParam can't parse a JSON number without decimal places
> 
>
> Key: SPARK-33423
> URL: https://issues.apache.org/jira/browse/SPARK-33423
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib
>Affects Versions: 2.4.6
>Reporter: Alexander Bouriakov
>Priority: Minor
>
> {quote}{color:#0033b3}new {color}DoubleParam({color:#067d17}""{color}, 
> {color:#067d17}""{color}, 
> {color:#067d17}""{color}).jsonDecode({color:#067d17}"1"{color})
> {quote}
>  
> throws an following exception (see below). It would be great if it could make 
> a double from a JInt (trivial)
>  
> {{ java.lang.IllegalArgumentException: Cannot decode JInt(1) to Double.}}
> {{at org.apache.spark.ml.param.DoubleParam$.jValueDecode(params.scala:380)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:349)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:330)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33423) DoubleParam can't parse a JSON number without decimal places

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237115#comment-17237115
 ] 

Apache Spark commented on SPARK-33423:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/30462

> DoubleParam can't parse a JSON number without decimal places
> 
>
> Key: SPARK-33423
> URL: https://issues.apache.org/jira/browse/SPARK-33423
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib
>Affects Versions: 2.4.6
>Reporter: Alexander Bouriakov
>Priority: Minor
>
> {quote}{color:#0033b3}new {color}DoubleParam({color:#067d17}""{color}, 
> {color:#067d17}""{color}, 
> {color:#067d17}""{color}).jsonDecode({color:#067d17}"1"{color})
> {quote}
>  
> throws an following exception (see below). It would be great if it could make 
> a double from a JInt (trivial)
>  
> {{ java.lang.IllegalArgumentException: Cannot decode JInt(1) to Double.}}
> {{at org.apache.spark.ml.param.DoubleParam$.jValueDecode(params.scala:380)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:349)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:330)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33423) DoubleParam can't parse a JSON number without decimal places

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237116#comment-17237116
 ] 

Apache Spark commented on SPARK-33423:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/30462

> DoubleParam can't parse a JSON number without decimal places
> 
>
> Key: SPARK-33423
> URL: https://issues.apache.org/jira/browse/SPARK-33423
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib
>Affects Versions: 2.4.6
>Reporter: Alexander Bouriakov
>Priority: Minor
>
> {quote}{color:#0033b3}new {color}DoubleParam({color:#067d17}""{color}, 
> {color:#067d17}""{color}, 
> {color:#067d17}""{color}).jsonDecode({color:#067d17}"1"{color})
> {quote}
>  
> throws an following exception (see below). It would be great if it could make 
> a double from a JInt (trivial)
>  
> {{ java.lang.IllegalArgumentException: Cannot decode JInt(1) to Double.}}
> {{at org.apache.spark.ml.param.DoubleParam$.jValueDecode(params.scala:380)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:349)}}
> {{at org.apache.spark.ml.param.DoubleParam.jsonDecode(params.scala:330)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-33516) Upgrade Scala 2.13 from 2.13.3 to 2.13.4

2020-11-22 Thread Yang Jie (Jira)
Yang Jie created SPARK-33516:


 Summary: Upgrade Scala 2.13 from 2.13.3 to 2.13.4
 Key: SPARK-33516
 URL: https://issues.apache.org/jira/browse/SPARK-33516
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 3.1.0
Reporter: Yang Jie


Scala 2.13.4 released(https://github.com/scala/scala/releases/tag/v2.13.4)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-33512) Upgrade test libraries

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-33512.
---
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30456
[https://github.com/apache/spark/pull/30456]

> Upgrade test libraries
> --
>
> Key: SPARK-33512
> URL: https://issues.apache.org/jira/browse/SPARK-33512
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33512) Upgrade test libraries

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-33512:
-

Assignee: Dongjoon Hyun

> Upgrade test libraries
> --
>
> Key: SPARK-33512
> URL: https://issues.apache.org/jira/browse/SPARK-33512
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33515) Improve exception messages while handling UnresolvedTable

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237046#comment-17237046
 ] 

Apache Spark commented on SPARK-33515:
--

User 'imback82' has created a pull request for this issue:
https://github.com/apache/spark/pull/30461

> Improve exception messages while handling UnresolvedTable
> -
>
> Key: SPARK-33515
> URL: https://issues.apache.org/jira/browse/SPARK-33515
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Terry Kim
>Priority: Minor
>
> Improve exception messages while handling UnresolvedTable by adding command 
> name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33515) Improve exception messages while handling UnresolvedTable

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33515:


Assignee: Apache Spark

> Improve exception messages while handling UnresolvedTable
> -
>
> Key: SPARK-33515
> URL: https://issues.apache.org/jira/browse/SPARK-33515
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Terry Kim
>Assignee: Apache Spark
>Priority: Minor
>
> Improve exception messages while handling UnresolvedTable by adding command 
> name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33515) Improve exception messages while handling UnresolvedTable

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237045#comment-17237045
 ] 

Apache Spark commented on SPARK-33515:
--

User 'imback82' has created a pull request for this issue:
https://github.com/apache/spark/pull/30461

> Improve exception messages while handling UnresolvedTable
> -
>
> Key: SPARK-33515
> URL: https://issues.apache.org/jira/browse/SPARK-33515
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Terry Kim
>Priority: Minor
>
> Improve exception messages while handling UnresolvedTable by adding command 
> name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33515) Improve exception messages while handling UnresolvedTable

2020-11-22 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33515:


Assignee: (was: Apache Spark)

> Improve exception messages while handling UnresolvedTable
> -
>
> Key: SPARK-33515
> URL: https://issues.apache.org/jira/browse/SPARK-33515
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Terry Kim
>Priority: Minor
>
> Improve exception messages while handling UnresolvedTable by adding command 
> name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-33515) Improve exception messages while handling UnresolvedTable

2020-11-22 Thread Terry Kim (Jira)
Terry Kim created SPARK-33515:
-

 Summary: Improve exception messages while handling UnresolvedTable
 Key: SPARK-33515
 URL: https://issues.apache.org/jira/browse/SPARK-33515
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.1.0
Reporter: Terry Kim


Improve exception messages while handling UnresolvedTable by adding command 
name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-33469) Add current_timezone function

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-33469.
---
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30400
[https://github.com/apache/spark/pull/30400]

> Add current_timezone function
> -
>
> Key: SPARK-33469
> URL: https://issues.apache.org/jira/browse/SPARK-33469
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: ulysses you
>Assignee: ulysses you
>Priority: Minor
> Fix For: 3.1.0
>
>
> Add current_timezone function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-33469) Add current_timezone function

2020-11-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-33469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-33469:
-

Assignee: ulysses you

> Add current_timezone function
> -
>
> Key: SPARK-33469
> URL: https://issues.apache.org/jira/browse/SPARK-33469
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: ulysses you
>Assignee: ulysses you
>Priority: Minor
>
> Add current_timezone function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-31962) Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-11-22 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-31962:


Assignee: Christopher Highman

> Provide modifiedAfter and modifiedBefore options when filtering from a 
> batch-based file data source
> ---
>
> Key: SPARK-31962
> URL: https://issues.apache.org/jira/browse/SPARK-31962
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Christopher Highman
>Assignee: Christopher Highman
>Priority: Minor
>
> Two new options, _modifiiedBefore_ and _modifiedAfter_, is provided expecting 
> a value in '-MM-DDTHH:mm:ss' format. _PartioningAwareFileIndex_ considers 
> these options during the process of checking for files, just before 
> considering applied _PathFilters_ such as {{pathGlobFilter.}} In order to 
> filter file results, a new PathFilter class was derived for this purpose. 
> General house-keeping around classes extending PathFilter was performed for 
> neatness. It became apparent support was needed to handle multiple potential 
> path filters. Logic was introduced for this purpose and the associated tests 
> written.
>  
> When loading files from a data source, there can often times be thousands of 
> file within a respective file path. In many cases I've seen, we want to start 
> loading from a folder path and ideally be able to begin loading files having 
> modification dates past a certain point. This would mean out of thousands of 
> potential files, only the ones with modification dates greater than the 
> specified timestamp would be considered. This saves a ton of time 
> automatically and reduces significant complexity managing this in code.
>  
> *Example Usages*
> _Load all CSV files modified after date:_
> {{spark.read.format("csv").option("modifiedAfter","2020-06-15T05:00:00").load()}}
> _Load all CSV files modified before date:_
> {{spark.read.format("csv").option("modifiedBefore","2020-06-15T05:00:00").load()}}
> _Load all CSV files modified between two dates:_
> {{spark.read .format("csv") .option("modifiedAfter","2019-01-15T05:00:00") 
> .option("modifiedBefore","2020-06-15T05:00:00") .load()}}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-31962) Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-11-22 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-31962.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30411
[https://github.com/apache/spark/pull/30411]

> Provide modifiedAfter and modifiedBefore options when filtering from a 
> batch-based file data source
> ---
>
> Key: SPARK-31962
> URL: https://issues.apache.org/jira/browse/SPARK-31962
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Christopher Highman
>Assignee: Christopher Highman
>Priority: Minor
> Fix For: 3.1.0
>
>
> Two new options, _modifiiedBefore_ and _modifiedAfter_, is provided expecting 
> a value in '-MM-DDTHH:mm:ss' format. _PartioningAwareFileIndex_ considers 
> these options during the process of checking for files, just before 
> considering applied _PathFilters_ such as {{pathGlobFilter.}} In order to 
> filter file results, a new PathFilter class was derived for this purpose. 
> General house-keeping around classes extending PathFilter was performed for 
> neatness. It became apparent support was needed to handle multiple potential 
> path filters. Logic was introduced for this purpose and the associated tests 
> written.
>  
> When loading files from a data source, there can often times be thousands of 
> file within a respective file path. In many cases I've seen, we want to start 
> loading from a folder path and ideally be able to begin loading files having 
> modification dates past a certain point. This would mean out of thousands of 
> potential files, only the ones with modification dates greater than the 
> specified timestamp would be considered. This saves a ton of time 
> automatically and reduces significant complexity managing this in code.
>  
> *Example Usages*
> _Load all CSV files modified after date:_
> {{spark.read.format("csv").option("modifiedAfter","2020-06-15T05:00:00").load()}}
> _Load all CSV files modified before date:_
> {{spark.read.format("csv").option("modifiedBefore","2020-06-15T05:00:00").load()}}
> _Load all CSV files modified between two dates:_
> {{spark.read .format("csv") .option("modifiedAfter","2019-01-15T05:00:00") 
> .option("modifiedBefore","2020-06-15T05:00:00") .load()}}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33340) spark run on kubernetes has Could not load KUBERNETES classes issue

2020-11-22 Thread Daniel Moore (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237027#comment-17237027
 ] 

Daniel Moore commented on SPARK-33340:
--

Looks like it is bombing out when it verifies 
`org.apache.spark.deploy.k8s.submit.KubernetesClientApplication`  is loaded.  
Jump into that image and see if that is there.  Also, maybe try an image built 
using just the kubernetes profile and not mesos or yarn.

> spark run on kubernetes has Could not load KUBERNETES classes issue
> ---
>
> Key: SPARK-33340
> URL: https://issues.apache.org/jira/browse/SPARK-33340
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.1
> Environment: Kubernete 1.16
> Spark (master branch code)
>Reporter: Xiu Juan Xiang
>Priority: Major
>
> Hi, I am trying to run spark on my kubernetes cluster (it's not a minikube 
> cluster). And I follow this doc: 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] to create 
> spark docker image and then submit the application step by step. However, it 
> failed and I check the log of spark driver, it showed below error:
> ```+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
> spark.driver.bindAddress=172.30.140.13 --deploy-mode client --properties-file 
> /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner 
> file:/root/Work/spark/examples/src/main/python/wordcount.py+ exec 
> /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
> spark.driver.bindAddress=172.30.140.13 --deploy-mode client --properties-file 
> /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner 
> file:/root/Work/spark/examples/src/main/python/wordcount.pyException in 
> thread "main" org.apache.spark.SparkException: Could not load KUBERNETES 
> classes. This copy of Spark may not have been compiled with KUBERNETES 
> support. at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:942) 
> at 
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:265)
>  at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:877)
>  at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at 
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at 
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1013) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1022) at 
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> ```
> I am not sure if I am missing which step. I have been blocked here several 
> days. Cloud you please help me about this? Thanks in advance!
>  
> By the way, below is the step I did:
>  # Prepare a kubernetes cluster and check I have appropriate permissions to 
> list, create, edit and delete pods;
> About this, I am sure, I have all necessary permissions.
>  # Build distribution
> ```
> ./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr 
> -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes
> ```
>  # Build spark docker image
> ```
> ./bin/docker-image-tool.sh spark -t latest build
> ```
>  # submit application 
> ```
> ./bin/spark-submit --master 
> k8s://https://c7.us-south.containers.cloud.ibm.com:31937 --deploy-mode 
> cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf 
> spark.executor.instances=5 --conf 
> spark.kubernetes.container.image=docker.io/bluebosh/spark:python3 
> examples/src/main/python/wordcount.py
> ```
> BTW, I am sure the master is correct and also my docker image has contained 
> python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33427) Interpreted subexpression elimination

2020-11-22 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237000#comment-17237000
 ] 

Apache Spark commented on SPARK-33427:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30459

> Interpreted subexpression elimination
> -
>
> Key: SPARK-33427
> URL: https://issues.apache.org/jira/browse/SPARK-33427
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
> Fix For: 3.1.0
>
>
> Currently we only do subexpression elimination for codegen. For some reasons, 
> we may need to run interpreted expression valuation. For example, codegen 
> fails to compile and fallback to interpreted mode. It is commonly seen for 
> complex schema from expressions that is possibly caused by the query 
> optimizer too, e.g. SPARK-32945.
> We should also support subexpression elimination for interpreted evaluation. 
> That could reduce performance difference when Spark fallbacks from codegen to 
> interpreted expression evaluation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28704) Test backward compatibility on JDK9+ once we have a version supports JDK9+

2020-11-22 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17236993#comment-17236993
 ] 

Dongjoon Hyun commented on SPARK-28704:
---

This is resolved via https://github.com/apache/spark/pull/30451

> Test backward compatibility on JDK9+ once we have a version supports JDK9+
> --
>
> Key: SPARK-28704
> URL: https://issues.apache.org/jira/browse/SPARK-28704
> Project: Spark
>  Issue Type: Test
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.1.0
>
>
> We skip test HiveExternalCatalogVersionsSuite when testing with JAVA_9 or 
> later because our previous version does not support JAVA_9 or later. We 
> should add it back once we have a version supports JAVA_9 or later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org