[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues

2019-09-02 Thread jobit mathew (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jobit mathew updated SPARK-28930:
-
Description: 
Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
*Last Access time and* feeling some information displays can make it better.

Test steps:
 1. Open spark sql
 2. Create table with partition
 CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 
STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address 
STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 
'hdfs://hacluster/user/sparkhive/warehouse';
 3. from spark sql check the table description
 desc formatted tablename;
 4. From scala shell check the table description
 sql("desc formatted tablename").show()

*Issue1:*
 If there is no comment for spark scala shell shows *"null" in small letters* 
but all other places Hive beeline/Spark beeline/Spark SQL it is showing in 
*CAPITAL "NULL*". Better to show same in all places.

 
{code:java}
*scala>* sql("desc formatted employees_info_extended").show(false);
 +-+---++---
|col_name|data_type|*comment*|

+-+---++---
|id|int|*null*|
|name|string|*null*|
|usd_flag|string|*null*|
|salary|double|*null*|
|deductions|map|*null*|
|address|string|null|
|entrytime|string|null|
| # Partition Information| | |
| # col_name|data_type|comment|
|entrytime|string|null|
| | | |
| # Detailed Table Information| | |
|Database|sparkdb__| |
|Table|employees_info_extended| |
|Owner|root| |

*|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
 *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
|Created By|Spark 2.4.3| |
|Type|EXTERNAL| |
|Provider|hive| |

+-+---++---
 only showing top 20 rows

*scala>*
{code}
*Issue 2:*
 Spark SQL "desc formatted tablename" is not showing the header [# 
col_name,data_type,comment|#col_name,data_type,comment] in the top of the query 
result.But header is showing on top of partition description. For Better 
understanding show the header on Top of the query result.Other than in spark 
sql ,we are able to see the header like [# 
col_name,data_type,comment|#col_name,data_type,comment] in spark-beeline & hive 
beeline  .
{code:java}
*spark-sql>* desc formatted employees_info_extended1;
 id int *NULL*
 name string *NULL*
 usd_flag string NULL
 salary double NULL
 deductions map NULL
 address string NULL
 entrytime string NULL
 * 
 ## Partition Information*
 ## col_name data_type comment*
 entrytime string *NULL*

 # Detailed Table Information
 Database sparkdb__
 Table employees_info_extended1
 Owner spark
 *Created Time Tue Aug 20 14:50:37 CST 2019*
 *Last Access Thu Jan 01 08:00:00 CST 1970*
 Created By Spark 2.3.2.0201
 Type EXTERNAL
 Provider hive
 Table Properties [transient_lastDdlTime=1566286655]
 Location hdfs://hacluster/user/sparkhive/warehouse
 Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 InputFormat org.apache.hadoop.mapred.TextInputFormat
 OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 Storage Properties [serialization.format=1]
 Partition Provider Catalog
 Time taken: 0.477 seconds, Fetched 27 row(s)
 *spark-sql>*

This is the spark-beeline which is showing the headers 
0: jdbc:hive2://10.186.60.158:23040/default> desc formatted employees;
+---+-+--+--+
|   col_name|data_type  
  | comment  |
+---+-+--+--+
| name  | string
  | Employee name|
| salary| float 
  | Employee salary  |
|   |   
  |  |



{code}
 

*Issue 3:*
 I created the table on Aug 20.So it is showing created time correct .*But Last 
access time showing 1970 Jan 01*. It is not good to show Last access time 
earlier time than the created time.Better to show the correct date and time 
else show UNKNOWN.
 *[Created Time,Tue Aug 20 13:42:06 CST 2019,]*
 *[Last Access,Thu Jan 01 08:00:00 CST 1970,]*

  was:
Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
*Last Access time and* feeling some information displays can make it better.

Test steps:
 1. Open spark sql
 2. Create table with partition
 CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 

[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues

2019-09-02 Thread jobit mathew (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jobit mathew updated SPARK-28930:
-
Description: 
Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
*Last Access time and* feeling some information displays can make it better.

Test steps:
 1. Open spark sql
 2. Create table with partition
 CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 
STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address 
STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 
'hdfs://hacluster/user/sparkhive/warehouse';
 3. from spark sql check the table description
 desc formatted tablename;
 4. From scala shell check the table description
 sql("desc formatted tablename").show()

*Issue1:*
 If there is no comment for spark scala shell shows *"null" in small letters* 
but all other places Hive beeline/Spark beeline/Spark SQL it is showing in 
*CAPITAL "NULL*". Better to show same in all places.

 
{code:java}
*scala>* sql("desc formatted employees_info_extended").show(false);
 +-+---++---
|col_name|data_type|*comment*|

+-+---++---
|id|int|*null*|
|name|string|*null*|
|usd_flag|string|*null*|
|salary|double|*null*|
|deductions|map|*null*|
|address|string|null|
|entrytime|string|null|
| # Partition Information| | |
| # col_name|data_type|comment|
|entrytime|string|null|
| | | |
| # Detailed Table Information| | |
|Database|sparkdb__| |
|Table|employees_info_extended| |
|Owner|root| |

*|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
 *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
|Created By|Spark 2.4.3| |
|Type|EXTERNAL| |
|Provider|hive| |

+-+---++---
 only showing top 20 rows

*scala>*
{code}
*Issue 2:*
 Spark SQL "desc formatted tablename" is not showing the header [# 
col_name,data_type,comment|#col_name,data_type,comment] in the top of the query 
result.But header is showing on top of partition description. For Better 
understanding show the header on Top of the query result.Other than in spark 
sql ,we are able to see the header like [# 
col_name,data_type,comment|#col_name,data_type,comment] in spark-beeline & hive 
beeline  .
{code:java}
*spark-sql>* desc formatted employees_info_extended1;
 id int *NULL*
 name string *NULL*
 usd_flag string NULL
 salary double NULL
 deductions map NULL
 address string NULL
 entrytime string NULL
 * 
 ## Partition Information*
 ## col_name data_type comment*
 entrytime string *NULL*

 # Detailed Table Information
 Database sparkdb__
 Table employees_info_extended1
 Owner spark
 *Created Time Tue Aug 20 14:50:37 CST 2019*
 *Last Access Thu Jan 01 08:00:00 CST 1970*
 Created By Spark 2.3.2.0201
 Type EXTERNAL
 Provider hive
 Table Properties [transient_lastDdlTime=1566286655]
 Location hdfs://hacluster/user/sparkhive/warehouse
 Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 InputFormat org.apache.hadoop.mapred.TextInputFormat
 OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 Storage Properties [serialization.format=1]
 Partition Provider Catalog
 Time taken: 0.477 seconds, Fetched 27 row(s)
 *spark-sql>*

this is the spark-beeline which is showing the headers 
0: jdbc:hive2://10.186.60.158:23040/default> desc formatted employees;
+---+-+--+--+
|   col_name|data_type  
  | comment  |
+---+-+--+--+
| name  | string
  | Employee name|
| salary| float 
  | Employee salary  |
|   |   
  |  |
| # Detailed Table Information  |   
  |  |
| Database  | sparkdb__ 
  |  |
| Table | employees 
  |  |
| Owner | spark 
  |  |
| Created Time  | Mon Aug 26 15:25:01 CST 2019  
  | 

[jira] [Assigned] (SPARK-28573) Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for partitioned table

2019-09-02 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-28573:
---

Assignee: Xianjin YE

> Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for 
> partitioned table
> 
>
> Key: SPARK-28573
> URL: https://issues.apache.org/jira/browse/SPARK-28573
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xianjin YE
>Assignee: Xianjin YE
>Priority: Major
>
> Currently we don't translate InsertInto(HiveTableRelation) to DataSource 
> insertion when partitioned table is involved, the reason is that, quote from 
> the comments:
> {quote}// Inserting into partitioned table is not supported in Parquet/Orc 
> data source (yet).
> {quote}
>  
> which doesn't hold any more. Since datasource table dynamic partition insert 
> now supports 
> dynamic mode (SPARK-20236). I think it's worthy to translate 
> InsertIntoTable(HiveTableRelation) to datasource table.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28573) Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for partitioned table

2019-09-02 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-28573.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 25306
[https://github.com/apache/spark/pull/25306]

> Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for 
> partitioned table
> 
>
> Key: SPARK-28573
> URL: https://issues.apache.org/jira/browse/SPARK-28573
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xianjin YE
>Assignee: Xianjin YE
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently we don't translate InsertInto(HiveTableRelation) to DataSource 
> insertion when partitioned table is involved, the reason is that, quote from 
> the comments:
> {quote}// Inserting into partitioned table is not supported in Parquet/Orc 
> data source (yet).
> {quote}
>  
> which doesn't hold any more. Since datasource table dynamic partition insert 
> now supports 
> dynamic mode (SPARK-20236). I think it's worthy to translate 
> InsertIntoTable(HiveTableRelation) to datasource table.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28942) [Spark][WEB UI]Spark in local mode hostname display localhost in the Host Column of Task Summary Page

2019-09-02 Thread ABHISHEK KUMAR GUPTA (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ABHISHEK KUMAR GUPTA updated SPARK-28942:
-
Summary: [Spark][WEB UI]Spark in local mode hostname display localhost in 
the Host Column of Task Summary Page  (was: Spark in local mode hostname 
display localhost in the Host Column of Task Summary Page)

> [Spark][WEB UI]Spark in local mode hostname display localhost in the Host 
> Column of Task Summary Page
> -
>
> Key: SPARK-28942
> URL: https://issues.apache.org/jira/browse/SPARK-28942
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> In the stage page under Task Summary Page Host Column shows 'localhost' 
> instead of showing host IP or host name mentioned against the Driver Host Name
> Steps:
> spark-shell --master local
> create table emp(id int);
> insert into emp values(100);
> select * from emp;
> Go to  Stage UI page and check the Task Summary Page.
> Host column will display 'localhost' instead the driver host.
>  
> Note in case of spark-shell --master yarn mode UI display correct host name 
> under the column.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28372) Document Spark WEB UI

2019-09-02 Thread zhengruifeng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921151#comment-16921151
 ] 

zhengruifeng commented on SPARK-28372:
--

[~smilegator] I think we may need to add a subtask for streaming?  As 
[~planga82]  suggested.

> Document Spark WEB UI
> -
>
> Key: SPARK-28372
> URL: https://issues.apache.org/jira/browse/SPARK-28372
> Project: Spark
>  Issue Type: Umbrella
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> Spark web UIs are being used to monitor the status and resource consumption 
> of your Spark applications and clusters. However, we do not have the 
> corresponding document. It is hard for end users to use and understand them. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-02 Thread zhengruifeng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921138#comment-16921138
 ] 

zhengruifeng commented on SPARK-28373:
--

[~planga82] Thanks!:D

> Document JDBC/ODBC Server page
> --
>
> Key: SPARK-28373
> URL: https://issues.apache.org/jira/browse/SPARK-28373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> !https://user-images.githubusercontent.com/5399861/60809590-9dcf2500-a1bd-11e9-826e-33729bb97daf.png|width=1720,height=503!
>  
> [https://github.com/apache/spark/pull/25062] added a new column CLOSE TIME 
> and EXECUTION TIME. It is hard to understand the difference. We need to 
> document them; otherwise, it is hard for end users to understand them
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-28953:

Comment: was deleted

(was: What is your Hadoop version and JDK version? Please see SPARK-27177 and 
SPARK-28693 for more details.)

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921115#comment-16921115
 ] 

Yuming Wang edited comment on SPARK-28953 at 9/3/19 2:31 AM:
-

What is your Hadoop version and JDK version? Please see SPARK-27177 and 
SPARK-28693 for more details.


was (Author: q79969786):
What is you Hadoop version and JDK version? Please see SPARK-27177 and 
SPARK-28693 for more details.

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921115#comment-16921115
 ] 

Yuming Wang commented on SPARK-28953:
-

What is you Hadoop version and JDK version? Please see SPARK-27177 and 
SPARK-28693 for more details.

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28955) Support for LocalDateTime semantics

2019-09-02 Thread Bill Schneider (Jira)
Bill Schneider created SPARK-28955:
--

 Summary: Support for LocalDateTime semantics
 Key: SPARK-28955
 URL: https://issues.apache.org/jira/browse/SPARK-28955
 Project: Spark
  Issue Type: Wish
  Components: SQL
Affects Versions: 2.3.0
Reporter: Bill Schneider


It would be great if Spark supported local times in DataFrames, rather than 
only instants. 

The specific use case I have in mind is something like
 * parse "2019-01-01 17:00" (no timezone) from CSV -> LocalDateTime in dataframe
 * save to Parquet: LocalDateTime is stored with same integer value as 
2019-01-01 17:00 UTC, but with isAdjustedToUTC=false.  (Currently Spark saves 
either INT96 or TIME_MILLIS/TIME_MICROS which has isAdjustedToUTC=true)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28954) For SparkCLI, start up with conf of HIVEAUXJARS, we add jar with SessionStateResourceLoader's addJar() API

2019-09-02 Thread angerszhu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated SPARK-28954:
--
Description: 
When startup SparkSQL CLI .

For extra jar passed through hive conf {{HiveConf.ConfVars.HIVEAUXJARS}}, we 
don't need to use complex APIs to fix different hive version problem, we just 
can handle it through spark's SessionResourceLoader's API. add jar to Spark and 
SparkSession's running env.

*SessionResourceLoader api* :

{code:java}
val resourceLoader = SparkSQLEnv.sqlContext.sessionState.resourceLoader
  StringUtils.split(auxJars, ",").foreach(resourceLoader.addJar(_))
{code}


*v1.2.1ThriftServerShimUtils*:
{code:java}

private[thriftserver] def addToClassPath(
            loader: ClassLoader,
             auxJars: Array[String]): ClassLoader = {
 Utilities.addToClassPath(loader, auxJars)
 }
{code}

*v2.3.5ThriftServerShimUtils*:
{code:java}
 private[thriftserver] def addToClassPath(
  loader: ClassLoader,
  auxJars: Array[String]): ClassLoader = {
val addAction = new AddToClassPathAction(loader, auxJars.toList.asJava)
AccessController.doPrivileged(addAction)
  }
{code}


> For SparkCLI, start up with conf of HIVEAUXJARS, we add jar with 
> SessionStateResourceLoader's addJar() API
> --
>
> Key: SPARK-28954
> URL: https://issues.apache.org/jira/browse/SPARK-28954
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0, 3.0.0
>Reporter: angerszhu
>Priority: Major
>
> When startup SparkSQL CLI .
> For extra jar passed through hive conf {{HiveConf.ConfVars.HIVEAUXJARS}}, we 
> don't need to use complex APIs to fix different hive version problem, we just 
> can handle it through spark's SessionResourceLoader's API. add jar to Spark 
> and SparkSession's running env.
> *SessionResourceLoader api* :
> {code:java}
> val resourceLoader = SparkSQLEnv.sqlContext.sessionState.resourceLoader
>   StringUtils.split(auxJars, ",").foreach(resourceLoader.addJar(_))
> {code}
> *v1.2.1ThriftServerShimUtils*:
> {code:java}
> private[thriftserver] def addToClassPath(
>             loader: ClassLoader,
>              auxJars: Array[String]): ClassLoader = {
>  Utilities.addToClassPath(loader, auxJars)
>  }
> {code}
> *v2.3.5ThriftServerShimUtils*:
> {code:java}
>  private[thriftserver] def addToClassPath(
>   loader: ClassLoader,
>   auxJars: Array[String]): ClassLoader = {
> val addAction = new AddToClassPathAction(loader, auxJars.toList.asJava)
> AccessController.doPrivileged(addAction)
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28954) For SparkCLI, start up with conf of HIVEAUXJARS, we add jar with SessionStateResourceLoader's addJar() API

2019-09-02 Thread angerszhu (Jira)
angerszhu created SPARK-28954:
-

 Summary: For SparkCLI, start up with conf of HIVEAUXJARS, we add 
jar with SessionStateResourceLoader's addJar() API
 Key: SPARK-28954
 URL: https://issues.apache.org/jira/browse/SPARK-28954
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.0, 3.0.0
Reporter: angerszhu






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28864) Add spark connector for Alibaba Log Service

2019-09-02 Thread Jungtaek Lim (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921096#comment-16921096
 ] 

Jungtaek Lim commented on SPARK-28864:
--

I guess general recommendation of new connectors has been "Apache Bahir serves 
for this purpose", as Spark moved out existing connectors from Spark codebase 
to Bahir project and tries to keep only the first class of connectors in Spark 
codebase.

Bahir project is here: [https://bahir.apache.org/]

> Add spark connector for Alibaba Log Service
> ---
>
> Key: SPARK-28864
> URL: https://issues.apache.org/jira/browse/SPARK-28864
> Project: Spark
>  Issue Type: New Feature
>  Components: Input/Output
>Affects Versions: 3.0.0
>Reporter: Ke Li
>Priority: Major
>
> Alibaba Log Service is a big data service which has been widely used in 
> Alibaba Group and thousands of customers of Alibaba Cloud. The core storage 
> engine of Log Service is named Loghub which is a large scale distributed 
> storage system which provides producer and consumer to push and pull data 
> like Kafka, AWS Kinesis and Azure Eventhub does. 
> There are a lot of users of Log Service are using Spark Streaming, Spark SQL 
> and Spark Structured Streaming to analysis data collected from both on 
> premise and cloud data sources.
> Happy to hear any comments.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28864) Add spark source connector for Alibaba Log Service

2019-09-02 Thread Ke Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Li updated SPARK-28864:
--
Summary: Add spark source connector for Alibaba Log Service  (was: Add 
spark source connector for Aliyun Log Service)

> Add spark source connector for Alibaba Log Service
> --
>
> Key: SPARK-28864
> URL: https://issues.apache.org/jira/browse/SPARK-28864
> Project: Spark
>  Issue Type: New Feature
>  Components: Input/Output
>Affects Versions: 3.0.0
>Reporter: Ke Li
>Priority: Major
>
> Alibaba Log Service is a big data service which has been widely used in 
> Alibaba Group and thousands of customers of Alibaba Cloud. The core storage 
> engine of Log Service is named Loghub which is a large scale distributed 
> storage system which provides producer and consumer to push and pull data 
> like Kafka, AWS Kinesis and Azure Eventhub does. 
> There are a lot of users of Log Service are using Spark Streaming, Spark SQL 
> and Spark Structured Streaming to analysis data collected from both on 
> premise and cloud data sources.
> Happy to hear any comments.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28864) Add spark connector for Alibaba Log Service

2019-09-02 Thread Ke Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Li updated SPARK-28864:
--
Summary: Add spark connector for Alibaba Log Service  (was: Add spark 
source connector for Alibaba Log Service)

> Add spark connector for Alibaba Log Service
> ---
>
> Key: SPARK-28864
> URL: https://issues.apache.org/jira/browse/SPARK-28864
> Project: Spark
>  Issue Type: New Feature
>  Components: Input/Output
>Affects Versions: 3.0.0
>Reporter: Ke Li
>Priority: Major
>
> Alibaba Log Service is a big data service which has been widely used in 
> Alibaba Group and thousands of customers of Alibaba Cloud. The core storage 
> engine of Log Service is named Loghub which is a large scale distributed 
> storage system which provides producer and consumer to push and pull data 
> like Kafka, AWS Kinesis and Azure Eventhub does. 
> There are a lot of users of Log Service are using Spark Streaming, Spark SQL 
> and Spark Structured Streaming to analysis data collected from both on 
> premise and cloud data sources.
> Happy to hear any comments.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-28921:
--
Fix Version/s: 2.4.5

> Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 
> 1.12.10, 1.11.10)
> ---
>
> Key: SPARK-28921
> URL: https://issues.apache.org/jira/browse/SPARK-28921
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.3
>Reporter: Paul Schweigert
>Assignee: Andy Grove
>Priority: Major
> Fix For: 2.4.5, 3.0.0
>
>
> Spark jobs are failing on latest versions of Kubernetes when jobs attempt to 
> provision executor pods (jobs like Spark-Pi that do not launch executors run 
> without a problem):
>  
> Here's an example error message:
>  
> {code:java}
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: 
> HTTP 403, Status: 403 - 
> java.net.ProtocolException: Expected HTTP 101 response but was '403 
> Forbidden' 
> at 
> okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) 
> at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) 
> at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) 
> at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Looks like the issue is caused by fixes for a recent CVE : 
> CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809]
> Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669]
>  
> Looks like upgrading kubernetes-client to 4.4.2 would solve this issue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Stavros Kontopoulos (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921087#comment-16921087
 ] 

Stavros Kontopoulos edited comment on SPARK-28953 at 9/3/19 12:04 AM:
--

[~shaneknapp] [~eje] I can fix this since I am working on: SPARK-27936 but im 
wondering about the root cause.


was (Author: skonto):
[~shaneknapp] [~eje] I can fix this since I am working on: SPARK-27936 but im 
wondering of the root cause.

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Stavros Kontopoulos (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921087#comment-16921087
 ] 

Stavros Kontopoulos edited comment on SPARK-28953 at 9/3/19 12:03 AM:
--

[~shaneknapp] [~eje] I can fix this since I am working on: SPARK-27936 but im 
wondering of the root cause.


was (Author: skonto):
[~shaneknapp] I can fix this since I am working on: SPARK-27936 but im 
wondering of the root cause.

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Stavros Kontopoulos (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921087#comment-16921087
 ] 

Stavros Kontopoulos commented on SPARK-28953:
-

[~shaneknapp] I can fix this since I am working on: SPARK-27936 but im 
wondering of the root cause.

> Integration tests fail due to malformed URL
> ---
>
> Key: SPARK-28953
> URL: https://issues.apache.org/jira/browse/SPARK-28953
> Project: Spark
>  Issue Type: Bug
>  Components: jenkins, Kubernetes
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Tests failed on Ubuntu, verified on two different machines:
> KubernetesSuite:
> - Launcher client dependencies *** FAILED ***
>  java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
>  at java.net.URL.(URL.java:600)
>  at java.net.URL.(URL.java:497)
>  at java.net.URL.(URL.java:446)
>  at 
> org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  at org.scalatest.Transformer.apply(Transformer.scala:20)
>  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>  
> Welcome to
>   __
>  / __/__ ___ _/ /__
>  _\ \/ _ \/ _ `/ __/ '_/
>  /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
>  /_/
>  
>  Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
>  Type in expressions to have them evaluated.
>  Type :help for more information.
>  
> scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
> ceph-nano-s3 -n spark --url")
>  pb: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> pb.redirectErrorStream(true)
>  res0: ProcessBuilder = java.lang.ProcessBuilder@46092840
> scala> val proc = pb.start()
>  proc: Process = java.lang.UNIXProcess@5e9650d3
> scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
>  r: String =
>  "* http://172.31.46.91:30706
>  "
> Although (no asterisk):
> $ minikube service ceph-nano-s3 -n spark --url
> [http://172.31.46.91:30706|http://172.31.46.91:30706/]
>  
> This is weird because it fails at the java level, where does the asterisk 
> come from?
> $ minikube version
> minikube version: v1.3.1
> commit: ca60a424ce69a4d79f502650199ca2b52f29e631
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28953) Integration tests fail due to malformed URL

2019-09-02 Thread Stavros Kontopoulos (Jira)
Stavros Kontopoulos created SPARK-28953:
---

 Summary: Integration tests fail due to malformed URL
 Key: SPARK-28953
 URL: https://issues.apache.org/jira/browse/SPARK-28953
 Project: Spark
  Issue Type: Bug
  Components: jenkins, Kubernetes
Affects Versions: 3.0.0
Reporter: Stavros Kontopoulos


Tests failed on Ubuntu, verified on two different machines:


KubernetesSuite:
- Launcher client dependencies *** FAILED ***
 java.net.MalformedURLException: no protocol: * http://172.31.46.91:30706
 at java.net.URL.(URL.java:600)
 at java.net.URL.(URL.java:497)
 at java.net.URL.(URL.java:446)
 at 
org.apache.spark.deploy.k8s.integrationtest.DepsTestsSuite.$anonfun$$init$$1(DepsTestsSuite.scala:160)
 at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
 at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
 at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
 at org.scalatest.Transformer.apply(Transformer.scala:22)
 at org.scalatest.Transformer.apply(Transformer.scala:20)
 at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)

 

Welcome to
  __
 / __/__ ___ _/ /__
 _\ \/ _ \/ _ `/ __/ '_/
 /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT
 /_/
 
 Using Scala version 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_222)
 Type in expressions to have them evaluated.
 Type :help for more information.

 

scala> val pb = new ProcessBuilder().command("bash", "-c", "minikube service 
ceph-nano-s3 -n spark --url")
 pb: ProcessBuilder = java.lang.ProcessBuilder@46092840

scala> pb.redirectErrorStream(true)
 res0: ProcessBuilder = java.lang.ProcessBuilder@46092840

scala> val proc = pb.start()
 proc: Process = java.lang.UNIXProcess@5e9650d3

scala> val r = org.apache.commons.io.IOUtils.toString(proc.getInputStream())
 r: String =
 "* http://172.31.46.91:30706
 "

Although (no asterisk):
$ minikube service ceph-nano-s3 -n spark --url
[http://172.31.46.91:30706|http://172.31.46.91:30706/]

 

This is weird because it fails at the java level, where does the asterisk come 
from?

$ minikube version
minikube version: v1.3.1
commit: ca60a424ce69a4d79f502650199ca2b52f29e631

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921082#comment-16921082
 ] 

Dongjoon Hyun edited comment on SPARK-24227 at 9/2/19 11:58 PM:


Apache Spark 2.3.4 was the last release and `branch-2.3` becomes EOL. I'll 
resolve this issue as `Not A Problem`. Let's use the latest version with the 
proper K8s configuration as [~felipejfc] described in the above.


was (Author: dongjoon):
Apache Spark 2.3.4 was the last release and `branch-2.3` becomes EOL. I'll 
resolve this issue as `Not A Problem`. Please use the latest version with the 
proper K8s configuration.

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> 

[jira] [Resolved] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-24227.
---
Resolution: Not A Problem

Apache Spark 2.3.4 was the last release and `branch-2.3` becomes EOL. I'll 
resolve this issue as `Not A Problem`. Please use the latest version with the 
proper K8s configuration.

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> 

[jira] [Resolved] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-28921.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 25640
[https://github.com/apache/spark/pull/25640]

> Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 
> 1.12.10, 1.11.10)
> ---
>
> Key: SPARK-28921
> URL: https://issues.apache.org/jira/browse/SPARK-28921
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.3
>Reporter: Paul Schweigert
>Assignee: Andy Grove
>Priority: Major
> Fix For: 3.0.0
>
>
> Spark jobs are failing on latest versions of Kubernetes when jobs attempt to 
> provision executor pods (jobs like Spark-Pi that do not launch executors run 
> without a problem):
>  
> Here's an example error message:
>  
> {code:java}
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: 
> HTTP 403, Status: 403 - 
> java.net.ProtocolException: Expected HTTP 101 response but was '403 
> Forbidden' 
> at 
> okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) 
> at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) 
> at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) 
> at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Looks like the issue is caused by fixes for a recent CVE : 
> CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809]
> Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669]
>  
> Looks like upgrading kubernetes-client to 4.4.2 would solve this issue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-28921:
-

Assignee: Andy Grove

> Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 
> 1.12.10, 1.11.10)
> ---
>
> Key: SPARK-28921
> URL: https://issues.apache.org/jira/browse/SPARK-28921
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.3
>Reporter: Paul Schweigert
>Assignee: Andy Grove
>Priority: Major
>
> Spark jobs are failing on latest versions of Kubernetes when jobs attempt to 
> provision executor pods (jobs like Spark-Pi that do not launch executors run 
> without a problem):
>  
> Here's an example error message:
>  
> {code:java}
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors 
> from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: 
> HTTP 403, Status: 403 - 
> java.net.ProtocolException: Expected HTTP 101 response but was '403 
> Forbidden' 
> at 
> okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) 
> at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) 
> at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) 
> at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Looks like the issue is caused by fixes for a recent CVE : 
> CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809]
> Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669]
>  
> Looks like upgrading kubernetes-client to 4.4.2 would solve this issue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921076#comment-16921076
 ] 

Dongjoon Hyun commented on SPARK-24227:
---

Thank you for reporting and analysis, [~felipejfc]!

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable 

[jira] [Updated] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24227:
--
Labels: kubernetes  (was: kubernetes spark)

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Submit
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>  Labels: kubernetes
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: 

[jira] [Updated] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24227:
--
Labels:   (was: kubernetes)

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Submit
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to 

[jira] [Updated] (SPARK-24227) Not able to submit spark job to kubernetes on 2.3

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24227:
--
Component/s: (was: Spark Submit)
 (was: Spark Core)
 Kubernetes

> Not able to submit spark job to kubernetes on 2.3
> -
>
> Key: SPARK-24227
> URL: https://issues.apache.org/jira/browse/SPARK-24227
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Felipe Cavalcanti
>Priority: Major
>
> Hi, I'm trying to submit a spark job to kubernetes with no success, I 
> followed the steps @ 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] with no 
> success, when I run:
>  
> {code:java}
> bin/spark-submit \
>   --master k8s://https://${host}:${port} \
>   --deploy-mode cluster \ 
>   --name jaeger-spark \
>   --class io.jaegertracing.spark.dependencies.DependenciesSparkJob \
>   --conf spark.executor.instances=5 \
>   --conf spark.kubernetes.container.image=bla/jaeger-deps-spark:latest\
>   --conf spark.kubernetes.namespace=spark \
>   local:///opt/spark/jars/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar
> {code}
>  
> Im getting the following stack trace:
> {code:java}
> 2018-05-09 17:06:02 WARN WatchConnectionManager:192 - Exec Failure 
> javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target at 
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at 
> sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at 
> sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
>  at 
> sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at 
> sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at 
> okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281)
>  at 
> okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251)
>  at 
> okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) 
> at 
> okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
>  at 
> okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
>  at 
> okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
>  at 
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) 
> at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:90)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
>  at 
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
>  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at 
> okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at 
> okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> 

[jira] [Assigned] (SPARK-28951) Add release announce template

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-28951:
-

Assignee: Dongjoon Hyun

> Add release announce template
> -
>
> Key: SPARK-28951
> URL: https://issues.apache.org/jira/browse/SPARK-28951
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 2.4.5, 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28951) Add release announce template

2019-09-02 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-28951.
---
Fix Version/s: 3.0.0
   2.4.5
   Resolution: Fixed

Issue resolved by pull request 25656
[https://github.com/apache/spark/pull/25656]

> Add release announce template
> -
>
> Key: SPARK-28951
> URL: https://issues.apache.org/jira/browse/SPARK-28951
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 2.4.5, 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Trivial
> Fix For: 2.4.5, 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28372) Document Spark WEB UI

2019-09-02 Thread Pablo Langa Blanco (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921026#comment-16921026
 ] 

Pablo Langa Blanco commented on SPARK-28372:


[~smilegator] We have a streaming tab section in the documentation with little 
documentation. Do you think we should open a new issue to complete this?

> Document Spark WEB UI
> -
>
> Key: SPARK-28372
> URL: https://issues.apache.org/jira/browse/SPARK-28372
> Project: Spark
>  Issue Type: Umbrella
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> Spark web UIs are being used to monitor the status and resource consumption 
> of your Spark applications and clusters. However, we do not have the 
> corresponding document. It is hard for end users to use and understand them. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-02 Thread Pablo Langa Blanco (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921019#comment-16921019
 ] 

Pablo Langa Blanco commented on SPARK-28373:


[~podongfeng] Ok i can take care of it 

> Document JDBC/ODBC Server page
> --
>
> Key: SPARK-28373
> URL: https://issues.apache.org/jira/browse/SPARK-28373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, Web UI
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> !https://user-images.githubusercontent.com/5399861/60809590-9dcf2500-a1bd-11e9-826e-33729bb97daf.png|width=1720,height=503!
>  
> [https://github.com/apache/spark/pull/25062] added a new column CLOSE TIME 
> and EXECUTION TIME. It is hard to understand the difference. We need to 
> document them; otherwise, it is hard for end users to understand them
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27733) Upgrade to Avro 1.9.x

2019-09-02 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921009#comment-16921009
 ] 

Dongjoon Hyun commented on SPARK-27733:
---

Great! Thank you for the follow-ups, [~Fokko]!

> Upgrade to Avro 1.9.x
> -
>
> Key: SPARK-27733
> URL: https://issues.apache.org/jira/browse/SPARK-27733
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.0
>Reporter: Ismaël Mejía
>Priority: Minor
>
> Avro 1.9.0 was released with many nice features including reduced size (1MB 
> less), and removed dependencies, no paranmer, no shaded guava, security 
> updates, so probably a worth upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28952) Getting Error using 'Linearregression' in spark 2.3.4

2019-09-02 Thread Sandeep Singh (Jira)
Sandeep Singh created SPARK-28952:
-

 Summary: Getting Error using 'Linearregression' in spark 2.3.4
 Key: SPARK-28952
 URL: https://issues.apache.org/jira/browse/SPARK-28952
 Project: Spark
  Issue Type: Bug
  Components: ML
Affects Versions: 2.4.3
Reporter: Sandeep Singh


Getting following error While fitting the 'LinearRegression':

 

File "C:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\sql\utils.py", line 79, 
in deco
 raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)

IllegalArgumentException: 'requirement failed: Column features must be of type 
struct,values:array> but was 
actually struct,values:array>.'



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28951) Add release announce template

2019-09-02 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-28951:
-

 Summary: Add release announce template
 Key: SPARK-28951
 URL: https://issues.apache.org/jira/browse/SPARK-28951
 Project: Spark
  Issue Type: Task
  Components: Project Infra
Affects Versions: 2.4.5, 3.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28912) MatchError exception in CheckpointWriteHandler

2019-09-02 Thread Aleksandr Kashkirov (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920979#comment-16920979
 ] 

Aleksandr Kashkirov commented on SPARK-28912:
-

Steps to reproduce the error:
 # Start Hadoop in a pseudo-distributed mode.
 # In another terminal run command  {{nc -lk }}
 # In the Spark shell execute the following statements:
{code:java}
scala> val ssc = new StreamingContext(sc, Seconds(30))
ssc: org.apache.spark.streaming.StreamingContext = 
org.apache.spark.streaming.StreamingContext@376fd14f

scala> ssc.checkpoint("hdfs://localhost:9000/checkpoint-01")   

scala> val lines = ssc.socketTextStream("localhost", )
lines: org.apache.spark.streaming.dstream.ReceiverInputDStream[String] = 
org.apache.spark.streaming.dstream.SocketInputDStream@39b7d031
  
scala> val words = lines.flatMap(_.split(" "))
words: org.apache.spark.streaming.dstream.DStream[String] = 
org.apache.spark.streaming.dstream.FlatMappedDStream@637ae337   
   
scala> val pairs = words.map(word => (word, 1))  
pairs: org.apache.spark.streaming.dstream.DStream[(String, Int)] = 
org.apache.spark.streaming.dstream.MappedDStream@523d07cc
 
scala> val wordCounts = pairs.reduceByKey(_ + _)   
wordCounts: org.apache.spark.streaming.dstream.DStream[(String, Int)] = 
org.apache.spark.streaming.dstream.ShuffledDStream@3c62183b
   
scala> wordCounts.print()
   
scala> ssc.start()   

scala> ssc.awaitTermination()
{code}

> MatchError exception in CheckpointWriteHandler
> --
>
> Key: SPARK-28912
> URL: https://issues.apache.org/jira/browse/SPARK-28912
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.2
>Reporter: Aleksandr Kashkirov
>Priority: Minor
>
> Setting checkpoint directory name to "checkpoint-" plus some digits (e.g. 
> "checkpoint-01") results in the following error:
> {code:java}
> Exception in thread "pool-32-thread-1" scala.MatchError: 
> 0523a434-0daa-4ea6-a050-c4eb3c557d8c (of class java.lang.String) 
>  at 
> org.apache.spark.streaming.Checkpoint$.org$apache$spark$streaming$Checkpoint$$sortFunc$1(Checkpoint.scala:121)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at scala.math.Ordering$$anon$9.compare(Ordering.scala:200) 
>  at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) 
>  at java.util.TimSort.sort(TimSort.java:234) 
>  at java.util.Arrays.sort(Arrays.java:1438) 
>  at scala.collection.SeqLike$class.sorted(SeqLike.scala:648) 
>  at scala.collection.mutable.ArrayOps$ofRef.sorted(ArrayOps.scala:186) 
>  at scala.collection.SeqLike$class.sortWith(SeqLike.scala:601) 
>  at scala.collection.mutable.ArrayOps$ofRef.sortWith(ArrayOps.scala:186) 
>  at 
> org.apache.spark.streaming.Checkpoint$.getCheckpointFiles(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:262)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
>  at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28463) Thriftserver throws java.math.BigDecimal incompatible with org.apache.hadoop.hive.common.type.HiveDecimal

2019-09-02 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang reassigned SPARK-28463:
---

Assignee: Yuming Wang

> Thriftserver throws java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
> -
>
> Key: SPARK-28463
> URL: https://issues.apache.org/jira/browse/SPARK-28463
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {code:sh}
> build/sbt clean package -Phive -Phive-thriftserver -Phadoop-3.2
> export SPARK_PREPEND_CLASSES=true
> sbin/start-thriftserver.sh
> [root@spark-3267648 spark]# bin/beeline -u 
> jdbc:hive2://localhost:1/default -e "select cast(1 as decimal(38, 18));"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Spark SQL (version 3.0.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.3.5)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: java.lang.ClassCastException: java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal (state=,code=0)
> Closing: 0: jdbc:hive2://localhost:1/default
> {code}
> Logs:
> {noformat}
> java.lang.RuntimeException: java.lang.ClassCastException: 
> java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at 
> java.security.AccessController.doPrivileged(AccessController.java:770)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy31.fetchResults(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:521)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:623)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1717)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1702)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:53)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:819)
> Caused by: java.lang.ClassCastException: java.math.BigDecimal incompatible 
> with org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:111)
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:42)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$getNextRowSet$1(SparkExecuteStatementOperation.scala:150)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$Lambda$1921.9054D6E0.apply(Unknown
>  Source)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withSchedulerPool(SparkExecuteStatementOperation.scala:298)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.getNextRowSet(SparkExecuteStatementOperation.scala:112)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:244)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:799)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   ... 18 more
> {noformat}



--
This message was sent by Atlassian Jira

[jira] [Resolved] (SPARK-28463) Thriftserver throws java.math.BigDecimal incompatible with org.apache.hadoop.hive.common.type.HiveDecimal

2019-09-02 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang resolved SPARK-28463.
-
Resolution: Fixed

> Thriftserver throws java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
> -
>
> Key: SPARK-28463
> URL: https://issues.apache.org/jira/browse/SPARK-28463
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {code:sh}
> build/sbt clean package -Phive -Phive-thriftserver -Phadoop-3.2
> export SPARK_PREPEND_CLASSES=true
> sbin/start-thriftserver.sh
> [root@spark-3267648 spark]# bin/beeline -u 
> jdbc:hive2://localhost:1/default -e "select cast(1 as decimal(38, 18));"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Spark SQL (version 3.0.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.3.5)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: java.lang.ClassCastException: java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal (state=,code=0)
> Closing: 0: jdbc:hive2://localhost:1/default
> {code}
> Logs:
> {noformat}
> java.lang.RuntimeException: java.lang.ClassCastException: 
> java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at 
> java.security.AccessController.doPrivileged(AccessController.java:770)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy31.fetchResults(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:521)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:623)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1717)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1702)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:53)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:819)
> Caused by: java.lang.ClassCastException: java.math.BigDecimal incompatible 
> with org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:111)
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:42)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$getNextRowSet$1(SparkExecuteStatementOperation.scala:150)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$Lambda$1921.9054D6E0.apply(Unknown
>  Source)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withSchedulerPool(SparkExecuteStatementOperation.scala:298)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.getNextRowSet(SparkExecuteStatementOperation.scala:112)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:244)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:799)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   ... 18 more
> {noformat}



--
This message was sent by Atlassian Jira

[jira] [Updated] (SPARK-28463) Thriftserver throws java.math.BigDecimal incompatible with org.apache.hadoop.hive.common.type.HiveDecimal

2019-09-02 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-28463:

Fix Version/s: 3.0.0

> Thriftserver throws java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
> -
>
> Key: SPARK-28463
> URL: https://issues.apache.org/jira/browse/SPARK-28463
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> How to reproduce this issue:
> {code:sh}
> build/sbt clean package -Phive -Phive-thriftserver -Phadoop-3.2
> export SPARK_PREPEND_CLASSES=true
> sbin/start-thriftserver.sh
> [root@spark-3267648 spark]# bin/beeline -u 
> jdbc:hive2://localhost:1/default -e "select cast(1 as decimal(38, 18));"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Spark SQL (version 3.0.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.3.5)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: java.lang.ClassCastException: java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal (state=,code=0)
> Closing: 0: jdbc:hive2://localhost:1/default
> {code}
> Logs:
> {noformat}
> java.lang.RuntimeException: java.lang.ClassCastException: 
> java.math.BigDecimal incompatible with 
> org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at 
> java.security.AccessController.doPrivileged(AccessController.java:770)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy31.fetchResults(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:521)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:623)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1717)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1702)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:53)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:819)
> Caused by: java.lang.ClassCastException: java.math.BigDecimal incompatible 
> with org.apache.hadoop.hive.common.type.HiveDecimal
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:111)
>   at 
> org.apache.hive.service.cli.ColumnBasedSet.addRow(ColumnBasedSet.java:42)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$getNextRowSet$1(SparkExecuteStatementOperation.scala:150)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$Lambda$1921.9054D6E0.apply(Unknown
>  Source)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withSchedulerPool(SparkExecuteStatementOperation.scala:298)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.getNextRowSet(SparkExecuteStatementOperation.scala:112)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:244)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:799)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   ... 18 more
> {noformat}



--
This message was sent 

[jira] [Created] (SPARK-28950) FollowingUp: Change whereClause to be optional in DELETE

2019-09-02 Thread Xianyin Xin (Jira)
Xianyin Xin created SPARK-28950:
---

 Summary: FollowingUp: Change whereClause to be optional in DELETE
 Key: SPARK-28950
 URL: https://issues.apache.org/jira/browse/SPARK-28950
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Xianyin Xin






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28950) [SPARK-28351] FollowingUp: Change whereClause to be optional in DELETE

2019-09-02 Thread Xianyin Xin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated SPARK-28950:

Summary: [SPARK-28351] FollowingUp: Change whereClause to be optional in 
DELETE  (was: FollowingUp: Change whereClause to be optional in DELETE)

> [SPARK-28351] FollowingUp: Change whereClause to be optional in DELETE
> --
>
> Key: SPARK-28950
> URL: https://issues.apache.org/jira/browse/SPARK-28950
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xianyin Xin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28705) drop the tables in AnalysisExternalCatalogSuite after the testcase execution

2019-09-02 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-28705.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 25427
[https://github.com/apache/spark/pull/25427]

> drop the tables in AnalysisExternalCatalogSuite after the testcase execution
> 
>
> Key: SPARK-28705
> URL: https://issues.apache.org/jira/browse/SPARK-28705
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Sandeep Katta
>Assignee: Sandeep Katta
>Priority: Trivial
> Fix For: 3.0.0
>
>
> drop the tables after each testcase executed



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28705) drop the tables in AnalysisExternalCatalogSuite after the testcase execution

2019-09-02 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-28705:


Assignee: Sandeep Katta

> drop the tables in AnalysisExternalCatalogSuite after the testcase execution
> 
>
> Key: SPARK-28705
> URL: https://issues.apache.org/jira/browse/SPARK-28705
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Sandeep Katta
>Assignee: Sandeep Katta
>Priority: Trivial
>
> drop the tables after each testcase executed



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28949) Kubernetes CGroup leaking leads to Spark Pods hang in Pending status

2019-09-02 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-28949:
-
Description: 
After running Spark on k8s for a few days, some kubelet fails to create pod 
caused by warning message like
{code:java}
\"mkdir 
/sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
 no space left on device\"
{code}
The k8s cluster and the kubelet node are free.

These pods zombie over days before we manually notify and terminate them. Maybe 
it

is a little bit easy to identify zombied driver pods, but it is quite 
inconvenient to identify executor pods when spark applications scale-out.

This probably related to [https://github.com/kubernetes/kubernetes/issues/70324]

Do we need a timeout, retry or failover mechanism for Spark to handle these 
kinds of k8s kernel issues?

 

 

  was:
After running Spark on k8s for a few days, some kubelet fails to create pod 
caused by warning message like
{code:java}
\"mkdir 
/sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
 no space left on device\"
{code}
The k8s cluster and the kubelet node are free.

These pods zombie over days before we manually notify and terminate them. Maybe 
it

is a little bit 

This probably related to [https://github.com/kubernetes/kubernetes/issues/70324]

Do we need a timeout, retry or failover mechanism for Spark to handle these 
kinds of k8s kernel issues?

 

 


> Kubernetes CGroup leaking leads to Spark Pods hang in Pending status
> 
>
> Key: SPARK-28949
> URL: https://issues.apache.org/jira/browse/SPARK-28949
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.4
>Reporter: Kent Yao
>Priority: Major
> Attachments: describe-driver-pod.txt, describe-executor-pod.txt
>
>
> After running Spark on k8s for a few days, some kubelet fails to create pod 
> caused by warning message like
> {code:java}
> \"mkdir 
> /sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
>  no space left on device\"
> {code}
> The k8s cluster and the kubelet node are free.
> These pods zombie over days before we manually notify and terminate them. 
> Maybe it
> is a little bit easy to identify zombied driver pods, but it is quite 
> inconvenient to identify executor pods when spark applications scale-out.
> This probably related to 
> [https://github.com/kubernetes/kubernetes/issues/70324]
> Do we need a timeout, retry or failover mechanism for Spark to handle these 
> kinds of k8s kernel issues?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28949) Kubernetes CGroup leaking leads to Spark Pods hang in Pending status

2019-09-02 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-28949:
-
Attachment: describe-executor-pod.txt

> Kubernetes CGroup leaking leads to Spark Pods hang in Pending status
> 
>
> Key: SPARK-28949
> URL: https://issues.apache.org/jira/browse/SPARK-28949
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.4
>Reporter: Kent Yao
>Priority: Major
> Attachments: describe-driver-pod.txt, describe-executor-pod.txt
>
>
> After running Spark on k8s for a few days, some kubelet fails to create pod 
> caused by warning message like
> {code:java}
> \"mkdir 
> /sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
>  no space left on device\"
> {code}
> The k8s cluster and the kubelet node are free.
> These pods zombie over days before we manually notify and terminate them. 
> Maybe it
> is a little bit 
> This probably related to 
> [https://github.com/kubernetes/kubernetes/issues/70324]
> Do we need a timeout, retry or failover mechanism for Spark to handle these 
> kinds of k8s kernel issues?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28949) Kubernetes CGroup leaking leads to Spark Pods hang in Pending status

2019-09-02 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-28949:
-
Attachment: describe-driver-pod.txt

> Kubernetes CGroup leaking leads to Spark Pods hang in Pending status
> 
>
> Key: SPARK-28949
> URL: https://issues.apache.org/jira/browse/SPARK-28949
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.3, 2.4.4
>Reporter: Kent Yao
>Priority: Major
> Attachments: describe-driver-pod.txt, describe-executor-pod.txt
>
>
> After running Spark on k8s for a few days, some kubelet fails to create pod 
> caused by warning message like
> {code:java}
> \"mkdir 
> /sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
>  no space left on device\"
> {code}
> The k8s cluster and the kubelet node are free.
> These pods zombie over days before we manually notify and terminate them. 
> Maybe it
> is a little bit 
> This probably related to 
> [https://github.com/kubernetes/kubernetes/issues/70324]
> Do we need a timeout, retry or failover mechanism for Spark to handle these 
> kinds of k8s kernel issues?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28949) Kubernetes CGroup leaking leads to Spark Pods hang in Pending status

2019-09-02 Thread Kent Yao (Jira)
Kent Yao created SPARK-28949:


 Summary: Kubernetes CGroup leaking leads to Spark Pods hang in 
Pending status
 Key: SPARK-28949
 URL: https://issues.apache.org/jira/browse/SPARK-28949
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 2.4.4, 2.3.3
Reporter: Kent Yao


After running Spark on k8s for a few days, some kubelet fails to create pod 
caused by warning message like
{code:java}
\"mkdir 
/sys/fs/cgroup/memory/kubepods/burstable/podb4a04361-ca89-11e9-a224-6c92bf35392e/1d5aed3ea20b246ec4f121f778f48c493e3e8678f2afe58a96c15180176e:
 no space left on device\"
{code}
The k8s cluster and the kubelet node are free.

These pods zombie over days before we manually notify and terminate them. Maybe 
it

is a little bit 

This probably related to [https://github.com/kubernetes/kubernetes/issues/70324]

Do we need a timeout, retry or failover mechanism for Spark to handle these 
kinds of k8s kernel issues?

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files

2019-09-02 Thread Gabor Somogyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920706#comment-16920706
 ] 

Gabor Somogyi commented on SPARK-28025:
---

[~ste...@apache.org] I think adding "file.bytes-per-checksum = 0" possibility 
would be a workaround (at least here). The whole point why one choose 
ChecksumFileSystem is to have checksum.

> HDFSBackedStateStoreProvider should not leak .crc files 
> 
>
> Key: SPARK-28025
> URL: https://issues.apache.org/jira/browse/SPARK-28025
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.3
> Environment: Spark 2.4.3
> Kubernetes 1.11(?) (OpenShift)
> StateStore storage on a mounted PVC. Viewed as a local filesystem by the 
> `FileContextBasedCheckpointFileManager` : 
> {noformat}
> scala> glusterfm.isLocal
> res17: Boolean = true{noformat}
>Reporter: Gerard Maas
>Assignee: Jungtaek Lim
>Priority: Major
> Fix For: 2.4.4, 3.0.0
>
>
> The HDFSBackedStateStoreProvider when using the default CheckpointFileManager 
> is leaving '.crc' files behind. There's a .crc file created for each 
> `atomicFile` operation of the CheckpointFileManager.
> Over time, the number of files becomes very large. It makes the state store 
> file system constantly increase in size and, in our case, deteriorates the 
> file system performance.
> Here's a sample of one of our spark storage volumes after 2 days of execution 
> (4 stateful streaming jobs, each on a different sub-dir):
>  # 
> {noformat}
> Total files in PVC (used for checkpoints and state store)
> $find . | wc -l
> 431796
> # .crc files
> $find . -name "*.crc" | wc -l
> 418053{noformat}
> With each .crc file taking one storage block, the used storage runs into the 
> GBs of data.
> These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, 
> shows serious performance deterioration with this large number of files:
> {noformat}
> DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28906) `bin/spark-submit --version` shows incorrect info

2019-09-02 Thread Kazuaki Ishizaki (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920594#comment-16920594
 ] 

Kazuaki Ishizaki edited comment on SPARK-28906 at 9/2/19 8:48 AM:
--

For the information on {{git}} comand, {{.git}} directory is deleted after 
{{git clone}} is executed. As a result, we cannot get infomration on {{git}} 
command. When I tentatively stop deleting {{.git}} directory, 
{{spark-version-info.properties}} can include the correct information like:
{code}
version=2.3.4
user=ishizaki
revision=8c6f8150f3c6298ff4e1c7e06028f12d7eaf0210
branch=HEAD
date=2019-09-02T02:31:25Z
url=https://gitbox.apache.org/repos/asf/spark.git
{code}



was (Author: kiszk):
For the information on {{git}} comand, {{.git}} directory is deleted after 
{{git clone}} is executed. When I tentatively stop deleting {{.git}} directory, 
{{spark-version-info.properties}} can include the correct information like:
{code}
version=2.3.4
user=ishizaki
revision=8c6f8150f3c6298ff4e1c7e06028f12d7eaf0210
branch=HEAD
date=2019-09-02T02:31:25Z
url=https://gitbox.apache.org/repos/asf/spark.git
{code}


> `bin/spark-submit --version` shows incorrect info
> -
>
> Key: SPARK-28906
> URL: https://issues.apache.org/jira/browse/SPARK-28906
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.4, 2.4.0, 2.4.1, 2.4.2, 
> 3.0.0, 2.4.3
>Reporter: Marcelo Vanzin
>Priority: Minor
> Attachments: image-2019-08-29-05-50-13-526.png
>
>
> Since Spark 2.3.1, `spark-submit` shows a wrong information.
> {code}
> $ bin/spark-submit --version
> Welcome to
>     __
>  / __/__  ___ _/ /__
> _\ \/ _ \/ _ `/ __/  '_/
>/___/ .__/\_,_/_/ /_/\_\   version 2.3.3
>   /_/
> Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222
> Branch
> Compiled by user  on 2019-02-04T13:00:46Z
> Revision
> Url
> Type --help for more information.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28916) Generated SpecificSafeProjection.apply method grows beyond 64 KB when use SparkSQL

2019-09-02 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920692#comment-16920692
 ] 

Wenchen Fan commented on SPARK-28916:
-

to double-check, this is just error message not actual exception, right?

> Generated SpecificSafeProjection.apply method grows beyond 64 KB when use  
> SparkSQL
> ---
>
> Key: SPARK-28916
> URL: https://issues.apache.org/jira/browse/SPARK-28916
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1, 2.4.3
>Reporter: MOBIN
>Priority: Major
>
> Can be reproduced by the following steps:
> 1. Create a table with 5000 fields
> 2. val data=spark.sql("select * from spark64kb limit 10");
> 3. data.describe()
> Then,The following error occurred
> {code:java}
> WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 0, localhost, 
> executor 1): org.codehaus.janino.InternalCompilerException: failed to 
> compile: org.codehaus.janino.InternalCompilerException: Compiling 
> "GeneratedClass": Code of method 
> "apply(Ljava/lang/Object;)Ljava/lang/Object;" of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection"
>  grows beyond 64 KB
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1298)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1376)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1373)
>   at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>   at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
>   at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>   at 
> org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1238)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.create(GenerateMutableProjection.scala:143)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.generate(GenerateMutableProjection.scala:44)
>   at 
> org.apache.spark.sql.execution.SparkPlan.newMutableProjection(SparkPlan.scala:385)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3$$anonfun$4.apply(SortAggregateExec.scala:96)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3$$anonfun$4.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.sql.execution.aggregate.AggregationIterator.generateProcessRow(AggregationIterator.scala:180)
>   at 
> org.apache.spark.sql.execution.aggregate.AggregationIterator.(AggregationIterator.scala:199)
>   at 
> org.apache.spark.sql.execution.aggregate.SortBasedAggregationIterator.(SortBasedAggregationIterator.scala:40)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:86)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:77)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
>   at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:121)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.codehaus.janino.InternalCompilerException: Compiling 
> "GeneratedClass": Code of method 
> "apply(Ljava/lang/Object;)Ljava/lang/Object;" 

[jira] [Commented] (SPARK-27733) Upgrade to Avro 1.9.x

2019-09-02 Thread Fokko Driesprong (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920685#comment-16920685
 ] 

Fokko Driesprong commented on SPARK-27733:
--

The regression issue has been resolved with the freshly released Avro 1.9.1. 
I'll look into the issues with the Hive dependency.

> Upgrade to Avro 1.9.x
> -
>
> Key: SPARK-27733
> URL: https://issues.apache.org/jira/browse/SPARK-27733
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.0
>Reporter: Ismaël Mejía
>Priority: Minor
>
> Avro 1.9.0 was released with many nice features including reduced size (1MB 
> less), and removed dependencies, no paranmer, no shaded guava, security 
> updates, so probably a worth upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28694) Add Java/Scala StructuredKerberizedKafkaWordCount examples

2019-09-02 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920677#comment-16920677
 ] 

daile commented on SPARK-28694:
---

ok

> Add Java/Scala StructuredKerberizedKafkaWordCount examples
> --
>
> Key: SPARK-28694
> URL: https://issues.apache.org/jira/browse/SPARK-28694
> Project: Spark
>  Issue Type: Improvement
>  Components: Examples, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: hong dongdong
>Priority: Minor
>
> Now,`StructuredKafkaWordCount` example is not support to visit kafka using 
> kerberos authentication. Add a parameter which target if kerberos is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2019-09-02 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920676#comment-16920676
 ] 

Wenchen Fan commented on SPARK-18084:
-

It looks to me that we should just add more doc in `partitionBy`. The string 
there is the column name, while the string in `select` is a general expression.

> write.partitionBy() does not recognize nested columns that select() can access
> --
>
> Key: SPARK-18084
> URL: https://issues.apache.org/jira/browse/SPARK-18084
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.0.1, 2.4.3
>Reporter: Nicholas Chammas
>Priority: Minor
>
> Here's a simple repro in the PySpark shell:
> {code}
> from pyspark.sql import Row
> rdd = spark.sparkContext.parallelize([Row(a=Row(b=5))])
> df = spark.createDataFrame(rdd)
> df.printSchema()
> df.select('a.b').show()  # works
> df.write.partitionBy('a.b').text('/tmp/test')  # doesn't work
> {code}
> Here's what I see when I run this:
> {code}
> >>> from pyspark.sql import Row
> >>> rdd = spark.sparkContext.parallelize([Row(a=Row(b=5))])
> >>> df = spark.createDataFrame(rdd)
> >>> df.printSchema()
> root
>  |-- a: struct (nullable = true)
>  ||-- b: long (nullable = true)
> >>> df.show()
> +---+
> |  a|
> +---+
> |[5]|
> +---+
> >>> df.select('a.b').show()
> +---+
> |  b|
> +---+
> |  5|
> +---+
> >>> df.write.partitionBy('a.b').text('/tmp/test')
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/apache-spark/2.0.1/libexec/python/pyspark/sql/utils.py", 
> line 63, in deco
> return f(*a, **kw)
>   File 
> "/usr/local/Cellar/apache-spark/2.0.1/libexec/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py",
>  line 319, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling o233.text.
> : org.apache.spark.sql.AnalysisException: Partition column a.b not found in 
> schema 
> StructType(StructField(a,StructType(StructField(b,LongType,true)),true));
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$partitionColumnsSchema$1$$anonfun$apply$10.apply(PartitioningUtils.scala:368)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$partitionColumnsSchema$1$$anonfun$apply$10.apply(PartitioningUtils.scala:368)
>   at scala.Option.getOrElse(Option.scala:121)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$partitionColumnsSchema$1.apply(PartitioningUtils.scala:367)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$partitionColumnsSchema$1.apply(PartitioningUtils.scala:366)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$.partitionColumnsSchema(PartitioningUtils.scala:366)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$.validatePartitionColumn(PartitioningUtils.scala:349)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:458)
>   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
>   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
>   at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:534)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>   at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>   at py4j.Gateway.invoke(Gateway.java:280)
>   at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>   at py4j.commands.CallCommand.execute(CallCommand.java:79)
>   at py4j.GatewayConnection.run(GatewayConnection.java:214)
>   at java.lang.Thread.run(Thread.java:745)
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "", line 1, in 
>   

[jira] [Commented] (SPARK-10892) Join with Data Frame returns wrong results

2019-09-02 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-10892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920675#comment-16920675
 ] 

Wenchen Fan commented on SPARK-10892:
-

This is kind of fixed in 3.0 by https://github.com/apache/spark/pull/25107 . 
Now Spark can detect ambiguous join condition and throw exception.

> Join with Data Frame returns wrong results
> --
>
> Key: SPARK-10892
> URL: https://issues.apache.org/jira/browse/SPARK-10892
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1, 1.5.0, 2.4.0
>Reporter: Ofer Mendelevitch
>Priority: Critical
>  Labels: correctness
> Attachments: data.json
>
>
> I'm attaching a simplified reproducible example of the problem:
> 1. Loading a JSON file from HDFS as a Data Frame
> 2. Creating 3 data frames: PRCP, TMIN, TMAX
> 3. Joining the data frames together. Each of those has a column "value" with 
> the same name, so renaming them after the join.
> 4. The output seems incorrect; the first column has the correct values, but 
> the two other columns seem to have a copy of the values from the first column.
> Here's the sample code:
> {code}
> import org.apache.spark.sql._
> val sqlc = new SQLContext(sc)
> val weather = sqlc.read.format("json").load("data.json")
> val prcp = weather.filter("metric = 'PRCP'").as("prcp").cache()
> val tmin = weather.filter("metric = 'TMIN'").as("tmin").cache()
> val tmax = weather.filter("metric = 'TMAX'").as("tmax").cache()
> prcp.filter("year=2012 and month=10").show()
> tmin.filter("year=2012 and month=10").show()
> tmax.filter("year=2012 and month=10").show()
> val out = (prcp.join(tmin, "date_str").join(tmax, "date_str")
>   .select(prcp("year"), prcp("month"), prcp("day"), prcp("date_str"),
> prcp("value").alias("PRCP"), tmin("value").alias("TMIN"),
> tmax("value").alias("TMAX")) )
> out.filter("year=2012 and month=10").show()
> {code}
> The output is:
> {code}
> ++---+--+-+---+-++
> |date_str|day|metric|month|station|value|year|
> ++---+--+-+---+-++
> |20121001|  1|  PRCP|   10|USW00023272|0|2012|
> |20121002|  2|  PRCP|   10|USW00023272|0|2012|
> |20121003|  3|  PRCP|   10|USW00023272|0|2012|
> |20121004|  4|  PRCP|   10|USW00023272|0|2012|
> |20121005|  5|  PRCP|   10|USW00023272|0|2012|
> |20121006|  6|  PRCP|   10|USW00023272|0|2012|
> |20121007|  7|  PRCP|   10|USW00023272|0|2012|
> |20121008|  8|  PRCP|   10|USW00023272|0|2012|
> |20121009|  9|  PRCP|   10|USW00023272|0|2012|
> |20121010| 10|  PRCP|   10|USW00023272|0|2012|
> |20121011| 11|  PRCP|   10|USW00023272|3|2012|
> |20121012| 12|  PRCP|   10|USW00023272|0|2012|
> |20121013| 13|  PRCP|   10|USW00023272|0|2012|
> |20121014| 14|  PRCP|   10|USW00023272|0|2012|
> |20121015| 15|  PRCP|   10|USW00023272|0|2012|
> |20121016| 16|  PRCP|   10|USW00023272|0|2012|
> |20121017| 17|  PRCP|   10|USW00023272|0|2012|
> |20121018| 18|  PRCP|   10|USW00023272|0|2012|
> |20121019| 19|  PRCP|   10|USW00023272|0|2012|
> |20121020| 20|  PRCP|   10|USW00023272|0|2012|
> ++---+--+-+---+-+——+
> ++---+--+-+---+-++
> |date_str|day|metric|month|station|value|year|
> ++---+--+-+---+-++
> |20121001|  1|  TMIN|   10|USW00023272|  139|2012|
> |20121002|  2|  TMIN|   10|USW00023272|  178|2012|
> |20121003|  3|  TMIN|   10|USW00023272|  144|2012|
> |20121004|  4|  TMIN|   10|USW00023272|  144|2012|
> |20121005|  5|  TMIN|   10|USW00023272|  139|2012|
> |20121006|  6|  TMIN|   10|USW00023272|  128|2012|
> |20121007|  7|  TMIN|   10|USW00023272|  122|2012|
> |20121008|  8|  TMIN|   10|USW00023272|  122|2012|
> |20121009|  9|  TMIN|   10|USW00023272|  139|2012|
> |20121010| 10|  TMIN|   10|USW00023272|  128|2012|
> |20121011| 11|  TMIN|   10|USW00023272|  122|2012|
> |20121012| 12|  TMIN|   10|USW00023272|  117|2012|
> |20121013| 13|  TMIN|   10|USW00023272|  122|2012|
> |20121014| 14|  TMIN|   10|USW00023272|  128|2012|
> |20121015| 15|  TMIN|   10|USW00023272|  128|2012|
> |20121016| 16|  TMIN|   10|USW00023272|  156|2012|
> |20121017| 17|  TMIN|   10|USW00023272|  139|2012|
> |20121018| 18|  TMIN|   10|USW00023272|  161|2012|
> |20121019| 19|  TMIN|   10|USW00023272|  133|2012|
> |20121020| 20|  TMIN|   10|USW00023272|  122|2012|
> ++---+--+-+---+-+——+
> ++---+--+-+---+-++
> |date_str|day|metric|month|station|value|year|
> ++---+--+-+---+-++
> |20121001|  1|  TMAX|   10|USW00023272|  322|2012|
> |20121002|  2|  TMAX|   10|USW00023272|  344|2012|
> |20121003|  3|  TMAX|   10|USW00023272|  

[jira] [Commented] (SPARK-28694) Add Java/Scala StructuredKerberizedKafkaWordCount examples

2019-09-02 Thread hong dongdong (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920674#comment-16920674
 ] 

hong dongdong commented on SPARK-28694:
---

[~726575...@qq.com] thanks, but I working now and will push today later. It is 
related to SPARK-28691.

> Add Java/Scala StructuredKerberizedKafkaWordCount examples
> --
>
> Key: SPARK-28694
> URL: https://issues.apache.org/jira/browse/SPARK-28694
> Project: Spark
>  Issue Type: Improvement
>  Components: Examples, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: hong dongdong
>Priority: Minor
>
> Now,`StructuredKafkaWordCount` example is not support to visit kafka using 
> kerberos authentication. Add a parameter which target if kerberos is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28948) support data source v2 in CREATE TABLE USING

2019-09-02 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-28948:
---

 Summary: support data source v2 in CREATE TABLE USING
 Key: SPARK-28948
 URL: https://issues.apache.org/jira/browse/SPARK-28948
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 3.0.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28694) Add Java/Scala StructuredKerberizedKafkaWordCount examples

2019-09-02 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920665#comment-16920665
 ] 

daile commented on SPARK-28694:
---

I will work on this

> Add Java/Scala StructuredKerberizedKafkaWordCount examples
> --
>
> Key: SPARK-28694
> URL: https://issues.apache.org/jira/browse/SPARK-28694
> Project: Spark
>  Issue Type: Improvement
>  Components: Examples, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: hong dongdong
>Priority: Minor
>
> Now,`StructuredKafkaWordCount` example is not support to visit kafka using 
> kerberos authentication. Add a parameter which target if kerberos is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28947) Status logging occurs on every state change but not at an interval for liveness.

2019-09-02 Thread Kent Yao (Jira)
Kent Yao created SPARK-28947:


 Summary: Status logging occurs on every state change but not at an 
interval for liveness.
 Key: SPARK-28947
 URL: https://issues.apache.org/jira/browse/SPARK-28947
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 2.4.4, 2.3.3
Reporter: Kent Yao


The start method of `LoggingPodStatusWatcherImpl`  should be invoked



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28940) Subquery reuse across all subquery levels

2019-09-02 Thread Peter Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Toth updated SPARK-28940:
---
Summary: Subquery reuse across all subquery levels  (was: Subquery reuse 
accross all subquery levels)

> Subquery reuse across all subquery levels
> -
>
> Key: SPARK-28940
> URL: https://issues.apache.org/jira/browse/SPARK-28940
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Major
>
> Currently subquery reuse doesn't work across all subquery levels.
> Here is an example query:
> {noformat}
> SELECT (SELECT avg(key) FROM testData), (SELECT (SELECT avg(key) FROM 
> testData))
> FROM testData
> LIMIT 1
> {noformat}
> where the plan now is:
> {noformat}
> CollectLimit 1
> +- *(1) Project [Subquery scalar-subquery#268, [id=#231] AS 
> scalarsubquery()#276, Subquery scalar-subquery#270, [id=#266] AS 
> scalarsubquery()#277]
>:  :- Subquery scalar-subquery#268, [id=#231]
>:  :  +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 as 
> bigint))], output=[avg(key)#272])
>:  : +- Exchange SinglePartition, true, [id=#227]
>:  :+- *(1) HashAggregate(keys=[], 
> functions=[partial_avg(cast(key#13 as bigint))], output=[sum#282, count#283L])
>:  :   +- *(1) SerializeFromObject 
> [knownnotnull(assertnotnull(input[0, 
> org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
>:  :  +- Scan[obj#12]
>:  +- Subquery scalar-subquery#270, [id=#266]
>: +- *(1) Project [Subquery scalar-subquery#269, [id=#263] AS 
> scalarsubquery()#275]
>::  +- Subquery scalar-subquery#269, [id=#263]
>:: +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 
> as bigint))], output=[avg(key)#274])
>::+- Exchange SinglePartition, true, [id=#259]
>::   +- *(1) HashAggregate(keys=[], 
> functions=[partial_avg(cast(key#13 as bigint))], output=[sum#286, count#287L])
>::  +- *(1) SerializeFromObject 
> [knownnotnull(assertnotnull(input[0, 
> org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
>:: +- Scan[obj#12]
>:+- *(1) Scan OneRowRelation[]
>+- *(1) SerializeFromObject
>   +- Scan[obj#12]
> {noformat}
> but it could be:
> {noformat}
> CollectLimit 1
> +- *(1) Project [ReusedSubquery Subquery scalar-subquery#241, [id=#148] AS 
> scalarsubquery()#248, Subquery scalar-subquery#242, [id=#164] AS 
> scalarsubquery()#249]
>:  :- ReusedSubquery Subquery scalar-subquery#241, [id=#148]
>:  +- Subquery scalar-subquery#242, [id=#164]
>: +- *(1) Project [Subquery scalar-subquery#241, [id=#148] AS 
> scalarsubquery()#247]
>::  +- Subquery scalar-subquery#241, [id=#148]
>:: +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 
> as bigint))], output=[avg(key)#246])
>::+- Exchange SinglePartition, true, [id=#144]
>::   +- *(1) HashAggregate(keys=[], 
> functions=[partial_avg(cast(key#13 as bigint))], output=[sum#258, count#259L])
>::  +- *(1) SerializeFromObject 
> [knownnotnull(assertnotnull(input[0, 
> org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
>:: +- Scan[obj#12]
>:+- *(1) Scan OneRowRelation[]
>+- *(1) SerializeFromObject
>   +- Scan[obj#12]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28912) MatchError exception in CheckpointWriteHandler

2019-09-02 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920647#comment-16920647
 ] 

Hyukjin Kwon commented on SPARK-28912:
--

ping [~avk1]

> MatchError exception in CheckpointWriteHandler
> --
>
> Key: SPARK-28912
> URL: https://issues.apache.org/jira/browse/SPARK-28912
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.2
>Reporter: Aleksandr Kashkirov
>Priority: Minor
>
> Setting checkpoint directory name to "checkpoint-" plus some digits (e.g. 
> "checkpoint-01") results in the following error:
> {code:java}
> Exception in thread "pool-32-thread-1" scala.MatchError: 
> 0523a434-0daa-4ea6-a050-c4eb3c557d8c (of class java.lang.String) 
>  at 
> org.apache.spark.streaming.Checkpoint$.org$apache$spark$streaming$Checkpoint$$sortFunc$1(Checkpoint.scala:121)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.Checkpoint$$anonfun$getCheckpointFiles$1.apply(Checkpoint.scala:132)
>  
>  at scala.math.Ordering$$anon$9.compare(Ordering.scala:200) 
>  at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) 
>  at java.util.TimSort.sort(TimSort.java:234) 
>  at java.util.Arrays.sort(Arrays.java:1438) 
>  at scala.collection.SeqLike$class.sorted(SeqLike.scala:648) 
>  at scala.collection.mutable.ArrayOps$ofRef.sorted(ArrayOps.scala:186) 
>  at scala.collection.SeqLike$class.sortWith(SeqLike.scala:601) 
>  at scala.collection.mutable.ArrayOps$ofRef.sortWith(ArrayOps.scala:186) 
>  at 
> org.apache.spark.streaming.Checkpoint$.getCheckpointFiles(Checkpoint.scala:132)
>  
>  at 
> org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:262)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
>  at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28941) Spark Sql Jobs

2019-09-02 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-28941.
--
Resolution: Invalid

Please ask questions to mailing lists.

> Spark Sql Jobs
> --
>
> Key: SPARK-28941
> URL: https://issues.apache.org/jira/browse/SPARK-28941
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Brahmendra
>Priority: Major
>  Labels: github-import, pull-request-available
> Fix For: 2.4.3
>
>
> HI Team,
> I need one favor on spark sql jobs.
> I have to 200+ spark sql query running on 7 different hive table.
> How can we do this in one jar file to execute all 200+ spark sql jobs.
> currently we are managing 7 jar files for each tables.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28943) NoSuchMethodError: shaded.parquet.org.apache.thrift.EncodingUtils.setBit(BIZ)B

2019-09-02 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920644#comment-16920644
 ] 

Hyukjin Kwon commented on SPARK-28943:
--

Does this happen in regular Apache Spark too, not CDH?
Also, please provide steps to reproduce.

> NoSuchMethodError: 
> shaded.parquet.org.apache.thrift.EncodingUtils.setBit(BIZ)B 
> ---
>
> Key: SPARK-28943
> URL: https://issues.apache.org/jira/browse/SPARK-28943
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Michael Heuer
>Priority: Major
>
> Since adapting our build for Spark 2.4.x, we are unable to run on Spark 2.2.0 
> provided by CDH.  For more details, please see linked issue 
> https://github.com/bigdatagenomics/adam/issues/2157



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28941) Spark Sql Jobs

2019-09-02 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-28941:
-
Target Version/s:   (was: 2.4.3)

> Spark Sql Jobs
> --
>
> Key: SPARK-28941
> URL: https://issues.apache.org/jira/browse/SPARK-28941
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Brahmendra
>Priority: Major
>  Labels: github-import, pull-request-available
> Fix For: 2.4.3
>
>
> HI Team,
> I need one favor on spark sql jobs.
> I have to 200+ spark sql query running on 7 different hive table.
> How can we do this in one jar file to execute all 200+ spark sql jobs.
> currently we are managing 7 jar files for each tables.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28943) NoSuchMethodError: shaded.parquet.org.apache.thrift.EncodingUtils.setBit(BIZ)B

2019-09-02 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920644#comment-16920644
 ] 

Hyukjin Kwon edited comment on SPARK-28943 at 9/2/19 6:27 AM:
--

Does this happen in regular Apache Spark too, not CDH?
Also, please provide steps to reproduce.

Also, does that happen in Apache Spark 2.4.x? or 2.2.0?


was (Author: hyukjin.kwon):
Does this happen in regular Apache Spark too, not CDH?
Also, please provide steps to reproduce.

> NoSuchMethodError: 
> shaded.parquet.org.apache.thrift.EncodingUtils.setBit(BIZ)B 
> ---
>
> Key: SPARK-28943
> URL: https://issues.apache.org/jira/browse/SPARK-28943
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Michael Heuer
>Priority: Major
>
> Since adapting our build for Spark 2.4.x, we are unable to run on Spark 2.2.0 
> provided by CDH.  For more details, please see linked issue 
> https://github.com/bigdatagenomics/adam/issues/2157



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27336) Incorrect DataSet.summary() result

2019-09-02 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-27336.
--
Resolution: Won't Fix

> Incorrect DataSet.summary() result
> --
>
> Key: SPARK-27336
> URL: https://issues.apache.org/jira/browse/SPARK-27336
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Priority: Major
> Attachments: test.csv
>
>
> There is a single data point in the minimum_nights column that is 1.0E8 out 
> of 8k records, but .summary() says it is the 75% and the max.
> I compared this with approxQuantile, and approxQuantile for 75% gave the 
> correct value of 30.0.
> To reproduce:
> {code:java}
> scala> val df = 
> spark.read.format("csv").load("test.csv").withColumn("minimum_nights", 
> '_c0.cast("Int"))
> df: org.apache.spark.sql.DataFrame = [_c0: string, minimum_nights: int]
> scala> df.select("minimum_nights").summary().show()
> +---+--+
> |summary|minimum_nights|
> +---+--+
> |  count|  7072|
> |   mean| 14156.35407239819|
> | stddev|1189128.5444975856|
> |min| 1|
> |25%| 2|
> |50%| 4|
> |75%| 1|
> |max| 1|
> +---+--+
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.1)
> res1: Array[Double] = Array(30.0)
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.001)
> res2: Array[Double] = Array(30.0)
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.0001)
> res3: Array[Double] = Array(1.0E8)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org