[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461965#comment-16461965
 ] 

Hyukjin Kwon commented on SPARK-24152:
--

Thanks [~viirya], I retriggered one build. Will resolve this once it gets 
passed.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Assignee: Liang-Chi Hsieh
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-24152:


Assignee: Liang-Chi Hsieh

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Assignee: Liang-Chi Hsieh
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461962#comment-16461962
 ] 

Liang-Chi Hsieh commented on SPARK-24152:
-

CRAN sysadmin replied me it should be fixed now. I can't access laptop so
don't confirm it. Maybe someone can confirm it by checking if Jenkins R
tests pass now.

Thanks.




> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread Dongjoon Hyun (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461957#comment-16461957
 ] 

Dongjoon Hyun commented on SPARK-24151:
---

Thank you for reporting and fixing this, [~jamesthomp].
I also checked that this is a regression at Apache Spark 2.2.1 as you reported.

> CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when 
> caseSensitive is enabled
> --
>
> Key: SPARK-24151
> URL: https://issues.apache.org/jira/browse/SPARK-24151
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: James Thompson
>Priority: Major
>
> After this change: https://issues.apache.org/jira/browse/SPARK-22333
> Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
> been enabled:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
> given input columns: [col1]{code}
> This is due to the fact that the analyzer incorrectly uses a case sensitive 
> resolver to resolve the function. I will submit a PR with a fix + test for 
> this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24169) JsonToStructs should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24169:


Assignee: Apache Spark  (was: Wenchen Fan)

> JsonToStructs should not access SQLConf at executor side
> 
>
> Key: SPARK-24169
> URL: https://issues.apache.org/jira/browse/SPARK-24169
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24169) JsonToStructs should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24169:


Assignee: Wenchen Fan  (was: Apache Spark)

> JsonToStructs should not access SQLConf at executor side
> 
>
> Key: SPARK-24169
> URL: https://issues.apache.org/jira/browse/SPARK-24169
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24169) JsonToStructs should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461954#comment-16461954
 ] 

Apache Spark commented on SPARK-24169:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/21226

> JsonToStructs should not access SQLConf at executor side
> 
>
> Key: SPARK-24169
> URL: https://issues.apache.org/jira/browse/SPARK-24169
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24151:
--
Affects Version/s: 2.2.1

> CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when 
> caseSensitive is enabled
> --
>
> Key: SPARK-24151
> URL: https://issues.apache.org/jira/browse/SPARK-24151
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: James Thompson
>Priority: Major
>
> After this change: https://issues.apache.org/jira/browse/SPARK-22333
> Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
> been enabled:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
> given input columns: [col1]{code}
> This is due to the fact that the analyzer incorrectly uses a case sensitive 
> resolver to resolve the function. I will submit a PR with a fix + test for 
> this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24169) JsonToStructs should not access SQLConf at executor side

2018-05-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24169:
---

 Summary: JsonToStructs should not access SQLConf at executor side
 Key: SPARK-24169
 URL: https://issues.apache.org/jira/browse/SPARK-24169
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24168) WindowExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461951#comment-16461951
 ] 

Apache Spark commented on SPARK-24168:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/21225

> WindowExec should not access SQLConf at executor side
> -
>
> Key: SPARK-24168
> URL: https://issues.apache.org/jira/browse/SPARK-24168
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24168) WindowExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24168:


Assignee: Wenchen Fan  (was: Apache Spark)

> WindowExec should not access SQLConf at executor side
> -
>
> Key: SPARK-24168
> URL: https://issues.apache.org/jira/browse/SPARK-24168
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24168) WindowExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24168:


Assignee: Apache Spark  (was: Wenchen Fan)

> WindowExec should not access SQLConf at executor side
> -
>
> Key: SPARK-24168
> URL: https://issues.apache.org/jira/browse/SPARK-24168
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24168) WindowExec should not access SQLConf at executor side

2018-05-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24168:
---

 Summary: WindowExec should not access SQLConf at executor side
 Key: SPARK-24168
 URL: https://issues.apache.org/jira/browse/SPARK-24168
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24167) ParquetFilters should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24167:


Assignee: Apache Spark  (was: Wenchen Fan)

> ParquetFilters should not access SQLConf at executor side
> -
>
> Key: SPARK-24167
> URL: https://issues.apache.org/jira/browse/SPARK-24167
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24167) ParquetFilters should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24167:


Assignee: Wenchen Fan  (was: Apache Spark)

> ParquetFilters should not access SQLConf at executor side
> -
>
> Key: SPARK-24167
> URL: https://issues.apache.org/jira/browse/SPARK-24167
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24167) ParquetFilters should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461949#comment-16461949
 ] 

Apache Spark commented on SPARK-24167:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/21224

> ParquetFilters should not access SQLConf at executor side
> -
>
> Key: SPARK-24167
> URL: https://issues.apache.org/jira/browse/SPARK-24167
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24167) ParquetFilters should not access SQLConf at executor side

2018-05-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24167:
---

 Summary: ParquetFilters should not access SQLConf at executor side
 Key: SPARK-24167
 URL: https://issues.apache.org/jira/browse/SPARK-24167
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24166) InMemoryTableScanExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461935#comment-16461935
 ] 

Apache Spark commented on SPARK-24166:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/21223

> InMemoryTableScanExec should not access SQLConf at executor side
> 
>
> Key: SPARK-24166
> URL: https://issues.apache.org/jira/browse/SPARK-24166
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24166) InMemoryTableScanExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24166:


Assignee: Apache Spark  (was: Wenchen Fan)

> InMemoryTableScanExec should not access SQLConf at executor side
> 
>
> Key: SPARK-24166
> URL: https://issues.apache.org/jira/browse/SPARK-24166
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24166) InMemoryTableScanExec should not access SQLConf at executor side

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24166:


Assignee: Wenchen Fan  (was: Apache Spark)

> InMemoryTableScanExec should not access SQLConf at executor side
> 
>
> Key: SPARK-24166
> URL: https://issues.apache.org/jira/browse/SPARK-24166
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24166) InMemoryTableScanExec should not access SQLConf at executor side

2018-05-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24166:
---

 Summary: InMemoryTableScanExec should not access SQLConf at 
executor side
 Key: SPARK-24166
 URL: https://issues.apache.org/jira/browse/SPARK-24166
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461932#comment-16461932
 ] 

Hyukjin Kwon edited comment on SPARK-24152 at 5/3/18 4:44 AM:
--

For the past issue, it was fixed within only a couple of hours (after his 
action to CRAN admin). Will take an action if it takes a longer while.


was (Author: hyukjin.kwon):
For the past issue, it was fixed within only a couple of hours. Will take an 
action if it takes a longer while.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461932#comment-16461932
 ] 

Hyukjin Kwon commented on SPARK-24152:
--

For the past issue, it was fixed within only a couple of hours. Will take an 
action if it takes a longer while.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461931#comment-16461931
 ] 

Shivaram Venkataraman commented on SPARK-24152:
---

If this is blocking all PRs I think its fine to temporarily remove the CRAN 
check from Jenkins – We'll just need to be extra careful while merging SparkR 
PRs for a short period of time.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461929#comment-16461929
 ] 

Hyukjin Kwon edited comment on SPARK-24152 at 5/3/18 4:40 AM:
--

>From Liang-Chi's comment and given previous discussion and resolution - 
>[https://github.com/apache/spark/pull/20005|https://github.com/apache/spark/pull/20005]
> (SPARK-22812) seems it's a problem from R's. We (mainly he) investigated this 
>problem there and he solved that by asking / reporting the problem to R dev. I 
>think it's outside of Spark and we could wait for the response.

BTW, I think this is quite critical since it blocks all other PRs.


was (Author: hyukjin.kwon):
>From Liang-Chi's comment and given previous discussion and resolution - 
>[https://github.com/apache/spark/pull/20005|https://github.com/apache/spark/pull/20005,]
> (SPARK-22812) seems it's a problem from R's. We (mainly he) investigated this 
>problem there and he solved that by asking / reporting the problem to R dev. I 
>think it's outside of Spark and we could wait for the response.

BTW, I think this is quite critical since it blocks all other PRs.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461929#comment-16461929
 ] 

Hyukjin Kwon edited comment on SPARK-24152 at 5/3/18 4:39 AM:
--

>From Liang-Chi's comment and given previous discussion and resolution - 
>[https://github.com/apache/spark/pull/20005|https://github.com/apache/spark/pull/20005,]
> (SPARK-22812) seems it's a problem from R's. We (mainly he) investigated this 
>problem there and he solved that by asking / reporting the problem to R dev. I 
>think it's outside of Spark and we could wait for the response.

BTW, I think this is quite critical since it blocks all other PRs.


was (Author: hyukjin.kwon):
>From Liang-Chi's comment and given previous discussion and resolution - 
>[https://github.com/apache/spark/pull/20005,] seems it's a problem from R's. 
>We (mainly he) investigated this problem there and he solved that by asking / 
>reporting the problem to R dev. I think it's outside of Spark and we could 
>wait for the response.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461929#comment-16461929
 ] 

Hyukjin Kwon commented on SPARK-24152:
--

>From Liang-Chi's comment and given previous discussion and resolution - 
>[https://github.com/apache/spark/pull/20005,] seems it's a problem from R's. 
>We (mainly he) investigated this problem there and he solved that by asking / 
>reporting the problem to R dev. I think it's outside of Spark and we could 
>wait for the response.

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461920#comment-16461920
 ] 

Shivaram Venkataraman commented on SPARK-24152:
---

Unfortunately I dont have time to look at this till Friday. Do we know if the 
problem is in SparkR or from some other package ?

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461867#comment-16461867
 ] 

Liang-Chi Hsieh commented on SPARK-24152:
-

Thanks [~hyukjin.kwon] for pinging me. I found a problem in CRAN PACKAGES.in 
file. Seems it causes the R test failure again. Already emailed to cran 
sysadmin for help.

 

 

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24161) Enable debug package feature on structured streaming

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24161:


Assignee: (was: Apache Spark)

> Enable debug package feature on structured streaming
> 
>
> Key: SPARK-24161
> URL: https://issues.apache.org/jira/browse/SPARK-24161
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jungtaek Lim
>Priority: Major
>
> Currently, debug package has a implicit class which matches Dataset to 
> provide debug features on Dataset class. It doesn't work with structured 
> streaming: it requires query is already started, and the information can be 
> retrieved from StreamingQuery, not Dataset. For the same reason, "explain" 
> had to be placed to StreamingQuery whereas it exists on Dataset.
> This issue tracks effort to enable debug package feature on structured 
> streaming. Unlike batch, it may have some restrictions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24161) Enable debug package feature on structured streaming

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24161:


Assignee: Apache Spark

> Enable debug package feature on structured streaming
> 
>
> Key: SPARK-24161
> URL: https://issues.apache.org/jira/browse/SPARK-24161
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jungtaek Lim
>Assignee: Apache Spark
>Priority: Major
>
> Currently, debug package has a implicit class which matches Dataset to 
> provide debug features on Dataset class. It doesn't work with structured 
> streaming: it requires query is already started, and the information can be 
> retrieved from StreamingQuery, not Dataset. For the same reason, "explain" 
> had to be placed to StreamingQuery whereas it exists on Dataset.
> This issue tracks effort to enable debug package feature on structured 
> streaming. Unlike batch, it may have some restrictions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24161) Enable debug package feature on structured streaming

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461858#comment-16461858
 ] 

Apache Spark commented on SPARK-24161:
--

User 'HeartSaVioR' has created a pull request for this issue:
https://github.com/apache/spark/pull/21222

> Enable debug package feature on structured streaming
> 
>
> Key: SPARK-24161
> URL: https://issues.apache.org/jira/browse/SPARK-24161
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jungtaek Lim
>Priority: Major
>
> Currently, debug package has a implicit class which matches Dataset to 
> provide debug features on Dataset class. It doesn't work with structured 
> streaming: it requires query is already started, and the information can be 
> retrieved from StreamingQuery, not Dataset. For the same reason, "explain" 
> had to be placed to StreamingQuery whereas it exists on Dataset.
> This issue tracks effort to enable debug package feature on structured 
> streaming. Unlike batch, it may have some restrictions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24110) Avoid calling UGI loginUserFromKeytab in ThriftServer

2018-05-02 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-24110.
-
   Resolution: Fixed
Fix Version/s: 2.4.0

Issue resolved by pull request 21178
[https://github.com/apache/spark/pull/21178]

> Avoid calling UGI loginUserFromKeytab in ThriftServer
> -
>
> Key: SPARK-24110
> URL: https://issues.apache.org/jira/browse/SPARK-24110
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
>Priority: Major
> Fix For: 2.4.0
>
>
> Spark ThriftServer will call UGI.loginUserFromKeytab twice in initialization. 
> This is unnecessary and will cause various potential problems, like Hadoop 
> IPC failure after 7 days, or RM failover issue and so on.
> So here we need to remove all the unnecessary login logics and make sure UGI 
> in the context never be created again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24110) Avoid calling UGI loginUserFromKeytab in ThriftServer

2018-05-02 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned SPARK-24110:
---

Assignee: Saisai Shao

> Avoid calling UGI loginUserFromKeytab in ThriftServer
> -
>
> Key: SPARK-24110
> URL: https://issues.apache.org/jira/browse/SPARK-24110
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
>Priority: Major
> Fix For: 2.4.0
>
>
> Spark ThriftServer will call UGI.loginUserFromKeytab twice in initialization. 
> This is unnecessary and will cause various potential problems, like Hadoop 
> IPC failure after 7 days, or RM failover issue and so on.
> So here we need to remove all the unnecessary login logics and make sure UGI 
> in the context never be created again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461798#comment-16461798
 ] 

Hyukjin Kwon commented on SPARK-24152:
--

cc [~viirya] too

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-22812) Failing cran-check on master

2018-05-02 Thread Hyukjin Kwon (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-22812:


Assignee: Liang-Chi Hsieh

> Failing cran-check on master 
> -
>
> Key: SPARK-22812
> URL: https://issues.apache.org/jira/browse/SPARK-22812
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.3.0
>Reporter: Hossein Falaki
>Assignee: Liang-Chi Hsieh
>Priority: Minor
>
> When I run {{R/run-tests.sh}} or {{R/check-cran.sh}} I get the following 
> failure message:
> {code}
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) :
>   dims [product 22] do not match the length of object [0]
> {code}
> cc [~felixcheung] have you experienced this error before?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24165) UDF within when().otherwise() raises NullPointerException

2018-05-02 Thread Jingxuan Wang (JIRA)
Jingxuan Wang created SPARK-24165:
-

 Summary: UDF within when().otherwise() raises NullPointerException
 Key: SPARK-24165
 URL: https://issues.apache.org/jira/browse/SPARK-24165
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Jingxuan Wang


I have a UDF which takes java.sql.Timestamp and String as input column type and 
returns an Array of (Seq[case class], Double) as output. Since some of values 
in input columns can be nullable, I put the UDF inside a when($input.isNull, 
null).otherwise(UDF) filter. Such function works well when I test in spark 
shell. But running as a scala jar in spark-submit with yarn cluster mode, it 
raised NullPointerException which points to the UDF function. If I remove the 
when().otherwsie() condition, but put null check inside the UDF, the function 
works without issue in spark-submit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23429) Add executor memory metrics to heartbeat and expose in executors REST API

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461752#comment-16461752
 ] 

Apache Spark commented on SPARK-23429:
--

User 'edwinalu' has created a pull request for this issue:
https://github.com/apache/spark/pull/21221

> Add executor memory metrics to heartbeat and expose in executors REST API
> -
>
> Key: SPARK-23429
> URL: https://issues.apache.org/jira/browse/SPARK-23429
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Edwina Lu
>Priority: Major
>
> Add new executor level memory metrics ( jvmUsedMemory, onHeapExecutionMemory, 
> offHeapExecutionMemory, onHeapStorageMemory, offHeapStorageMemory, 
> onHeapUnifiedMemory, and offHeapUnifiedMemory), and expose these via the 
> executors REST API. This information will help provide insight into how 
> executor and driver JVM memory is used, and for the different memory regions. 
> It can be used to help determine good values for spark.executor.memory, 
> spark.driver.memory, spark.memory.fraction, and spark.memory.storageFraction.
> Add an ExecutorMetrics class, with jvmUsedMemory, onHeapExecutionMemory, 
> offHeapExecutionMemory, onHeapStorageMemory, and offHeapStorageMemory. This 
> will track the memory usage at the executor level. The new ExecutorMetrics 
> will be sent by executors to the driver as part of the Heartbeat. A heartbeat 
> will be added for the driver as well, to collect these metrics for the driver.
> Modify the EventLoggingListener to log ExecutorMetricsUpdate events if there 
> is a new peak value for one of the memory metrics for an executor and stage. 
> Only the ExecutorMetrics will be logged, and not the TaskMetrics, to minimize 
> additional logging. Analysis on a set of sample applications showed an 
> increase of 0.25% in the size of the Spark history log, with this approach.
> Modify the AppStatusListener to collect snapshots of peak values for each 
> memory metric. Each snapshot has the time, jvmUsedMemory, executionMemory and 
> storageMemory, and list of active stages.
> Add the new memory metrics (snapshots of peak values for each memory metric) 
> to the executors REST API.
> This is a subtask for SPARK-23206. Please refer to the design doc for that 
> ticket for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24164) Support column list as the pivot column in Pivot

2018-05-02 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-24164:
---

 Summary: Support column list as the pivot column in Pivot
 Key: SPARK-24164
 URL: https://issues.apache.org/jira/browse/SPARK-24164
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.3.0
Reporter: Maryann Xue


This is part of a functionality extension to Pivot SQL support as SPARK-24035.

Currently, we only support a single column as the pivot column, while a column 
list as the pivot column would look like:
{code:java}
SELECT * FROM (
  SELECT year, course, earnings FROM courseSales
)
PIVOT (
  sum(earnings)
  FOR (course, year) IN (('dotNET', 2012), ('Java', 2013))
);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24163) Support "ANY" or sub-query for Pivot "IN" clause

2018-05-02 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-24163:
---

 Summary: Support "ANY" or sub-query for Pivot "IN" clause
 Key: SPARK-24163
 URL: https://issues.apache.org/jira/browse/SPARK-24163
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.3.0
Reporter: Maryann Xue


This is part of a functionality extension to Pivot SQL support as SPARK-24035.

Currently, only literal values are allowed in Pivot "IN" clause. To support ANY 
or a sub-query in the "IN" clause (the examples of which provided below), we 
need to enable evaluation of a sub-query before/during query analysis time.
{code:java}
SELECT * FROM (
  SELECT year, course, earnings FROM courseSales
)
PIVOT (
  sum(earnings)
  FOR course IN ANY
);{code}
{code:java}
SELECT * FROM (
  SELECT year, course, earnings FROM courseSales
)
PIVOT (
  sum(earnings)
  FOR course IN (
SELECT course FROM courses
WHERE region = 'AZ'
  )
);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24162) Support aliased literal values for Pivot "IN" clause

2018-05-02 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-24162:
---

 Summary: Support aliased literal values for Pivot "IN" clause
 Key: SPARK-24162
 URL: https://issues.apache.org/jira/browse/SPARK-24162
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.3.0
Reporter: Maryann Xue


This is part of a functionality extension to Pivot SQL support as SPARK-24035.

When literal values are specified in Pivot IN clause, it would be nice to allow 
aliases for those values so that the output column names can be customized. For 
example:
{code:java}
SELECT * FROM (
  SELECT year, course, earnings FROM courseSales
)
PIVOT (
  sum(earnings)
  FOR course IN ('dotNET' as c1, 'Java' as c2)
);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24111) Add TPCDS v2.7 (latest) queries in TPCDSQueryBenchmark

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-24111.
-
   Resolution: Fixed
 Assignee: Takeshi Yamamuro
Fix Version/s: 2.4.0

> Add TPCDS v2.7 (latest) queries in TPCDSQueryBenchmark
> --
>
> Key: SPARK-24111
> URL: https://issues.apache.org/jira/browse/SPARK-24111
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
>Priority: Trivial
> Fix For: 2.4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24161) Enable debug package feature on structured streaming

2018-05-02 Thread Jungtaek Lim (JIRA)
Jungtaek Lim created SPARK-24161:


 Summary: Enable debug package feature on structured streaming
 Key: SPARK-24161
 URL: https://issues.apache.org/jira/browse/SPARK-24161
 Project: Spark
  Issue Type: Improvement
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Jungtaek Lim


Currently, debug package has a implicit class which matches Dataset to provide 
debug features on Dataset class. It doesn't work with structured streaming: it 
requires query is already started, and the information can be retrieved from 
StreamingQuery, not Dataset. For the same reason, "explain" had to be placed to 
StreamingQuery whereas it exists on Dataset.

This issue tracks effort to enable debug package feature on structured 
streaming. Unlike batch, it may have some restrictions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24161) Enable debug package feature on structured streaming

2018-05-02 Thread Jungtaek Lim (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461713#comment-16461713
 ] 

Jungtaek Lim commented on SPARK-24161:
--

I have a working patch. Will raise a PR sooner.

> Enable debug package feature on structured streaming
> 
>
> Key: SPARK-24161
> URL: https://issues.apache.org/jira/browse/SPARK-24161
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jungtaek Lim
>Priority: Major
>
> Currently, debug package has a implicit class which matches Dataset to 
> provide debug features on Dataset class. It doesn't work with structured 
> streaming: it requires query is already started, and the information can be 
> retrieved from StreamingQuery, not Dataset. For the same reason, "explain" 
> had to be placed to StreamingQuery whereas it exists on Dataset.
> This issue tracks effort to enable debug package feature on structured 
> streaming. Unlike batch, it may have some restrictions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24157) Enable no-data micro batches for streaming aggregation and deduplication

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461680#comment-16461680
 ] 

Apache Spark commented on SPARK-24157:
--

User 'tdas' has created a pull request for this issue:
https://github.com/apache/spark/pull/21220

> Enable no-data micro batches for streaming aggregation and deduplication
> 
>
> Key: SPARK-24157
> URL: https://issues.apache.org/jira/browse/SPARK-24157
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Tathagata Das
>Assignee: Tathagata Das
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24157) Enable no-data micro batches for streaming aggregation and deduplication

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24157:


Assignee: Apache Spark  (was: Tathagata Das)

> Enable no-data micro batches for streaming aggregation and deduplication
> 
>
> Key: SPARK-24157
> URL: https://issues.apache.org/jira/browse/SPARK-24157
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Tathagata Das
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24157) Enable no-data micro batches for streaming aggregation and deduplication

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24157:


Assignee: Tathagata Das  (was: Apache Spark)

> Enable no-data micro batches for streaming aggregation and deduplication
> 
>
> Key: SPARK-24157
> URL: https://issues.apache.org/jira/browse/SPARK-24157
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Tathagata Das
>Assignee: Tathagata Das
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24160) ShuffleBlockFetcherIterator should fail if it receives zero-size blocks

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24160:


Assignee: Josh Rosen  (was: Apache Spark)

> ShuffleBlockFetcherIterator should fail if it receives zero-size blocks
> ---
>
> Key: SPARK-24160
> URL: https://issues.apache.org/jira/browse/SPARK-24160
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.3.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>
> In the shuffle layer, we guarantee that zero-size blocks will never be 
> requested (a block containing zero records is always 0 bytes in size and is 
> marked as empty such that it will never be legitimately requested by 
> executors). However, we failed to take advantage of this in the shuffle-read 
> path: the existing code did not explicitly check whether blocks are 
> non-zero-size.
>  
> We should add `buf.size != 0` checks to ShuffleBlockFetcherIterator to take 
> advantage of this invariant and prevent potential data loss / corruption 
> issues. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24160) ShuffleBlockFetcherIterator should fail if it receives zero-size blocks

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24160:


Assignee: Apache Spark  (was: Josh Rosen)

> ShuffleBlockFetcherIterator should fail if it receives zero-size blocks
> ---
>
> Key: SPARK-24160
> URL: https://issues.apache.org/jira/browse/SPARK-24160
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.3.0
>Reporter: Josh Rosen
>Assignee: Apache Spark
>Priority: Major
>
> In the shuffle layer, we guarantee that zero-size blocks will never be 
> requested (a block containing zero records is always 0 bytes in size and is 
> marked as empty such that it will never be legitimately requested by 
> executors). However, we failed to take advantage of this in the shuffle-read 
> path: the existing code did not explicitly check whether blocks are 
> non-zero-size.
>  
> We should add `buf.size != 0` checks to ShuffleBlockFetcherIterator to take 
> advantage of this invariant and prevent potential data loss / corruption 
> issues. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24160) ShuffleBlockFetcherIterator should fail if it receives zero-size blocks

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461675#comment-16461675
 ] 

Apache Spark commented on SPARK-24160:
--

User 'JoshRosen' has created a pull request for this issue:
https://github.com/apache/spark/pull/21219

> ShuffleBlockFetcherIterator should fail if it receives zero-size blocks
> ---
>
> Key: SPARK-24160
> URL: https://issues.apache.org/jira/browse/SPARK-24160
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.3.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>
> In the shuffle layer, we guarantee that zero-size blocks will never be 
> requested (a block containing zero records is always 0 bytes in size and is 
> marked as empty such that it will never be legitimately requested by 
> executors). However, we failed to take advantage of this in the shuffle-read 
> path: the existing code did not explicitly check whether blocks are 
> non-zero-size.
>  
> We should add `buf.size != 0` checks to ShuffleBlockFetcherIterator to take 
> advantage of this invariant and prevent potential data loss / corruption 
> issues. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24160) ShuffleBlockFetcherIterator should fail if it receives zero-size blocks

2018-05-02 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-24160:
--

 Summary: ShuffleBlockFetcherIterator should fail if it receives 
zero-size blocks
 Key: SPARK-24160
 URL: https://issues.apache.org/jira/browse/SPARK-24160
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle
Affects Versions: 2.3.0
Reporter: Josh Rosen
Assignee: Josh Rosen


In the shuffle layer, we guarantee that zero-size blocks will never be 
requested (a block containing zero records is always 0 bytes in size and is 
marked as empty such that it will never be legitimately requested by 
executors). However, we failed to take advantage of this in the shuffle-read 
path: the existing code did not explicitly check whether blocks are 
non-zero-size.

 

We should add `buf.size != 0` checks to ShuffleBlockFetcherIterator to take 
advantage of this invariant and prevent potential data loss / corruption 
issues. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24155) Instrumentation improvement for clustering

2018-05-02 Thread Lu Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Wang updated SPARK-24155:

Summary: Instrumentation improvement for clustering  (was: Instrument 
improvement for clustering)

> Instrumentation improvement for clustering
> --
>
> Key: SPARK-24155
> URL: https://issues.apache.org/jira/browse/SPARK-24155
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Lu Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24132) Instrumentation improvement for classification

2018-05-02 Thread Lu Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Wang updated SPARK-24132:

Summary: Instrumentation improvement for classification  (was: Instruments 
improvement for classification)

> Instrumentation improvement for classification
> --
>
> Key: SPARK-24132
> URL: https://issues.apache.org/jira/browse/SPARK-24132
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Lu Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24159) Enable no-data micro batches for streaming mapGroupswithState

2018-05-02 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-24159:
-

 Summary: Enable no-data micro batches for streaming 
mapGroupswithState
 Key: SPARK-24159
 URL: https://issues.apache.org/jira/browse/SPARK-24159
 Project: Spark
  Issue Type: Sub-task
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Tathagata Das


When event-time timeout is enabled, then use watermark updates to decide 
whether to run another batch

When processing-time timeout is enabled, then use the processing time and to 
decide when to run more batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24158) Enable no-data micro batches for streaming joins

2018-05-02 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-24158:
-

 Summary: Enable no-data micro batches for streaming joins
 Key: SPARK-24158
 URL: https://issues.apache.org/jira/browse/SPARK-24158
 Project: Spark
  Issue Type: Sub-task
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Tathagata Das






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24158) Enable no-data micro batches for streaming joins

2018-05-02 Thread Tathagata Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tathagata Das reassigned SPARK-24158:
-

Assignee: Tathagata Das

> Enable no-data micro batches for streaming joins
> 
>
> Key: SPARK-24158
> URL: https://issues.apache.org/jira/browse/SPARK-24158
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Tathagata Das
>Assignee: Tathagata Das
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24156) Enable no-data micro batches for more eager streaming state clean up

2018-05-02 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-24156:
-

 Summary: Enable no-data micro batches for more eager streaming 
state clean up 
 Key: SPARK-24156
 URL: https://issues.apache.org/jira/browse/SPARK-24156
 Project: Spark
  Issue Type: Improvement
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Tathagata Das
Assignee: Tathagata Das


Currently, MicroBatchExecution in Structured Streaming runs batches only when 
there is new data to process. This is sensible in most cases as we dont want to 
unnecessarily use resources when there is nothing new to process. However, in 
some cases of stateful streaming queries, this delays state clean up as well as 
clean-up based output. For example, consider a streaming aggregation query with 
watermark-based state cleanup. The watermark is updated after every batch with 
new data completes. The updated value is used in the next batch to clean up 
state, and output finalized aggregates in append mode. However, if there is no 
data, then the next batch does not occur, and cleanup/output gets delayed 
unnecessarily. This is true for all stateful streaming operators - aggregation, 
deduplication, joins, mapGroupsWithState

This issue tracks the work to enable no-data batches in MicroBatchExecution. 
The major challenge is that all the tests of relevant stateful operations add 
dummy data to force another batch for testing the state cleanup. So a lot of 
the tests are going to be changed. So my plan is to enable no-data batches for 
different stateful operators one at a time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24157) Enable no-data micro batches for streaming aggregation and deduplication

2018-05-02 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-24157:
-

 Summary: Enable no-data micro batches for streaming aggregation 
and deduplication
 Key: SPARK-24157
 URL: https://issues.apache.org/jira/browse/SPARK-24157
 Project: Spark
  Issue Type: Sub-task
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Tathagata Das
Assignee: Tathagata Das






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24155) Instrument improvement for clustering

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24155:


Assignee: Apache Spark

> Instrument improvement for clustering
> -
>
> Key: SPARK-24155
> URL: https://issues.apache.org/jira/browse/SPARK-24155
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Lu Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24155) Instrument improvement for clustering

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461640#comment-16461640
 ] 

Apache Spark commented on SPARK-24155:
--

User 'ludatabricks' has created a pull request for this issue:
https://github.com/apache/spark/pull/21218

> Instrument improvement for clustering
> -
>
> Key: SPARK-24155
> URL: https://issues.apache.org/jira/browse/SPARK-24155
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Lu Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24155) Instrument improvement for clustering

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24155:


Assignee: (was: Apache Spark)

> Instrument improvement for clustering
> -
>
> Key: SPARK-24155
> URL: https://issues.apache.org/jira/browse/SPARK-24155
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Lu Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24155) Instrument improvement for clustering

2018-05-02 Thread Lu Wang (JIRA)
Lu Wang created SPARK-24155:
---

 Summary: Instrument improvement for clustering
 Key: SPARK-24155
 URL: https://issues.apache.org/jira/browse/SPARK-24155
 Project: Spark
  Issue Type: Sub-task
  Components: ML
Affects Versions: 2.3.0
Reporter: Lu Wang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-18791) Stream-Stream Joins

2018-05-02 Thread Tathagata Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tathagata Das resolved SPARK-18791.
---
   Resolution: Done
Fix Version/s: 2.3.0

> Stream-Stream Joins
> ---
>
> Key: SPARK-18791
> URL: https://issues.apache.org/jira/browse/SPARK-18791
> Project: Spark
>  Issue Type: New Feature
>  Components: Structured Streaming
>Reporter: Michael Armbrust
>Assignee: Tathagata Das
>Priority: Major
> Fix For: 2.3.0
>
>
> Stream stream join is a much requested, but missing feature in Structured 
> Streaming. While the join API exists in Datasets and DataFrames, it throws 
> UnsupportedOperationException when applied between two streaming 
> Datasets/DataFrames. To support this, we have to maintain the same semantics 
> as other Structured Streaming operations - the result of the operation after 
> consuming two data streams data till positions/offsets X and Y, respectively, 
> must be the same as a single batch join operation on all the data till 
> positions X and Y, respectively. To achieve this, the execution has to buffer 
> past data (i.e. streaming state) from each stream, so that future data can be 
> matched against past data. Here is the set of a few high-level requirements. 
> - Buffer past rows as streaming state (using StateStore), and joining with 
> the past rows.
> - Support state cleanup using the event time watermark when possible.
> - Support different types of joins (inner, left outer, right outer is in 
> highest demand for ETL/enrichment type use cases [kafka -> best-effort enrich 
> -> write to S3])
> - Support cascading join operations (i.e. joining more than 2 streams)
> - Support multiple output modes (Append mode is in highest demand for 
> enabling ETL/enrichment type use cases)
> All the work to incrementally build this is going represented by this JIRA, 
> with specific subtasks for each step. At this point, this is the rough 
> direction as follows:
> - Implement stream-stream inner join in Append Mode, supporting multiple 
> cascaded joins.
> - Extends it stream-stream left/right outer join in Append Mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-23923) High-order function: cardinality(x) → bigint

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-23923.
-
   Resolution: Fixed
 Assignee: Kazuaki Ishizaki
Fix Version/s: 2.4.0

> High-order function: cardinality(x) → bigint
> 
>
> Key: SPARK-23923
> URL: https://issues.apache.org/jira/browse/SPARK-23923
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Assignee: Kazuaki Ishizaki
>Priority: Major
> Fix For: 2.4.0
>
>
> Ref: https://prestodb.io/docs/current/functions/array.html and  
> https://prestodb.io/docs/current/functions/map.html.
> Returns the cardinality (size) of the array/map x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24123) Fix a flaky test `DateTimeUtilsSuite.monthsBetween`

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-24123.
-
   Resolution: Fixed
 Assignee: Marco Gaido
Fix Version/s: 2.4.0

> Fix a flaky test `DateTimeUtilsSuite.monthsBetween`
> ---
>
> Key: SPARK-24123
> URL: https://issues.apache.org/jira/browse/SPARK-24123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Assignee: Marco Gaido
>Priority: Minor
> Fix For: 2.4.0
>
>
> **MASTER BRANCH**
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/4810/testReport/org.apache.spark.sql.catalyst.util/DateTimeUtilsSuite/monthsBetween/
> {code}
> Error Message
> 3.949596773820191 did not equal 3.9495967741935485
> Stacktrace
>   org.scalatest.exceptions.TestFailedException: 3.949596773820191 did not 
> equal 3.9495967741935485
>   at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:528)
>   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
>   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtilsSuite$$anonfun$25.apply(DateTimeUtilsSuite.scala:495)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtilsSuite$$anonfun$25.apply(DateTimeUtilsSuite.scala:488)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24097) Instruments improvements - RandomForest and GradientBoostedTree

2018-05-02 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-24097:
--
Shepherd: Joseph K. Bradley

> Instruments improvements - RandomForest and GradientBoostedTree
> ---
>
> Key: SPARK-24097
> URL: https://issues.apache.org/jira/browse/SPARK-24097
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Weichen Xu
>Priority: Major
>
> Instruments improvements - RandomForest and GradientBoostedTree



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24097) Instruments improvements - RandomForest and GradientBoostedTree

2018-05-02 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley reassigned SPARK-24097:
-

Assignee: Weichen Xu

> Instruments improvements - RandomForest and GradientBoostedTree
> ---
>
> Key: SPARK-24097
> URL: https://issues.apache.org/jira/browse/SPARK-24097
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Weichen Xu
>Assignee: Weichen Xu
>Priority: Major
>
> Instruments improvements - RandomForest and GradientBoostedTree



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24133) Reading Parquet files containing large strings can fail with java.lang.ArrayIndexOutOfBoundsException

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-24133.
-
   Resolution: Fixed
 Assignee: Ala Luszczak
Fix Version/s: 2.4.0

> Reading Parquet files containing large strings can fail with 
> java.lang.ArrayIndexOutOfBoundsException
> -
>
> Key: SPARK-24133
> URL: https://issues.apache.org/jira/browse/SPARK-24133
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Ala Luszczak
>Assignee: Ala Luszczak
>Priority: Major
> Fix For: 2.4.0
>
>
> ColumnVectors store string data in one big byte array. Since the array size 
> is capped at just under Integer.MAX_VALUE, a single ColumnVector cannot store 
> more than 2GB of string data.
> However, since the Parquet files commonly contain large blobs stored as 
> strings, and ColumnVectors by default carry 4096 values, it's entirely 
> possible to go past that limit.
> In such cases a negative capacity is requested from 
> WritableColumnVector.reserve(). The call succeeds (requested capacity is 
> smaller than already allocated), and consequently  
> java.lang.ArrayIndexOutOfBoundsException is thrown when the reader actually 
> attempts to put the data into the array.
> This behavior is hard to troubleshoot for the users. Spark should instead 
> check for negative requested capacity in WritableColumnVector.reserve() and 
> throw more informative error, instructing the user to tweak ColumnarBatch 
> size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24154) AccumulatorV2 loses type information during serialization

2018-05-02 Thread Sergey Zhemzhitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Zhemzhitsky updated SPARK-24154:
---
Description: 
AccumulatorV2 loses type information during serialization.
It happens 
[here|https://github.com/apache/spark/blob/4f5bad615b47d743b8932aea1071652293981604/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L164]
 during *writeReplace* call
{code:scala}
final protected def writeReplace(): Any = {
  if (atDriverSide) {
if (!isRegistered) {
  throw new UnsupportedOperationException(
"Accumulator must be registered before send to executor")
}
val copyAcc = copyAndReset()
assert(copyAcc.isZero, "copyAndReset must return a zero value copy")
val isInternalAcc = name.isDefined && 
name.get.startsWith(InternalAccumulator.METRICS_PREFIX)
if (isInternalAcc) {
  // Do not serialize the name of internal accumulator and send it to 
executor.
  copyAcc.metadata = metadata.copy(name = None)
} else {
  // For non-internal accumulators, we still need to send the name because 
users may need to
  // access the accumulator name at executor side, or they may keep the 
accumulators sent from
  // executors and access the name when the registered accumulator is 
already garbage
  // collected(e.g. SQLMetrics).
  copyAcc.metadata = metadata
}
copyAcc
  } else {
this
  }
}
{code}

It means that it is hardly possible to create new accumulators easily by adding 
new behaviour to existing ones by means of mix-ins or inheritance (without 
overriding *copy*).

For example the following snippet ...
{code:scala}
trait TripleCount {
  self: LongAccumulator =>
  abstract override def add(v: jl.Long): Unit = {
self.add(v * 3)
  }
}
val acc = new LongAccumulator with TripleCount
sc.register(acc)

val data = 1 to 10
val rdd = sc.makeRDD(data, 5)

rdd.foreach(acc.add(_))
acc.value shouldBe 3 * data.sum
{code}

... fails with

{code:none}
org.scalatest.exceptions.TestFailedException: 55 was not equal to 165
  at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
  at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)
{code}

Also such a behaviour seems to be error prone and confusing because an 
implementor gets not the same thing as he/she sees in the code.

  was:
AccumulatorV2 loses type information during serialization.
It happens 
[here|https://github.com/apache/spark/blob/4f5bad615b47d743b8932aea1071652293981604/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L164]
 during *writeReplace* call
{code:scala}
final protected def writeReplace(): Any = {
  if (atDriverSide) {
if (!isRegistered) {
  throw new UnsupportedOperationException(
"Accumulator must be registered before send to executor")
}
val copyAcc = copyAndReset()
assert(copyAcc.isZero, "copyAndReset must return a zero value copy")
val isInternalAcc = name.isDefined && 
name.get.startsWith(InternalAccumulator.METRICS_PREFIX)
if (isInternalAcc) {
  // Do not serialize the name of internal accumulator and send it to 
executor.
  copyAcc.metadata = metadata.copy(name = None)
} else {
  // For non-internal accumulators, we still need to send the name because 
users may need to
  // access the accumulator name at executor side, or they may keep the 
accumulators sent from
  // executors and access the name when the registered accumulator is 
already garbage
  // collected(e.g. SQLMetrics).
  copyAcc.metadata = metadata
}
copyAcc
  } else {
this
  }
}
{code}

It means that it is hardly possible to create new accumulators easily by adding 
new behaviour to existing ones by means of mix-ins or inheritance (without 
overriding *copy*).

For example the following snippet ...
{code:scala}
trait TripleCount {
  self: LongAccumulator =>
  abstract override def add(v: jl.Long): Unit = {
self.add(v * 3)
  }
}
val acc = new LongAccumulator with TripleCount
sc.register(acc)

val data = 1 to 10
val rdd = sc.makeRDD(data, 5)

rdd.foreach(acc.add(_))
acc.value shouldBe 3 * data.sum
{code}

... fails with

{code:none}
org.scalatest.exceptions.TestFailedException: 55 was not equal to 165
  at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
  at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)
{code}


> AccumulatorV2 loses type information during serialization
> -
>
> Key: SPARK-24154
> URL: https://issues.apache.org/jira/browse/SPARK-24154
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0, 2.2.1, 2.3.0, 2.3.1
> Environment: Scala 2.11
> Spark 2.2.0
>Reporter: Sergey Zhemzhitsky
>Priority: Major
>
> AccumulatorV2 loses 

[jira] [Created] (SPARK-24154) AccumulatorV2 loses type information during serialization

2018-05-02 Thread Sergey Zhemzhitsky (JIRA)
Sergey Zhemzhitsky created SPARK-24154:
--

 Summary: AccumulatorV2 loses type information during serialization
 Key: SPARK-24154
 URL: https://issues.apache.org/jira/browse/SPARK-24154
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.3.0, 2.2.1, 2.2.0, 2.3.1
 Environment: Scala 2.11
Spark 2.2.0
Reporter: Sergey Zhemzhitsky


AccumulatorV2 loses type information during serialization.
It happens 
[here|https://github.com/apache/spark/blob/4f5bad615b47d743b8932aea1071652293981604/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L164]
 during *writeReplace* call
{code:scala}
final protected def writeReplace(): Any = {
  if (atDriverSide) {
if (!isRegistered) {
  throw new UnsupportedOperationException(
"Accumulator must be registered before send to executor")
}
val copyAcc = copyAndReset()
assert(copyAcc.isZero, "copyAndReset must return a zero value copy")
val isInternalAcc = name.isDefined && 
name.get.startsWith(InternalAccumulator.METRICS_PREFIX)
if (isInternalAcc) {
  // Do not serialize the name of internal accumulator and send it to 
executor.
  copyAcc.metadata = metadata.copy(name = None)
} else {
  // For non-internal accumulators, we still need to send the name because 
users may need to
  // access the accumulator name at executor side, or they may keep the 
accumulators sent from
  // executors and access the name when the registered accumulator is 
already garbage
  // collected(e.g. SQLMetrics).
  copyAcc.metadata = metadata
}
copyAcc
  } else {
this
  }
}
{code}

It means that it is hardly possible to create new accumulators easily by adding 
new behaviour to existing ones by means of mix-ins or inheritance (without 
overriding *copy*).

For example the following snippet ...
{code:scala}
trait TripleCount {
  self: LongAccumulator =>
  abstract override def add(v: jl.Long): Unit = {
self.add(v * 3)
  }
}
val acc = new LongAccumulator with TripleCount
sc.register(acc)

val data = 1 to 10
val rdd = sc.makeRDD(data, 5)

rdd.foreach(acc.add(_))
acc.value shouldBe 3 * data.sum
{code}

... fails with

{code:none}
org.scalatest.exceptions.TestFailedException: 55 was not equal to 165
  at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
  at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2018-05-02 Thread Evan McClain (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461502#comment-16461502
 ] 

Evan McClain commented on SPARK-4502:
-

The workaround I've been using is to explicitly pass in the read schema.

It's an ugly workaround (typos in the field names and/or types can lead to 
seemingly unrelated errors), but it works.

> Spark SQL reads unneccesary nested fields from Parquet
> --
>
> Key: SPARK-4502
> URL: https://issues.apache.org/jira/browse/SPARK-4502
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.1.0
>Reporter: Liwen Sun
>Priority: Critical
>
> When reading a field of a nested column from Parquet, SparkSQL reads and 
> assemble all the fields of that nested column. This is unnecessary, as 
> Parquet supports fine-grained field reads out of a nested column. This may 
> degrades the performance significantly when a nested column has many fields. 
> For example, I loaded json tweets data into SparkSQL and ran the following 
> query:
> {{SELECT User.contributors_enabled from Tweets;}}
> User is a nested structure that has 38 primitive fields (for Tweets schema, 
> see: https://dev.twitter.com/overview/api/tweets), here is the log message:
> {{14/11/19 16:36:49 INFO InternalParquetRecordReader: Assembled and processed 
> 385779 records from 38 columns in 3976 ms: 97.02691 rec/ms, 3687.0227 
> cell/ms}}
> For comparison, I also ran:
> {{SELECT User FROM Tweets;}}
> And here is the log message:
> {{14/11/19 16:45:40 INFO InternalParquetRecordReader: Assembled and processed 
> 385779 records from 38 columns in 9461 ms: 40.77571 rec/ms, 1549.477 cell/ms}}
> So both queries load 38 columns from Parquet, while the first query only 
> needs 1 column. I also measured the bytes read within Parquet. In these two 
> cases, the same number of bytes (99365194 bytes) were read. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23971) Should not leak Spark sessions across test suites

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-23971:

Fix Version/s: 2.3.1

> Should not leak Spark sessions across test suites
> -
>
> Key: SPARK-23971
> URL: https://issues.apache.org/jira/browse/SPARK-23971
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.4.0
>Reporter: Eric Liang
>Assignee: Eric Liang
>Priority: Major
> Fix For: 2.3.1, 2.4.0
>
>
> Many suites currently leak Spark sessions (sometimes with stopped 
> SparkContexts) via the thread-local active Spark session and default Spark 
> session. We should attempt to clean these up and detect when this happens to 
> improve the reproducibility of tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23971) Should not leak Spark sessions across test suites

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-23971:

Component/s: Tests

> Should not leak Spark sessions across test suites
> -
>
> Key: SPARK-23971
> URL: https://issues.apache.org/jira/browse/SPARK-23971
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.4.0
>Reporter: Eric Liang
>Assignee: Eric Liang
>Priority: Major
> Fix For: 2.3.1, 2.4.0
>
>
> Many suites currently leak Spark sessions (sometimes with stopped 
> SparkContexts) via the thread-local active Spark session and default Spark 
> session. We should attempt to clean these up and detect when this happens to 
> improve the reproducibility of tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24013) ApproximatePercentile grinds to a halt on sorted input.

2018-05-02 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-24013.
-
   Resolution: Fixed
 Assignee: Marco Gaido
Fix Version/s: 2.4.0

> ApproximatePercentile grinds to a halt on sorted input.
> ---
>
> Key: SPARK-24013
> URL: https://issues.apache.org/jira/browse/SPARK-24013
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Juliusz Sompolski
>Assignee: Marco Gaido
>Priority: Major
> Fix For: 2.4.0
>
> Attachments: screenshot-1.png
>
>
> Running
> {code}
> sql("select approx_percentile(rid, array(0.1)) from (select rand() as rid 
> from range(1000))").collect()
> {code}
> takes 7 seconds, while
> {code}
> sql("select approx_percentile(id, array(0.1)) from range(1000)").collect()
> {code}
> grinds to a halt - processes the first million rows quickly, and then slows 
> down to a few thousands rows / second (4m rows processed after 20 minutes).
> Thread dumps show that it spends time in QuantileSummary.compress.
> Seems it hits some edge case inefficiency when dealing with sorted data?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23489) Flaky Test: HiveExternalCatalogVersionsSuite

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-23489:
--
Description: 
I saw this error in an unrelated PR. It seems to me a bad configuration in the 
Jenkins node where the tests are run.

{code}
Error Message
java.io.IOException: Cannot run program "./bin/spark-submit" (in directory 
"/tmp/test-spark/spark-2.2.1"): error=2, No such file or directory
Stacktrace
sbt.ForkMain$ForkError: java.io.IOException: Cannot run program 
"./bin/spark-submit" (in directory "/tmp/test-spark/spark-2.2.1"): error=2, No 
such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at 
org.apache.spark.sql.hive.SparkSubmitTestUtils$class.runSparkSubmit(SparkSubmitTestUtils.scala:73)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.runSparkSubmit(HiveExternalCatalogVersionsSuite.scala:43)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:176)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:161)
at scala.collection.immutable.List.foreach(List.scala:381)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.beforeAll(HiveExternalCatalogVersionsSuite.scala:161)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
at 
org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:52)
at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480)
at sbt.ForkMain$Run$2.call(ForkMain.java:296)
at sbt.ForkMain$Run$2.call(ForkMain.java:286)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: sbt.ForkMain$ForkError: java.io.IOException: error=2, No such file 
or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:248)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 17 more
{code}

This is the link: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87615/testReport/.

*MASTER BRANCH*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/4389

*BRANCH 2.3*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.6/321/

*NOTE: This failure frequently looks as `Test Result (no failures)`*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/4811/


  was:
I saw this error in an unrelated PR. It seems to me a bad configuration in the 
Jenkins node where the tests are run.

{code}
Error Message
java.io.IOException: Cannot run program "./bin/spark-submit" (in directory 
"/tmp/test-spark/spark-2.2.1"): error=2, No such file or directory
Stacktrace
sbt.ForkMain$ForkError: java.io.IOException: Cannot run program 
"./bin/spark-submit" (in directory "/tmp/test-spark/spark-2.2.1"): error=2, No 
such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at 
org.apache.spark.sql.hive.SparkSubmitTestUtils$class.runSparkSubmit(SparkSubmitTestUtils.scala:73)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.runSparkSubmit(HiveExternalCatalogVersionsSuite.scala:43)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:176)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:161)
at scala.collection.immutable.List.foreach(List.scala:381)
at 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.beforeAll(HiveExternalCatalogVersionsSuite.scala:161)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
at 
org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:52)
at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480)
at sbt.ForkMain$Run$2.call(ForkMain.java:296)

[jira] [Created] (SPARK-24153) Flaky Test: DirectKafkaStreamSuite

2018-05-02 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-24153:
-

 Summary: Flaky Test: DirectKafkaStreamSuite
 Key: SPARK-24153
 URL: https://issues.apache.org/jira/browse/SPARK-24153
 Project: Spark
  Issue Type: Bug
  Components: Structured Streaming
Affects Versions: 2.4.0
Reporter: Dongjoon Hyun


{code}
Test Result (5 failures / +5)

org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.receiving from 
largest starting offset
org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.creating stream 
by offset
org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.offset recovery
org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.offset recovery 
from kafka
org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.Direct Kafka 
stream report input information
{code}

- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-maven-hadoop-2.7/348/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24152:
--
Description: 
PR builder and master branch test fails with the following SparkR error with 
unknown reason. The following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

*PR BUILDER*
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

*MASTER BRANCH*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
 (Fail with no failures)

This is critical because we already start to merge the PR by ignoring this 
**known unkonwn** SparkR failure.
- https://github.com/apache/spark/pull/21175

  was:
PR builder and master branch test fails with the following SparkR error with 
unknown reason. The following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

*PR BUILDER*
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

*MASTER BRANCH*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastCompletedBuild/console

This is critical because we already start to merge the PR by ignoring this 
**known unkonwn** SparkR failure.
- https://github.com/apache/spark/pull/21175


> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/4458/
>  (Fail with no failures)
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24152:
--
Description: 
PR builder and master branch test fails with the following SparkR error with 
unknown reason. The following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

*PR BUILDER*
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

*MASTER BRANCH*
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastCompletedBuild/console

This is critical because we already start to merge the PR by ignoring this 
**known unkonwn** SparkR failure.
- https://github.com/apache/spark/pull/21175

  was:
PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

This is critical because we already start to merge the PR by ignoring this 
**known unkonwn** SparkR failure.
- https://github.com/apache/spark/pull/21175


> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder and master branch test fails with the following SparkR error with 
> unknown reason. The following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> *PR BUILDER*
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> *MASTER BRANCH*
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastCompletedBuild/console
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24152:
--
Description: 
PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

This is critical because we already start to merge the PR by ignoring this 
**known unkonwn** SparkR failure.
- https://github.com/apache/spark/pull/21175

  was:
PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/


> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder fails with the following SparkR error with unknown reason. The 
> following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/
> This is critical because we already start to merge the PR by ignoring this 
> **known unkonwn** SparkR failure.
> - https://github.com/apache/spark/pull/21175



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461328#comment-16461328
 ] 

Dongjoon Hyun commented on SPARK-24152:
---

cc [~shivaram], [~felixcheung], [~yanboliang]

> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder fails with the following SparkR error with unknown reason. The 
> following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-24152:
--
Description: 
PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/

  was:
PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/


> Flaky Test: SparkR
> --
>
> Key: SPARK-24152
> URL: https://issues.apache.org/jira/browse/SPARK-24152
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> PR builder fails with the following SparkR error with unknown reason. The 
> following is an error message from that.
> {code}
> * this is package 'SparkR' version '2.4.0'
> * checking CRAN incoming feasibility ...Error in 
> .check_package_CRAN_incoming(pkgdir) : 
>   dims [product 24] do not match the length of object [0]
> Execution halted
> {code}
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/
> - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89998/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24152) Flaky Test: SparkR

2018-05-02 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-24152:
-

 Summary: Flaky Test: SparkR
 Key: SPARK-24152
 URL: https://issues.apache.org/jira/browse/SPARK-24152
 Project: Spark
  Issue Type: Bug
  Components: SparkR
Affects Versions: 2.4.0
Reporter: Dongjoon Hyun


PR builder fails with the following SparkR error with unknown reason. The 
following is an error message from that.

{code}
* this is package 'SparkR' version '2.4.0'
* checking CRAN incoming feasibility ...Error in 
.check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted
{code}

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90039/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89983/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size

2018-05-02 Thread Erik Erlandson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461324#comment-16461324
 ] 

Erik Erlandson commented on SPARK-24135:


> In the case of the executor failing to start at all, this wouldn't be caught 
> by Spark's task failure count logic because you're never going to end up 
> scheduling tasks on these executors that failed to start.

Aha, that argues for allowing a way to give up after repeated pod start 
failures.

> [K8s] Executors that fail to start up because of init-container errors are 
> not retried and limit the executor pool size
> ---
>
> Key: SPARK-24135
> URL: https://issues.apache.org/jira/browse/SPARK-24135
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Matt Cheah
>Priority: Major
>
> In KubernetesClusterSchedulerBackend, we detect if executors disconnect after 
> having been started or if executors hit the {{ERROR}} or {{DELETED}} states. 
> When executors fail in these ways, they are removed from the pending 
> executors pool and the driver should retry requesting these executors.
> However, the driver does not handle a different class of error: when the pod 
> enters the {{Init:Error}} state. This state comes up when the executor fails 
> to launch because one of its init-containers fails. Spark itself doesn't 
> attach any init-containers to the executors. However, custom web hooks can 
> run on the cluster and attach init-containers to the executor pods. 
> Additionally, pod presets can specify init containers to run on these pods. 
> Therefore Spark should be handling the {{Init:Error}} cases regardless if 
> Spark itself is aware of init-containers or not.
> This class of error is particularly bad because when we hit this state, the 
> failed executor will never start, but it's still seen as pending by the 
> executor allocator. The executor allocator won't request more rounds of 
> executors because its current batch hasn't been resolved to either running or 
> failed. Therefore we end up with being stuck with the number of executors 
> that successfully started before the faulty one failed to start, potentially 
> creating a fake resource bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that threads will eventually synchronise on different monitors 
(because they will synchronise on different objects which references have been 
assigned to "applications"), breaking the initial synchronisation intent. This 
has even greater chance to reproduce when number_new_log_files > 
replayExecutor_pool_size

If such log disappears (it will not be present in the list "applications"), it 
will be impossible to read it from the UI (being in the list "applications" is 
a mandatory check to avoid getting a 404)

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses

  was:
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that threads will eventually synchronise on different monitors 
(because they will synchronise on different objects which references have been 
assigned to "applications"), breaking the initial synchronisation intent. This 
has even greater chance to reproduce when number_new_log_files > 
replayExecutor_pool_size

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Major
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that threads will eventually synchronise on different monitors 
> (because they will synchronise on different objects which references have 
> been assigned to "applications"), breaking the initial synchronisation 
> intent. This has even greater chance to reproduce when number_new_log_files > 
> replayExecutor_pool_size
> If such log disappears (it will not be present in the list "applications"), 
> it will be impossible to read it from the UI (being in the list 
> "applications" is a mandatory check to avoid getting a 404)
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that threads will eventually synchronise on different monitors 
(because they will synchronise on different objects which references have been 
assigned to "applications"), breaking the initial synchronisation intent. This 
has even greater chance to reproduce when number_new_log_files > 
replayExecutor_pool_size

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses

  was:
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that threads will eventually synchronise on different monitors 
(because they will synchronise on different objects which references that have 
been assigned to "applications"), breaking the initial synchronisation intent. 
This has even greater chance to reproduce when number_new_log_files > 
replayExecutor_pool_size

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Major
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that threads will eventually synchronise on different monitors 
> (because they will synchronise on different objects which references have 
> been assigned to "applications"), breaking the initial synchronisation 
> intent. This has even greater chance to reproduce when number_new_log_files > 
> replayExecutor_pool_size
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Priority: Major  (was: Minor)

> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Major
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that threads will eventually synchronise on different monitors 
> (because they will synchronise on different objects which references that 
> have been assigned to "applications"), breaking the initial synchronisation 
> intent. This has even greater chance to reproduce when number_new_log_files > 
> replayExecutor_pool_size
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that threads will eventually synchronise on different monitors 
(because they will synchronise on different objects which references that have 
been assigned to "applications"), breaking the initial synchronisation intent. 
This has even greater chance to reproduce when number_new_log_files > 
replayExecutor_pool_size

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses

  was:
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that if the number of tasks (the number of new log files to 
replay and add to the applications list) is greater than the number of threads 
in the pool, threads will eventually synchronise on different monitors (because 
they will synchronise on different objects which references that have been 
assigned to "applications"), breaking the initial synchronisation intent.

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Minor
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that threads will eventually synchronise on different monitors 
> (because they will synchronise on different objects which references that 
> have been assigned to "applications"), breaking the initial synchronisation 
> intent. This has even greater chance to reproduce when number_new_log_files > 
> replayExecutor_pool_size
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that if the number of tasks (the number of new log files to 
replay and add to the applications list) is greater than the number of threads 
in the pool, threads will eventually synchronise on different monitors (because 
they will synchronise on different objects which references that have been 
assigned to "applications"), breaking the initial synchronisation intent.

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses

  was:
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that if the number of tasks (the number of new log files to 
replay and add to the applications list) is greater than the number of threads 
in the pool, there is a great chance that a thread will try to synchronise on 
an updated version of applications (since it is volatile and updated) while 
some are still being synchronised on an old reference of applications. There 
the race condition happens.

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Minor
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that if the number of tasks (the number of new log files to 
> replay and add to the applications list) is greater than the number of 
> threads in the pool, threads will eventually synchronise on different 
> monitors (because they will synchronise on different objects which references 
> that have been assigned to "applications"), breaking the initial 
> synchronisation intent.
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition in checkLogs method between threads of 
replayExecutor. They use the field "applications" to synchronise, but they also 
update that field.

The problem is that if the number of tasks (the number of new log files to 
replay and add to the applications list) is greater than the number of threads 
in the pool, there is a great chance that a thread will try to synchronise on 
an updated version of applications (since it is volatile and updated) while 
some are still being synchronised on an old reference of applications. There 
the race condition happens.

Workaround:
 * use a permanent object as a monitor on which to synchronise (or synchronise 
on `this`)
 * keep volatile field for all other read accesses

  was:
There exist a race condition between the method checkLogs and cleanLogs.

cleanLogs can read the field applications while it is concurrently processed by 
checkLogs. It is possible that checkLogs added new fetched logs, sets 
applications and this is erased by cleanLogs having an old version of 
applications. The problem is that the fetched log won't appear in applications 
anymore and it will then be impossible to display the corresponding application 
in the History Server, since it must be in the LinkedList applications. 

Workaround:
 * use a permanent object as a monitor on which to synchronise
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Minor
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that if the number of tasks (the number of new log files to 
> replay and add to the applications list) is greater than the number of 
> threads in the pool, there is a great chance that a thread will try to 
> synchronise on an updated version of applications (since it is volatile and 
> updated) while some are still being synchronised on an old reference of 
> applications. There the race condition happens.
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22918) sbt test (spark - local) fail after upgrading to 2.2.1 with: java.security.AccessControlException: access denied org.apache.derby.security.SystemPermission( "engine",

2018-05-02 Thread Sam Garrett (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461250#comment-16461250
 ] 

Sam Garrett commented on SPARK-22918:
-

+1 same issue

> sbt test (spark - local) fail after upgrading to 2.2.1 with: 
> java.security.AccessControlException: access denied 
> org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" )
> 
>
> Key: SPARK-22918
> URL: https://issues.apache.org/jira/browse/SPARK-22918
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Damian Momot
>Priority: Major
>
> After upgrading 2.2.0 -> 2.2.1 sbt test command in one of my projects started 
> to fail with following exception:
> {noformat}
> java.security.AccessControlException: access denied 
> org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" )
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at 
> org.apache.derby.iapi.security.SecurityUtil.checkDerbyInternalsPrivilege(Unknown
>  Source)
>   at org.apache.derby.iapi.services.monitor.Monitor.startMonitor(Unknown 
> Source)
>   at org.apache.derby.iapi.jdbc.JDBCBoot$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at org.apache.derby.iapi.jdbc.JDBCBoot.boot(Unknown Source)
>   at org.apache.derby.iapi.jdbc.JDBCBoot.boot(Unknown Source)
>   at org.apache.derby.jdbc.EmbeddedDriver.boot(Unknown Source)
>   at org.apache.derby.jdbc.EmbeddedDriver.(Unknown Source)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at java.lang.Class.newInstance(Class.java:442)
>   at 
> org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:47)
>   at 
> org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54)
>   at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
>   at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
>   at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:85)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
>   at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
>   at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
>   at 
> org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:240)
>   at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:286)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
>   at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
>   at 
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
>   at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
>   at 
> 

[jira] [Assigned] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24151:


Assignee: (was: Apache Spark)

> CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when 
> caseSensitive is enabled
> --
>
> Key: SPARK-24151
> URL: https://issues.apache.org/jira/browse/SPARK-24151
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: James Thompson
>Priority: Major
>
> After this change: https://issues.apache.org/jira/browse/SPARK-22333
> Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
> been enabled:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
> given input columns: [col1]{code}
> This is due to the fact that the analyzer incorrectly uses a case sensitive 
> resolver to resolve the function. I will submit a PR with a fix + test for 
> this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24151:


Assignee: Apache Spark

> CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when 
> caseSensitive is enabled
> --
>
> Key: SPARK-24151
> URL: https://issues.apache.org/jira/browse/SPARK-24151
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: James Thompson
>Assignee: Apache Spark
>Priority: Major
>
> After this change: https://issues.apache.org/jira/browse/SPARK-22333
> Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
> been enabled:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
> given input columns: [col1]{code}
> This is due to the fact that the analyzer incorrectly uses a case sensitive 
> resolver to resolve the function. I will submit a PR with a fix + test for 
> this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461237#comment-16461237
 ] 

Apache Spark commented on SPARK-24151:
--

User 'jamesthomp' has created a pull request for this issue:
https://github.com/apache/spark/pull/21217

> CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when 
> caseSensitive is enabled
> --
>
> Key: SPARK-24151
> URL: https://issues.apache.org/jira/browse/SPARK-24151
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: James Thompson
>Priority: Major
>
> After this change: https://issues.apache.org/jira/browse/SPARK-22333
> Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
> been enabled:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
> given input columns: [col1]{code}
> This is due to the fact that the analyzer incorrectly uses a case sensitive 
> resolver to resolve the function. I will submit a PR with a fix + test for 
> this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24151) CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as column names when caseSensitive is enabled

2018-05-02 Thread James Thompson (JIRA)
James Thompson created SPARK-24151:
--

 Summary: CURRENT_DATE, CURRENT_TIMESTAMP incorrectly resolved as 
column names when caseSensitive is enabled
 Key: SPARK-24151
 URL: https://issues.apache.org/jira/browse/SPARK-24151
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: James Thompson


After this change: https://issues.apache.org/jira/browse/SPARK-22333

Running SQL such as "CURRENT_TIMESTAMP" can fail spark.sql.caseSensitive has 
been enabled:
{code:java}
org.apache.spark.sql.AnalysisException: cannot resolve '`CURRENT_TIMESTAMP`' 
given input columns: [col1]{code}
This is due to the fact that the analyzer incorrectly uses a case sensitive 
resolver to resolve the function. I will submit a PR with a fix + test for this.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size

2018-05-02 Thread Matt Cheah (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461188#comment-16461188
 ] 

Matt Cheah edited comment on SPARK-24135 at 5/2/18 3:37 PM:


{quote}Restarting seems like it would eventually be limited by the job failure 
limit that Spark already has. If pod startup failures are deterministic the job 
failure count will hit this limit and job will be killed that way.
{quote}
In the case of the executor failing to start at all, this wouldn't be caught by 
Spark's task failure count logic because you're never going to end up 
scheduling tasks on these executors that failed to start.


was (Author: mcheah):
> Restarting seems like it would eventually be limited by the job failure limit 
>that Spark already has. If pod startup failures are deterministic the job 
>failure count will hit this limit and job will be killed that way.

In the case of the executor failing to start at all, this wouldn't be caught by 
Spark's task failure count logic because you're never going to end up 
scheduling tasks on these executors that failed to start.

> [K8s] Executors that fail to start up because of init-container errors are 
> not retried and limit the executor pool size
> ---
>
> Key: SPARK-24135
> URL: https://issues.apache.org/jira/browse/SPARK-24135
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Matt Cheah
>Priority: Major
>
> In KubernetesClusterSchedulerBackend, we detect if executors disconnect after 
> having been started or if executors hit the {{ERROR}} or {{DELETED}} states. 
> When executors fail in these ways, they are removed from the pending 
> executors pool and the driver should retry requesting these executors.
> However, the driver does not handle a different class of error: when the pod 
> enters the {{Init:Error}} state. This state comes up when the executor fails 
> to launch because one of its init-containers fails. Spark itself doesn't 
> attach any init-containers to the executors. However, custom web hooks can 
> run on the cluster and attach init-containers to the executor pods. 
> Additionally, pod presets can specify init containers to run on these pods. 
> Therefore Spark should be handling the {{Init:Error}} cases regardless if 
> Spark itself is aware of init-containers or not.
> This class of error is particularly bad because when we hit this state, the 
> failed executor will never start, but it's still seen as pending by the 
> executor allocator. The executor allocator won't request more rounds of 
> executors because its current batch hasn't been resolved to either running or 
> failed. Therefore we end up with being stuck with the number of executors 
> that successfully started before the faulty one failed to start, potentially 
> creating a fake resource bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size

2018-05-02 Thread Matt Cheah (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460047#comment-16460047
 ] 

Matt Cheah edited comment on SPARK-24135 at 5/2/18 3:37 PM:


{quote}But I'm not sure how much this buys us because very likely the newly 
requested executors will fail to be initialized,
{quote}
That's entirely up to the behavior of the init container itself - there's many 
reasons for one to believe that a given init container's logic can be flaky. 
But it's not immediately obvious to me whether or not the init container's 
failure should count towards a job failure. Job failures shouldn't be caused by 
failures in the framework, and in this case, the framework has added the 
init-container for these pods - in other words the user's code didn't directly 
cause the job failure.


was (Author: mcheah):
_> But I'm not sure how much this buys us because very likely the newly 
requested executors will fail to be initialized,_

That's entirely up to the behavior of the init container itself - there's many 
reasons for one to believe that a given init container's logic can be flaky. 
But it's not immediately obvious to me whether or not the init container's 
failure should count towards a job failure. Job failures shouldn't be caused by 
failures in the framework, and in this case, the framework has added the 
init-container for these pods - in other words the user's code didn't directly 
cause the job failure.

> [K8s] Executors that fail to start up because of init-container errors are 
> not retried and limit the executor pool size
> ---
>
> Key: SPARK-24135
> URL: https://issues.apache.org/jira/browse/SPARK-24135
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Matt Cheah
>Priority: Major
>
> In KubernetesClusterSchedulerBackend, we detect if executors disconnect after 
> having been started or if executors hit the {{ERROR}} or {{DELETED}} states. 
> When executors fail in these ways, they are removed from the pending 
> executors pool and the driver should retry requesting these executors.
> However, the driver does not handle a different class of error: when the pod 
> enters the {{Init:Error}} state. This state comes up when the executor fails 
> to launch because one of its init-containers fails. Spark itself doesn't 
> attach any init-containers to the executors. However, custom web hooks can 
> run on the cluster and attach init-containers to the executor pods. 
> Additionally, pod presets can specify init containers to run on these pods. 
> Therefore Spark should be handling the {{Init:Error}} cases regardless if 
> Spark itself is aware of init-containers or not.
> This class of error is particularly bad because when we hit this state, the 
> failed executor will never start, but it's still seen as pending by the 
> executor allocator. The executor allocator won't request more rounds of 
> executors because its current batch hasn't been resolved to either running or 
> failed. Therefore we end up with being stuck with the number of executors 
> that successfully started before the faulty one failed to start, potentially 
> creating a fake resource bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size

2018-05-02 Thread Matt Cheah (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461188#comment-16461188
 ] 

Matt Cheah commented on SPARK-24135:


> Restarting seems like it would eventually be limited by the job failure limit 
>that Spark already has. If pod startup failures are deterministic the job 
>failure count will hit this limit and job will be killed that way.

In the case of the executor failing to start at all, this wouldn't be caught by 
Spark's task failure count logic because you're never going to end up 
scheduling tasks on these executors that failed to start.

> [K8s] Executors that fail to start up because of init-container errors are 
> not retried and limit the executor pool size
> ---
>
> Key: SPARK-24135
> URL: https://issues.apache.org/jira/browse/SPARK-24135
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Matt Cheah
>Priority: Major
>
> In KubernetesClusterSchedulerBackend, we detect if executors disconnect after 
> having been started or if executors hit the {{ERROR}} or {{DELETED}} states. 
> When executors fail in these ways, they are removed from the pending 
> executors pool and the driver should retry requesting these executors.
> However, the driver does not handle a different class of error: when the pod 
> enters the {{Init:Error}} state. This state comes up when the executor fails 
> to launch because one of its init-containers fails. Spark itself doesn't 
> attach any init-containers to the executors. However, custom web hooks can 
> run on the cluster and attach init-containers to the executor pods. 
> Additionally, pod presets can specify init containers to run on these pods. 
> Therefore Spark should be handling the {{Init:Error}} cases regardless if 
> Spark itself is aware of init-containers or not.
> This class of error is particularly bad because when we hit this state, the 
> failed executor will never start, but it's still seen as pending by the 
> executor allocator. The executor allocator won't request more rounds of 
> executors because its current batch hasn't been resolved to either running or 
> failed. Therefore we end up with being stuck with the number of executors 
> that successfully started before the faulty one failed to start, potentially 
> creating a fake resource bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Montaz updated SPARK-24150:
---
Description: 
There exist a race condition between the method checkLogs and cleanLogs.

cleanLogs can read the field applications while it is concurrently processed by 
checkLogs. It is possible that checkLogs added new fetched logs, sets 
applications and this is erased by cleanLogs having an old version of 
applications. The problem is that the fetched log won't appear in applications 
anymore and it will then be impossible to display the corresponding application 
in the History Server, since it must be in the LinkedList applications. 

Workaround:
 * use a permanent object as a monitor on which to synchronise
 * keep volatile field for all other read accesses

  was:
There exist a race condition between the method checkLogs and cleanLogs.

Workaround:
 * use a permanent object as a monitor on which to synchronise
 * keep volatile field for all other read accesses


> Race condition in FsHistoryProvider
> ---
>
> Key: SPARK-24150
> URL: https://issues.apache.org/jira/browse/SPARK-24150
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: William Montaz
>Priority: Minor
>
> There exist a race condition between the method checkLogs and cleanLogs.
> cleanLogs can read the field applications while it is concurrently processed 
> by checkLogs. It is possible that checkLogs added new fetched logs, sets 
> applications and this is erased by cleanLogs having an old version of 
> applications. The problem is that the fetched log won't appear in 
> applications anymore and it will then be impossible to display the 
> corresponding application in the History Server, since it must be in the 
> LinkedList applications. 
> Workaround:
>  * use a permanent object as a monitor on which to synchronise
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24150) Race condition in FsHistoryProvider

2018-05-02 Thread William Montaz (JIRA)
William Montaz created SPARK-24150:
--

 Summary: Race condition in FsHistoryProvider
 Key: SPARK-24150
 URL: https://issues.apache.org/jira/browse/SPARK-24150
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.0
Reporter: William Montaz


There exist a race condition between the method checkLogs and cleanLogs.

Workaround:
 * use a permanent object as a monitor on which to synchronise
 * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size

2018-05-02 Thread Erik Erlandson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461154#comment-16461154
 ] 

Erik Erlandson commented on SPARK-24135:


IIRC the dynamic allocation heuristic was to avoid scheduling new executors if 
there were executors still pending, to prevent a positive feedback loop from 
swamping kube with ever-increasing numbers of executor pod scheduling requests. 
How does that interact with the concept of killing a pending executor because 
its pod start is failing?

 

Restarting seems like it would eventually be limited by the job failure limit 
that Spark already has. If pod startup failures are deterministic the job 
failure count will hit this limit and job will be killed that way.  That isn't 
mutually exclusive to supporting some maximum number of pod startup attempts in 
the back-end, however.

> [K8s] Executors that fail to start up because of init-container errors are 
> not retried and limit the executor pool size
> ---
>
> Key: SPARK-24135
> URL: https://issues.apache.org/jira/browse/SPARK-24135
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Matt Cheah
>Priority: Major
>
> In KubernetesClusterSchedulerBackend, we detect if executors disconnect after 
> having been started or if executors hit the {{ERROR}} or {{DELETED}} states. 
> When executors fail in these ways, they are removed from the pending 
> executors pool and the driver should retry requesting these executors.
> However, the driver does not handle a different class of error: when the pod 
> enters the {{Init:Error}} state. This state comes up when the executor fails 
> to launch because one of its init-containers fails. Spark itself doesn't 
> attach any init-containers to the executors. However, custom web hooks can 
> run on the cluster and attach init-containers to the executor pods. 
> Additionally, pod presets can specify init containers to run on these pods. 
> Therefore Spark should be handling the {{Init:Error}} cases regardless if 
> Spark itself is aware of init-containers or not.
> This class of error is particularly bad because when we hit this state, the 
> failed executor will never start, but it's still seen as pending by the 
> executor allocator. The executor allocator won't request more rounds of 
> executors because its current batch hasn't been resolved to either running or 
> failed. Therefore we end up with being stuck with the number of executors 
> that successfully started before the faulty one failed to start, potentially 
> creating a fake resource bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >