[jira] [Updated] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24343: -- Description: When shuffle.partition > bucket number, Spark needs to shuffle the bucket table as per the

[jira] [Updated] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24343: -- Description: When shuffle.partition > bucket number, Spark needs to shuffle the bucket table as per the

[jira] [Assigned] (SPARK-24063) Control maximum epoch backlog

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24063: Assignee: Apache Spark > Control maximum epoch backlog > - >

[jira] [Assigned] (SPARK-24063) Control maximum epoch backlog

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24063: Assignee: (was: Apache Spark) > Control maximum epoch backlog >

[jira] [Commented] (SPARK-24329) Remove comments filtering before parsing of CSV files

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483848#comment-16483848 ] Apache Spark commented on SPARK-24329: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Attachment: tasktimespan.PNG > Large Task prior scheduling to Reduce overall execution time >

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Attachment: (was: taskstimespan.png) > Large Task prior scheduling to Reduce overall execution time >

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Attachment: tasktimespan.PNG > Large Task prior scheduling to Reduce overall execution time >

[jira] [Created] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread yucai (JIRA)
yucai created SPARK-24343: - Summary: Avoid shuffle for the bucketed table when shuffle.partition > bucket number Key: SPARK-24343 URL: https://issues.apache.org/jira/browse/SPARK-24343 Project: Spark

[jira] [Commented] (SPARK-23777) Missing DAG arrows between stages

2018-05-22 Thread Gabor Sudar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483701#comment-16483701 ] Gabor Sudar commented on SPARK-23777: - Started to work on a fix for this. > Missing DAG arrows

[jira] [Created] (SPARK-24344) Spark SQL Thrift Server issue

2018-05-22 Thread L (JIRA)
L created SPARK-24344: - Summary: Spark SQL Thrift Server issue Key: SPARK-24344 URL: https://issues.apache.org/jira/browse/SPARK-24344 Project: Spark Issue Type: Bug Components: SQL

[jira] [Updated] (SPARK-24344) Spark SQL Thrift Server issue

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24344: Priority: Major (was: Blocker) > Spark SQL Thrift Server issue > - >

[jira] [Updated] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24343: -- Description: When shuffle.partition > bucket number, Spark needs to shuffle the bucket table as per the

[jira] [Assigned] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24343: Assignee: (was: Apache Spark) > Avoid shuffle for the bucketed table when

[jira] [Assigned] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24343: Assignee: Apache Spark > Avoid shuffle for the bucketed table when shuffle.partition >

[jira] [Commented] (SPARK-24343) Avoid shuffle for the bucketed table when shuffle.partition > bucket number

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483723#comment-16483723 ] Apache Spark commented on SPARK-24343: -- User 'yucai' has created a pull request for this issue:

[jira] [Resolved] (SPARK-24321) Extract common code from Divide/Remainder to a base trait

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24321. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21367

[jira] [Commented] (SPARK-24344) Spark SQL Thrift Server issue

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483829#comment-16483829 ] Marco Gaido commented on SPARK-24344: - I moved to Major as Critical and Blocker are reserved for

[jira] [Commented] (SPARK-24341) Codegen compile error from predicate subquery

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483846#comment-16483846 ] Marco Gaido commented on SPARK-24341: - This is an issue in the Optimizer, rather than a codegen

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Attachment: taskstimespan.png > Large Task prior scheduling to Reduce overall execution time >

[jira] [Assigned] (SPARK-24321) Extract common code from Divide/Remainder to a base trait

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24321: --- Assignee: Kris Mok > Extract common code from Divide/Remainder to a base trait >

[jira] [Commented] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484096#comment-16484096 ] Apache Spark commented on SPARK-24349: -- User 'LantaoJin' has created a pull request for this issue:

[jira] [Commented] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Cristian Consonni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484138#comment-16484138 ] Cristian Consonni commented on SPARK-24324: --- [~hyukjin.kwon] said: > Can you narrow down the

[jira] [Assigned] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24351: Assignee: (was: Apache Spark) > offsetLog/commitLog purge thresholdBatchId should be

[jira] [Assigned] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24351: Assignee: Apache Spark > offsetLog/commitLog purge thresholdBatchId should be computed

[jira] [Commented] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484172#comment-16484172 ] Apache Spark commented on SPARK-24350: -- User 'wajda' has created a pull request for this issue:

[jira] [Created] (SPARK-24347) df.alias() in python API should not clear metadata by default

2018-05-22 Thread Tomasz Bartczak (JIRA)
Tomasz Bartczak created SPARK-24347: --- Summary: df.alias() in python API should not clear metadata by default Key: SPARK-24347 URL: https://issues.apache.org/jira/browse/SPARK-24347 Project: Spark

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Description: When performing a set of concurrent tasks, if the relatively large task (long-time task) performs

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Attachment: (was: tasktimespan.PNG) > Large Task prior scheduling to Reduce overall execution time >

[jira] [Commented] (SPARK-24063) Control maximum epoch backlog

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483796#comment-16483796 ] Apache Spark commented on SPARK-24063: -- User 'efimpoberezkin' has created a pull request for this

[jira] [Commented] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483830#comment-16483830 ] Apache Spark commented on SPARK-20114: -- User 'WeichenXu123' has created a pull request for this

[jira] [Assigned] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24349: Assignee: (was: Apache Spark) > obtainDelegationTokens() exits JVM if Driver use JDBC

[jira] [Assigned] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24349: Assignee: Apache Spark > obtainDelegationTokens() exits JVM if Driver use JDBC instead of

[jira] [Updated] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-22269: - Priority: Major (was: Minor) > Java style checks should be run in Jenkins >

[jira] [Commented] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484130#comment-16484130 ] Apache Spark commented on SPARK-24334: -- User 'icexelloss' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24334: Assignee: Apache Spark > Race condition in ArrowPythonRunner causes unclean shutdown of

[jira] [Comment Edited] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Cristian Consonni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484138#comment-16484138 ] Cristian Consonni edited comment on SPARK-24324 at 5/22/18 3:30 PM:

[jira] [Commented] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484151#comment-16484151 ] Apache Spark commented on SPARK-22269: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22269: Assignee: (was: Apache Spark) > Java style checks should be run in Jenkins >

[jira] [Assigned] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22269: Assignee: Apache Spark > Java style checks should be run in Jenkins >

[jira] [Created] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Alex Wajda (JIRA)
Alex Wajda created SPARK-24350: -- Summary: ClassCastException in "array_position" function Key: SPARK-24350 URL: https://issues.apache.org/jira/browse/SPARK-24350 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread huangtengfei (JIRA)
huangtengfei created SPARK-24351: Summary: offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode Key: SPARK-24351 URL:

[jira] [Commented] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484173#comment-16484173 ] Apache Spark commented on SPARK-24351: -- User 'ivoson' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24350: Assignee: Apache Spark > ClassCastException in "array_position" function >

[jira] [Assigned] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24350: Assignee: (was: Apache Spark) > ClassCastException in "array_position" function >

[jira] [Assigned] (SPARK-24345) Improve ParseError stop location when offending symbol is a token

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24345: Assignee: Apache Spark > Improve ParseError stop location when offending symbol is a

[jira] [Assigned] (SPARK-24345) Improve ParseError stop location when offending symbol is a token

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24345: Assignee: (was: Apache Spark) > Improve ParseError stop location when offending

[jira] [Commented] (SPARK-24345) Improve ParseError stop location when offending symbol is a token

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483895#comment-16483895 ] Apache Spark commented on SPARK-24345: -- User 'rubenfiszel' has created a pull request for this

[jira] [Assigned] (SPARK-20087) Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20087: --- Assignee: Xianjin YE > Include accumulators / taskMetrics when sending TaskKilled to

[jira] [Resolved] (SPARK-21673) Spark local directory is not set correctly

2018-05-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21673. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 18894

[jira] [Assigned] (SPARK-21673) Spark local directory is not set correctly

2018-05-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-21673: - Assignee: Jake Charland > Spark local directory is not set correctly >

[jira] [Resolved] (SPARK-24313) Collection functions interpreted execution doesn't work with complex types

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24313. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21361

[jira] [Assigned] (SPARK-24313) Collection functions interpreted execution doesn't work with complex types

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24313: --- Assignee: Marco Gaido > Collection functions interpreted execution doesn't work with

[jira] [Comment Edited] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Mateusz Pieniak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483991#comment-16483991 ] Mateusz Pieniak edited comment on SPARK-24334 at 5/22/18 1:59 PM: -- I

[jira] [Comment Edited] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Mateusz Pieniak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483991#comment-16483991 ] Mateusz Pieniak edited comment on SPARK-24334 at 5/22/18 1:59 PM: -- I

[jira] [Commented] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Mateusz Pieniak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483991#comment-16483991 ] Mateusz Pieniak commented on SPARK-24334: - I came across with this issue while running my custom

[jira] [Resolved] (SPARK-24244) Parse only required columns of CSV file

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24244. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21296

[jira] [Assigned] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24348: Assignee: Apache Spark > scala.MatchError in the "element_at" expression >

[jira] [Commented] (SPARK-24334) Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

2018-05-22 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484086#comment-16484086 ] Li Jin commented on SPARK-24334: [~pi3ni0] did it happen for you when your UDF throws exception? > Race

[jira] [Updated] (SPARK-24345) Improve ParseError stop location when offending symbol is a token

2018-05-22 Thread Ruben Fiszel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruben Fiszel updated SPARK-24345: - Description: In the case where the offending symbol of a syntaxError is a CommonToken, this PR

[jira] [Resolved] (SPARK-20087) Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20087. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21165

[jira] [Assigned] (SPARK-24244) Parse only required columns of CSV file

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24244: --- Assignee: Maxim Gekk > Parse only required columns of CSV file >

[jira] [Created] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
Lantao Jin created SPARK-24349: -- Summary: obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore Key: SPARK-24349 URL: https://issues.apache.org/jira/browse/SPARK-24349

[jira] [Created] (SPARK-24345) Improve ParseError stop location when offending symbol is a token

2018-05-22 Thread Ruben Fiszel (JIRA)
Ruben Fiszel created SPARK-24345: Summary: Improve ParseError stop location when offending symbol is a token Key: SPARK-24345 URL: https://issues.apache.org/jira/browse/SPARK-24345 Project: Spark

[jira] [Commented] (SPARK-24273) Failure while using .checkpoint method

2018-05-22 Thread Jami Malikzade (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483908#comment-16483908 ] Jami Malikzade commented on SPARK-24273: [~kiszk] I went deeper and found more: This way it

[jira] [Created] (SPARK-24346) Executors are unable to fetch remote cache blocks

2018-05-22 Thread Truong Duc Kien (JIRA)
Truong Duc Kien created SPARK-24346: --- Summary: Executors are unable to fetch remote cache blocks Key: SPARK-24346 URL: https://issues.apache.org/jira/browse/SPARK-24346 Project: Spark

[jira] [Created] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Alex Wajda (JIRA)
Alex Wajda created SPARK-24348: -- Summary: scala.MatchError in the "element_at" expression Key: SPARK-24348 URL: https://issues.apache.org/jira/browse/SPARK-24348 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484046#comment-16484046 ] Apache Spark commented on SPARK-24348: -- User 'wajda' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24348: Assignee: (was: Apache Spark) > scala.MatchError in the "element_at" expression >

[jira] [Commented] (SPARK-24269) Infer nullability rather than declaring all columns as nullable

2018-05-22 Thread Simeon Simeonov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484071#comment-16484071 ] Simeon Simeonov commented on SPARK-24269: - There are many reasons why correct nullability

[jira] [Commented] (SPARK-6235) Address various 2G limits

2018-05-22 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484069#comment-16484069 ] Imran Rashid commented on SPARK-6235: - Would be nice to find a better home for this, but for now I

[jira] [Commented] (SPARK-24341) Codegen compile error from predicate subquery

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484225#comment-16484225 ] Xiao Li commented on SPARK-24341: - cc [~dkbiswal] Could you take a look at this too? > Codegen compile

[jira] [Comment Edited] (SPARK-13638) Support for saving with a quote mode

2018-05-22 Thread Umesh K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484255#comment-16484255 ] Umesh K edited comment on SPARK-13638 at 5/22/18 4:36 PM: -- [~rxin] Just want to

[jira] [Updated] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread huangtengfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huangtengfei updated SPARK-24351: - Description: In structured streaming, there is a conf spark.sql.streaming.minBatchesToRetain

[jira] [Created] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24353: --- Summary: Add support for pod affinity/anti-affinity Key: SPARK-24353 URL: https://issues.apache.org/jira/browse/SPARK-24353 Project: Spark

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Labels: correctness (was: ) > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Priority: Blocker (was: Minor) > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Created] (SPARK-24352) Flaky test: StandaloneDynamicAllocationSuite

2018-05-22 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-24352: -- Summary: Flaky test: StandaloneDynamicAllocationSuite Key: SPARK-24352 URL: https://issues.apache.org/jira/browse/SPARK-24352 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s

[jira] [Commented] (SPARK-13638) Support for saving with a quote mode

2018-05-22 Thread Umesh K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484255#comment-16484255 ] Umesh K commented on SPARK-13638: - Just want to confirm are we ever going to have quoteMode or we always

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Target Version/s: 2.3.1 (was: 2.3.2) > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Target Version/s: 2.3.2 > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Commented] (SPARK-24339) spark sql can not prune column in transform/map/reduce query

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486561#comment-16486561 ] Hyukjin Kwon commented on SPARK-24339: -- (Don't set the target versions usually reserved for

[jira] [Updated] (SPARK-24358) createDataFrame in Python 3 should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Croteau updated SPARK-24358: - Labels: Python3 (was: ) Description: createDataFrame can infer Python 3's bytearray

[jira] [Commented] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486717#comment-16486717 ] Hyukjin Kwon commented on SPARK-22366: -- Oops, sorry. I mistakenly edited the JIRA. I reverted it

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-22366: - Description: +underlined text+There's an existing flag "spark.sql.files.ignoreCorruptFiles"

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-22366: - Description: There's an existing flag "spark.sql.files.ignoreCorruptFiles" that will quietly

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Priority: Major (was: Minor) > Large Task prior scheduling to Reduce overall execution time >

[jira] [Commented] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486569#comment-16486569 ] Hyukjin Kwon commented on SPARK-24358: -- ? do you mean bytes in Python 2? that's an alias for str,

[jira] [Commented] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486581#comment-16486581 ] Joel Croteau commented on SPARK-24358: -- No, I mean the bytes type in Python 3. This code:

[jira] [Comment Edited] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486581#comment-16486581 ] Joel Croteau edited comment on SPARK-24358 at 5/23/18 1:47 AM: --- No, I mean

[jira] [Commented] (SPARK-24356) Duplicate strings in File.path managed by FileSegmentManagedBuffer

2018-05-22 Thread Misha Dmitriev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486580#comment-16486580 ] Misha Dmitriev commented on SPARK-24356: I plan to work on this feature. > Duplicate strings in

[jira] [Commented] (SPARK-24357) createDataFrame in Python infers large integers as long type and then fails silently when converting them

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486571#comment-16486571 ] Joel Croteau commented on SPARK-24357: -- Fair enough, here is some code to reproduce it:

[jira] [Commented] (SPARK-24358) createDataFrame in Python 3 should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486594#comment-16486594 ] Joel Croteau commented on SPARK-24358: -- Done. > createDataFrame in Python 3 should be able to infer

[jira] [Created] (SPARK-24361) Polish code block manipulation API

2018-05-22 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-24361: --- Summary: Polish code block manipulation API Key: SPARK-24361 URL: https://issues.apache.org/jira/browse/SPARK-24361 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Component/s: (was: Optimizer) Spark Core > Large Task prior scheduling to Reduce overall

[jira] [Closed] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin closed SPARK-24349. -- > obtainDelegationTokens() exits JVM if Driver use JDBC instead of using > metastore >

[jira] [Resolved] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin resolved SPARK-24349. Resolution: Not A Problem delegationTokensRequired has been checked in SparkSQLCLIDriver.scala >

[jira] [Commented] (SPARK-22055) Port release scripts

2018-05-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486728#comment-16486728 ] Felix Cheung commented on SPARK-22055: -- interesting - I'd definitely be happy to help. do you have

[jira] [Commented] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486565#comment-16486565 ] Hyukjin Kwon commented on SPARK-24324: -- Ah, I meant shorter reproducer should make other guys easier

  1   2   >