[jira] [Updated] (SPARK-32060) Huber loss Convergence

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-32060: - Description: |performace test in https://issues.apache.org/jira/browse/SPARK-31783, Huber loss

[jira] [Updated] (SPARK-32046) current_timestamp called in a cache dataframe freezes the time for all future calls

2020-06-30 Thread Dustin Smith (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Smith updated SPARK-32046: - Description: If I call current_timestamp 3 times while caching the dataframe variable in order

[jira] [Commented] (SPARK-32119) ExecutorPlugin doesn't work with Standalone Cluster

2020-06-30 Thread Luca Canali (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148350#comment-17148350 ] Luca Canali commented on SPARK-32119: - No plans that I know of. CC [~vanzin] > ExecutorPlugin

[jira] [Commented] (SPARK-31060) Handle column names containing `dots` in data source `Filter`

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148594#comment-17148594 ] Apache Spark commented on SPARK-31060: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-32115) Incorrect results for SUBSTRING when overflow

2020-06-30 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148409#comment-17148409 ] Yuanjian Li commented on SPARK-32115: - Thank you for verifying! [~dongjoon] > Incorrect results for

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 8:07 AM: -- [~lotus2you] 

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 8:05 AM: -- [~lotus2you] 

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 8:05 AM: -- [~lotus2you] 

[jira] [Created] (SPARK-32137) AttributeError: Can only use .dt accessor with datetimelike values

2020-06-30 Thread David Lacalle Castillo (Jira)
David Lacalle Castillo created SPARK-32137: -- Summary: AttributeError: Can only use .dt accessor with datetimelike values Key: SPARK-32137 URL: https://issues.apache.org/jira/browse/SPARK-32137

[jira] [Comment Edited] (SPARK-32060) Huber loss Convergence

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148412#comment-17148412 ] zhengruifeng edited comment on SPARK-32060 at 6/30/20, 8:17 AM: I found

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148416#comment-17148416 ] Jungtaek Lim commented on SPARK-32130: -- I've feeling that "opt-in" approach is correct as it brings

[jira] [Commented] (SPARK-28861) Jetty property handling: java.lang.NumberFormatException: For input string: "unknown".

2020-06-30 Thread Siddhant Chadha (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148444#comment-17148444 ] Siddhant Chadha commented on SPARK-28861: - I’d like to take this up > Jetty property handling:

[jira] [Commented] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148452#comment-17148452 ] zhengruifeng commented on SPARK-3181: - I am working on blockify+gemv/gemm for better performance, and

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148559#comment-17148559 ] Jungtaek Lim commented on SPARK-32130: -- So anyone can just reproduce via running spark-shell on

[jira] [Commented] (SPARK-32060) Huber loss Convergence

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148412#comment-17148412 ] zhengruifeng commented on SPARK-32060: -- I found that the optimization of Huber Loss is unstable, if

[jira] [Updated] (SPARK-32060) Huber loss Convergence

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-32060: - Attachment: (was: image-2020-06-28-18-05-28-867.png) > Huber loss Convergence >

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148570#comment-17148570 ] Jungtaek Lim commented on SPARK-32130: -- Looks like we already saw the difference but we missed to

[jira] [Commented] (SPARK-32083) Unnecessary tasks are launched when input is empty with AQE

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148585#comment-17148585 ] Apache Spark commented on SPARK-32083: -- User 'manuzhang' has created a pull request for this issue:

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang commented on SPARK-32130: [~lotus2you] Thank you for your feedback. Lots of time is used

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 7:39 AM: -- [~lotus2you] 

[jira] [Comment Edited] (SPARK-32060) Huber loss Convergence

2020-06-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148412#comment-17148412 ] zhengruifeng edited comment on SPARK-32060 at 6/30/20, 8:20 AM: I found

[jira] [Commented] (SPARK-32083) Unnecessary tasks are launched when input is empty with AQE

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148584#comment-17148584 ] Apache Spark commented on SPARK-32083: -- User 'manuzhang' has created a pull request for this issue:

[jira] [Created] (SPARK-32136) Spark producing incorrect groupBy results when key is a struct with nullable properties

2020-06-30 Thread Jason Moore (Jira)
Jason Moore created SPARK-32136: --- Summary: Spark producing incorrect groupBy results when key is a struct with nullable properties Key: SPARK-32136 URL: https://issues.apache.org/jira/browse/SPARK-32136

[jira] [Updated] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Gourav (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gourav updated SPARK-32130: --- Attachment: SPARK 32130 - replication and findings.ipynb > Spark 3.0 json load performance is unacceptable

[jira] [Updated] (SPARK-32136) Spark producing incorrect groupBy results when key is a struct with nullable properties

2020-06-30 Thread Jason Moore (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Moore updated SPARK-32136: Description: I'm in the process of migrating from Spark 2.4.x to Spark 3.0.0 and I'm noticing a

[jira] [Commented] (SPARK-25556) Predicate Pushdown for Nested fields

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148587#comment-17148587 ] Apache Spark commented on SPARK-25556: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-17636) Parquet predicate pushdown for nested fields

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148588#comment-17148588 ] Apache Spark commented on SPARK-17636: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-31026) Parquet predicate pushdown on columns with dots

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148590#comment-17148590 ] Apache Spark commented on SPARK-31026: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-17636) Parquet predicate pushdown for nested fields

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148589#comment-17148589 ] Apache Spark commented on SPARK-17636: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 7:58 AM: -- [~lotus2you] 

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148393#comment-17148393 ] JinxinTang edited comment on SPARK-32130 at 6/30/20, 7:59 AM: -- [~lotus2you] 

[jira] [Updated] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Gourav (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gourav updated SPARK-32130: --- Attachment: (was: SPARK 32130 - replication and findings.ipynb) > Spark 3.0 json load performance is

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Gourav (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148459#comment-17148459 ] Gourav commented on SPARK-32130: [~JinxinTang] and [~lotus2you] I think that it is only the first time

[jira] [Updated] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Gourav (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gourav updated SPARK-32130: --- Attachment: SPARK 32130 - replication and findings.ipynb > Spark 3.0 json load performance is unacceptable

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Gourav (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148459#comment-17148459 ] Gourav edited comment on SPARK-32130 at 6/30/20, 9:02 AM: -- [~JinxinTang] and

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread JinxinTang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148501#comment-17148501 ] JinxinTang commented on SPARK-32130: [~gourav.sengupta] Nice notebook, is seems the row count() is 

[jira] [Updated] (SPARK-32137) AttributeError: Can only use .dt accessor with datetimelike values

2020-06-30 Thread David Lacalle Castillo (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lacalle Castillo updated SPARK-32137: --- Priority: Critical (was: Major) > AttributeError: Can only use .dt

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148546#comment-17148546 ] Jungtaek Lim commented on SPARK-32130: -- For me it's reproduced consistently. Please make sure you

[jira] [Commented] (SPARK-31060) Handle column names containing `dots` in data source `Filter`

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148592#comment-17148592 ] Apache Spark commented on SPARK-31060: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-31026) Parquet predicate pushdown on columns with dots

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148593#comment-17148593 ] Apache Spark commented on SPARK-31026: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Sanjeev Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148663#comment-17148663 ] Sanjeev Mishra commented on SPARK-32130: I tried to load entire dataset using above suggestions

[jira] [Updated] (SPARK-31816) Create high level description about JDBC connection providers for users/developers

2020-06-30 Thread Gabor Somogyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-31816: -- Summary: Create high level description about JDBC connection providers for users/developers

[jira] [Commented] (SPARK-32119) ExecutorPlugin doesn't work with Standalone Cluster

2020-06-30 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148746#comment-17148746 ] Thomas Graves commented on SPARK-32119: --- You can specify the jars in extraClassPath but it

[jira] [Updated] (SPARK-29919) remove python2 test execution

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29919: - Parent: SPARK-29909 Issue Type: Sub-task (was: Improvement) > remove python2 test

[jira] [Commented] (SPARK-29803) remove all instances of 'from __future__ import print_function'

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148755#comment-17148755 ] Hyukjin Kwon commented on SPARK-29803: -- I will do it at SPARK-29909 > remove all instances of

[jira] [Updated] (SPARK-29919) remove python2 test execution

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29919: - Parent: (was: SPARK-29909) Issue Type: Improvement (was: Sub-task) > remove

[jira] [Commented] (SPARK-32138) Drop Python 2, 3.4 and 3.5 in codes and documentation

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148766#comment-17148766 ] Apache Spark commented on SPARK-32138: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Updated] (SPARK-29919) remove python2 test execution

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29919: - Parent: (was: SPARK-27884) Issue Type: Bug (was: Sub-task) > remove python2 test

[jira] [Updated] (SPARK-29919) remove python2 test execution

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29919: - Parent: SPARK-29909 Issue Type: Sub-task (was: Bug) > remove python2 test execution >

[jira] [Commented] (SPARK-31797) Adds TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS functions

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148694#comment-17148694 ] Apache Spark commented on SPARK-31797: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-31797) Adds TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS functions

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148696#comment-17148696 ] Apache Spark commented on SPARK-31797: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-32135) Show Spark Driver name on Spark history web page

2020-06-30 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148742#comment-17148742 ] Thomas Graves commented on SPARK-32135: --- [~gaurangi]can you please clarify what you mean by

[jira] [Resolved] (SPARK-29803) remove all instances of 'from __future__ import print_function'

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-29803. -- Resolution: Duplicate > remove all instances of 'from __future__ import print_function' >

[jira] [Updated] (SPARK-29803) remove all instances of 'from __future__ import print_function'

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29803: - Parent: (was: SPARK-29909) Issue Type: Improvement (was: Sub-task) > remove all

[jira] [Updated] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Sanjeev Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjeev Mishra updated SPARK-32130: --- Description: We are planning to move to Spark 3 but the read performance of our json files

[jira] [Updated] (SPARK-32138) Drop Python 2, 3.4 and 3.5 in codes and documentation

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-32138: - Summary: Drop Python 2, 3.4 and 3.5 in codes and documentation (was: Drop Python 2, 3.4 and

[jira] [Commented] (SPARK-31816) Create high level description about JDBC connection providers for users/developers

2020-06-30 Thread Gabor Somogyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148712#comment-17148712 ] Gabor Somogyi commented on SPARK-31816: --- Originally I've thought it's enough to add developer

[jira] [Updated] (SPARK-29919) Remove python2 test execution in Jenkins environment

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29919: - Summary: Remove python2 test execution in Jenkins environment (was: remove python2 test

[jira] [Comment Edited] (SPARK-29803) remove all instances of 'from __future__ import print_function'

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148755#comment-17148755 ] Hyukjin Kwon edited comment on SPARK-29803 at 6/30/20, 3:07 PM: I will

[jira] [Updated] (SPARK-29802) Update remaining python scripts in repo to python3 shebang

2020-06-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-29802: - Summary: Update remaining python scripts in repo to python3 shebang (was: update remaining

[jira] [Created] (SPARK-32138) Drop Python 2, 3.4 and 3.5 in the main and dev codes

2020-06-30 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-32138: Summary: Drop Python 2, 3.4 and 3.5 in the main and dev codes Key: SPARK-32138 URL: https://issues.apache.org/jira/browse/SPARK-32138 Project: Spark Issue

[jira] [Comment Edited] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Sanjeev Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148663#comment-17148663 ] Sanjeev Mishra edited comment on SPARK-32130 at 6/30/20, 1:35 PM: -- I

[jira] [Assigned] (SPARK-32068) Spark 3 UI task launch time show in error time zone

2020-06-30 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-32068: - Assignee: JinxinTang > Spark 3 UI task launch time show in error time zone >

[jira] [Resolved] (SPARK-32068) Spark 3 UI task launch time show in error time zone

2020-06-30 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-32068. --- Fix Version/s: 3.1.0 3.0.1 Resolution: Fixed > Spark 3 UI task

[jira] [Assigned] (SPARK-32138) Drop Python 2, 3.4 and 3.5 in codes and documentation

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32138: Assignee: Apache Spark > Drop Python 2, 3.4 and 3.5 in codes and documentation >

[jira] [Assigned] (SPARK-32138) Drop Python 2, 3.4 and 3.5 in codes and documentation

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32138: Assignee: (was: Apache Spark) > Drop Python 2, 3.4 and 3.5 in codes and

[jira] [Commented] (SPARK-32132) Thriftserver interval returns "4 weeks 2 days" in 2.4 and "30 days" in 3.0

2020-06-30 Thread Juliusz Sompolski (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148848#comment-17148848 ] Juliusz Sompolski commented on SPARK-32132: --- Also 2.4 adds "interval" at the start, while 3.0

[jira] [Commented] (SPARK-32119) ExecutorPlugin doesn't work with Standalone Cluster

2020-06-30 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148790#comment-17148790 ] Kousuke Saruta commented on SPARK-32119: Yeah I know it works with extraClassPath but as you

[jira] [Updated] (SPARK-32026) Add PrometheusServletSuite

2020-06-30 Thread Eren Avsarogullari (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eren Avsarogullari updated SPARK-32026: --- Description: This Jira aims to be added _PrometheusServletSuite_. *Note:* This

[jira] [Commented] (SPARK-32132) Thriftserver interval returns "4 weeks 2 days" in 2.4 and "30 days" in 3.0

2020-06-30 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148836#comment-17148836 ] Wenchen Fan commented on SPARK-32132: - We did it intentionally in

[jira] [Updated] (SPARK-31935) Hadoop file system config should be effective in data source options

2020-06-30 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-31935: Issue Type: Bug (was: Improvement) > Hadoop file system config should be effective in data

[jira] [Updated] (SPARK-32121) ExternalShuffleBlockResolverSuite failed on Windows

2020-06-30 Thread Cheng Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Pan updated SPARK-32121: -- Issue Type: Bug (was: Test) > ExternalShuffleBlockResolverSuite failed on Windows >

[jira] [Updated] (SPARK-31935) Hadoop file system config should be effective in data source options

2020-06-30 Thread Cheng Lian (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-31935: --- Affects Version/s: (was: 3.0.1) (was: 3.1.0)

[jira] [Assigned] (SPARK-31336) Support Oracle Kerberos login in JDBC connector

2020-06-30 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-31336: - Assignee: Gabor Somogyi > Support Oracle Kerberos login in JDBC connector >

[jira] [Resolved] (SPARK-31336) Support Oracle Kerberos login in JDBC connector

2020-06-30 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-31336. --- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28863

[jira] [Commented] (SPARK-32088) test of pyspark.sql.functions.timestamp_seconds failed if non-american timezone setting

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148913#comment-17148913 ] Apache Spark commented on SPARK-32088: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148936#comment-17148936 ] Sean R. Owen commented on SPARK-32130: -- So, is the issue that it's trying and failing to parse

[jira] [Updated] (SPARK-28664) ORDER BY in aggregate function

2020-06-30 Thread Will Zimmerman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Zimmerman updated SPARK-28664: --- Attachment: image-2020-06-30-15-49-46-796.png > ORDER BY in aggregate function >

[jira] [Updated] (SPARK-23631) Add summary to RandomForestClassificationModel

2020-06-30 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-23631: --- Parent: SPARK-32139 Issue Type: Sub-task (was: New Feature) > Add summary to

[jira] [Created] (SPARK-32140) Add summary to FMClassificationModel

2020-06-30 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-32140: -- Summary: Add summary to FMClassificationModel Key: SPARK-32140 URL: https://issues.apache.org/jira/browse/SPARK-32140 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-31893) Add a generic ClassificationSummary trait

2020-06-30 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-31893: --- Parent: SPARK-32139 Issue Type: Sub-task (was: Improvement) > Add a generic

[jira] [Created] (SPARK-32139) Unify Classification Training Summary

2020-06-30 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-32139: -- Summary: Unify Classification Training Summary Key: SPARK-32139 URL: https://issues.apache.org/jira/browse/SPARK-32139 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28664) ORDER BY in aggregate function

2020-06-30 Thread Will Zimmerman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148952#comment-17148952 ] Will Zimmerman commented on SPARK-28664: [~yumwang] - Would this allow for the changing of Null

[jira] [Updated] (SPARK-20249) Add summary for LinearSVCModel

2020-06-30 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-20249: --- Parent: SPARK-32139 Issue Type: Sub-task (was: Improvement) > Add summary for

[jira] [Commented] (SPARK-32088) test of pyspark.sql.functions.timestamp_seconds failed if non-american timezone setting

2020-06-30 Thread huangtianhua (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149044#comment-17149044 ] huangtianhua commented on SPARK-32088: -- the test success by the modification

[jira] [Created] (SPARK-32142) Keep the original tests and codes to avoid potential conflicts in dev in ParquetFilterSuite and ParquetIOSuite

2020-06-30 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-32142: Summary: Keep the original tests and codes to avoid potential conflicts in dev in ParquetFilterSuite and ParquetIOSuite Key: SPARK-32142 URL:

[jira] [Commented] (SPARK-32140) Add summary to FMClassificationModel

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148985#comment-17148985 ] Apache Spark commented on SPARK-32140: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-32140) Add summary to FMClassificationModel

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32140: Assignee: (was: Apache Spark) > Add summary to FMClassificationModel >

[jira] [Commented] (SPARK-32140) Add summary to FMClassificationModel

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148984#comment-17148984 ] Apache Spark commented on SPARK-32140: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-32140) Add summary to FMClassificationModel

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32140: Assignee: Apache Spark > Add summary to FMClassificationModel >

[jira] [Commented] (SPARK-32130) Spark 3.0 json load performance is unacceptable in comparison of Spark 2.4

2020-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149019#comment-17149019 ] Jungtaek Lim commented on SPARK-32130: -- There might be some tricks to make type inference for

[jira] [Created] (SPARK-32141) Repartition leads to out of memory

2020-06-30 Thread Lekshmi Nair (Jira)
Lekshmi Nair created SPARK-32141: Summary: Repartition leads to out of memory Key: SPARK-32141 URL: https://issues.apache.org/jira/browse/SPARK-32141 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-32142) Keep the original tests and codes to avoid potential conflicts in dev in ParquetFilterSuite and ParquetIOSuite

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32142: Assignee: (was: Apache Spark) > Keep the original tests and codes to avoid potential

[jira] [Assigned] (SPARK-32142) Keep the original tests and codes to avoid potential conflicts in dev in ParquetFilterSuite and ParquetIOSuite

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32142: Assignee: Apache Spark > Keep the original tests and codes to avoid potential conflicts

[jira] [Commented] (SPARK-32142) Keep the original tests and codes to avoid potential conflicts in dev in ParquetFilterSuite and ParquetIOSuite

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149035#comment-17149035 ] Apache Spark commented on SPARK-32142: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Updated] (SPARK-31935) Hadoop file system config should be effective in data source options

2020-06-30 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-31935: Fix Version/s: 2.4.7 > Hadoop file system config should be effective in data source options >

[jira] [Commented] (SPARK-32136) Spark producing incorrect groupBy results when key is a struct with nullable properties

2020-06-30 Thread Jason Moore (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149073#comment-17149073 ] Jason Moore commented on SPARK-32136: - Here is a similar test, and why it's a problem for what I'm

[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade

2020-06-30 Thread Jason Moore (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149083#comment-17149083 ] Jason Moore commented on SPARK-27780: - I encounter this on 3.0.0 running with a much older shuffle

[jira] [Assigned] (SPARK-32143) Fast fail when the AQE skew join produce too many splits

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32143: Assignee: (was: Apache Spark) > Fast fail when the AQE skew join produce too many

[jira] [Commented] (SPARK-32143) Fast fail when the AQE skew join produce too many splits

2020-06-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149095#comment-17149095 ] Apache Spark commented on SPARK-32143: -- User 'LantaoJin' has created a pull request for this issue:

  1   2   >