[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348555#comment-16348555 ] Oz Ben-Ami commented on SPARK-22575: [~mgaido] The issue was apparently related to dynamic allocation

[jira] [Updated] (SPARK-23295) Exclude Waring message when generating versions in make-distribution.sh

2018-02-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23295: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Exclude Waring message when

[jira] [Assigned] (SPARK-23235) Add executor Threaddump to api

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23235: Assignee: Apache Spark > Add executor Threaddump to api > --

[jira] [Commented] (SPARK-23235) Add executor Threaddump to api

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348616#comment-16348616 ] Apache Spark commented on SPARK-23235: -- User 'attilapiros' has created a pull request for this

[jira] [Updated] (SPARK-23300) Print out if Pandas and PyArrow are installed or not in tests

2018-02-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-23300: - Summary: Print out if Pandas and PyArrow are installed or not in tests (was: Print out if

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348617#comment-16348617 ] Oz Ben-Ami commented on SPARK-22575: That makes sense. Not sure why we only saw this with STS, likely

[jira] [Assigned] (SPARK-23235) Add executor Threaddump to api

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23235: Assignee: (was: Apache Spark) > Add executor Threaddump to api >

[jira] [Assigned] (SPARK-23202) Add new API in DataSourceWriter: onDataWriterCommit

2018-02-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23202: --- Assignee: Gengliang Wang > Add new API in DataSourceWriter: onDataWriterCommit >

[jira] [Resolved] (SPARK-23202) Add new API in DataSourceWriter: onDataWriterCommit

2018-02-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23202. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20454

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348622#comment-16348622 ] Marco Gaido commented on SPARK-22575: - I think STS is the only Spark application where this can

[jira] [Commented] (SPARK-23301) data source v2 column pruning with arbitrary expressions is broken

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348658#comment-16348658 ] Apache Spark commented on SPARK-23301: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23301) data source v2 column pruning with arbitrary expressions is broken

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23301: Assignee: Wenchen Fan (was: Apache Spark) > data source v2 column pruning with arbitrary

[jira] [Assigned] (SPARK-23301) data source v2 column pruning with arbitrary expressions is broken

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23301: Assignee: Apache Spark (was: Wenchen Fan) > data source v2 column pruning with arbitrary

[jira] [Created] (SPARK-23299) __repr__ broken for Rows instantiated with *args

2018-02-01 Thread Oli Hall (JIRA)
Oli Hall created SPARK-23299: Summary: __repr__ broken for Rows instantiated with *args Key: SPARK-23299 URL: https://issues.apache.org/jira/browse/SPARK-23299 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-23300) Print out if Pandas and PyArrow are installed or not in tests

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23300: Assignee: (was: Apache Spark) > Print out if Pandas and PyArrow are installed or not

[jira] [Assigned] (SPARK-23300) Print out if Pandas and PyArrow are installed or not in tests

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23300: Assignee: Apache Spark > Print out if Pandas and PyArrow are installed or not in tests >

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido commented on SPARK-22575: - I am not able to reproduce the issue. May I ask you to provide

[jira] [Comment Edited] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido edited comment on SPARK-22575 at 2/1/18 1:06 PM: - I am not able

[jira] [Commented] (SPARK-23300) Print out if Pandas and PyArrow are installed or not in tests

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348547#comment-16348547 ] Apache Spark commented on SPARK-23300: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Created] (SPARK-23298) distinct.count on Dataset/DataFrame yields non-deterministic results

2018-02-01 Thread Mateusz Jukiewicz (JIRA)
Mateusz Jukiewicz created SPARK-23298: - Summary: distinct.count on Dataset/DataFrame yields non-deterministic results Key: SPARK-23298 URL: https://issues.apache.org/jira/browse/SPARK-23298

[jira] [Assigned] (SPARK-23256) Add columnSchema method to PySpark image reader

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23256: Assignee: Apache Spark > Add columnSchema method to PySpark image reader >

[jira] [Commented] (SPARK-23256) Add columnSchema method to PySpark image reader

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348643#comment-16348643 ] Apache Spark commented on SPARK-23256: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-23256) Add columnSchema method to PySpark image reader

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23256: Assignee: (was: Apache Spark) > Add columnSchema method to PySpark image reader >

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348603#comment-16348603 ] Marco Gaido commented on SPARK-22575: - Then the problem is likely that the executors are killed in

[jira] [Created] (SPARK-23301) data source v2 column pruning with arbitrary expressions is broken

2018-02-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-23301: --- Summary: data source v2 column pruning with arbitrary expressions is broken Key: SPARK-23301 URL: https://issues.apache.org/jira/browse/SPARK-23301 Project: Spark

[jira] [Created] (SPARK-23300) Print out if NumPy, Pandas and PyArrow are installed or not in tests

2018-02-01 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-23300: Summary: Print out if NumPy, Pandas and PyArrow are installed or not in tests Key: SPARK-23300 URL: https://issues.apache.org/jira/browse/SPARK-23300 Project: Spark

[jira] [Commented] (SPARK-22751) Improve ML RandomForest shuffle performance

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348430#comment-16348430 ] Apache Spark commented on SPARK-22751: -- User 'lucio-yz' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22751) Improve ML RandomForest shuffle performance

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22751: Assignee: Apache Spark > Improve ML RandomForest shuffle performance >

[jira] [Assigned] (SPARK-22751) Improve ML RandomForest shuffle performance

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22751: Assignee: (was: Apache Spark) > Improve ML RandomForest shuffle performance >

[jira] [Resolved] (SPARK-23289) OneForOneBlockFetcher.DownloadCallback.onData may write just a part of data

2018-02-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23289. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20461

[jira] [Updated] (SPARK-23296) Diagnostics message for user code exceptions should include the stacktrace

2018-02-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23296: -- Shepherd: (was: Sean Owen) > Diagnostics message for user code exceptions should include the

[jira] [Updated] (SPARK-23296) Diagnostics message for user code exceptions should include the stacktrace

2018-02-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23296: -- Priority: Trivial (was: Major) > Diagnostics message for user code exceptions should include the

[jira] [Created] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-23303: --- Summary: improve the explain result for data source v2 relations Key: SPARK-23303 URL: https://issues.apache.org/jira/browse/SPARK-23303 Project: Spark Issue

[jira] [Updated] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` test case for ORC data source

2018-02-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23305: -- Summary: Add `spark.sql.files.ignoreMissingFiles` test case for ORC data source (was: Add

[jira] [Updated] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` test case for ORC

2018-02-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23305: -- Summary: Add `spark.sql.files.ignoreMissingFiles` test case for ORC (was: Add

[jira] [Updated] (SPARK-23290) inadvertent change in handling of DateType when converting to pandas dataframe

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23290: Priority: Blocker (was: Major) > inadvertent change in handling of DateType when converting to pandas

[jira] [Resolved] (SPARK-23301) data source v2 column pruning with arbitrary expressions is broken

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23301. - Resolution: Fixed Fix Version/s: 2.3.0 > data source v2 column pruning with arbitrary expressions

[jira] [Commented] (SPARK-23292) python tests related to pandas are skipped

2018-02-01 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349040#comment-16349040 ] Yin Huai commented on SPARK-23292: -- So, jenkins does have the right version of pandas and pyarrow for

[jira] [Created] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` for ORC data source

2018-02-01 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-23305: - Summary: Add `spark.sql.files.ignoreMissingFiles` for ORC data source Key: SPARK-23305 URL: https://issues.apache.org/jira/browse/SPARK-23305 Project: Spark

[jira] [Assigned] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` test case for ORC

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23305: Assignee: (was: Apache Spark) > Add `spark.sql.files.ignoreMissingFiles` test case

[jira] [Assigned] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` test case for ORC

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23305: Assignee: Apache Spark > Add `spark.sql.files.ignoreMissingFiles` test case for ORC >

[jira] [Commented] (SPARK-23305) Add `spark.sql.files.ignoreMissingFiles` test case for ORC

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349054#comment-16349054 ] Apache Spark commented on SPARK-23305: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-23107) ML, Graph 2.3 QA: API: New Scala APIs, docs

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349090#comment-16349090 ] Sameer Agarwal commented on SPARK-23107: Thanks [~yanboliang], I'll cut the next RC as soon as

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349147#comment-16349147 ] Sameer Agarwal commented on SPARK-23304: [~tgraves] just to rule out the obvious, was there a

[jira] [Created] (SPARK-23306) Race condition in TaskMemoryManager

2018-02-01 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-23306: -- Summary: Race condition in TaskMemoryManager Key: SPARK-23306 URL: https://issues.apache.org/jira/browse/SPARK-23306 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23304: - Summary: Spark SQL coalesce() against hive not working Key: SPARK-23304 URL: https://issues.apache.org/jira/browse/SPARK-23304 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15473) CSV fails to write and read back empty dataframe

2018-02-01 Thread Andrey Taptunov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349075#comment-16349075 ] Andrey Taptunov commented on SPARK-15473: - Just want to point out that example from description

[jira] [Resolved] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-13983. - Resolution: Fixed Assignee: Yuming Wang (was: Cheng Lian) Fix Version/s: 2.3.0 >

[jira] [Assigned] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23304: --- Assignee: Xiao Li > Spark SQL coalesce() against hive not working >

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23304: Target Version/s: 2.3.0 > Spark SQL coalesce() against hive not working >

[jira] [Commented] (SPARK-23294) Spark Streaming + Rate source + Console Sink : Receiver MaxRate is violated

2018-02-01 Thread Ravinder Matte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349111#comment-16349111 ] Ravinder Matte commented on SPARK-23294: [~maropu] Please do let me know if you need any

[jira] [Commented] (SPARK-8835) Provide pluggable Congestion Strategies to deal with Streaming load

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348886#comment-16348886 ] Apache Spark commented on SPARK-8835: - User 'EmergentOrder' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23303: Assignee: Apache Spark (was: Wenchen Fan) > improve the explain result for data source

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2018-02-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348980#comment-16348980 ] Kazuaki Ishizaki commented on SPARK-18016: -- This issue has been fixed in Spark 2.3 or master. It

[jira] [Commented] (SPARK-19275) Spark Streaming, Kafka receiver, "Failed to get records for ... after polling for 512"

2018-02-01 Thread Riccardo Vincelli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348996#comment-16348996 ] Riccardo Vincelli commented on SPARK-19275: --- Hi, I would like to point out that a timeout could

[jira] [Created] (SPARK-23302) Refactor group aggregate pandas UDF to its own catalyst rules

2018-02-01 Thread Li Jin (JIRA)
Li Jin created SPARK-23302: -- Summary: Refactor group aggregate pandas UDF to its own catalyst rules Key: SPARK-23302 URL: https://issues.apache.org/jira/browse/SPARK-23302 Project: Spark Issue

[jira] [Commented] (SPARK-23235) Add executor Threaddump to api

2018-02-01 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348758#comment-16348758 ] Imran Rashid commented on SPARK-23235: -- Just brainstorming I'd like to get some opinions on: I

[jira] [Commented] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348737#comment-16348737 ] Apache Spark commented on SPARK-23303: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23303: Assignee: Wenchen Fan (was: Apache Spark) > improve the explain result for data source

[jira] [Updated] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-23307: --- Priority: Blocker (was: Major) > Spark UI should sort jobs/stages with the completed

[jira] [Updated] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-23307: --- Target Version/s: 2.3.0 > Spark UI should sort jobs/stages with the completed timestamp

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349346#comment-16349346 ] Sameer Agarwal commented on SPARK-23304: Also, is there a JIRA/repro for the caching issue you

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349367#comment-16349367 ] Thomas Graves commented on SPARK-23304: --- ok I've attached 2 files one with spark 2.3 and one with

[jira] [Assigned] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23307: Assignee: Shixiong Zhu (was: Apache Spark) > Spark UI should sort jobs/stages with the

[jira] [Created] (SPARK-23308) ignoreCorruptFiles should not ignore retryable IOException

2018-02-01 Thread JIRA
Márcio Furlani Carmona created SPARK-23308: -- Summary: ignoreCorruptFiles should not ignore retryable IOException Key: SPARK-23308 URL: https://issues.apache.org/jira/browse/SPARK-23308

[jira] [Commented] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349382#comment-16349382 ] Apache Spark commented on SPARK-23307: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23307: Assignee: Apache Spark (was: Shixiong Zhu) > Spark UI should sort jobs/stages with the

[jira] [Commented] (SPARK-23294) Spark Streaming + Rate source + Console Sink : Receiver MaxRate is violated

2018-02-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349411#comment-16349411 ] Shixiong Zhu commented on SPARK-23294: -- [~rmatte] the configurations you posted in the ticket is for

[jira] [Resolved] (SPARK-23294) Spark Streaming + Rate source + Console Sink : Receiver MaxRate is violated

2018-02-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-23294. -- Resolution: Not A Problem > Spark Streaming + Rate source + Console Sink : Receiver MaxRate is

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349446#comment-16349446 ] Thomas Graves commented on SPARK-23304: --- It still seems like a bug to me since the coalesce isn't

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349444#comment-16349444 ] Thomas Graves commented on SPARK-23304: --- [~smilegator] just to make sure you saw my comment above,

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2018-02-01 Thread Dan Meany (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349440#comment-16349440 ] Dan Meany commented on SPARK-19371: --- We have had this issue on many occasions and nothing I tried

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349212#comment-16349212 ] Thomas Graves commented on SPARK-23304: --- yes there are difference in the # of partitions between

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349220#comment-16349220 ] Thomas Graves commented on SPARK-23304: --- If it helps , spark 2.3 # partitions is 317531 and spark

[jira] [Assigned] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-23307: Assignee: Shixiong Zhu > Spark UI should sort jobs/stages with the completed timestamp

[jira] [Updated] (SPARK-23292) python tests related to pandas are skipped

2018-02-01 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-23292: - Priority: Critical (was: Blocker) > python tests related to pandas are skipped >

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349340#comment-16349340 ] Xiao Li commented on SPARK-23304: - I do not think our native ORC reader respects

[jira] [Assigned] (SPARK-23306) Race condition in TaskMemoryManager

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23306: Assignee: (was: Apache Spark) > Race condition in TaskMemoryManager >

[jira] [Assigned] (SPARK-23306) Race condition in TaskMemoryManager

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23306: Assignee: Apache Spark > Race condition in TaskMemoryManager >

[jira] [Commented] (SPARK-23306) Race condition in TaskMemoryManager

2018-02-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349203#comment-16349203 ] Apache Spark commented on SPARK-23306: -- User 'zhzhan' has created a pull request for this issue:

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349227#comment-16349227 ] Thomas Graves commented on SPARK-23304: --- Ok, I just realized what you are getting at, I tried on

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Priority: Major (was: Blocker) > Spark SQL coalesce() against hive not working >

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349339#comment-16349339 ] Xiao Li commented on SPARK-23304: - Hi, [~tgraves], could you change the two SQLConf `spark.sql.orc.impl`

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Description: The query below seems to ignore the coalesce. This is running spark 2.2 or spark

[jira] [Created] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-23307: Summary: Spark UI should sort jobs/stages with the completed timestamp before cleaning up them Key: SPARK-23307 URL: https://issues.apache.org/jira/browse/SPARK-23307

[jira] [Commented] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349288#comment-16349288 ] Shixiong Zhu commented on SPARK-23307: -- cc [~vanzin] [~cloud_fan] > Spark UI should sort

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Attachment: spark23_oldorc_explain.txt spark22_oldorc_explain.txt > Spark SQL

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349369#comment-16349369 ] Thomas Graves commented on SPARK-23304: --- Note I've removed some of the columns from the output, if

[jira] [Assigned] (SPARK-23296) Diagnostics message for user code exceptions should include the stacktrace

2018-02-01 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23296: -- Assignee: Gera Shegalov > Diagnostics message for user code exceptions should include

[jira] [Resolved] (SPARK-23296) Diagnostics message for user code exceptions should include the stacktrace

2018-02-01 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23296. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20470

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349483#comment-16349483 ] Thomas Graves commented on SPARK-23309: --- sure, I can also run with the  --conf

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349270#comment-16349270 ] Thomas Graves commented on SPARK-23304: --- so with the new ORC code is there anyway to control the #

[jira] [Updated] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-23307: - Description: When you have a long running job, it may be deleted from UI quickly when it

[jira] [Commented] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2018-02-01 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349325#comment-16349325 ] Yin Huai commented on SPARK-12297: -- [~zi] has this issue got resolved in Hive? I see HIVE-12767 is still

[jira] [Commented] (SPARK-23307) Spark UI should sort jobs/stages with the completed timestamp before cleaning up them

2018-02-01 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349323#comment-16349323 ] Sameer Agarwal commented on SPARK-23307: Bumping this to a blocker for 2.3 > Spark UI should

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349431#comment-16349431 ] Thomas Graves commented on SPARK-23309: --- I'm curious if anyone else is seeing the same behavior? 

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349436#comment-16349436 ] Xiao Li commented on SPARK-23304: - In this release, we also made a change in the default of another

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349470#comment-16349470 ] Xiao Li commented on SPARK-23309: - [~tgraves] Could you first run count before you run the show?

[jira] [Updated] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23309: Priority: Blocker (was: Major) > Spark 2.3 cached query performance 20-30% worse then spark 2.2 >

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Attachment: spark23_oldorc_explain_convermetastoreorcfalse.txt > Spark SQL coalesce() against

  1   2   >