[jira] [Assigned] (SPARK-24966) Fix the precedence rule for set operations.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24966: Assignee: Apache Spark > Fix the precedence rule for set operations. >

[jira] [Commented] (SPARK-24966) Fix the precedence rule for set operations.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564772#comment-16564772 ] Apache Spark commented on SPARK-24966: -- User 'dilipbiswal' has created a pull request for this

[jira] [Assigned] (SPARK-24966) Fix the precedence rule for set operations.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24966: Assignee: (was: Apache Spark) > Fix the precedence rule for set operations. >

[jira] [Resolved] (SPARK-24951) Table valued functions should throw AnalysisException instead of IllegalArgumentException

2018-07-31 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24951. - Resolution: Fixed Fix Version/s: 2.4.0 > Table valued functions should throw AnalysisException

[jira] [Resolved] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-07-31 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-24311. -- Resolution: Won't Fix As I got feedback on

[jira] [Resolved] (SPARK-24893) Remove the entire CaseWhen if all the outputs are semantic equivalence

2018-07-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24893. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21852

[jira] [Created] (SPARK-24986) OOM in BufferHolder during writes to a stream

2018-07-31 Thread Sanket Reddy (JIRA)
Sanket Reddy created SPARK-24986: Summary: OOM in BufferHolder during writes to a stream Key: SPARK-24986 URL: https://issues.apache.org/jira/browse/SPARK-24986 Project: Spark Issue Type:

[jira] [Created] (SPARK-24985) Executing SQL with "Full Outer Join" on top of large tables when there is data skew met OOM

2018-07-31 Thread sheperd huang (JIRA)
sheperd huang created SPARK-24985: - Summary: Executing SQL with "Full Outer Join" on top of large tables when there is data skew met OOM Key: SPARK-24985 URL: https://issues.apache.org/jira/browse/SPARK-24985

[jira] [Commented] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564570#comment-16564570 ] Apache Spark commented on SPARK-23874: -- User 'BryanCutler' has created a pull request for this

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564562#comment-16564562 ] Saisai Shao commented on SPARK-24615: - Leveraging dynamic allocation to tear down and bring up

[jira] [Comment Edited] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564562#comment-16564562 ] Saisai Shao edited comment on SPARK-24615 at 8/1/18 12:35 AM: -- Leveraging

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564558#comment-16564558 ] Erik Erlandson commented on SPARK-24615: Am I understanding correctly that this can't assign

[jira] [Resolved] (SPARK-24976) Allow None for Decimal type conversion (specific to PyArrow 0.9.0)

2018-07-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24976. -- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by pull

[jira] [Assigned] (SPARK-24976) Allow None for Decimal type conversion (specific to PyArrow 0.9.0)

2018-07-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-24976: Assignee: Hyukjin Kwon > Allow None for Decimal type conversion (specific to PyArrow

[jira] [Commented] (SPARK-19394) "assertion failed: Expected hostname" on macOS when self-assigned IP contains a percent sign

2018-07-31 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564532#comment-16564532 ] Yuming Wang commented on SPARK-19394: - Try to add {{::1             localhost}} to /etc/hosts. >

[jira] [Commented] (SPARK-24977) input_file_name() result can't save and use for partitionBy()

2018-07-31 Thread kevin yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564490#comment-16564490 ] kevin yu commented on SPARK-24977: -- Hello Srinivasarao: Can you show the steps you encountered the

[jira] [Created] (SPARK-24984) Spark Streaming with xml data

2018-07-31 Thread Kavya (JIRA)
Kavya created SPARK-24984: - Summary: Spark Streaming with xml data Key: SPARK-24984 URL: https://issues.apache.org/jira/browse/SPARK-24984 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-24983) Collapsing multiple project statements with dependent When-Otherwise statements on the same column can OOM the driver

2018-07-31 Thread David Vogelbacher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Vogelbacher updated SPARK-24983: -- Description: I noticed that writing a spark job that includes many sequential

[jira] [Created] (SPARK-24983) Collapsing multiple project statements with dependent When-Otherwise statements on the same column can OOM the driver

2018-07-31 Thread David Vogelbacher (JIRA)
David Vogelbacher created SPARK-24983: - Summary: Collapsing multiple project statements with dependent When-Otherwise statements on the same column can OOM the driver Key: SPARK-24983 URL:

[jira] [Assigned] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24982: Assignee: Reynold Xin (was: Apache Spark) > UDAF resolution should not throw

[jira] [Commented] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564364#comment-16564364 ] Apache Spark commented on SPARK-24982: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24982: Assignee: Apache Spark (was: Reynold Xin) > UDAF resolution should not throw

[jira] [Assigned] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-24982: --- Assignee: Reynold Xin > UDAF resolution should not throw java.lang.AssertionError >

[jira] [Comment Edited] (SPARK-24882) data source v2 API improvement

2018-07-31 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564362#comment-16564362 ] Ryan Blue edited comment on SPARK-24882 at 7/31/18 9:02 PM: {quote}the

[jira] [Commented] (SPARK-24882) data source v2 API improvement

2018-07-31 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564362#comment-16564362 ] Ryan Blue commented on SPARK-24882: --- {quote}the problem is then we need to make `CatalogSupport` a

[jira] [Assigned] (SPARK-24973) Add numIter to Python ClusteringSummary

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-24973: - Assignee: Huaxin Gao > Add numIter to Python ClusteringSummary >

[jira] [Updated] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 2.0.0

2018-07-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18057: - Summary: Update structured streaming kafka from 0.10.0.1 to 2.0.0 (was: Update structured

[jira] [Resolved] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 2.0.0

2018-07-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18057. -- Resolution: Fixed Assignee: Ted Yu Fix Version/s: 2.4.0 > Update structured

[jira] [Commented] (SPARK-23914) High-order function: array_union(x, y) → array

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564227#comment-16564227 ] Apache Spark commented on SPARK-23914: -- User 'kiszk' has created a pull request for this issue:

[jira] [Updated] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-24982: Description: See udaf.sql.out:   {code:java} – !query 3 SELECT default.myDoubleAvg(int_col1, 3)

[jira] [Created] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-24982: --- Summary: UDAF resolution should not throw java.lang.AssertionError Key: SPARK-24982 URL: https://issues.apache.org/jira/browse/SPARK-24982 Project: Spark

[jira] [Commented] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564175#comment-16564175 ] Apache Spark commented on SPARK-24981: -- User 'hthuynh2' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24981: Assignee: Apache Spark > ShutdownHook timeout causes job to fail when succeeded when

[jira] [Assigned] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24981: Assignee: (was: Apache Spark) > ShutdownHook timeout causes job to fail when

[jira] [Commented] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564172#comment-16564172 ] Stavros Kontopoulos commented on SPARK-14643: - Cool thanks! > Remove overloaded methods

[jira] [Comment Edited] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564148#comment-16564148 ] Stavros Kontopoulos edited comment on SPARK-14643 at 7/31/18 6:38 PM:

[jira] [Comment Edited] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564148#comment-16564148 ] Stavros Kontopoulos edited comment on SPARK-14643 at 7/31/18 6:38 PM:

[jira] [Resolved] (SPARK-24609) PySpark/SparkR doc doesn't explain RandomForestClassifier.featureSubsetStrategy well

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24609. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21788

[jira] [Updated] (SPARK-24609) PySpark/SparkR doc doesn't explain RandomForestClassifier.featureSubsetStrategy well

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-24609: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > PySpark/SparkR doc doesn't

[jira] [Assigned] (SPARK-24609) PySpark/SparkR doc doesn't explain RandomForestClassifier.featureSubsetStrategy well

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-24609: - Assignee: zhengruifeng > PySpark/SparkR doc doesn't explain >

[jira] [Commented] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-07-31 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564154#comment-16564154 ] shane knapp commented on SPARK-24980: - [~hyukjin.kwon] thought you might be interested in this!  ;)

[jira] [Comment Edited] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564148#comment-16564148 ] Stavros Kontopoulos edited comment on SPARK-14643 at 7/31/18 6:34 PM:

[jira] [Comment Edited] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564148#comment-16564148 ] Stavros Kontopoulos edited comment on SPARK-14643 at 7/31/18 6:34 PM:

[jira] [Commented] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564148#comment-16564148 ] Stavros Kontopoulos commented on SPARK-14643: - [~srowen] Right now there is no such change

[jira] [Commented] (SPARK-24773) support reading AVRO logical types - Timestamp with different precisions

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564145#comment-16564145 ] Apache Spark commented on SPARK-24773: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-24773) support reading AVRO logical types - Timestamp with different precisions

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24773: Assignee: (was: Apache Spark) > support reading AVRO logical types - Timestamp with

[jira] [Assigned] (SPARK-24773) support reading AVRO logical types - Timestamp with different precisions

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24773: Assignee: Apache Spark > support reading AVRO logical types - Timestamp with different

[jira] [Resolved] (SPARK-24973) Add numIter to Python ClusteringSummary

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24973. --- Resolution: Duplicate I agree, though I think this should be thought of as one big issue. > Add

[jira] [Updated] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-07-31 Thread Hieu Tri Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hieu Tri Huynh updated SPARK-24981: --- Priority: Minor (was: Major) > ShutdownHook timeout causes job to fail when succeeded when

[jira] [Updated] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-07-31 Thread Hieu Tri Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hieu Tri Huynh updated SPARK-24981: --- Summary: ShutdownHook timeout causes job to fail when succeeded when SparkContext stop()

[jira] [Created] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called

2018-07-31 Thread Hieu Tri Huynh (JIRA)
Hieu Tri Huynh created SPARK-24981: -- Summary: ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called Key: SPARK-24981 URL: https://issues.apache.org/jira/browse/SPARK-24981

[jira] [Updated] (SPARK-24287) Spark -packages option should support classifier, no-transitive, and custom conf

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-24287: -- Issue Type: Improvement (was: Bug) This covers a lot of the same ground as 

[jira] [Commented] (SPARK-24951) Table valued functions should throw AnalysisException instead of IllegalArgumentException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564075#comment-16564075 ] Apache Spark commented on SPARK-24951: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24951) Table valued functions should throw AnalysisException instead of IllegalArgumentException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24951: Assignee: Reynold Xin (was: Apache Spark) > Table valued functions should throw

[jira] [Assigned] (SPARK-24951) Table valued functions should throw AnalysisException instead of IllegalArgumentException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24951: Assignee: Apache Spark (was: Reynold Xin) > Table valued functions should throw

[jira] [Updated] (SPARK-24971) remove SupportsDeprecatedScanRow

2018-07-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24971: Issue Type: Sub-task (was: Improvement) Parent: SPARK-22386 > remove

[jira] [Updated] (SPARK-24974) Spark put all file's paths into SharedInMemoryCache even for unused partitions.

2018-07-31 Thread andrzej.stankev...@gmail.com (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] andrzej.stankev...@gmail.com updated SPARK-24974: - Description: SharedInMemoryCache has all  filestatus no matter

[jira] [Created] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-07-31 Thread shane knapp (JIRA)
shane knapp created SPARK-24980: --- Summary: add support for pandas/arrow etc for python2.7 and pypy builds Key: SPARK-24980 URL: https://issues.apache.org/jira/browse/SPARK-24980 Project: Spark

[jira] [Commented] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563998#comment-16563998 ] Sean Owen commented on SPARK-14643: --- [~skonto] is the conclusion that there's a change involving the 

[jira] [Updated] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner

2018-07-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14540: -- Labels: release-notes (was: ) > Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner >

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24917: Assignee: (was: Apache Spark) > Sending a partition over netty results in 2x memory

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24917: Assignee: Apache Spark > Sending a partition over netty results in 2x memory usage >

[jira] [Commented] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563918#comment-16563918 ] Apache Spark commented on SPARK-24917: -- User 'vincent-grosbois' has created a pull request for this

[jira] [Assigned] (SPARK-24979) add AnalysisHelper#resolveOperatorsUp

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24979: Assignee: Apache Spark (was: Wenchen Fan) > add AnalysisHelper#resolveOperatorsUp >

[jira] [Commented] (SPARK-24979) add AnalysisHelper#resolveOperatorsUp

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563896#comment-16563896 ] Apache Spark commented on SPARK-24979: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24979) add AnalysisHelper#resolveOperatorsUp

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24979: Assignee: Wenchen Fan (was: Apache Spark) > add AnalysisHelper#resolveOperatorsUp >

[jira] [Updated] (SPARK-24979) add AnalysisHelper#resolveOperatorsUp

2018-07-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24979: Summary: add AnalysisHelper#resolveOperatorsUp (was: add resolveOperatorsUp) > add

[jira] [Created] (SPARK-24979) add resolveOperatorsUp

2018-07-31 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24979: --- Summary: add resolveOperatorsUp Key: SPARK-24979 URL: https://issues.apache.org/jira/browse/SPARK-24979 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-24536) Query with nonsensical LIMIT hits AssertionError

2018-07-31 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24536. - Resolution: Fixed Fix Version/s: 2.4.0 2.3.2 > Query with nonsensical LIMIT

[jira] [Comment Edited] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563730#comment-16563730 ] Saisai Shao edited comment on SPARK-24615 at 7/31/18 2:16 PM: -- Hi

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563730#comment-16563730 ] Saisai Shao commented on SPARK-24615: - Hi [~tgraves], I think eval() might unnecessarily break the

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563711#comment-16563711 ] Thomas Graves commented on SPARK-24615: --- so I guess my question is this the right approach at all. 

[jira] [Commented] (SPARK-24579) SPIP: Standardize Optimized Data Exchange between Spark and DL/AI frameworks

2018-07-31 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563689#comment-16563689 ] Thomas Graves commented on SPARK-24579: --- going from Spark feeds data into DL/AI frameworks for

[jira] [Resolved] (SPARK-24816) SQL interface support repartitionByRange

2018-07-31 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-24816. - Resolution: Won't Fix {{Order by}} is implement by {{rangepartitioning}}. > SQL interface

[jira] [Commented] (SPARK-24978) Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563487#comment-16563487 ] Apache Spark commented on SPARK-24978: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24978) Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24978: Assignee: (was: Apache Spark) > Add spark.sql.fast.hash.aggregate.row.max.capacity

[jira] [Assigned] (SPARK-24978) Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation.

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24978: Assignee: Apache Spark > Add spark.sql.fast.hash.aggregate.row.max.capacity to configure

[jira] [Created] (SPARK-24978) Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation.

2018-07-31 Thread caoxuewen (JIRA)
caoxuewen created SPARK-24978: - Summary: Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation. Key: SPARK-24978 URL: https://issues.apache.org/jira/browse/SPARK-24978

[jira] [Resolved] (SPARK-24968) Configurable Chunksize in ChunkedByteBufferOutputStream

2018-07-31 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent resolved SPARK-24968. - Resolution: Fixed > Configurable Chunksize in ChunkedByteBufferOutputStream >

[jira] [Commented] (SPARK-24968) Configurable Chunksize in ChunkedByteBufferOutputStream

2018-07-31 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563467#comment-16563467 ] Vincent commented on SPARK-24968: - indeed they are closely related I'll close this ticket >

[jira] [Commented] (SPARK-24946) PySpark - Allow np.Arrays and pd.Series in df.approxQuantile

2018-07-31 Thread Paul Westenthanner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563455#comment-16563455 ] Paul Westenthanner commented on SPARK-24946: Sure, go ahead :)  > PySpark - Allow np.Arrays

[jira] [Commented] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563434#comment-16563434 ] Apache Spark commented on SPARK-14540: -- User 'skonto' has created a pull request for this issue:

[jira] [Commented] (SPARK-24977) input_file_name() result can't save and use for partitionBy()

2018-07-31 Thread Srinivasarao Padala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563424#comment-16563424 ] Srinivasarao Padala commented on SPARK-24977: - i could able to see the the filenames by

[jira] [Created] (SPARK-24977) input_file_name() result can't save and use for paritionBy()

2018-07-31 Thread Srinivasarao Padala (JIRA)
Srinivasarao Padala created SPARK-24977: --- Summary: input_file_name() result can't save and use for paritionBy() Key: SPARK-24977 URL: https://issues.apache.org/jira/browse/SPARK-24977 Project:

[jira] [Updated] (SPARK-24977) input_file_name() result can't save and use for partitionBy()

2018-07-31 Thread Srinivasarao Padala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srinivasarao Padala updated SPARK-24977: Summary: input_file_name() result can't save and use for partitionBy() (was:

[jira] [Commented] (SPARK-24968) Configurable Chunksize in ChunkedByteBufferOutputStream

2018-07-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563274#comment-16563274 ] Hyukjin Kwon commented on SPARK-24968: -- So is it a duplicate of SPARK-24917 roughly? I suggest to

[jira] [Resolved] (SPARK-24949) pyspark.sql.Column breaks the iterable contract

2018-07-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24949. -- Resolution: Won't Fix Let me leave this resolved. Seems difficult to fix but I wonder if it's

[jira] [Commented] (SPARK-24970) Spark Kinesis streaming application fails to recover from streaming checkpoint due to ProvisionedThroughputExceededException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563267#comment-16563267 ] Apache Spark commented on SPARK-24970: -- User 'brucezhao11' has created a pull request for this

[jira] [Assigned] (SPARK-24970) Spark Kinesis streaming application fails to recover from streaming checkpoint due to ProvisionedThroughputExceededException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24970: Assignee: (was: Apache Spark) > Spark Kinesis streaming application fails to recover

[jira] [Assigned] (SPARK-24970) Spark Kinesis streaming application fails to recover from streaming checkpoint due to ProvisionedThroughputExceededException

2018-07-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24970: Assignee: Apache Spark > Spark Kinesis streaming application fails to recover from

[jira] [Resolved] (SPARK-24975) Spark history server REST API /api/v1/version returns error 404

2018-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-24975. - Resolution: Duplicate > Spark history server REST API /api/v1/version returns error 404 >

[jira] [Commented] (SPARK-24975) Spark history server REST API /api/v1/version returns error 404

2018-07-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563259#comment-16563259 ] Marco Gaido commented on SPARK-24975: - This seems a duplicate of SPARK-24188. Despite here I see

[jira] [Commented] (SPARK-24720) kafka transaction creates Non-consecutive Offsets (due to transaction offset) making streaming fail when failOnDataLoss=true

2018-07-31 Thread Quentin Ambard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563256#comment-16563256 ] Quentin Ambard commented on SPARK-24720: Maybe we could improve that. As we already have to 

[jira] [Resolved] (SPARK-24972) PivotFirst could not handle pivot columns of complex types

2018-07-31 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24972. - Resolution: Fixed Assignee: Maryann Xue Fix Version/s: 2.4.0 > PivotFirst could not