[jira] [Comment Edited] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-12-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769384#comment-15769384 ] jin xing edited comment on SPARK-15725 at 12/22/16 7:54 AM:

[jira] [Commented] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-12-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769384#comment-15769384 ] jin xing commented on SPARK-15725: -- [~b...@cloudera.com] May I ask two questions? 1. "a large stage will

[jira] [Assigned] (SPARK-18975) Add an API to remove SparkListener from SparkContext

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18975: Assignee: Apache Spark > Add an API to remove SparkListener from SparkContext >

[jira] [Commented] (SPARK-18975) Add an API to remove SparkListener from SparkContext

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769375#comment-15769375 ] Apache Spark commented on SPARK-18975: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18975) Add an API to remove SparkListener from SparkContext

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18975: Assignee: (was: Apache Spark) > Add an API to remove SparkListener from SparkContext

[jira] [Created] (SPARK-18975) Add an API to remove SparkListener from SparkContext

2016-12-21 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-18975: --- Summary: Add an API to remove SparkListener from SparkContext Key: SPARK-18975 URL: https://issues.apache.org/jira/browse/SPARK-18975 Project: Spark Issue

[jira] [Resolved] (SPARK-18908) It's hard for the user to see the failure if StreamExecution fails to create the logical plan

2016-12-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18908. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > It's hard for the

[jira] [Created] (SPARK-18974) FileInputDStream could not detected files which moved to the directory

2016-12-21 Thread Adam Wang (JIRA)
Adam Wang created SPARK-18974: - Summary: FileInputDStream could not detected files which moved to the directory Key: SPARK-18974 URL: https://issues.apache.org/jira/browse/SPARK-18974 Project: Spark

[jira] [Commented] (SPARK-18964) HiveContext does not support Time Interval Literals

2016-12-21 Thread Suhas Nalapure (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769073#comment-15769073 ] Suhas Nalapure commented on SPARK-18964: Right, the assumption is "In addition to the basic

[jira] [Assigned] (SPARK-18973) Remove SortPartitions and RedistributeData

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18973: Assignee: Reynold Xin (was: Apache Spark) > Remove SortPartitions and RedistributeData >

[jira] [Commented] (SPARK-18973) Remove SortPartitions and RedistributeData

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768988#comment-15768988 ] Apache Spark commented on SPARK-18973: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18973) Remove SortPartitions and RedistributeData

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18973: Assignee: Apache Spark (was: Reynold Xin) > Remove SortPartitions and RedistributeData >

[jira] [Created] (SPARK-18973) Remove SortPartitions and RedistributeData

2016-12-21 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18973: --- Summary: Remove SortPartitions and RedistributeData Key: SPARK-18973 URL: https://issues.apache.org/jira/browse/SPARK-18973 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-18972) Fix the netty thread names for RPC

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768973#comment-15768973 ] Apache Spark commented on SPARK-18972: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18972) Fix the netty thread names for RPC

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18972: Assignee: Apache Spark (was: Shixiong Zhu) > Fix the netty thread names for RPC >

[jira] [Assigned] (SPARK-18972) Fix the netty thread names for RPC

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18972: Assignee: Shixiong Zhu (was: Apache Spark) > Fix the netty thread names for RPC >

[jira] [Created] (SPARK-18972) Fix the netty thread names for RPC

2016-12-21 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18972: Summary: Fix the netty thread names for RPC Key: SPARK-18972 URL: https://issues.apache.org/jira/browse/SPARK-18972 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-18933) Different log output between Terminal screen and stderr file

2016-12-21 Thread Sean Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Wong resolved SPARK-18933. --- Resolution: Not A Bug This is not a bug, but the log system can be future improved. > Different log

[jira] [Commented] (SPARK-18933) Different log output between Terminal screen and stderr file

2016-12-21 Thread Sean Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768775#comment-15768775 ] Sean Wong commented on SPARK-18933: --- Finally, I have got the answer. It's not the bug but the log

[jira] [Created] (SPARK-18971) Netty issue may cause the shuffle client hang

2016-12-21 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18971: Summary: Netty issue may cause the shuffle client hang Key: SPARK-18971 URL: https://issues.apache.org/jira/browse/SPARK-18971 Project: Spark Issue Type:

[jira] [Updated] (SPARK-18970) FileSource failure during file list refresh doesn't cause an application to fail, but stops further processing

2016-12-21 Thread Lev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lev updated SPARK-18970: Summary: FileSource failure during file list refresh doesn't cause an application to fail, but stops further

[jira] [Updated] (SPARK-18970) FileSource failure during refresh doesn't cause an application to fail, but stops further processing

2016-12-21 Thread Lev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lev updated SPARK-18970: Summary: FileSource failure during refresh doesn't cause an application to fail, but stops further processing

[jira] [Created] (SPARK-18970) FileSource filure during refresh doesn't cause an application to fail, but stops further processing

2016-12-21 Thread Lev (JIRA)
Lev created SPARK-18970: --- Summary: FileSource filure during refresh doesn't cause an application to fail, but stops further processing Key: SPARK-18970 URL: https://issues.apache.org/jira/browse/SPARK-18970

[jira] [Issue Comment Deleted] (SPARK-18933) Different log output between Terminal screen and stderr file

2016-12-21 Thread Sean Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Wong updated SPARK-18933: -- Comment: was deleted (was: I found that Loginfo codes in the transformation() or actions() are only

[jira] [Reopened] (SPARK-18933) Different log output between Terminal screen and stderr file

2016-12-21 Thread Sean Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Wong reopened SPARK-18933: --- I found that Loginfo codes in the transformation() or actions() are only shown in the terminal screen.

[jira] [Commented] (SPARK-18933) Different log output between Terminal screen and stderr file

2016-12-21 Thread Sean Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768739#comment-15768739 ] Sean Wong commented on SPARK-18933: --- I found that Loginfo codes in the transformation() or actions()

[jira] [Assigned] (SPARK-18969) PullOutNondeterministic should work for Aggregate operator

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18969: Assignee: Apache Spark (was: Reynold Xin) > PullOutNondeterministic should work for

[jira] [Commented] (SPARK-18969) PullOutNondeterministic should work for Aggregate operator

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768706#comment-15768706 ] Apache Spark commented on SPARK-18969: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18969) PullOutNondeterministic should work for Aggregate operator

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18969: Assignee: Reynold Xin (was: Apache Spark) > PullOutNondeterministic should work for

[jira] [Created] (SPARK-18969) PullOutNondeterministic should work for Aggregate operator

2016-12-21 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18969: --- Summary: PullOutNondeterministic should work for Aggregate operator Key: SPARK-18969 URL: https://issues.apache.org/jira/browse/SPARK-18969 Project: Spark

[jira] [Commented] (SPARK-18959) invalid resource statistics for standalone cluster

2016-12-21 Thread hustfxj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768696#comment-15768696 ] hustfxj commented on SPARK-18959: - 2.2.0-SNAPSHOT > invalid resource statistics for standalone cluster >

[jira] [Resolved] (SPARK-18903) uiWebUrl is not accessible to SparkR

2016-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-18903. -- Resolution: Fixed Assignee: Felix Cheung Target Version/s: 2.2.0 >

[jira] [Resolved] (SPARK-18528) limit + groupBy leads to java.lang.NullPointerException

2016-12-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18528. --- Resolution: Fixed Assignee: Takeshi Yamamuro Fix Version/s: 2.2.0

[jira] [Resolved] (SPARK-18234) Update mode in structured streaming

2016-12-21 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18234. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved by

[jira] [Updated] (SPARK-17807) Scalatest listed as compile dependency in spark-tags

2016-12-21 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-17807: --- Fix Version/s: 2.2.0 > Scalatest listed as compile dependency in spark-tags >

[jira] [Resolved] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-12-21 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18588. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved by

[jira] [Created] (SPARK-18968) .sparkStaging quickly fill up HDFS

2016-12-21 Thread Chen He (JIRA)
Chen He created SPARK-18968: --- Summary: .sparkStaging quickly fill up HDFS Key: SPARK-18968 URL: https://issues.apache.org/jira/browse/SPARK-18968 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-18775) Limit the max number of records written per file

2016-12-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18775. --- Resolution: Fixed Fix Version/s: 2.2.0 > Limit the max number of records

[jira] [Commented] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-21 Thread Ilya Matiach (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768372#comment-15768372 ] Ilya Matiach commented on SPARK-18036: -- Thanks, I've sent a pull request to fix this. > Decision

[jira] [Assigned] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18036: Assignee: Apache Spark > Decision Trees do not handle edge cases >

[jira] [Commented] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768356#comment-15768356 ] Apache Spark commented on SPARK-18036: -- User 'imatiach-msft' has created a pull request for this

[jira] [Assigned] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18036: Assignee: (was: Apache Spark) > Decision Trees do not handle edge cases >

[jira] [Updated] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18700: -- Fix Version/s: 2.0.3 > getCached in HiveMetastoreCatalog not thread safe cause driver

[jira] [Updated] (SPARK-18949) Add recoverPartitions API to Catalog

2016-12-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18949: Fix Version/s: 2.2.0 > Add recoverPartitions API to Catalog >

[jira] [Commented] (SPARK-10523) SparkR formula syntax to turn strings/factors into numerics

2016-12-21 Thread Vincent Warmerdam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768274#comment-15768274 ] Vincent Warmerdam commented on SPARK-10523: --- more machine learning models via h2o and a more

[jira] [Assigned] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18967: Assignee: Apache Spark (was: Imran Rashid) > Locality preferences should be used when

[jira] [Assigned] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18967: Assignee: Imran Rashid (was: Apache Spark) > Locality preferences should be used when

[jira] [Commented] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768269#comment-15768269 ] Apache Spark commented on SPARK-18967: -- User 'squito' has created a pull request for this issue:

[jira] [Commented] (SPARK-10523) SparkR formula syntax to turn strings/factors into numerics

2016-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768165#comment-15768165 ] Felix Cheung commented on SPARK-10523: -- [~cantdutchthis]I'm curious, do you know why all your

[jira] [Resolved] (SPARK-18954) Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18954. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 2.0.3

[jira] [Resolved] (SPARK-18031) Flaky test: org.apache.spark.streaming.scheduler.ExecutorAllocationManagerSuite basic functionality

2016-12-21 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18031. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16321

[jira] [Updated] (SPARK-18925) Reduce memory usage of mapWithState

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Pchelko updated SPARK-18925: - Description: With default settings mapWithState leads to storing up to 10 copies of

[jira] [Updated] (SPARK-18031) Flaky test: org.apache.spark.streaming.scheduler.ExecutorAllocationManagerSuite basic functionality

2016-12-21 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18031: -- Fix Version/s: (was: 2.1.0) 2.1.1 > Flaky test: >

[jira] [Updated] (SPARK-18925) Reduce memory usage of mapWithState

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Pchelko updated SPARK-18925: - Description: With default settings mapWithState leads to storing up to 10 copies of

[jira] [Commented] (SPARK-16951) Alternative implementation of NOT IN to Anti-join

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767904#comment-15767904 ] Nattavut Sutyanyong commented on SPARK-16951: - I was wrong on case 3 when the subquery

[jira] [Comment Edited] (SPARK-16951) Alternative implementation of NOT IN to Anti-join

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412882#comment-15412882 ] Nattavut Sutyanyong edited comment on SPARK-16951 at 12/21/16 7:13 PM:

[jira] [Comment Edited] (SPARK-16951) Alternative implementation of NOT IN to Anti-join

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412878#comment-15412878 ] Nattavut Sutyanyong edited comment on SPARK-16951 at 12/21/16 7:10 PM:

[jira] [Commented] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2016-12-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767824#comment-15767824 ] Imran Rashid commented on SPARK-18967: -- cc [~kayousterhout] [~markhamstra] [~mridulm80] > Locality

[jira] [Resolved] (SPARK-18894) Event time watermark delay threshold specified in months or years gives incorrect results

2016-12-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18894. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > Event time watermark

[jira] [Comment Edited] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767783#comment-15767783 ] Nattavut Sutyanyong edited comment on SPARK-18966 at 12/21/16 6:34 PM:

[jira] [Comment Edited] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767783#comment-15767783 ] Nattavut Sutyanyong edited comment on SPARK-18966 at 12/21/16 6:34 PM:

[jira] [Commented] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767783#comment-15767783 ] Nattavut Sutyanyong commented on SPARK-18966: - {code} == Analyzed Logical Plan == a1: int,

[jira] [Updated] (SPARK-18886) Delay scheduling should not delay some executors indefinitely if one task is scheduled before delay timeout

2016-12-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-18886: - Description: Delay scheduling can introduce an unbounded delay and underutilization of cluster

[jira] [Created] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2016-12-21 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-18967: Summary: Locality preferences should be used when scheduling even when delay scheduling is turned off Key: SPARK-18967 URL: https://issues.apache.org/jira/browse/SPARK-18967

[jira] [Created] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
Nattavut Sutyanyong created SPARK-18966: --- Summary: NOT IN subquery with correlated expressions may return incorrect result Key: SPARK-18966 URL: https://issues.apache.org/jira/browse/SPARK-18966

[jira] [Resolved] (SPARK-18951) Upgrade com.thoughtworks.paranamer/paranamer to 2.6

2016-12-21 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-18951. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16359

[jira] [Commented] (SPARK-18863) Output non-aggregate expressions without GROUP BY in a subquery does not yield an error

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767537#comment-15767537 ] Nattavut Sutyanyong commented on SPARK-18863: - Another case to invetigate

[jira] [Comment Edited] (SPARK-18863) Output non-aggregate expressions without GROUP BY in a subquery does not yield an error

2016-12-21 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767537#comment-15767537 ] Nattavut Sutyanyong edited comment on SPARK-18863 at 12/21/16 4:54 PM:

[jira] [Resolved] (SPARK-18564) mapWithState: add configuration for DEFAULT_CHECKPOINT_DURATION_MULTIPLIER

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18564. --- Resolution: Not A Problem Agree, you can set this interval directly if desired. > mapWithState: add

[jira] [Commented] (SPARK-18965) wholeTextFiles() is not able to read large files

2016-12-21 Thread Pradeep Misra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767462#comment-15767462 ] Pradeep Misra commented on SPARK-18965: --- The reason I raised it as a bug is due to fact that even

[jira] [Resolved] (SPARK-18965) wholeTextFiles() is not able to read large files

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18965. --- Resolution: Invalid wholeTextFiles reads whole files into memory. WIth a large enough file, you run

[jira] [Resolved] (SPARK-18914) Local UDTs test (org.apache.spark.sql.UserDefinedTypeSuite) fails due to "ClassCastException: java.lang.Integer cannot be cast to org.apache.spark.sql.UDT$MyDenseVector

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18914. --- Resolution: Duplicate OK, but then this isn't a bug in Spark today. It's something caused by a

[jira] [Created] (SPARK-18965) wholeTextFiles() is not able to read large files

2016-12-21 Thread Pradeep Misra (JIRA)
Pradeep Misra created SPARK-18965: - Summary: wholeTextFiles() is not able to read large files Key: SPARK-18965 URL: https://issues.apache.org/jira/browse/SPARK-18965 Project: Spark Issue

[jira] [Commented] (SPARK-18959) invalid resource statistics for standalone cluster

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767343#comment-15767343 ] Sean Owen commented on SPARK-18959: --- What version of Spark? this sounds like something fixed a long

[jira] [Comment Edited] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-21 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767276#comment-15767276 ] Yanbo Liang edited comment on SPARK-18710 at 12/21/16 3:14 PM: ---

[jira] [Commented] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-21 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767276#comment-15767276 ] Yanbo Liang commented on SPARK-18710: - {{IterativelyReweightedLeastSquares}} is {{private[ml]}}, we

[jira] [Comment Edited] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-21 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767276#comment-15767276 ] Yanbo Liang edited comment on SPARK-18710 at 12/21/16 3:13 PM: ---

[jira] [Resolved] (SPARK-18956) Python API should reuse existing SparkSession while creating new SQLContext instances

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18956. --- Resolution: Duplicate > Python API should reuse existing SparkSession while creating new SQLContext

[jira] [Commented] (SPARK-18956) Python API should reuse existing SparkSession while creating new SQLContext instances

2016-12-21 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766987#comment-15766987 ] Liang-Chi Hsieh commented on SPARK-18956: - Yeah, I think so. > Python API should reuse existing

[jira] [Resolved] (SPARK-18947) SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18947. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 2.0.3

[jira] [Commented] (SPARK-18956) Python API should reuse existing SparkSession while creating new SQLContext instances

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766797#comment-15766797 ] Sean Owen commented on SPARK-18956: --- Seems like a duplicate of / closely related to SPARK-18687 >

[jira] [Updated] (SPARK-18957) when WAL time out, loss data

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18957: -- Flags: (was: Important) Target Version/s: (was: 1.6.2) Priority: Major

[jira] [Commented] (SPARK-18964) HiveContext does not support Time Interval Literals

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766789#comment-15766789 ] Sean Owen commented on SPARK-18964: --- Is that a bug? the docs describe Spark SQL functionality, not

[jira] [Created] (SPARK-18964) HiveContext does not support Time Interval Literals

2016-12-21 Thread Suhas Nalapure (JIRA)
Suhas Nalapure created SPARK-18964: -- Summary: HiveContext does not support Time Interval Literals Key: SPARK-18964 URL: https://issues.apache.org/jira/browse/SPARK-18964 Project: Spark

[jira] [Commented] (SPARK-18963) Test Failuire on big endian; o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766737#comment-15766737 ] Apache Spark commented on SPARK-18963: -- User 'robbinspg' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18963) Test Failuire on big endian; o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18963: Assignee: (was: Apache Spark) > Test Failuire on big endian; >

[jira] [Assigned] (SPARK-18963) Test Failuire on big endian; o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18963: Assignee: Apache Spark > Test Failuire on big endian; >

[jira] [Created] (SPARK-18963) Test Failuire on big endian; o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray

2016-12-21 Thread Pete Robbins (JIRA)
Pete Robbins created SPARK-18963: Summary: Test Failuire on big endian; o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray Key: SPARK-18963 URL: https://issues.apache.org/jira/browse/SPARK-18963

[jira] [Assigned] (SPARK-18925) Reduce memory usage of mapWithState

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18925: Assignee: (was: Apache Spark) > Reduce memory usage of mapWithState >

[jira] [Commented] (SPARK-18925) Reduce memory usage of mapWithState

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766610#comment-15766610 ] Apache Spark commented on SPARK-18925: -- User 'vpchelko' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18925) Reduce memory usage of mapWithState

2016-12-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18925: Assignee: Apache Spark > Reduce memory usage of mapWithState >

[jira] [Comment Edited] (SPARK-18564) mapWithState: add configuration for DEFAULT_CHECKPOINT_DURATION_MULTIPLIER

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766561#comment-15766561 ] Vladimir Pchelko edited comment on SPARK-18564 at 12/21/16 9:21 AM:

[jira] [Comment Edited] (SPARK-18564) mapWithState: add configuration for DEFAULT_CHECKPOINT_DURATION_MULTIPLIER

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766561#comment-15766561 ] Vladimir Pchelko edited comment on SPARK-18564 at 12/21/16 9:21 AM:

[jira] [Comment Edited] (SPARK-18564) mapWithState: add configuration for DEFAULT_CHECKPOINT_DURATION_MULTIPLIER

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766561#comment-15766561 ] Vladimir Pchelko edited comment on SPARK-18564 at 12/21/16 9:20 AM:

[jira] [Commented] (SPARK-18564) mapWithState: add configuration for DEFAULT_CHECKPOINT_DURATION_MULTIPLIER

2016-12-21 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766561#comment-15766561 ] Vladimir Pchelko commented on SPARK-18564: -- Currently user can modify interval of mapWithState

[jira] [Resolved] (SPARK-18962) Unable to create parquet file for the given data

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18962. --- Resolution: Invalid Per https://issues.apache.org/jira/browse/SPARK-18877 this should be a Parquet

[jira] [Updated] (SPARK-18923) Support SKIP_PYTHONDOC/RDOC in doc generation

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18923: -- Assignee: Dongjoon Hyun > Support SKIP_PYTHONDOC/RDOC in doc generation >

[jira] [Resolved] (SPARK-18923) Support SKIP_PYTHONDOC/RDOC in doc generation

2016-12-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18923. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16336

[jira] [Updated] (SPARK-18962) Unable to create parquet file for the given data

2016-12-21 Thread Navya Krishnappa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navya Krishnappa updated SPARK-18962: - Affects Version/s: 2.0.2 > Unable to create parquet file for the given data >

[jira] [Created] (SPARK-18962) Unable to create parquet file for the given data

2016-12-21 Thread Navya Krishnappa (JIRA)
Navya Krishnappa created SPARK-18962: Summary: Unable to create parquet file for the given data Key: SPARK-18962 URL: https://issues.apache.org/jira/browse/SPARK-18962 Project: Spark

  1   2   >