[jira] [Updated] (SPARK-19302) Fix the wrong item format in security.md

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19302: -- Issue Type: Improvement (was: Bug) [~sarutak] please don't make a JIRA for this kind of thing. >

[jira] [Commented] (SPARK-19307) SPARK-17387 caused ignorance of conf object passed to SparkContext:

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831572#comment-15831572 ] Sean Owen commented on SPARK-19307: --- [~yuriy_hupalo] we don't use patches. Please read

[jira] [Commented] (SPARK-19287) JavaPairRDD flatMapValues requires function returning Iterable, not Iterator

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831600#comment-15831600 ] Sean Owen commented on SPARK-19287: --- Whether it was an oversight doesn't really change things. Someone

[jira] [Resolved] (SPARK-10842) Eliminate create duplicate stage while generate job dag

2017-01-20 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-10842. -- Resolution: Duplicate I am resolving this as a duplicate per the reporter's comment

[jira] [Commented] (SPARK-19307) SPARK-17387 caused ignorance of conf object passed to SparkContext:

2017-01-20 Thread yuriy_hupalo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831609#comment-15831609 ] yuriy_hupalo commented on SPARK-19307: -- pull: https://github.com/yuriyhupalo/spark/pull/1 >

[jira] [Created] (SPARK-19310) PySpark Window over function changes behaviour regarding Order-By

2017-01-20 Thread Lucas Tittmann (JIRA)
Lucas Tittmann created SPARK-19310: -- Summary: PySpark Window over function changes behaviour regarding Order-By Key: SPARK-19310 URL: https://issues.apache.org/jira/browse/SPARK-19310 Project: Spark

[jira] [Commented] (SPARK-19288) Failure (at test_sparkSQL.R#1300): date functions on a DataFrame in R/run-tests.sh

2017-01-20 Thread Nirman Narang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831748#comment-15831748 ] Nirman Narang commented on SPARK-19288: --- [~felixcheung] Version 2.0.1 > Failure (at

[jira] [Created] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
Gregor Moehler created SPARK-19311: -- Summary: UDFs disregard UDT type hierarchy Key: SPARK-19311 URL: https://issues.apache.org/jira/browse/SPARK-19311 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregor Moehler updated SPARK-19311: --- Description: When you define UDTs based on hierarchical traits UDFs disregard the type

[jira] [Updated] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregor Moehler updated SPARK-19311: --- Description: When you define UDTs based on hierarchical traits UDFs disregard the type

[jira] [Updated] (SPARK-19302) Fix the wrong item format in security.md

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19302: -- Assignee: Kousuke Saruta > Fix the wrong item format in security.md >

[jira] [Resolved] (SPARK-19302) Fix the wrong item format in security.md

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19302. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 16653

[jira] [Updated] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregor Moehler updated SPARK-19311: --- Description: When you define UDTs based on hierarchical traits UDFs disregard the type

[jira] [Updated] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregor Moehler updated SPARK-19311: --- Description: When you define UDTs based on hierarchical traits UDFs disregard the type

[jira] [Updated] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Gregor Moehler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregor Moehler updated SPARK-19311: --- Description: When you define UDTs based on hierarchical traits UDFs disregard the type

[jira] [Updated] (SPARK-19302) Fix the wrong item format in security.md

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19302: -- Fix Version/s: (was: 3.0.0) 2.2.0 > Fix the wrong item format in security.md >

[jira] [Resolved] (SPARK-18431) Hard coded value in org.apache.spark.streaming.kinesis.KinesisReceiver

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18431. --- Resolution: Won't Fix > Hard coded value in org.apache.spark.streaming.kinesis.KinesisReceiver >

[jira] [Updated] (SPARK-19155) ML GLR string params should support both uppercase and lowercase

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Summary: ML GLR string params should support both uppercase and lowercase (was: ML estimator

[jira] [Updated] (SPARK-19155) ML GLR string params should support both uppercase and lowercase

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Description: ML {{GeneralizedLinearRegression}} should support both uppercase and lowercase. For

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Summary: MLlib GeneralizedLinearRegression family and link should case insensitive (was: ML GLR

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Description: ML {{GeneralizedLinearRegression}} only support lowercase input for {{family}} and

[jira] [Assigned] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19313: Assignee: Apache Spark > GaussianMixture throws cryptic error when number of features is

[jira] [Assigned] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19313: Assignee: (was: Apache Spark) > GaussianMixture throws cryptic error when number of

[jira] [Assigned] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18823: Assignee: (was: Apache Spark) > Assignation by column name variable not available or

[jira] [Created] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-19314: - Summary: Do not allow sort before aggregation in Structured Streaming plan Key: SPARK-19314 URL: https://issues.apache.org/jira/browse/SPARK-19314 Project: Spark

[jira] [Assigned] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19314: Assignee: Tathagata Das (was: Apache Spark) > Do not allow sort before aggregation in

[jira] [Assigned] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18120: Assignee: (was: Apache Spark) > QueryExecutionListener method doesnt' get executed

[jira] [Assigned] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18120: Assignee: Apache Spark > QueryExecutionListener method doesnt' get executed for

[jira] [Commented] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832362#comment-15832362 ] Apache Spark commented on SPARK-18120: -- User 'salilsurendran' has created a pull request for this

[jira] [Created] (SPARK-19315) StructType should support nested lookup; throws IllegalArgumentException

2017-01-20 Thread Vinay varma (JIRA)
Vinay varma created SPARK-19315: --- Summary: StructType should support nested lookup; throws IllegalArgumentException Key: SPARK-19315 URL: https://issues.apache.org/jira/browse/SPARK-19315 Project:

[jira] [Commented] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832290#comment-15832290 ] Apache Spark commented on SPARK-19313: -- User 'sethah' has created a pull request for this issue:

[jira] [Commented] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832299#comment-15832299 ] Apache Spark commented on SPARK-19314: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19314: Assignee: Apache Spark (was: Tathagata Das) > Do not allow sort before aggregation in

[jira] [Commented] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832324#comment-15832324 ] Apache Spark commented on SPARK-18823: -- User 'felixcheung' has created a pull request for this

[jira] [Assigned] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18823: Assignee: Apache Spark > Assignation by column name variable not available or bug? >

[jira] [Commented] (SPARK-17890) scala.ScalaReflectionException

2017-01-20 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832620#comment-15832620 ] Dave DeCaprio commented on SPARK-17890: --- I'm running into this also. Naively changing the above

[jira] [Created] (SPARK-19317) UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized

2017-01-20 Thread Barry Becker (JIRA)
Barry Becker created SPARK-19317: Summary: UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized Key: SPARK-19317 URL: https://issues.apache.org/jira/browse/SPARK-19317 Project: Spark

[jira] [Created] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-01-20 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19318: --- Summary: Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle` Key: SPARK-19318 URL: https://issues.apache.org/jira/browse/SPARK-19318 Project:

[jira] [Resolved] (SPARK-18589) persist() resolves "java.lang.RuntimeException: Invalid PythonUDF (...), requires attributes from more than one child"

2017-01-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18589. --- Resolution: Fixed > persist() resolves "java.lang.RuntimeException: Invalid

[jira] [Commented] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-01-20 Thread Suresh Thalamati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832659#comment-15832659 ] Suresh Thalamati commented on SPARK-19318: -- I am looking into this test failure. > Docker test

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-01-20 Thread Jonathan Alvarado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832598#comment-15832598 ] Jonathan Alvarado commented on SPARK-12837: --- I am seeing this issue on EMR 5.2.0 with Spark

[jira] [Assigned] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13478: Assignee: Apache Spark (was: Marcelo Vanzin) > Fetching delegation tokens for Hive fails

[jira] [Commented] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832617#comment-15832617 ] Apache Spark commented on SPARK-13478: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13478: Assignee: Marcelo Vanzin (was: Apache Spark) > Fetching delegation tokens for Hive fails

[jira] [Comment Edited] (SPARK-17890) scala.ScalaReflectionException

2017-01-20 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832620#comment-15832620 ] Dave DeCaprio edited comment on SPARK-17890 at 1/20/17 11:46 PM: - I'm

[jira] [Updated] (SPARK-18589) persist() resolves "java.lang.RuntimeException: Invalid PythonUDF (...), requires attributes from more than one child"

2017-01-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18589: -- Fix Version/s: 2.2.0 2.1.1 > persist() resolves

[jira] [Commented] (SPARK-19300) Executor is waiting for lock

2017-01-20 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832658#comment-15832658 ] Shixiong Zhu commented on SPARK-19300: -- Could you provide the full thread dump? Looks like there is

[jira] [Commented] (SPARK-19289) UnCache Dataset using Name

2017-01-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832663#comment-15832663 ] Xiao Li commented on SPARK-19289: - Basically, you are creating a view for that dataframe. View name is

[jira] [Created] (SPARK-19321) Support Hive 2.x's metastore

2017-01-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-19321: Summary: Support Hive 2.x's metastore Key: SPARK-19321 URL: https://issues.apache.org/jira/browse/SPARK-19321 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-16101) Refactoring CSV data source to be consistent with JSON data source

2017-01-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16101. - Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.2.0 > Refactoring CSV

[jira] [Resolved] (SPARK-14536) NPE in JDBCRDD when array column contains nulls (postgresql)

2017-01-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-14536. - Resolution: Fixed Assignee: Suresh Thalamati Fix Version/s: 2.2.0 > NPE in JDBCRDD when

[jira] [Comment Edited] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404 ] Yun Ni edited comment on SPARK-18392 at 1/20/17 9:15 PM: - Hi David, Thanks for

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832458#comment-15832458 ] Jisoo Kim commented on SPARK-19111: --- Thanks [~ste...@apache.org] for information, using S3a helped with

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:32 PM: - Not

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:33 PM: - Not

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:32 PM: - Not

[jira] [Commented] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2017-01-20 Thread Drew Robb (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832569#comment-15832569 ] Drew Robb commented on SPARK-16599: --- I encountered an identical exception when using a singleton spark

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404 ] Yun Ni commented on SPARK-18392: Hi David, Thanks for the question. I did not group the records by their

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487 ] Yun Ni commented on SPARK-18392: Yes, comparing if the hash signature equals is faster than computing the

[jira] [Updated] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-18750: --- Description: When running Sql queries on large datasets. Job fails with stack overflow

[jira] [Resolved] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19314. --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.3

[jira] [Comment Edited] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487 ] Yun Ni edited comment on SPARK-18392 at 1/20/17 10:29 PM: -- Yes, comparing if the

[jira] [Created] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
Jisoo Kim created SPARK-19316: - Summary: Spark event logs are huge compared to 1.5.2 Key: SPARK-19316 URL: https://issues.apache.org/jira/browse/SPARK-19316 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-19296) Awkward changes for JdbcUtils.saveTable in Spark 2.1.0

2017-01-20 Thread Paul Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831033#comment-15831033 ] Paul Wu edited comment on SPARK-19296 at 1/20/17 9:52 PM: -- We found this Util is

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread David S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832464#comment-15832464 ] David S commented on SPARK-18392: - Hi Yun and thanks for the answer, but my question now is, are there

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832473#comment-15832473 ] Jisoo Kim commented on SPARK-19111: --- Related to https://issues.apache.org/jira/browse/SPARK-19316. >

[jira] [Commented] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832470#comment-15832470 ] Jisoo Kim commented on SPARK-19316: --- Related to https://issues.apache.org/jira/browse/SPARK-19111 >

[jira] [Commented] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca commented on SPARK-18859: --- Not quite a repro, but here's explain output. {noformat}

[jira] [Issue Comment Deleted] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2017-01-20 Thread Drew Robb (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Robb updated SPARK-16599: -- Comment: was deleted (was: I encountered an identical exception when using a singleton spark session.

[jira] [Commented] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832684#comment-15832684 ] Marcelo Vanzin commented on SPARK-18750: Yay, I can reproduce it with a unit test against

[jira] [Created] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19319: - Summary: SparkR Kmeans summary returns error when the cluster size doesn't equal to k Key: SPARK-19319 URL: https://issues.apache.org/jira/browse/SPARK-19319 Project:

[jira] [Assigned] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19319: Assignee: Apache Spark > SparkR Kmeans summary returns error when the cluster size

[jira] [Commented] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832698#comment-15832698 ] Apache Spark commented on SPARK-19319: -- User 'wangmiao1981' has created a pull request for this

[jira] [Assigned] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19319: Assignee: (was: Apache Spark) > SparkR Kmeans summary returns error when the cluster

[jira] [Assigned] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18750: Assignee: Apache Spark > spark should be able to control the number of executor and

[jira] [Commented] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832703#comment-15832703 ] Apache Spark commented on SPARK-18750: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18750: Assignee: (was: Apache Spark) > spark should be able to control the number of

[jira] [Commented] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832710#comment-15832710 ] Jisoo Kim commented on SPARK-19316: --- I suspect this is due to "SparkListenrTaskEnd" event log having

[jira] [Created] (SPARK-19320) Allow guaranteed amount of GPU to be used when launching jobs

2017-01-20 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-19320: Summary: Allow guaranteed amount of GPU to be used when launching jobs Key: SPARK-19320 URL: https://issues.apache.org/jira/browse/SPARK-19320 Project: Spark

[jira] [Resolved] (SPARK-19267) Fix a race condition when stopping StateStore

2017-01-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19267. --- Resolution: Fixed Fix Version/s: 3.0.0 2.1.1 Issue resolved by

[jira] [Resolved] (SPARK-19305) partitioned table should always put partition columns at the end of table schema

2017-01-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19305. - Resolution: Fixed Issue resolved by pull request 16655

[jira] [Commented] (SPARK-19288) Failure (at test_sparkSQL.R#1300): date functions on a DataFrame in R/run-tests.sh

2017-01-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832837#comment-15832837 ] Felix Cheung commented on SPARK-19288: -- hmm, that's odd. what system and R version? I'm wondering if

[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-18788: Assignee: Felix Cheung > Add getNumPartitions() to SparkR >

[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18788: Assignee: (was: Apache Spark) > Add getNumPartitions() to SparkR >

[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18788: Assignee: Apache Spark > Add getNumPartitions() to SparkR >

[jira] [Commented] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832873#comment-15832873 ] Apache Spark commented on SPARK-18788: -- User 'felixcheung' has created a pull request for this

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Description: ML {{GeneralizedLinearRegression}} should support both uppercase and lowercase. For

[jira] [Assigned] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang reassigned SPARK-19155: --- Assignee: Yanbo Liang > MLlib GeneralizedLinearRegression family and link should case

[jira] [Comment Edited] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831888#comment-15831888 ] Liang-Chi Hsieh edited comment on SPARK-19311 at 1/20/17 3:06 PM: --

[jira] [Commented] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832006#comment-15832006 ] Jason White commented on SPARK-19299: - Also seeing this same behaviour in Spark 2.0.1 when creating a

[jira] [Commented] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831888#comment-15831888 ] Liang-Chi Hsieh commented on SPARK-19311: - [~Gregor Moehler] I think you already have the fixing.

[jira] [Commented] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831938#comment-15831938 ] Apache Spark commented on SPARK-19311: -- User 'gmoehler' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19311: Assignee: (was: Apache Spark) > UDFs disregard UDT type hierarchy >

[jira] [Commented] (SPARK-16683) Group by does not work after multiple joins of the same dataframe

2017-01-20 Thread Andrew Ray (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831878#comment-15831878 ] Andrew Ray commented on SPARK-16683: I'm working on a solution for this > Group by does not work

[jira] [Assigned] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19311: Assignee: Apache Spark > UDFs disregard UDT type hierarchy >

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2017-01-20 Thread Junfeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831964#comment-15831964 ] Junfeng commented on SPARK-17602: - [~davies] the trouble really is the python worker share mode is not

[jira] [Created] (SPARK-19312) Spark gives wrong error message when failes to create file due to hdfs quota limit.

2017-01-20 Thread Rivkin Andrey (JIRA)
Rivkin Andrey created SPARK-19312: - Summary: Spark gives wrong error message when failes to create file due to hdfs quota limit. Key: SPARK-19312 URL: https://issues.apache.org/jira/browse/SPARK-19312

[jira] [Commented] (SPARK-19307) SPARK-17387 caused ignorance of conf object passed to SparkContext:

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831548#comment-15831548 ] Sean Owen commented on SPARK-19307: --- CC [~zjffdu] > SPARK-17387 caused ignorance of conf object passed

[jira] [Resolved] (SPARK-19301) SparkContext is ignoring SparkConf when _jvm is not initialized on spark-submit

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19301. --- Resolution: Duplicate > SparkContext is ignoring SparkConf when _jvm is not initialized on >

[jira] [Updated] (SPARK-8273) Driver hangs up when yarn shutdown in client mode

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-8273: - Assignee: Tao Wang > Driver hangs up when yarn shutdown in client mode >

  1   2   >