[jira] [Comment Edited] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591278#comment-16591278 ] Jungtaek Lim edited comment on SPARK-25106 at 8/24/18 7:46 AM: --- I played

[jira] [Commented] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591278#comment-16591278 ] Jungtaek Lim commented on SPARK-25106: -- I played with the project and looks like it is affected by 

[jira] [Resolved] (SPARK-25178) Directly ship the StructType objects of the keySchema / valueSchema for xxxHashMapGenerator

2018-08-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-25178. --- Resolution: Fixed Assignee: Kazuaki Ishizaki Fix Version/s: 2.4.0 Issue

[jira] [Commented] (SPARK-23698) Spark code contains numerous undefined names in Python 3

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591402#comment-16591402 ] Apache Spark commented on SPARK-23698: -- User 'cclauss' has created a pull request for this issue:

[jira] [Updated] (SPARK-25219) KMeans Clustering - Text Data - Results are incorrect

2018-08-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25219: Component/s: (was: Spark Submit) ML > KMeans Clustering - Text Data -

[jira] [Created] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status

2018-08-24 Thread Rob Vesse (JIRA)
Rob Vesse created SPARK-25222: - Summary: Spark on Kubernetes Pod Watcher dumps raw container status Key: SPARK-25222 URL: https://issues.apache.org/jira/browse/SPARK-25222 Project: Spark Issue

[jira] [Commented] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status

2018-08-24 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591386#comment-16591386 ] Rob Vesse commented on SPARK-25222: --- There is also a similar issue with task failure: {noformat}

[jira] [Commented] (SPARK-25219) KMeans Clustering - Text Data - Results are incorrect

2018-08-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591423#comment-16591423 ] Marco Gaido commented on SPARK-25219: - Hi [~VVasanth], a JIRA like this is very difficult to work

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-18-05-23-485.png > Wrong data may be returned when enable pushdown >

[jira] [Assigned] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25222: Assignee: Apache Spark > Spark on Kubernetes Pod Watcher dumps raw container status >

[jira] [Assigned] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25222: Assignee: (was: Apache Spark) > Spark on Kubernetes Pod Watcher dumps raw container

[jira] [Commented] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591434#comment-16591434 ] Apache Spark commented on SPARK-25222: -- User 'rvesse' has created a pull request for this issue:

[jira] [Created] (SPARK-25223) Use a map to store values for NamedLambdaVariable.

2018-08-24 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-25223: - Summary: Use a map to store values for NamedLambdaVariable. Key: SPARK-25223 URL: https://issues.apache.org/jira/browse/SPARK-25223 Project: Spark Issue

[jira] [Commented] (SPARK-11215) Add multiple columns support to StringIndexer

2018-08-24 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591667#comment-16591667 ] Barry Becker commented on SPARK-11215: -- Is the main motivation for this feature performance? Can

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Commented] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591756#comment-16591756 ] yucai commented on SPARK-25206: --- [~cloud_fan] , we need both [https://github.com/apache/spark/pull/21696] 

[jira] [Created] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Parth Gandhi (JIRA)
Parth Gandhi created SPARK-25231: Summary: Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver Key: SPARK-25231 URL: https://issues.apache.org/jira/browse/SPARK-25231

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: MySQL.png > Upper behaves incorrect for string contains "ß" >

[jira] [Created] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Xiaochen Ouyang (JIRA)
Xiaochen Ouyang created SPARK-25229: --- Summary: ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter Key: SPARK-25229 URL:

[jira] [Assigned] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25229: Assignee: (was: Apache Spark) > ExternalCatalogUtils.prunePartitionsByFilter throw

[jira] [Assigned] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25229: Assignee: Apache Spark > ExternalCatalogUtils.prunePartitionsByFilter throw an

[jira] [Commented] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591640#comment-16591640 ] Apache Spark commented on SPARK-25229: -- User 'ouyangxiaochen' has created a pull request for this

[jira] [Created] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-25230: --- Summary: Upper behaves incorrect for string contains "ß" Key: SPARK-25230 URL: https://issues.apache.org/jira/browse/SPARK-25230 Project: Spark Issue Type:

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-33-03-231.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-46-05-346.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25029) Scala 2.12 issues: TaskNotSerializable and Janino "Two non-abstract methods ..." errors

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25029: -- Priority: Major (was: Blocker) > Scala 2.12 issues: TaskNotSerializable and Janino "Two non-abstract

[jira] [Resolved] (SPARK-25029) Scala 2.12 issues: TaskNotSerializable and Janino "Two non-abstract methods ..." errors

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25029. --- Resolution: Fixed Assignee: Sean Owen Fix Version/s: 2.4.0 > Scala 2.12 issues:

[jira] [Updated] (SPARK-25047) Can't assign SerializedLambda to scala.Function1 in deserialization of BucketedRandomProjectionLSHModel

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25047: -- Docs Text: Release Notes text: In Scala 2.12, in some rare cases, Spark jobs will fail with an error

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: WechatIMG511.jpeg > Upper behaves incorrect for string contains "ß" >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: (was: WechatIMG511.jpeg) > Upper behaves incorrect for string contains "ß" >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: Teradata.jpeg > Upper behaves incorrect for string contains "ß" >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-34-11-539.png > Wrong data may be returned when enable pushdown >

[jira] [Assigned] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25231: Assignee: (was: Apache Spark) > Running a Large Job with Speculation On Causes

[jira] [Commented] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591818#comment-16591818 ] Apache Spark commented on SPARK-25231: -- User 'pgandhi999' has created a pull request for this

[jira] [Assigned] (SPARK-25073) Spark-submit on Yarn Task : When the yarn.nodemanager.resource.memory-mb and/or yarn.scheduler.maximum-allocation-mb is insufficient, Spark always reports an error requ

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25073: - Assignee: Sujith > Spark-submit on Yarn Task : When the yarn.nodemanager.resource.memory-mb >

[jira] [Resolved] (SPARK-25073) Spark-submit on Yarn Task : When the yarn.nodemanager.resource.memory-mb and/or yarn.scheduler.maximum-allocation-mb is insufficient, Spark always reports an error requ

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25073. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22199

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behavior incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Summary: Upper behavior incorrect for string contains "ß" (was: Upper behaves incorrect for

[jira] [Updated] (SPARK-25073) Spark-submit on Yarn Task : When the yarn.nodemanager.resource.memory-mb and/or yarn.scheduler.maximum-allocation-mb is insufficient, Spark always reports an error reque

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25073: -- Priority: Trivial (was: Minor) Issue Type: Improvement (was: Bug) This ends up just being a

[jira] [Updated] (SPARK-25029) Scala 2.12 issues: TaskNotSerializable and Janino "Two non-abstract methods ..." errors

2018-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25029: -- Docs Text: Release Notes text: Because of differences in how Scala 2.12 serializes closures, you may

[jira] [Updated] (SPARK-25230) Upper behavior incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: pr22183.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: Oracle.png > Upper behaves incorrect for string contains "ß" >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Assigned] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25231: Assignee: Apache Spark > Running a Large Job with Speculation On Causes Executor

[jira] [Commented] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Reza Safi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591889#comment-16591889 ] Reza Safi commented on SPARK-25233: --- I will send a PR shortly for this. > Give the user the option of

[jira] [Created] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Reza Safi (JIRA)
Reza Safi created SPARK-25233: - Summary: Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure Key: SPARK-25233 URL:

[jira] [Assigned] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24090: Assignee: Anirudh Ramanathan (was: Apache Spark) > Kubernetes Backend Hotlist for Spark

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592116#comment-16592116 ] Alexander commented on SPARK-7768: -- [~pgrandjean], are you thinking of writing a library for that? :) >

[jira] [Created] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-25234: - Summary: SparkR:::parallelize doesn't handle integer overflow properly Key: SPARK-25234 URL: https://issues.apache.org/jira/browse/SPARK-25234 Project: Spark

[jira] [Assigned] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24090: Assignee: Apache Spark (was: Anirudh Ramanathan) > Kubernetes Backend Hotlist for Spark

[jira] [Commented] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592001#comment-16592001 ] Apache Spark commented on SPARK-24090: -- User 'liyinan926' has created a pull request for this

[jira] [Commented] (SPARK-24391) to_json/from_json should support arrays of primitives, and more generally all JSON

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592083#comment-16592083 ] Apache Spark commented on SPARK-24391: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Updated] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-25234: -- Description: parallelize uses integer multiplication, which cannot handle size over ~47000.

[jira] [Assigned] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-25234: - Assignee: Xiangrui Meng > SparkR:::parallelize doesn't handle integer overflow

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592116#comment-16592116 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:01 PM: --- [~pgrandjean], are

[jira] [Created] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Lijie Xu (JIRA)
Lijie Xu created SPARK-25232: Summary: Support Full-Text Search in Spark SQL Key: SPARK-25232 URL: https://issues.apache.org/jira/browse/SPARK-25232 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591849#comment-16591849 ] Apache Spark commented on SPARK-25083: -- User 'xuanyuanking' has created a pull request for this

[jira] [Commented] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2018-08-24 Thread Furcy Pin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591868#comment-16591868 ] Furcy Pin commented on SPARK-10795: --- Hi, I came across this ticket with the same issue: my yarn job

[jira] [Assigned] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25233: Assignee: Apache Spark > Give the user the option of specifying a fixed minimum message

[jira] [Assigned] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25233: Assignee: (was: Apache Spark) > Give the user the option of specifying a fixed

[jira] [Commented] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591904#comment-16591904 ] Apache Spark commented on SPARK-25233: -- User 'rezasafi' has created a pull request for this issue:

[jira] [Commented] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592004#comment-16592004 ] Apache Spark commented on SPARK-25234: -- User 'mengxr' has created a pull request for this issue:

[jira] [Commented] (SPARK-19335) Spark should support doing an efficient DataFrame Upsert via JDBC

2018-08-24 Thread drew zoellner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592003#comment-16592003 ] drew zoellner commented on SPARK-19335: --- + 1 , is this still in progress? > Spark should support

[jira] [Commented] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Alexis Seigneurin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591834#comment-16591834 ] Alexis Seigneurin commented on SPARK-25106: --- I just built the code from

[jira] [Assigned] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25083: Assignee: (was: Apache Spark) > remove the type erasure hack in data source scan >

[jira] [Assigned] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25083: Assignee: Apache Spark > remove the type erasure hack in data source scan >

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:22 PM: --- I've also noticed

[jira] [Assigned] (SPARK-25174) ApplicationMaster suspends when unregistering itself from RM with extreme large diagnostic message

2018-08-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25174: -- Assignee: Kent Yao > ApplicationMaster suspends when unregistering itself from RM

[jira] [Resolved] (SPARK-25174) ApplicationMaster suspends when unregistering itself from RM with extreme large diagnostic message

2018-08-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25174. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22180

[jira] [Resolved] (SPARK-19094) Plumb through logging/error messages from the JVM to Jupyter PySpark

2018-08-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19094. - Resolution: Won't Fix No longer as important given other changes. > Plumb through logging/error

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592392#comment-16592392 ] yucai commented on SPARK-25206: --- [~dongjoon] , the reason you see `null` without predicate pushdown, it is

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:21 PM: --- I've also noticed

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:21 PM: --- I've also noticed

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:22 PM: --- I've also noticed

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander commented on SPARK-7768: -- I've also noticed that there are some idiosyncracies in the

[jira] [Commented] (SPARK-25124) VectorSizeHint.size is buggy, breaking streaming pipeline

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592174#comment-16592174 ] Apache Spark commented on SPARK-25124: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Resolved] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-25106. -- Resolution: Duplicate Thanks for reporting this. I'm closing this as a duplicate of

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:48 PM: - +1 for

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:45 PM: - +1 for

[jira] [Resolved] (SPARK-25223) Use a map to store values for NamedLambdaVariable.

2018-08-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-25223. --- Resolution: Won't Do > Use a map to store values for NamedLambdaVariable. >

[jira] [Updated] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-25-10-04-21-901.png > Wrong data may be returned for Parquet >

[jira] [Commented] (SPARK-25175) Case-insensitive field resolution when reading from ORC

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592419#comment-16592419 ] Dongjoon Hyun commented on SPARK-25175: --- [~seancxmao]. I know you are working, but could you give

[jira] [Created] (SPARK-25236) Investigate using a logging library inside of PySpark on the workers instead of print

2018-08-24 Thread holdenk (JIRA)
holdenk created SPARK-25236: --- Summary: Investigate using a logging library inside of PySpark on the workers instead of print Key: SPARK-25236 URL: https://issues.apache.org/jira/browse/SPARK-25236 Project:

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:52 PM: - +1 for

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592384#comment-16592384 ] yucai commented on SPARK-25206: --- [~dongjoon], I still think this bug is related to pushdown, but

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592403#comment-16592403 ] Dongjoon Hyun commented on SPARK-25206: --- If this is only reporting SPARK-25132, we had better

[jira] [Updated] (SPARK-25132) Case-insensitive field resolution when reading from Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25132: -- Affects Version/s: 2.2.0 > Case-insensitive field resolution when reading from Parquet >

[jira] [Commented] (SPARK-25214) Kafka v2 source may return duplicated records when `failOnDataLoss` is `false`

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592213#comment-16592213 ] Apache Spark commented on SPARK-25214: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Resolved] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-25234. --- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by

[jira] [Resolved] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Xiaochen Ouyang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaochen Ouyang resolved SPARK-25229. - Resolution: Not A Bug > ExternalCatalogUtils.prunePartitionsByFilter throw an

[jira] [Updated] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-25-09-54-53-219.png > Wrong data may be returned for Parquet >

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592402#comment-16592402 ] Dongjoon Hyun commented on SPARK-25206: --- Yes. That's my point. This is a simple duplication of

[jira] [Assigned] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25202: Assignee: Apache Spark > SQL Function Split Should Respect Limit Argument >

[jira] [Commented] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592156#comment-16592156 ] Apache Spark commented on SPARK-25202: -- User 'phegstrom' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25202: Assignee: (was: Apache Spark) > SQL Function Split Should Respect Limit Argument >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25206: -- Description: In current Spark 2.3.1, below query returns wrong data silently. {code:java}

  1   2   >