[jira] [Created] (SPARK-25339) Refactor FilterPushdownBenchmark to use main method

2018-09-04 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-25339: --- Summary: Refactor FilterPushdownBenchmark to use main method Key: SPARK-25339 URL: https://issues.apache.org/jira/browse/SPARK-25339 Project: Spark Issue

[jira] [Resolved] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25336. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22334

[jira] [Commented] (SPARK-19145) Timestamp to String casting is slowing the query significantly

2018-09-04 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603937#comment-16603937 ] Yuming Wang commented on SPARK-19145: - How about we target this ticket and

[jira] [Created] (SPARK-25338) Several tests miss calling super.afterAll() in their afterAll() method

2018-09-04 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-25338: Summary: Several tests miss calling super.afterAll() in their afterAll() method Key: SPARK-25338 URL: https://issues.apache.org/jira/browse/SPARK-25338

[jira] [Commented] (SPARK-25306) Avoid skewed filter trees to speed up `createFilter` in ORC

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603926#comment-16603926 ] Apache Spark commented on SPARK-25306: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-25306) Avoid skewed filter trees to speed up `createFilter` in ORC

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603921#comment-16603921 ] Apache Spark commented on SPARK-25306: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-25091) UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up executor memory

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603912#comment-16603912 ] Apache Spark commented on SPARK-25091: -- User 'cfangplus' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25091) UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up executor memory

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25091: Assignee: Apache Spark > UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up

[jira] [Assigned] (SPARK-25091) UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up executor memory

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25091: Assignee: (was: Apache Spark) > UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not

[jira] [Commented] (SPARK-25337) HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasou

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603865#comment-16603865 ] Dongjoon Hyun commented on SPARK-25337: --- [~srowen]. I reproduced this locally. The failure occurs

[jira] [Updated] (SPARK-25337) HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasourc

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25337: -- Description: Observed in the Scala 2.12 pull request builder consistently now. I don't see

[jira] [Closed] (SPARK-24256) ExpressionEncoder should support user-defined types as fields of Scala case class and tuple

2018-09-04 Thread Fangshi Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fangshi Li closed SPARK-24256. -- > ExpressionEncoder should support user-defined types as fields of Scala case > class and tuple >

[jira] [Resolved] (SPARK-25300) Unified the configuration parameter `spark.shuffle.service.enabled`

2018-09-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25300. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22306

[jira] [Assigned] (SPARK-25300) Unified the configuration parameter `spark.shuffle.service.enabled`

2018-09-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25300: --- Assignee: liuxian > Unified the configuration parameter `spark.shuffle.service.enabled` >

[jira] [Updated] (SPARK-25306) Avoid skewed filter trees to speed up `createFilter` in ORC

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25306: -- Description: In both ORC data sources, createFilter function has exponential time complexity

[jira] [Updated] (SPARK-25306) Avoid skewed filter trees to speed up `createFilter` in ORC

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25306: -- Summary: Avoid skewed filter trees to speed up `createFilter` in ORC (was: Use cache to

[jira] [Resolved] (SPARK-25306) Use cache to speed up `createFilter` in ORC

2018-09-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25306. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22313

[jira] [Assigned] (SPARK-25306) Use cache to speed up `createFilter` in ORC

2018-09-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25306: --- Assignee: Dongjoon Hyun > Use cache to speed up `createFilter` in ORC >

[jira] [Commented] (SPARK-25299) Use remote storage for persisting shuffle data

2018-09-04 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603807#comment-16603807 ] Matt Cheah commented on SPARK-25299: (Changed the title to "remote storage" for a little more

[jira] [Updated] (SPARK-25299) Use remote storage for persisting shuffle data

2018-09-04 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-25299: --- Summary: Use remote storage for persisting shuffle data (was: Use distributed storage for

[jira] [Comment Edited] (SPARK-19145) Timestamp to String casting is slowing the query significantly

2018-09-04 Thread Aaron Hiniker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603803#comment-16603803 ] Aaron Hiniker edited comment on SPARK-19145 at 9/5/18 1:20 AM: --- I found

[jira] [Commented] (SPARK-19145) Timestamp to String casting is slowing the query significantly

2018-09-04 Thread Aaron Hiniker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603803#comment-16603803 ] Aaron Hiniker commented on SPARK-19145: --- I found another (potentially huge) performance impact 

[jira] [Resolved] (SPARK-24256) ExpressionEncoder should support user-defined types as fields of Scala case class and tuple

2018-09-04 Thread Fangshi Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fangshi Li resolved SPARK-24256. Resolution: Won't Fix > ExpressionEncoder should support user-defined types as fields of Scala

[jira] [Commented] (SPARK-24256) ExpressionEncoder should support user-defined types as fields of Scala case class and tuple

2018-09-04 Thread Fangshi Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603800#comment-16603800 ] Fangshi Li commented on SPARK-24256: To summarize our discussion for this pr: Spark-avro is now

[jira] [Updated] (SPARK-25332) Instead of broadcast hash join ,Sort merge join has selected when restart spark-shell/spark-JDBC for hive provider

2018-09-04 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-25332: - Issue Type: Improvement (was: Bug) > Instead of broadcast hash join ,Sort merge join

[jira] [Commented] (SPARK-25332) Instead of broadcast hash join ,Sort merge join has selected when restart spark-shell/spark-JDBC for hive provider

2018-09-04 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603761#comment-16603761 ] Takeshi Yamamuro commented on SPARK-25332: -- Probably, you need to describe more about this

[jira] [Commented] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603712#comment-16603712 ] Apache Spark commented on SPARK-25258: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25258: Assignee: Apache Spark > Upgrade kryo package to version 4.0.2 >

[jira] [Commented] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603710#comment-16603710 ] Apache Spark commented on SPARK-25258: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25258: Assignee: (was: Apache Spark) > Upgrade kryo package to version 4.0.2 >

[jira] [Updated] (SPARK-25337) HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasourc

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25337: -- Description: Observed in the Scala 2.12 pull request builder consistently now. I don't see

[jira] [Commented] (SPARK-25337) HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasou

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603663#comment-16603663 ] Dongjoon Hyun commented on SPARK-25337: --- I'll take a look, [~srowen]. >

[jira] [Resolved] (SPARK-25297) Future for Scala 2.12 will block on a already shutdown ExecutionContext

2018-09-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25297. --- Resolution: Duplicate > Future for Scala 2.12 will block on a already shutdown ExecutionContext >

[jira] [Commented] (SPARK-24748) Support for reporting custom metrics via Streaming Query Progress

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603652#comment-16603652 ] Apache Spark commented on SPARK-24748: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-24748) Support for reporting custom metrics via Streaming Query Progress

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603651#comment-16603651 ] Apache Spark commented on SPARK-24748: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-24863) Report offset lag as a custom metrics for Kafka structured streaming source

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603649#comment-16603649 ] Apache Spark commented on SPARK-24863: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-24863) Report offset lag as a custom metrics for Kafka structured streaming source

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603645#comment-16603645 ] Apache Spark commented on SPARK-24863: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603646#comment-16603646 ] Apache Spark commented on SPARK-25336: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-25336: - Summary: Revert SPARK-24863 and SPARK-24748 (was: Revert SPARK-24863 and SPARK 24748) >

[jira] [Assigned] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25336: Assignee: (was: Apache Spark) > Revert SPARK-24863 and SPARK-24748 >

[jira] [Commented] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603643#comment-16603643 ] Apache Spark commented on SPARK-25336: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-25336: - Description: Revert SPARK-24863 and SPARK-24748. We will revisit them when the data source v2

[jira] [Assigned] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-25336: Assignee: Shixiong Zhu > Revert SPARK-24863 and SPARK-24748 >

[jira] [Assigned] (SPARK-25336) Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25336: Assignee: Apache Spark > Revert SPARK-24863 and SPARK-24748 >

[jira] [Created] (SPARK-25337) HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasourc

2018-09-04 Thread Sean Owen (JIRA)
Sean Owen created SPARK-25337: - Summary: HiveExternalCatalogVersionsSuite + Scala 2.12 = NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileFormat.$init$(Lorg/apache/spark/sql/execution/datasources/FileFormat;) Key:

[jira] [Created] (SPARK-25336) Revert SPARK-24863 and SPARK 24748

2018-09-04 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-25336: Summary: Revert SPARK-24863 and SPARK 24748 Key: SPARK-25336 URL: https://issues.apache.org/jira/browse/SPARK-25336 Project: Spark Issue Type: Task

[jira] [Updated] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25335: -- Description: Zinc is 23.5MB. {code} $ curl -LO

[jira] [Updated] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25335: -- Description: Zinc is 23.5MB. {code} $ curl -LO

[jira] [Assigned] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25335: Assignee: (was: Apache Spark) > Skip Zinc downloading if it's installed in the

[jira] [Assigned] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25335: Assignee: Apache Spark > Skip Zinc downloading if it's installed in the system >

[jira] [Commented] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603600#comment-16603600 ] Apache Spark commented on SPARK-25335: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Updated] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25335: -- Description: Zinc is 23.5MB. {code} $ curl -LO

[jira] [Updated] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25335: -- Priority: Minor (was: Major) > Skip Zinc downloading if it's installed in the system >

[jira] [Created] (SPARK-25335) Skip Zinc downloading if it's installed in the system

2018-09-04 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-25335: - Summary: Skip Zinc downloading if it's installed in the system Key: SPARK-25335 URL: https://issues.apache.org/jira/browse/SPARK-25335 Project: Spark

[jira] [Updated] (SPARK-25333) Ability to add new columns in Dataset in a user-defined position

2018-09-04 Thread Walid Mellouli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walid Mellouli updated SPARK-25333: --- Description: When we add new columns in a Dataset, they are added automatically at the end

[jira] [Commented] (SPARK-24316) Spark sql queries stall for column width more than 6k for parquet based table

2018-09-04 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603590#comment-16603590 ] Ruslan Dautkhanov commented on SPARK-24316: --- Thanks [~bersprockets]  Is cloudera

[jira] [Updated] (SPARK-23131) Kryo raises StackOverflow during serializing GLR model

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23131: -- Summary: Kryo raises StackOverflow during serializing GLR model (was: Stackoverflow using ML

[jira] [Commented] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603466#comment-16603466 ] Dongjoon Hyun commented on SPARK-25258: --- [~yumwang]. You wrote that you had submitted PR, but you

[jira] [Updated] (SPARK-25258) Upgrade kryo package to version 4.0.2

2018-09-04 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25258: -- Summary: Upgrade kryo package to version 4.0.2 (was: Upgrade kryo package to version 4.0.2+)

[jira] [Commented] (SPARK-24316) Spark sql queries stall for column width more than 6k for parquet based table

2018-09-04 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603453#comment-16603453 ] Bruce Robbins commented on SPARK-24316: --- This is likely SPARK-25164. > Spark sql queries stall

[jira] [Commented] (SPARK-25334) Default SessionCatalog should support UDFs

2018-09-04 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-25334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603341#comment-16603341 ] Tomasz Gawęda commented on SPARK-25334: --- If commiters say it's not very important, I can start

[jira] [Updated] (SPARK-25334) Default SessionCatalog should support UDFs

2018-09-04 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-25334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Gawęda updated SPARK-25334: -- Summary: Default SessionCatalog should support UDFs (was: Default SessionCatalog doesn't

[jira] [Created] (SPARK-25334) Default SessionCatalog doesn't support UDFs

2018-09-04 Thread JIRA
Tomasz Gawęda created SPARK-25334: - Summary: Default SessionCatalog doesn't support UDFs Key: SPARK-25334 URL: https://issues.apache.org/jira/browse/SPARK-25334 Project: Spark Issue Type:

[jira] [Updated] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Walid Mellouli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walid Mellouli updated SPARK-25333: --- External issue URL: https://github.com/apache/spark/pull/22332 Labels:

[jira] [Assigned] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25333: Assignee: (was: Apache Spark) > Ability to add new columns in the beginning of a

[jira] [Commented] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603330#comment-16603330 ] Apache Spark commented on SPARK-25333: -- User 'wmellouli' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25333: Assignee: Apache Spark > Ability to add new columns in the beginning of a Dataset >

[jira] [Commented] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603329#comment-16603329 ] Apache Spark commented on SPARK-25333: -- User 'wmellouli' has created a pull request for this issue:

[jira] [Resolved] (SPARK-25248) Audit barrier APIs for Spark 2.4

2018-09-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-25248. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22240

[jira] [Created] (SPARK-25333) Ability to add new columns in the beginning of a Dataset

2018-09-04 Thread Walid Mellouli (JIRA)
Walid Mellouli created SPARK-25333: -- Summary: Ability to add new columns in the beginning of a Dataset Key: SPARK-25333 URL: https://issues.apache.org/jira/browse/SPARK-25333 Project: Spark

[jira] [Created] (SPARK-25332) Instead of broadcast hash join ,Sort merge join has selected when restart spark-shell/spark-JDBC for hive provider

2018-09-04 Thread Babulal (JIRA)
Babulal created SPARK-25332: --- Summary: Instead of broadcast hash join ,Sort merge join has selected when restart spark-shell/spark-JDBC for hive provider Key: SPARK-25332 URL:

[jira] [Assigned] (SPARK-22666) Spark datasource for image format

2018-09-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-22666: - Assignee: Weichen Xu > Spark datasource for image format >

[jira] [Commented] (SPARK-25317) MemoryBlock performance regression

2018-09-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603174#comment-16603174 ] Marco Gaido commented on SPARK-25317: - I think I have a fix for this. I can submit a PR if you want,

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Mihaly Toth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603143#comment-16603143 ] Mihaly Toth commented on SPARK-25331: - After looking into how this could be solved there are a few

[jira] [Commented] (SPARK-25271) Creating parquet table with all the column null throws exception

2018-09-04 Thread Sujith (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603126#comment-16603126 ] Sujith commented on SPARK-25271: [~cloud_fan] [~sowen]  Will this cause a compatibility problem compare

[jira] [Commented] (SPARK-25271) Creating parquet table with all the column null throws exception

2018-09-04 Thread Sujith (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603128#comment-16603128 ] Sujith commented on SPARK-25271: cc [~hyukjin.kwon] > Creating parquet table with all the column null

[jira] [Assigned] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25331: Assignee: (was: Apache Spark) > Structured Streaming File Sink duplicates records in

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603120#comment-16603120 ] Apache Spark commented on SPARK-25331: -- User 'misutoth' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25331: Assignee: Apache Spark > Structured Streaming File Sink duplicates records in case of

[jira] [Created] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Mihaly Toth (JIRA)
Mihaly Toth created SPARK-25331: --- Summary: Structured Streaming File Sink duplicates records in case of driver failure Key: SPARK-25331 URL: https://issues.apache.org/jira/browse/SPARK-25331 Project:

[jira] [Updated] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-04 Thread Mihaly Toth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mihaly Toth updated SPARK-25331: Description: Lets assume {{FileStreamSink.addBtach}} is called and an appropriate job has been

[jira] [Commented] (SPARK-25271) Creating parquet table with all the column null throws exception

2018-09-04 Thread shivusondur (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603106#comment-16603106 ] shivusondur commented on SPARK-25271: - After further analyzing the issue i got following details In

[jira] [Commented] (SPARK-19355) Use map output statistices to improve global limit's parallelism

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602990#comment-16602990 ] Apache Spark commented on SPARK-19355: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-25176) Kryo fails to serialize a parametrised type hierarchy

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25176: Assignee: (was: Apache Spark) > Kryo fails to serialize a parametrised type

[jira] [Assigned] (SPARK-25176) Kryo fails to serialize a parametrised type hierarchy

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25176: Assignee: Apache Spark > Kryo fails to serialize a parametrised type hierarchy >

[jira] [Commented] (SPARK-25176) Kryo fails to serialize a parametrised type hierarchy

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602974#comment-16602974 ] Apache Spark commented on SPARK-25176: -- User 'wangyum' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp

2018-09-04 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602909#comment-16602909 ] Vladimir Pchelko edited comment on SPARK-20168 at 9/4/18 11:04 AM: ---

[jira] [Commented] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp

2018-09-04 Thread Vladimir Pchelko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602909#comment-16602909 ] Vladimir Pchelko commented on SPARK-20168: -- [~srowen]  this bug must be covered by unit tests

[jira] [Comment Edited] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-09-04 Thread Binzi Cao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602889#comment-16602889 ] Binzi Cao edited comment on SPARK-24189 at 9/4/18 10:49 AM: It seems I'm

[jira] [Comment Edited] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-09-04 Thread Binzi Cao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602889#comment-16602889 ] Binzi Cao edited comment on SPARK-24189 at 9/4/18 10:49 AM: It seems I'm

[jira] [Commented] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-09-04 Thread Binzi Cao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602889#comment-16602889 ] Binzi Cao commented on SPARK-24189: --- It seems I'm hitting a similar issuel. I managed to set the kafka

[jira] [Assigned] (SPARK-25328) Add an example for having two columns as the grouping key in group aggregate pandas UDF

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25328: Assignee: (was: Apache Spark) > Add an example for having two columns as the

[jira] [Assigned] (SPARK-25328) Add an example for having two columns as the grouping key in group aggregate pandas UDF

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25328: Assignee: Apache Spark > Add an example for having two columns as the grouping key in

[jira] [Commented] (SPARK-25328) Add an example for having two columns as the grouping key in group aggregate pandas UDF

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602846#comment-16602846 ] Apache Spark commented on SPARK-25328: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-22666) Spark datasource for image format

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22666: Assignee: Apache Spark > Spark datasource for image format >

[jira] [Assigned] (SPARK-22666) Spark datasource for image format

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22666: Assignee: (was: Apache Spark) > Spark datasource for image format >

[jira] [Commented] (SPARK-22666) Spark datasource for image format

2018-09-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602845#comment-16602845 ] Apache Spark commented on SPARK-22666: -- User 'WeichenXu123' has created a pull request for this

[jira] [Updated] (SPARK-25301) When a view uses an UDF from a non default database, Spark analyser throws AnalysisException

2018-09-04 Thread Vinod KC (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod KC updated SPARK-25301: - Description: When a hive view uses an UDF from a non default database, Spark analyser throws

[jira] [Updated] (SPARK-25301) When a view uses an UDF from a non default database, Spark analyser throws AnalysisException

2018-09-04 Thread Vinod KC (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod KC updated SPARK-25301: - Description: When a hive view uses an UDF from a non default database, Spark analyser throws

[jira] [Updated] (SPARK-25330) Permission issue after upgrade hadoop version to 2.7.7

2018-09-04 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25330: Description: How to reproduce: {code:java} # build spark ./dev/make-distribution.sh --name

  1   2   >