[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978178#comment-15978178 ] Kazuaki Ishizaki commented on SPARK-20392: -- Is there a program to reproduce this

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978157#comment-15978157 ] Nick Pentreath commented on SPARK-20392: cc [~viirya] > Slow performance when c

[jira] [Assigned] (SPARK-20423) fix MLOR coeffs centering when reg == 0

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20423: Assignee: (was: Apache Spark) > fix MLOR coeffs centering when reg == 0 >

[jira] [Assigned] (SPARK-20423) fix MLOR coeffs centering when reg == 0

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20423: Assignee: Apache Spark > fix MLOR coeffs centering when reg == 0 > ---

[jira] [Commented] (SPARK-20423) fix MLOR coeffs centering when reg == 0

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978149#comment-15978149 ] Apache Spark commented on SPARK-20423: -- User 'WeichenXu123' has created a pull reque

[jira] [Created] (SPARK-20423) fix MLOR coeffs centering when reg == 0

2017-04-20 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-20423: -- Summary: fix MLOR coeffs centering when reg == 0 Key: SPARK-20423 URL: https://issues.apache.org/jira/browse/SPARK-20423 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-20416) Column names inconsistent for UDFs in SQL vs Dataset

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20416: Assignee: Apache Spark > Column names inconsistent for UDFs in SQL vs Dataset > --

[jira] [Assigned] (SPARK-20416) Column names inconsistent for UDFs in SQL vs Dataset

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20416: Assignee: (was: Apache Spark) > Column names inconsistent for UDFs in SQL vs Dataset >

[jira] [Commented] (SPARK-20416) Column names inconsistent for UDFs in SQL vs Dataset

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977999#comment-15977999 ] Apache Spark commented on SPARK-20416: -- User 'maropu' has created a pull request for

[jira] [Created] (SPARK-20422) Worker registration retries should be configurable

2017-04-20 Thread Cody (JIRA)
Cody created SPARK-20422: Summary: Worker registration retries should be configurable Key: SPARK-20422 URL: https://issues.apache.org/jira/browse/SPARK-20422 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-20329) Resolution error when HAVING clause uses GROUP BY expression that involves implicit type coercion

2017-04-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20329. - Resolution: Fixed Assignee: Herman van Hovell Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-20367) Spark silently escapes partition column names

2017-04-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20367. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.0 > Spark silently escapes p

[jira] [Assigned] (SPARK-20367) Spark silently escapes partition column names

2017-04-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20367: --- Assignee: Juliusz Sompolski > Spark silently escapes partition column names > --

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977896#comment-15977896 ] Helena Edelson commented on SPARK-18057: I think this fix in 0.10.2.0 was a big p

[jira] [Assigned] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19951: Assignee: Apache Spark > Add string concatenate operator || to Spark SQL > ---

[jira] [Assigned] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19951: Assignee: (was: Apache Spark) > Add string concatenate operator || to Spark SQL >

[jira] [Commented] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977887#comment-15977887 ] Apache Spark commented on SPARK-19951: -- User 'maropu' has created a pull request for

[jira] [Commented] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977885#comment-15977885 ] Takeshi Yamamuro commented on SPARK-19951: -- okay, thanks! > Add string concaten

[jira] [Resolved] (SPARK-20172) Event log without read permission should be filtered out before actually reading it

2017-04-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-20172. Resolution: Fixed Assignee: Saisai Shao Fix Version/s: 2.2.0 > Event log wi

[jira] [Updated] (SPARK-20421) Mark JobProgressListener (and related classes) as deprecated

2017-04-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-20421: --- Summary: Mark JobProgressListener (and related classes) as deprecated (was: Mark JobPrgressL

[jira] [Created] (SPARK-20421) Mark JobPrgressListener (and related classes) as deprecated

2017-04-20 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-20421: -- Summary: Mark JobPrgressListener (and related classes) as deprecated Key: SPARK-20421 URL: https://issues.apache.org/jira/browse/SPARK-20421 Project: Spark

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977709#comment-15977709 ] Cody Koeninger commented on SPARK-18057: People have also been reporting that exp

[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977690#comment-15977690 ] Helena Edelson edited comment on SPARK-18057 at 4/20/17 10:15 PM: -

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977690#comment-15977690 ] Helena Edelson commented on SPARK-18057: Hi [~marmbrus], 0.10.2.0 is out. When I

[jira] [Commented] (SPARK-20420) Add events to the external catalog

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977646#comment-15977646 ] Apache Spark commented on SPARK-20420: -- User 'hvanhovell' has created a pull request

[jira] [Assigned] (SPARK-20420) Add events to the external catalog

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20420: Assignee: Herman van Hovell (was: Apache Spark) > Add events to the external catalog > --

[jira] [Assigned] (SPARK-20420) Add events to the external catalog

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20420: Assignee: Apache Spark (was: Herman van Hovell) > Add events to the external catalog > --

[jira] [Assigned] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-19951: - Assignee: (was: Herman van Hovell) > Add string concatenate operator || to S

[jira] [Commented] (SPARK-19951) Add string concatenate operator || to Spark SQL

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977629#comment-15977629 ] Herman van Hovell commented on SPARK-19951: --- [~maropu] go for it! > Add string

[jira] [Created] (SPARK-20420) Add events to the external catalog

2017-04-20 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-20420: - Summary: Add events to the external catalog Key: SPARK-20420 URL: https://issues.apache.org/jira/browse/SPARK-20420 Project: Spark Issue Type: Impr

[jira] [Commented] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-20 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977589#comment-15977589 ] Nan Zhu commented on SPARK-20251: - ignore my previous comments...the moving on Spark Stre

[jira] [Resolved] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20410. --- Resolution: Fixed Fix Version/s: 2.2.0 > Make SparkConf a def instead of a val

[jira] [Resolved] (SPARK-20334) Return a better error message when correlated predicates contain aggregate expression that has mixture of outer and local references

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20334. --- Resolution: Fixed Assignee: Dilip Biswal Fix Version/s: 2.2.0 > Retur

[jira] [Updated] (SPARK-20419) Support for Mesos Maintenance primitives

2017-04-20 Thread Kamal Gurala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamal Gurala updated SPARK-20419: - Description: With Mesos 0.25.0, maintenance primitives have been added. https://mesos.apache.org/

[jira] [Created] (SPARK-20419) Support for Mesos Maintenance primitives

2017-04-20 Thread Kamal Gurala (JIRA)
Kamal Gurala created SPARK-20419: Summary: Support for Mesos Maintenance primitives Key: SPARK-20419 URL: https://issues.apache.org/jira/browse/SPARK-20419 Project: Spark Issue Type: Improvem

[jira] [Created] (SPARK-20418) multi-label classification support

2017-04-20 Thread yu peng (JIRA)
yu peng created SPARK-20418: --- Summary: multi-label classification support Key: SPARK-20418 URL: https://issues.apache.org/jira/browse/SPARK-20418 Project: Spark Issue Type: New Feature Co

[jira] [Commented] (SPARK-20417) Move error reporting for subquery from Analyzer to CheckAnalysis

2017-04-20 Thread Dilip Biswal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977286#comment-15977286 ] Dilip Biswal commented on SPARK-20417: -- Currently waiting on [pr 17636|https://githu

[jira] [Created] (SPARK-20417) Move error reporting for subquery from Analyzer to CheckAnalysis

2017-04-20 Thread Dilip Biswal (JIRA)
Dilip Biswal created SPARK-20417: Summary: Move error reporting for subquery from Analyzer to CheckAnalysis Key: SPARK-20417 URL: https://issues.apache.org/jira/browse/SPARK-20417 Project: Spark

[jira] [Created] (SPARK-20416) Column names inconsistent for UDFs in SQL vs Dataset

2017-04-20 Thread Jacek Laskowski (JIRA)
Jacek Laskowski created SPARK-20416: --- Summary: Column names inconsistent for UDFs in SQL vs Dataset Key: SPARK-20416 URL: https://issues.apache.org/jira/browse/SPARK-20416 Project: Spark Is

[jira] [Commented] (SPARK-20396) Add support for pandas udf in pyspark

2017-04-20 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977236#comment-15977236 ] Li Jin commented on SPARK-20396: I am currently working on this. I'll keep updating statu

[jira] [Updated] (SPARK-12717) pyspark broadcast fails when using multiple threads

2017-04-20 Thread Srinivasa Reddy Vundela (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srinivasa Reddy Vundela updated SPARK-12717: Attachment: run.log Please find the attached log with fix for the following

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-20 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977132#comment-15977132 ] Shixiong Zhu commented on SPARK-13747: -- [~mousa] could you try the master branch? Th

[jira] [Updated] (SPARK-20415) SPARK job hangs while writing DataFrame to HDFS

2017-04-20 Thread P K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P K updated SPARK-20415: Environment: EMR 5.4.0 (was: Latest EMR) > SPARK job hangs while writing DataFrame to HDFS > -

[jira] [Updated] (SPARK-20415) SPARK job hangs while writing DataFrame to HDFS

2017-04-20 Thread P K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P K updated SPARK-20415: Summary: SPARK job hangs while writing DataFrame to HDFS (was: SPARK job hangs while writing table to HDFS) > SPA

[jira] [Created] (SPARK-20415) SPARK job hangs while writing table to HDFS

2017-04-20 Thread P K (JIRA)
P K created SPARK-20415: --- Summary: SPARK job hangs while writing table to HDFS Key: SPARK-20415 URL: https://issues.apache.org/jira/browse/SPARK-20415 Project: Spark Issue Type: Bug Component

[jira] [Updated] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-20413: --- Description: I am proposing that a new optimizer hint called NO_COLLAPSE be introduced. This

[jira] [Updated] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-20413: --- Description: I am proposing that a new optimizer hint called NO_COLLAPSE be introduced. This

[jira] [Updated] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-20413: --- Description: I am proposing that a new optimizer hint called NO_COLLAPSE be introduced. This

[jira] [Assigned] (SPARK-20358) Executors failing stage on interrupted exception thrown by cancelled tasks

2017-04-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reassigned SPARK-20358: Assignee: Eric Liang > Executors failing stage on interrupted exception thrown by cancelled tasks

[jira] [Resolved] (SPARK-20358) Executors failing stage on interrupted exception thrown by cancelled tasks

2017-04-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-20358. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17659 [https://github.com/

[jira] [Assigned] (SPARK-18891) Support for specific collection types

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18891: Assignee: Apache Spark > Support for specific collection types > -

[jira] [Resolved] (SPARK-20407) ParquetQuerySuite 'Enabling/disabling ignoreCorruptFiles' flaky test

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20407. --- Resolution: Fixed Assignee: Bogdan Raducanu Fix Version/s: 2.2.0 > Pa

[jira] [Assigned] (SPARK-18891) Support for specific collection types

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18891: Assignee: (was: Apache Spark) > Support for specific collection types > --

[jira] [Updated] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20414: -- Shepherd: (was: Sean Owen) Flags: (was: Patch) Affects Version/s: (

[jira] [Assigned] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20414: Assignee: Apache Spark > avoid creating only 16 reducers when calling topByKey() > ---

[jira] [Assigned] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20414: Assignee: (was: Apache Spark) > avoid creating only 16 reducers when calling topByKey(

[jira] [Commented] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976948#comment-15976948 ] Apache Spark commented on SPARK-20414: -- User 'yangyangyyy' has created a pull reques

[jira] [Updated] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Yang Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated SPARK-20414: -- Flags: Patch Description: currently in the MLlib topByKey() function, it directly calls aggre

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:34 PM: ---

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:34 PM: ---

[jira] [Created] (SPARK-20414) avoid creating only 16 reducers when calling topByKey()

2017-04-20 Thread Yang Yang (JIRA)
Yang Yang created SPARK-20414: - Summary: avoid creating only 16 reducers when calling topByKey() Key: SPARK-20414 URL: https://issues.apache.org/jira/browse/SPARK-20414 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:33 PM: ---

[jira] [Assigned] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20413: Assignee: (was: Apache Spark) > New Optimizer Hint to prevent collapsing of adjacent p

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:32 PM: ---

[jira] [Assigned] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20413: Assignee: Apache Spark > New Optimizer Hint to prevent collapsing of adjacent projections

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:32 PM: ---

[jira] [Commented] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976903#comment-15976903 ] Apache Spark commented on SPARK-20413: -- User 'ptkool' has created a pull request for

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer edited comment on SPARK-650 at 4/20/17 3:31 PM: ---

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-20 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976902#comment-15976902 ] Michael Schmeißer commented on SPARK-650: - In a nutshell, we have our own class "My

[jira] [Assigned] (SPARK-20412) NullPointerException in places expecting non-optional partitionSpec.

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20412: Assignee: (was: Apache Spark) > NullPointerException in places expecting non-optional

[jira] [Created] (SPARK-20413) New Optimizer Hint to prevent collapsing of adjacent projections

2017-04-20 Thread Michael Styles (JIRA)
Michael Styles created SPARK-20413: -- Summary: New Optimizer Hint to prevent collapsing of adjacent projections Key: SPARK-20413 URL: https://issues.apache.org/jira/browse/SPARK-20413 Project: Spark

[jira] [Assigned] (SPARK-20412) NullPointerException in places expecting non-optional partitionSpec.

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20412: Assignee: Apache Spark > NullPointerException in places expecting non-optional partitionSp

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976897#comment-15976897 ] Kazuaki Ishizaki commented on SPARK-20184: -- The root cause is overhead in Java c

[jira] [Commented] (SPARK-20412) NullPointerException in places expecting non-optional partitionSpec.

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976898#comment-15976898 ] Apache Spark commented on SPARK-20412: -- User 'juliuszsompolski' has created a pull r

[jira] [Updated] (SPARK-20409) fail early if aggregate function in GROUP BY

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-20409: -- Fix Version/s: 2.1.2 > fail early if aggregate function in GROUP BY > -

[jira] [Updated] (SPARK-20412) NullPointerException in places expecting non-optional partitionSpec.

2017-04-20 Thread Juliusz Sompolski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juliusz Sompolski updated SPARK-20412: -- Description: A number of commands expect a partition specification without empty values

[jira] [Created] (SPARK-20412) NullPointerException in places expecting non-optional partitionSpec.

2017-04-20 Thread Juliusz Sompolski (JIRA)
Juliusz Sompolski created SPARK-20412: - Summary: NullPointerException in places expecting non-optional partitionSpec. Key: SPARK-20412 URL: https://issues.apache.org/jira/browse/SPARK-20412 Projec

[jira] [Resolved] (SPARK-20409) fail early if aggregate function in GROUP BY

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20409. --- Resolution: Fixed Fix Version/s: 2.2.0 > fail early if aggregate function in G

[jira] [Commented] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-04-20 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976778#comment-15976778 ] Len Frodgers commented on SPARK-19732: -- Thanks. Have updated > DataFrame.fillna() d

[jira] [Issue Comment Deleted] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-04-20 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Len Frodgers updated SPARK-19732: - Comment: was deleted (was: Actually there's another anomaly: Spark (and pyspark) supports filling

[jira] [Updated] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-04-20 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Len Frodgers updated SPARK-19732: - Description: In PySpark, the fillna function of DataFrame inadvertently casts bools to ints, so

[jira] [Updated] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-04-20 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Len Frodgers updated SPARK-19732: - Issue Type: Improvement (was: Bug) > DataFrame.fillna() does not work for bools in PySpark > ---

[jira] [Updated] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-04-20 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Len Frodgers updated SPARK-19732: - Priority: Minor (was: Major) > DataFrame.fillna() does not work for bools in PySpark > -

[jira] [Updated] (SPARK-20411) New features for expression.scalalang.typed

2017-04-20 Thread Loic Descotte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Loic Descotte updated SPARK-20411: -- Description: In Spark 2 it is possible to use typed expressions for aggregation methods: {cod

[jira] [Created] (SPARK-20411) New features for expression.scalalang.typed

2017-04-20 Thread Loic Descotte (JIRA)
Loic Descotte created SPARK-20411: - Summary: New features for expression.scalalang.typed Key: SPARK-20411 URL: https://issues.apache.org/jira/browse/SPARK-20411 Project: Spark Issue Type: Imp

[jira] [Commented] (SPARK-20404) Regression with accumulator names when migrating from 1.6 to 2.x

2017-04-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976732#comment-15976732 ] Sean Owen commented on SPARK-20404: --- We use pull requests, not patches. See http://spar

[jira] [Commented] (SPARK-20404) Regression with accumulator names when migrating from 1.6 to 2.x

2017-04-20 Thread Sergey Zhemzhitsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976726#comment-15976726 ] Sergey Zhemzhitsky commented on SPARK-20404: Thanks for drawing attention to

[jira] [Commented] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976723#comment-15976723 ] Apache Spark commented on SPARK-20410: -- User 'hvanhovell' has created a pull request

[jira] [Updated] (SPARK-20404) Regression with accumulator names when migrating from 1.6 to 2.x

2017-04-20 Thread Sergey Zhemzhitsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zhemzhitsky updated SPARK-20404: --- Attachment: (was: spark-context-accum-option.patch) > Regression with accumulator

[jira] [Updated] (SPARK-20404) Regression with accumulator names when migrating from 1.6 to 2.x

2017-04-20 Thread Sergey Zhemzhitsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zhemzhitsky updated SPARK-20404: --- Attachment: spark-context-accum-option.patch > Regression with accumulator names when

[jira] [Assigned] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20410: Assignee: Herman van Hovell (was: Apache Spark) > Make SparkConf a def instead of a val i

[jira] [Assigned] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20410: Assignee: Apache Spark (was: Herman van Hovell) > Make SparkConf a def instead of a val i

[jira] [Updated] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-20410: -- Issue Type: Improvement (was: Bug) > Make SparkConf a def instead of a val in SharedSQ

[jira] [Created] (SPARK-20410) Make SparkConf a def instead of a val in SharedSQLContext

2017-04-20 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-20410: - Summary: Make SparkConf a def instead of a val in SharedSQLContext Key: SPARK-20410 URL: https://issues.apache.org/jira/browse/SPARK-20410 Project: Spark

[jira] [Resolved] (SPARK-17593) list files on s3 very slow

2017-04-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-17593. Resolution: Fixed closing as fixed now that Hadoop 2.8.0 is out the door. Upgrade your hado

[jira] [Resolved] (SPARK-20405) Dataset.withNewExecutionId should be private

2017-04-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20405. --- Resolution: Fixed Fix Version/s: 2.2.0 > Dataset.withNewExecutionId should be

[jira] [Commented] (SPARK-20404) Regression with accumulator names when migrating from 1.6 to 2.x

2017-04-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976599#comment-15976599 ] Sean Owen commented on SPARK-20404: --- Hm, no I reverse myself. Yes it should fail fast i

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-20 Thread Mousa HAMAD (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976580#comment-15976580 ] Mousa HAMAD commented on SPARK-13747: - I am also running into this issue *sporadicall

[jira] [Resolved] (SPARK-20387) Permissive mode is not replacing corrupt record with null

2017-04-20 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20387. -- Resolution: Duplicate I am pretty sure that this is fixed in 2.2.0 in that JIRA. I am resolving

[jira] [Assigned] (SPARK-20409) fail early if aggregate function in GROUP BY

2017-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20409: Assignee: Apache Spark (was: Wenchen Fan) > fail early if aggregate function in GROUP BY

  1   2   >