[jira] [Resolved] (SPARK-14519) Cross-publish Kafka for Scala 2.12

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-14519. Resolution: Fixed > Cross-publish Kafka for Scala 2.12 > -- > >

[jira] [Assigned] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-14280: -- Assignee: (was: Josh Rosen) > Update change-version.sh and pom.xml to add Scala 2.12 profi

[jira] [Resolved] (SPARK-14438) Cross-publish Breeze for Scala 2.12

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-14438. Resolution: Fixed > Cross-publish Breeze for Scala 2.12 > --- > >

[jira] [Assigned] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-14280: -- Assignee: (was: Josh Rosen) > Update change-version.sh and pom.xml to add Scala 2.12 profi

[jira] [Assigned] (SPARK-14650) Compile Spark REPL for Scala 2.12

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-14650: -- Assignee: (was: Josh Rosen) > Compile Spark REPL for Scala 2.12 >

[jira] [Assigned] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2017-07-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-14280: -- Assignee: Josh Rosen > Update change-version.sh and pom.xml to add Scala 2.12 profiles > -

[jira] [Updated] (SPARK-21444) Fetch failure due to node reboot causes job failure

2017-07-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-21444: --- Affects Version/s: (was: 2.0.2) 2.3.0 > Fetch failure due to node reboot c

[jira] [Commented] (SPARK-21444) Fetch failure due to node reboot causes job failure

2017-07-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090809#comment-16090809 ] Josh Rosen commented on SPARK-21444: I'm going to adjust the "affects versions" on th

[jira] [Commented] (SPARK-21444) Fetch failure due to node reboot causes job failure

2017-07-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090815#comment-16090815 ] Josh Rosen commented on SPARK-21444: I spot the problem: in the old code, we removed

[jira] [Assigned] (SPARK-21444) Fetch failure due to node reboot causes job failure

2017-07-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-21444: -- Assignee: Josh Rosen > Fetch failure due to node reboot causes job failure > -

[jira] [Resolved] (SPARK-21444) Fetch failure due to node reboot causes job failure

2017-07-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-21444. Resolution: Fixed Fix Version/s: 2.3.0 > Fetch failure due to node reboot causes job failure

[jira] [Commented] (SPARK-14643) Remove overloaded methods which become ambiguous in Scala 2.12

2017-07-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097169#comment-16097169 ] Josh Rosen commented on SPARK-14643: [~srowen], I just posted a comment about this ov

[jira] [Updated] (SPARK-16175) Handle None for all Python UDT

2016-06-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-16175: --- Assignee: Davies Liu > Handle None for all Python UDT > -- > >

[jira] [Commented] (SPARK-16175) Handle None for all Python UDT

2016-06-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353684#comment-15353684 ] Josh Rosen commented on SPARK-16175: Here's a published copy of the error reproductio

[jira] [Created] (SPARK-16555) Work around Jekyll error-handling bug which led to silent doc build failures

2016-07-14 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-16555: -- Summary: Work around Jekyll error-handling bug which led to silent doc build failures Key: SPARK-16555 URL: https://issues.apache.org/jira/browse/SPARK-16555 Project: Spa

[jira] [Commented] (SPARK-16550) Caching data with replication doesn't replicate data

2016-07-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378593#comment-15378593 ] Josh Rosen commented on SPARK-16550: *For the case involving a REPL-defined class:* T

[jira] [Assigned] (SPARK-16550) Caching data with replication doesn't replicate data

2016-07-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-16550: -- Assignee: Josh Rosen > Caching data with replication doesn't replicate data >

[jira] [Commented] (SPARK-16550) Caching data with replication doesn't replicate data

2016-07-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379793#comment-15379793 ] Josh Rosen commented on SPARK-16550: I have a partial fix for the "unable to replicat

[jira] [Resolved] (SPARK-5581) When writing sorted map output file, avoid open / close between each partition

2016-07-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5581. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13382 [https://github.com

[jira] [Updated] (SPARK-5581) When writing sorted map output file, avoid open / close between each partition

2016-07-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5581: -- Assignee: Brian Cho (was: Josh Rosen) > When writing sorted map output file, avoid open / close between

[jira] [Updated] (SPARK-5581) When writing sorted map output file, avoid open / close between each partition

2016-07-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5581: -- Assignee: Josh Rosen > When writing sorted map output file, avoid open / close between each partition >

[jira] [Commented] (SPARK-7953) Spark should cleanup output dir if job fails

2016-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392386#comment-15392386 ] Josh Rosen commented on SPARK-7953: --- [~uncleGen], to my knowledge this has not been work

[jira] [Reopened] (SPARK-15271) Allow force pulling executor docker images

2016-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-15271: Re-opening this because I had to revert the patch after it caused a master build break. > Allow force

[jira] [Updated] (SPARK-16166) Correctly honor off heap memory usage in web ui and log display

2016-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-16166: --- Assignee: Saisai Shao > Correctly honor off heap memory usage in web ui and log display > ---

[jira] [Resolved] (SPARK-16166) Correctly honor off heap memory usage in web ui and log display

2016-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-16166. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13920 [https://github.

[jira] [Created] (SPARK-16787) SparkContext.addFile() should not fail if called twice with the same file

2016-07-28 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-16787: -- Summary: SparkContext.addFile() should not fail if called twice with the same file Key: SPARK-16787 URL: https://issues.apache.org/jira/browse/SPARK-16787 Project: Spark

[jira] [Updated] (SPARK-16787) SparkContext.addFile() should not fail if called twice with the same file

2016-07-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-16787: --- Target Version/s: 2.0.1 (was: 1.6.3, 2.0.1) > SparkContext.addFile() should not fail if called twice

[jira] [Created] (SPARK-27542) SparkHadoopWriter doesn't set call setWorkOutputPath, causing NPEs for some legacy OutputFormats

2019-04-22 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27542: -- Summary: SparkHadoopWriter doesn't set call setWorkOutputPath, causing NPEs for some legacy OutputFormats Key: SPARK-27542 URL: https://issues.apache.org/jira/browse/SPARK-27542

[jira] [Updated] (SPARK-27542) SparkHadoopWriter doesn't set call setWorkOutputPath, causing NPEs when using certain legacy OutputFormats

2019-04-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27542: --- Summary: SparkHadoopWriter doesn't set call setWorkOutputPath, causing NPEs when using certain legac

[jira] [Commented] (SPARK-27542) SparkHadoopWriter doesn't set call setWorkOutputPath, causing NPEs when using certain legacy OutputFormats

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825560#comment-16825560 ] Josh Rosen commented on SPARK-27542: [~shivuson...@gmail.com], unfortunately I don't

[jira] [Created] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2019-04-24 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27561: -- Summary: Support "lateral column alias references" to allow column aliases to be used within SELECT clauses Key: SPARK-27561 URL: https://issues.apache.org/jira/browse/SPARK-27561

[jira] [Updated] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27561: --- Description: Amazon Redshift has a feature called "lateral column alias references: [https://aws.am

[jira] [Updated] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27561: --- Description: Amazon Redshift has a feature called "lateral column alias references": [https://aws.a

[jira] [Commented] (SPARK-27530) FetchFailedException: Received a zero-size buffer for block shuffle

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825770#comment-16825770 ] Josh Rosen commented on SPARK-27530: This specific error message was added in SPARK-

[jira] [Comment Edited] (SPARK-27530) FetchFailedException: Received a zero-size buffer for block shuffle

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825770#comment-16825770 ] Josh Rosen edited comment on SPARK-27530 at 4/25/19 6:32 AM: -

[jira] [Comment Edited] (SPARK-27530) FetchFailedException: Received a zero-size buffer for block shuffle

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825770#comment-16825770 ] Josh Rosen edited comment on SPARK-27530 at 4/25/19 6:32 AM: -

[jira] [Comment Edited] (SPARK-27216) Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825782#comment-16825782 ] Josh Rosen edited comment on SPARK-27216 at 4/25/19 6:38 AM: -

[jira] [Updated] (SPARK-27216) Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27216: --- Labels: correctness (was: ) > Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue > -

[jira] [Commented] (SPARK-27216) Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825782#comment-16825782 ] Josh Rosen commented on SPARK-27216: I've added the {{correctness}} label to this ti

[jira] [Commented] (SPARK-23178) Kryo Unsafe problems with count distinct from cache

2019-04-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825783#comment-16825783 ] Josh Rosen commented on SPARK-23178: This might be fixed by SPARK-27216 > Kryo Unsa

[jira] [Created] (SPARK-27573) Collapse adjacent aggregate physical operators when possible

2019-04-25 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27573: -- Summary: Collapse adjacent aggregate physical operators when possible Key: SPARK-27573 URL: https://issues.apache.org/jira/browse/SPARK-27573 Project: Spark Iss

[jira] [Updated] (SPARK-27573) Collapse adjacent physical aggregate operators when possible

2019-04-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27573: --- Summary: Collapse adjacent physical aggregate operators when possible (was: Collapse adjacent aggre

[jira] [Updated] (SPARK-27573) Collapse adjacent physical aggregate operators when possible

2019-04-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27573: --- Description: When an aggregation requires a shuffle, Spark SQL performs separate partial and final

[jira] [Updated] (SPARK-27573) Collapse adjacent physical aggregate operators when possible

2019-04-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27573: --- Description: When an aggregation requires a shuffle, Spark SQL performs separate partial and final

[jira] [Updated] (SPARK-27573) Skip partial aggregation when data is already partitioned (or collapse adjacent partial and final aggregates)

2019-04-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27573: --- Summary: Skip partial aggregation when data is already partitioned (or collapse adjacent partial and

[jira] [Updated] (SPARK-27573) Skip partial aggregation when data is already partitioned (or collapse adjacent partial and final aggregates)

2019-04-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27573: --- Description: When an aggregation requires a shuffle, Spark SQL performs separate partial and final

[jira] [Created] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-26 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27581: -- Summary: DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'" Key: SPARK-27581 URL: https://issues.apache.org/jira/browse/SPARK-2758

[jira] [Updated] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27581: --- Description: If I have a DataFrame then I can use {{count("*")}} as an expression, e.g.: {code} imp

[jira] [Updated] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27581: --- Issue Type: Bug (was: New Feature) > DataFrame countDistinct("*") fails with AnalysisException: "In

[jira] [Updated] (SPARK-27290) remove unneed sort under Aggregate

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27290: --- Description: I saw some tickets to remove unneeded sort in plan while I think there's another case

[jira] [Updated] (SPARK-27290) remove unneed sort under Aggregate

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27290: --- Description: I saw some tickets to remove unneeded sort in plan while I think there's another case

[jira] [Commented] (SPARK-27290) remove unneed sort under Aggregate

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827448#comment-16827448 ] Josh Rosen commented on SPARK-27290: Regarding that test case, my best guess is that

[jira] [Commented] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827449#comment-16827449 ] Josh Rosen commented on SPARK-27213: Since this sounds like a legitimate query corre

[jira] [Updated] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27213: --- Labels: correctness distinct filter (was: distinct filter) > Unexpected results when filter is used

[jira] [Comment Edited] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827449#comment-16827449 ] Josh Rosen edited comment on SPARK-27213 at 4/27/19 5:13 AM: -

[jira] [Commented] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827453#comment-16827453 ] Josh Rosen commented on SPARK-27213: SPARK-26767 sounds like a similar, possibly-dup

[jira] [Updated] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-27 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27581: --- Description: If I have a DataFrame then I can use {{count("*")}} as an expression, e.g.: {code:java}

[jira] [Commented] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-04-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829303#comment-16829303 ] Josh Rosen commented on SPARK-27586: Good find! This sounds pretty straightforward t

[jira] [Commented] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829309#comment-16829309 ] Josh Rosen commented on SPARK-27213: Hmm, this must have been fixed relatively recen

[jira] [Created] (SPARK-27607) Improve performance of Row.toString()

2019-04-30 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27607: -- Summary: Improve performance of Row.toString() Key: SPARK-27607 URL: https://issues.apache.org/jira/browse/SPARK-27607 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-27607) Improve performance of Row.toString()

2019-05-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830997#comment-16830997 ] Josh Rosen commented on SPARK-27607: Feel free to take this. > Improve performance

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2019-05-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831014#comment-16831014 ] Josh Rosen commented on SPARK-17637: I think this old feature suggestion is still ve

[jira] [Updated] (SPARK-27619) MapType should be prohibited in hash expressions

2019-05-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27619: --- Description: Spark currently allows MapType expressions to be used as input to hash expressions, bu

[jira] [Updated] (SPARK-27619) MapType should be prohibited in hash expressions

2019-05-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27619: --- Description: Spark currently allows MapType expressions to be used as input to hash expressions, bu

[jira] [Created] (SPARK-27619) MapType should be prohibited in hash expressions

2019-05-01 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27619: -- Summary: MapType should be prohibited in hash expressions Key: SPARK-27619 URL: https://issues.apache.org/jira/browse/SPARK-27619 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-27619) MapType should be prohibited in hash expressions

2019-05-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27619: --- Description: Spark currently allows MapType expressions to be used as input to hash expressions, bu

[jira] [Updated] (SPARK-27619) MapType should be prohibited in hash expressions

2019-05-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27619: --- Description: Spark currently allows MapType expressions to be used as input to hash expressions, bu

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-05-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834025#comment-16834025 ] Josh Rosen commented on SPARK-26555: [~cloud_fan] [~srowen], could we backport this

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-05-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834203#comment-16834203 ] Josh Rosen commented on SPARK-26555: I won't be able to tackle a backport for at lea

[jira] [Created] (SPARK-27653) Add max_by() / min_by() SQL aggregate functions

2019-05-07 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27653: -- Summary: Add max_by() / min_by() SQL aggregate functions Key: SPARK-27653 URL: https://issues.apache.org/jira/browse/SPARK-27653 Project: Spark Issue Type: New F

[jira] [Created] (SPARK-27676) InMemoryFileIndex should hard-fail on missing files instead of logging and continuing

2019-05-10 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27676: -- Summary: InMemoryFileIndex should hard-fail on missing files instead of logging and continuing Key: SPARK-27676 URL: https://issues.apache.org/jira/browse/SPARK-27676 Pro

[jira] [Updated] (SPARK-27676) InMemoryFileIndex should hard-fail on missing files instead of logging and continuing

2019-05-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27676: --- Description: Spark's {{InMemoryFileIndex}} contains two places where {{FileNotFound}} exceptions ar

[jira] [Updated] (SPARK-27676) InMemoryFileIndex should hard-fail on missing files instead of logging and continuing

2019-05-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27676: --- Description: Spark's {{InMemoryFileIndex}} contains two places where {{FileNotFound}} exceptions ar

[jira] [Updated] (SPARK-27676) InMemoryFileIndex should hard-fail on missing files instead of logging and continuing

2019-05-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27676: --- Description: Spark's {{InMemoryFileIndex}} contains two places where {{FileNotFound}} exceptions ar

[jira] [Resolved] (SPARK-3289) Avoid job failures due to rescheduling of failing tasks on buggy machines

2019-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3289. --- Resolution: Fixed As part of a cleanup of old tickets filed by me, I'm resolving this as "Fixed" bec

[jira] [Resolved] (SPARK-8352) Affixed table of contents, similar to Bootstrap 3 docs

2019-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-8352. --- Resolution: Fixed The new docs have a sidebar TOC, so marking as done. > Affixed table of contents,

[jira] [Resolved] (SPARK-8351) Umbella for improving Spark documentation CSS + JS

2019-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-8351. --- Resolution: Done > Umbella for improving Spark documentation CSS + JS > -

[jira] [Created] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-12 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27684: -- Summary: Reduce ScalaUDF conversion overheads for primitives Key: SPARK-27684 URL: https://issues.apache.org/jira/browse/SPARK-27684 Project: Spark Issue Type: I

[jira] [Updated] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27684: --- Description: I believe that we can reduce ScalaUDF overheads when operating over primitive types.

[jira] [Updated] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27684: --- Description: I believe that we can reduce ScalaUDF overheads when operating over primitive types.

[jira] [Updated] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27684: --- Description: I believe that we can reduce ScalaUDF overheads when operating over primitive types.

[jira] [Updated] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27684: --- Description: I believe that we can reduce ScalaUDF overheads when operating over primitive types.

[jira] [Updated] (SPARK-27685) `union` doesn't promote non-nullable columns of struct to nullable

2019-05-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27685: --- Labels: correctness (was: ) > `union` doesn't promote non-nullable columns of struct to nullable >

[jira] [Created] (SPARK-18034) Upgrade to MiMa 0.1.11

2016-10-20 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18034: -- Summary: Upgrade to MiMa 0.1.11 Key: SPARK-18034 URL: https://issues.apache.org/jira/browse/SPARK-18034 Project: Spark Issue Type: Bug Components: Proj

[jira] [Commented] (SPARK-18037) Event listener should be aware of multiple tries of same stage

2016-10-20 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593347#comment-15593347 ] Josh Rosen commented on SPARK-18037: Ahhh, I remember there being other JIRAs related

[jira] [Resolved] (SPARK-18034) Upgrade to MiMa 0.1.11

2016-10-21 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-18034. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15571 [https://github.

[jira] [Updated] (SPARK-18034) Upgrade to MiMa 0.1.11

2016-10-21 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18034: --- Target Version/s: 2.0.2, 2.1.0 (was: 2.1.0) > Upgrade to MiMa 0.1.11 > -- > >

[jira] [Updated] (SPARK-18034) Upgrade to MiMa 0.1.11

2016-10-21 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18034: --- Fix Version/s: 2.0.2 > Upgrade to MiMa 0.1.11 > -- > > Key: SPARK

[jira] [Created] (SPARK-18182) Expose ReplayListenerBus.replay() overload which accepts Iterator

2016-10-31 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18182: -- Summary: Expose ReplayListenerBus.replay() overload which accepts Iterator Key: SPARK-18182 URL: https://issues.apache.org/jira/browse/SPARK-18182 Project: Spark

[jira] [Created] (SPARK-18236) Reduce memory usage of Spark UI and HistoryServer by reducing duplicate objects

2016-11-02 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18236: -- Summary: Reduce memory usage of Spark UI and HistoryServer by reducing duplicate objects Key: SPARK-18236 URL: https://issues.apache.org/jira/browse/SPARK-18236 Project:

[jira] [Commented] (SPARK-14960) Don't perform treeAggregation in local mode

2016-11-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15633478#comment-15633478 ] Josh Rosen commented on SPARK-14960: It turns out that {{treeAggregation}}'s extra co

[jira] [Closed] (SPARK-14960) Don't perform treeAggregation in local mode

2016-11-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen closed SPARK-14960. -- Resolution: Won't Fix > Don't perform treeAggregation in local mode > -

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18254: --- Labels: correctness (was: ) > UDFs don't see aliased column names >

[jira] [Updated] (SPARK-18256) Improve performance of event log replay in HistoryServer based on profiler results

2016-11-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18256: --- Issue Type: Improvement (was: Bug) > Improve performance of event log replay in HistoryServer based

[jira] [Created] (SPARK-18256) Improve performance of event log replay in HistoryServer based on profiler results

2016-11-03 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18256: -- Summary: Improve performance of event log replay in HistoryServer based on profiler results Key: SPARK-18256 URL: https://issues.apache.org/jira/browse/SPARK-18256 Projec

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2016-11-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634050#comment-15634050 ] Josh Rosen commented on SPARK-14220: SPARK-14643 is likely to be the hardest task. >

[jira] [Resolved] (SPARK-18236) Reduce memory usage of Spark UI and HistoryServer by reducing duplicate objects

2016-11-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-18236. Resolution: Fixed Fix Version/s: 2.2.0 Merged into master (2.2.0). > Reduce memory usage of

[jira] [Created] (SPARK-18362) Use TextFileFormat in implementation of JsonFileFormat and CSVFileFormat

2016-11-08 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18362: -- Summary: Use TextFileFormat in implementation of JsonFileFormat and CSVFileFormat Key: SPARK-18362 URL: https://issues.apache.org/jira/browse/SPARK-18362 Project: Spark

[jira] [Created] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2016-11-10 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18406: -- Summary: Race between end-of-task and completion iterator read lock release Key: SPARK-18406 URL: https://issues.apache.org/jira/browse/SPARK-18406 Project: Spark

[jira] [Updated] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2016-11-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18406: --- Description: The following log comes from a production streaming job where executors periodically di

  1   2   3   4   5   6   7   8   9   10   >