[jira] [Comment Edited] (SPARK-24474) Cores are left idle when there are a lot of tasks to run

2018-07-04 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532910#comment-16532910 ] Al M edited comment on SPARK-24474 at 7/4/18 4:17 PM: -- My initial tests suggest

[jira] [Commented] (SPARK-24474) Cores are left idle when there are a lot of tasks to run

2018-07-04 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532910#comment-16532910 ] Al M commented on SPARK-24474: -- My initial tests suggest that this stops the issue from happening.  Thanks! 

[jira] [Commented] (SPARK-13127) Upgrade Parquet to 1.9 (Fixes parquet sorting)

2018-07-04 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532534#comment-16532534 ] Al M commented on SPARK-13127: -- Would be great to get this resolved in Spark 2.3.2.  Especially since

[jira] [Updated] (SPARK-24474) Cores are left idle when there are a lot of tasks to run

2018-06-12 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24474: - Summary: Cores are left idle when there are a lot of tasks to run (was: Cores are left idle when there are a

[jira] [Commented] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-11 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507965#comment-16507965 ] Al M commented on SPARK-24474: -- Also tried changing spark.scheduler.mode to "FIFO"; that didn't fix the

[jira] [Commented] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-07 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504730#comment-16504730 ] Al M commented on SPARK-24474: -- Sample code that reproduces this: {code:java} val builder =

[jira] [Updated] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-07 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24474: - Description: I've observed an issue happening consistently when: * A job contains a join of two datasets *

[jira] [Updated] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-06 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24474: - Description: I've observed an issue happening consistently when: * A job contains a join of two datasets *

[jira] [Commented] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-06 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503225#comment-16503225 ] Al M commented on SPARK-24474: -- I appreciate that 2.2.0 is slightly old but I couldn't see any scheduler

[jira] [Updated] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-06 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24474: - Description: I've observed an issue happening consistently when: * A job contains a join of two datasets *

[jira] [Created] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-06 Thread Al M (JIRA)
Al M created SPARK-24474: Summary: Cores are left idle when there are a lot of stages to run Key: SPARK-24474 URL: https://issues.apache.org/jira/browse/SPARK-24474 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-24474) Cores are left idle when there are a lot of stages to run

2018-06-06 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24474: - Description: I've observed an issue happening consistently when: * A job contains a join of two datasets *

[jira] [Updated] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy)

2018-05-17 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24306: - Summary: Sort a Dataset with a lambda (like RDD.sortBy) (was: Sort a Dataset with a lambda (like RDD.sortBy()

[jira] [Updated] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy() )

2018-05-17 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24306: - Summary: Sort a Dataset with a lambda (like RDD.sortBy() ) (was: Sort a Dataset with a lambda (like

[jira] [Created] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy()

2018-05-17 Thread Al M (JIRA)
Al M created SPARK-24306: Summary: Sort a Dataset with a lambda (like RDD.sortBy() Key: SPARK-24306 URL: https://issues.apache.org/jira/browse/SPARK-24306 Project: Spark Issue Type: Improvement

[jira] [Closed] (SPARK-14532) Spark SQL IF/ELSE does not handle Double correctly

2016-04-12 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M closed SPARK-14532. Resolution: Fixed Fix Version/s: 2.0.0 > Spark SQL IF/ELSE does not handle Double correctly >

[jira] [Commented] (SPARK-14532) Spark SQL IF/ELSE does not handle Double correctly

2016-04-12 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237058#comment-15237058 ] Al M commented on SPARK-14532: -- Thanks Bo. I'll look forward to the 2.0 release :) > Spark SQL IF/ELSE

[jira] [Created] (SPARK-14532) Spark SQL IF/ELSE does not handle Double correctly

2016-04-11 Thread Al M (JIRA)
Al M created SPARK-14532: Summary: Spark SQL IF/ELSE does not handle Double correctly Key: SPARK-14532 URL: https://issues.apache.org/jira/browse/SPARK-14532 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-9872) Allow passing of 'numPartitions' to DataFrame joins

2015-08-18 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701158#comment-14701158 ] Al M commented on SPARK-9872: - I would also be happy if we just get the partition count from

[jira] [Updated] (SPARK-9872) Allow passing of 'numPartitions' to DataFrame joins

2015-08-12 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-9872: Description: When I join two normal RDDs, I can set the number of shuffle partitions in the 'numPartitions'

[jira] [Created] (SPARK-9872) Allow passing of 'numPartitions' to DataFrame joins

2015-08-12 Thread Al M (JIRA)
Al M created SPARK-9872: --- Summary: Allow passing of 'numPartitions' to DataFrame joins Key: SPARK-9872 URL: https://issues.apache.org/jira/browse/SPARK-9872 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-5768) Spark UI Shows incorrect memory under Yarn

2015-02-12 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318088#comment-14318088 ] Al M commented on SPARK-5768: - So when it says *Memory Used* 3.2GB / 20GB it actually means we

[jira] [Created] (SPARK-5768) Spark UI Shows incorrect memory under Yarn

2015-02-12 Thread Al M (JIRA)
Al M created SPARK-5768: --- Summary: Spark UI Shows incorrect memory under Yarn Key: SPARK-5768 URL: https://issues.apache.org/jira/browse/SPARK-5768 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-5270) Elegantly check if RDD is empty

2015-01-16 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279962#comment-14279962 ] Al M commented on SPARK-5270: - Good point it's not a catch-all solution. The

[jira] [Commented] (SPARK-5270) Elegantly check if RDD is empty

2015-01-16 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280260#comment-14280260 ] Al M commented on SPARK-5270: - I don't mind at all. I'd be really happy to have such a

[jira] [Commented] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278983#comment-14278983 ] Al M commented on SPARK-5270: - I just noticed that rdd.partitions.size is set to 0 for empty

[jira] [Created] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
Al M created SPARK-5270: --- Summary: Elegantly check if RDD is empty Key: SPARK-5270 URL: https://issues.apache.org/jira/browse/SPARK-5270 Project: Spark Issue Type: Improvement Affects Versions:

[jira] [Updated] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-5270: Description: Right now there is no clean way to check if an RDD is empty. As discussed here:

[jira] [Commented] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-08 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269263#comment-14269263 ] Al M commented on SPARK-5137: - That's right. {code}a{code} has 11 partitions and

[jira] [Comment Edited] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-08 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269263#comment-14269263 ] Al M edited comment on SPARK-5137 at 1/8/15 12:30 PM: -- That's right.

[jira] [Comment Edited] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-08 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269263#comment-14269263 ] Al M edited comment on SPARK-5137 at 1/8/15 12:30 PM: -- That's right.

[jira] [Closed] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-08 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M closed SPARK-5137. --- Resolution: Not a Problem subtract does not take the spark.default.parallelism into account

[jira] [Commented] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-08 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268969#comment-14268969 ] Al M commented on SPARK-5137: - Yes I do mean subtractByKey. Sorry for not being clear. I'm

[jira] [Created] (SPARK-5137) subtract does not take the spark.default.parallelism into account

2015-01-07 Thread Al M (JIRA)
Al M created SPARK-5137: --- Summary: subtract does not take the spark.default.parallelism into account Key: SPARK-5137 URL: https://issues.apache.org/jira/browse/SPARK-5137 Project: Spark Issue Type: