[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-08-31 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117062#comment-14117062 ] Josh Rosen commented on SPARK-: --- [~shivaram] and I discussed this; we have a few ide

[jira] [Updated] (SPARK-2312) Spark Actors do not handle unknown messages in their receive methods

2014-09-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2312: -- Assignee: Josh Rosen > Spark Actors do not handle unknown messages in their receive methods > --

[jira] [Updated] (SPARK-2312) Spark Actors do not handle unknown messages in their receive methods

2014-09-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2312: -- Assignee: Isaias Barroso (was: Josh Rosen) > Spark Actors do not handle unknown messages in their recei

[jira] [Assigned] (SPARK-2638) Improve concurrency of fetching Map outputs

2014-09-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2638: - Assignee: Josh Rosen > Improve concurrency of fetching Map outputs >

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-09-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117905#comment-14117905 ] Josh Rosen commented on SPARK-: --- Still investigating. I tried this on my laptop by

[jira] [Resolved] (SPARK-3331) PEP8 tests fail because they check unzipped py4j code

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3331. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request [https://github.com/

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3181: -- Assignee: (was: Matthew Farrellee) > Add Robust Regression Algorithm with Huber Estimator >

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118695#comment-14118695 ] Josh Rosen commented on SPARK-: --- I was unable to reproduce this on a cluster with tw

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118775#comment-14118775 ] Josh Rosen commented on SPARK-: --- Tried this on a m3.xlarge cluster (1 master, 1 work

[jira] [Updated] (SPARK-3333) Large number of partitions causes OOM

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-: -- Attachment: spark--logs.zip I've attached some logs from my most recent run, showing a ~100 second

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118826#comment-14118826 ] Josh Rosen commented on SPARK-: --- 360/220 is approximately 1.6. Using Splunk, I comp

[jira] [Created] (SPARK-3358) PySpark worker fork()ing performance regression in m3.* / PVM instances

2014-09-02 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3358: - Summary: PySpark worker fork()ing performance regression in m3.* / PVM instances Key: SPARK-3358 URL: https://issues.apache.org/jira/browse/SPARK-3358 Project: Spark

[jira] [Commented] (SPARK-3358) PySpark worker fork()ing performance regression in m3.* / PVM instances

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119089#comment-14119089 ] Josh Rosen commented on SPARK-3358: --- Credit where it's due: Davies pointed out the poten

[jira] [Commented] (SPARK-3358) PySpark worker fork()ing performance regression in m3.* / PVM instances

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119093#comment-14119093 ] Josh Rosen commented on SPARK-3358: --- Update: that same microbenchmark that I posted abov

[jira] [Commented] (SPARK-3358) PySpark worker fork()ing performance regression in m3.* / PVM instances

2014-09-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119145#comment-14119145 ] Josh Rosen commented on SPARK-3358: --- Yes, I meant to link the two issues. > PySpark wor

[jira] [Resolved] (SPARK-3309) Put all public API in __all__

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3309. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2205 [https://github.com/

[jira] [Commented] (SPARK-3358) PySpark worker fork()ing performance regression in m3.* / PVM instances

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120354#comment-14120354 ] Josh Rosen commented on SPARK-3358: --- Agreed. Long term, I think it would be better to a

[jira] [Updated] (SPARK-3389) Add converter class to make reading Parquet files easy with PySpark

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3389: -- Component/s: PySpark > Add converter class to make reading Parquet files easy with PySpark > ---

[jira] [Commented] (SPARK-2491) When an OOM is thrown,the executor does not stop properly.

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120699#comment-14120699 ] Josh Rosen commented on SPARK-2491: --- Can you provide a little more context for how to tr

[jira] [Commented] (SPARK-3030) reuse python worker

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120801#comment-14120801 ] Josh Rosen commented on SPARK-3030: --- Do we ever clean up these workers? If we don't, th

[jira] [Commented] (SPARK-3030) reuse python worker

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120812#comment-14120812 ] Josh Rosen commented on SPARK-3030: --- Well, I guess this can be configurable at first and

[jira] [Resolved] (SPARK-2435) Add shutdown hook to bin/pyspark

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2435. --- Resolution: Fixed Fix Version/s: (was: 1.1.0) 1.2.0 Issue resolved by pu

[jira] [Updated] (SPARK-2435) Add shutdown hook to bin/pyspark

2014-09-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2435: -- Assignee: Matthew Farrellee (was: Josh Rosen) > Add shutdown hook to bin/pyspark >

[jira] [Resolved] (SPARK-1078) Replace lift-json with json4s-jackson

2014-09-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1078. --- Resolution: Fixed Fix Version/s: 1.0.0 It looks like this was fixed in SPARK-1132 / Spark 1.0.0

[jira] [Updated] (SPARK-3286) Cannot view ApplicationMaster UI when Yarn’s url scheme is https

2014-09-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3286: -- Component/s: Web UI > Cannot view ApplicationMaster UI when Yarn’s url scheme is https > ---

[jira] [Updated] (SPARK-2015) Spark UI issues at scale

2014-09-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2015: -- Component/s: Web UI > Spark UI issues at scale > > > Key: SPARK

[jira] [Assigned] (SPARK-3061) Maven build fails in Windows OS

2014-09-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-3061: - Assignee: Andrew Or (was: Josh Rosen) Re-assigning to Andrew, who's going to backport it. > Mav

[jira] [Updated] (SPARK-2334) Attribute Error calling PipelinedRDD.id() in pyspark

2014-09-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2334: -- Affects Version/s: 1.1.0 > Attribute Error calling PipelinedRDD.id() in pyspark > --

[jira] [Resolved] (SPARK-3399) Test for PySpark should ignore HADOOP_CONF_DIR and YARN_CONF_DIR

2014-09-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3399. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2270 [https://github.com/

[jira] [Commented] (SPARK-2491) When an OOM is thrown,the executor does not stop properly.

2014-09-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123326#comment-14123326 ] Josh Rosen commented on SPARK-2491: --- Ah, I see. It looks like we don't want to display

[jira] [Updated] (SPARK-2714) DAGScheduler should log jobid when runJob finishes

2014-09-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2714: -- Summary: DAGScheduler should log jobid when runJob finishes (was: DAGScheduler logs jobid when runJob f

[jira] [Updated] (SPARK-2714) DAGScheduler should log jobid when runJob finishes

2014-09-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2714: -- Description: When DAGScheduler concurrently runs multiple jobs, SparkContext only logs "Job finished" an

[jira] [Resolved] (SPARK-3406) Python persist API does not have a default storage level

2014-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3406. --- Resolution: Fixed Fix Version/s: 1.2.0 Fixed by Holden in https://github.com/apache/spark/pull/

[jira] [Resolved] (SPARK-3397) Bump pom.xml version number of master branch to 1.2.0-SNAPSHOT

2014-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3397. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2268 [https://github.com/

[jira] [Resolved] (SPARK-3301) The spark version in the welcome message of pyspark is not correct

2014-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3301. --- Resolution: Fixed Fix Version/s: 1.2.0 > The spark version in the welcome message of pyspark is

[jira] [Resolved] (SPARK-3273) We should read the version information from the same place.

2014-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3273. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2175 [https://github.com/

[jira] [Resolved] (SPARK-2334) Attribute Error calling PipelinedRDD.id() in pyspark

2014-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2334. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2296 [https://github.com/

[jira] [Updated] (SPARK-2232) Fix Jenkins tests in Maven

2014-09-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2232: -- Priority: Critical (was: Major) > Fix Jenkins tests in Maven > -- > >

[jira] [Updated] (SPARK-2232) Fix Jenkins tests in Maven

2014-09-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2232: -- Priority: Blocker (was: Critical) > Fix Jenkins tests in Maven > -- > >

[jira] [Updated] (SPARK-2232) Fix Jenkins tests in Maven

2014-09-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2232: -- Description: It appears Maven tests are failing under the newer Hadoop configurations. We need to go th

[jira] [Created] (SPARK-3433) Mima false-positives with @DeveloperAPI and @Experimental annotations

2014-09-07 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3433: - Summary: Mima false-positives with @DeveloperAPI and @Experimental annotations Key: SPARK-3433 URL: https://issues.apache.org/jira/browse/SPARK-3433 Project: Spark

[jira] [Resolved] (SPARK-3415) Using sys.stderr in pyspark results in error

2014-09-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3415. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2287 [https://github.com/

[jira] [Resolved] (SPARK-675) Gateway JVM should ask for less than SPARK_MEM memory

2014-09-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-675. -- Resolution: Invalid Thanks for the reminder. I'm going to close this since it only affected a very old

[jira] [Resolved] (SPARK-3047) add an option to use str in textFileRDD()

2014-09-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3047. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1951 [https://github.com/

[jira] [Updated] (SPARK-1579) PySpark should distinguish expected IOExceptions from unexpected ones in the worker

2014-09-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1579: -- Fix Version/s: (was: 1.1.0) 1.0.0 > PySpark should distinguish expected IOExcepti

[jira] [Commented] (SPARK-3500) SchemaRDD from jsonRDD() has not coalesce() method

2014-09-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132420#comment-14132420 ] Josh Rosen commented on SPARK-3500: --- This feels like a bug, not a missing feature, since

[jira] [Resolved] (SPARK-3094) Support run pyspark in PyPy

2014-09-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3094. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2144 [https://github.com/

[jira] [Resolved] (SPARK-3500) SchemaRDD from jsonRDD() has not coalesce() method

2014-09-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3500. --- Resolution: Fixed Fix Version/s: 1.1.1 1.2.0 Issue resolved by pull request

[jira] [Resolved] (SPARK-3030) reuse python worker

2014-09-13 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3030. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2259 [https://github.com/

[jira] [Resolved] (SPARK-3463) Show metrics about spilling in Python

2014-09-13 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3463. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2336 [https://github.com/

[jira] [Commented] (SPARK-3534) Avoid running MLlib and Streaming tests when testing SQL PRs

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134290#comment-14134290 ] Josh Rosen commented on SPARK-3534: --- Looks like this has been proposed before: SPARK-145

[jira] [Updated] (SPARK-1517) Publish nightly snapshots of documentation, maven artifacts, and binary builds

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1517: -- Fix Version/s: (was: 1.1.0) 1.2.0 We should revisit this for the 1.2.0 release cy

[jira] [Resolved] (SPARK-3104) Jenkins failing to test some PRs when asked to

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3104. --- Resolution: Cannot Reproduce Resolving this as "cannot reproduce" for now, since Jenkins seems to have

[jira] [Resolved] (SPARK-2232) Fix Jenkins tests in Maven

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2232. --- Resolution: Fixed This has been fixed; the Maven builds have now been green for a few days. > Fix Jen

[jira] [Resolved] (SPARK-2951) SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2951. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2365 [https://github.com/

[jira] [Resolved] (SPARK-1087) Separate file for traceback and callsite related functions

2014-09-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1087. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2385 [https://github.com/

[jira] [Commented] (SPARK-922) Update Spark AMI to Python 2.7

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135860#comment-14135860 ] Josh Rosen commented on SPARK-922: -- [~nchammas] In the long run, it might be nice to autom

[jira] [Resolved] (SPARK-3519) PySpark RDDs are missing the distinct(n) method

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3519. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2383 [https://github.com/

[jira] [Created] (SPARK-3556) Monitoring and debugging improvements (Spark 1.2)

2014-09-16 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3556: - Summary: Monitoring and debugging improvements (Spark 1.2) Key: SPARK-3556 URL: https://issues.apache.org/jira/browse/SPARK-3556 Project: Spark Issue Type: Umbrell

[jira] [Updated] (SPARK-3556) Monitoring and debugging improvements (Spark 1.2)

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3556: -- Issue Type: Epic (was: Umbrella) > Monitoring and debugging improvements (Spark 1.2) >

[jira] [Updated] (SPARK-3556) Monitoring and debugging improvements (Spark 1.2)

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3556: -- Epic Name: Monitoring and debugging improvements (Spark 1.2) > Monitoring and debugging improvements (Sp

[jira] [Updated] (SPARK-3067) JobProgressPage could not show Fair Scheduler Pools section sometimes

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3067: -- Component/s: Web UI > JobProgressPage could not show Fair Scheduler Pools section sometimes > --

[jira] [Commented] (SPARK-3067) JobProgressPage could not show Fair Scheduler Pools section sometimes

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136387#comment-14136387 ] Josh Rosen commented on SPARK-3067: --- Do you think SPARK-1208 sounds related to this? >

[jira] [Resolved] (SPARK-2414) Remove jquery

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2414. --- Resolution: Won't Fix Resolving this as "Won't Fix", since several of the web UI visualization PRs wi

[jira] [Created] (SPARK-3558) Throw exception for concurrently-running SparkContexts / StreamingContexts in the same JVM

2014-09-16 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3558: - Summary: Throw exception for concurrently-running SparkContexts / StreamingContexts in the same JVM Key: SPARK-3558 URL: https://issues.apache.org/jira/browse/SPARK-3558 Pr

[jira] [Resolved] (SPARK-2463) Creating multiple StreamingContexts from shell generates duplicate Streaming tabs in UI

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2463. --- Resolution: Invalid Going to resolve this as "Invalid", since we don't currently support concurrently

[jira] [Reopened] (SPARK-2463) Creating multiple StreamingContexts from shell generates duplicate Streaming tabs in UI

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-2463: --- Assignee: Josh Rosen [~nchammas] Good point. I'll re-open and investigate. > Creating multiple Str

[jira] [Resolved] (SPARK-744) BlockManagerUI with no RDD: java.lang.UnsupportedOperationException: empty.reduceLeft

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-744. -- Resolution: Incomplete > BlockManagerUI with no RDD: java.lang.UnsupportedOperationException: > empty.re

[jira] [Updated] (SPARK-611) Allow JStack to be run from web UI

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-611: - Target Version/s: 1.2.0 > Allow JStack to be run from web UI > -- > >

[jira] [Updated] (SPARK-611) Allow JStack to be run from web UI

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-611: - Summary: Allow JStack to be run from web UI (was: Expose basic JVM metrics in WebUI.) > Allow JStack to b

[jira] [Updated] (SPARK-2105) SparkUI doesn't remove active stages that failed

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2105: -- Component/s: Web UI > SparkUI doesn't remove active stages that failed > ---

[jira] [Updated] (SPARK-1622) Expose input split(s) accessed by a task in UI or logs

2014-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1622: -- Component/s: Web UI > Expose input split(s) accessed by a task in UI or logs > -

[jira] [Updated] (SPARK-3074) support groupByKey() with hot keys in PySpark

2014-09-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3074: -- Component/s: PySpark > support groupByKey() with hot keys in PySpark > -

[jira] [Commented] (SPARK-2321) Design a proper progress reporting & event listener API

2014-09-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138242#comment-14138242 ] Josh Rosen commented on SPARK-2321: --- I agree that this should be a pull API. A pull-bas

[jira] [Resolved] (SPARK-3554) handle large dataset in closure of PySpark

2014-09-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3554. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2417 [https://github.com/

[jira] [Resolved] (SPARK-1701) Inconsistent naming: "slice" or "partition"

2014-09-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1701. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2304 [https://github.com/

[jira] [Created] (SPARK-3616) Add Selenium tests to Web UI

2014-09-20 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3616: - Summary: Add Selenium tests to Web UI Key: SPARK-3616 URL: https://issues.apache.org/jira/browse/SPARK-3616 Project: Spark Issue Type: Improvement Compon

[jira] [Updated] (SPARK-3610) History server log name should not be based on user input

2014-09-20 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3610: -- Component/s: Web UI > History server log name should not be based on user input > --

[jira] [Updated] (SPARK-1966) Cannot cancel tasks running locally

2014-09-20 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1966: -- Affects Version/s: 1.1.0 > Cannot cancel tasks running locally > --- > >

[jira] [Commented] (SPARK-1966) Cannot cancel tasks running locally

2014-09-20 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142333#comment-14142333 ] Josh Rosen commented on SPARK-1966: --- I think this is still an issue even in 1.1.0; I ran

[jira] [Commented] (SPARK-1966) Cannot cancel tasks running locally

2014-09-20 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142334#comment-14142334 ] Josh Rosen commented on SPARK-1966: --- Actually, scratch that; it wasn't an issue since lo

[jira] [Created] (SPARK-3626) Replace AsyncRDDActions with a more general async. API

2014-09-21 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3626: - Summary: Replace AsyncRDDActions with a more general async. API Key: SPARK-3626 URL: https://issues.apache.org/jira/browse/SPARK-3626 Project: Spark Issue Type: Im

[jira] [Commented] (SPARK-2321) Design a proper progress reporting & event listener API

2014-09-21 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142772#comment-14142772 ] Josh Rosen commented on SPARK-2321: --- The scheduler has some data structures like StageIn

[jira] [Created] (SPARK-3634) Python modules added through addPyFile should take precedence over system modules

2014-09-21 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3634: - Summary: Python modules added through addPyFile should take precedence over system modules Key: SPARK-3634 URL: https://issues.apache.org/jira/browse/SPARK-3634 Project: Sp

[jira] [Commented] (SPARK-2321) Design a proper progress reporting & event listener API

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143425#comment-14143425 ] Josh Rosen commented on SPARK-2321: --- {quote} ... maybe we should redesign the SparkListe

[jira] [Updated] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3588: -- Assignee: Meethu Mathew > Gaussian Mixture Model clustering > - > >

[jira] [Commented] (SPARK-3431) Parallelize execution of tests

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143559#comment-14143559 ] Josh Rosen commented on SPARK-3431: --- It would be great to address this soon, since sever

[jira] [Resolved] (SPARK-2373) RDD add span function (split an RDD to two RDD based on user's function)

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2373. --- Resolution: Won't Fix Resolving this as "Won't Fix", per discussion on the PR. [Matei said|https://g

[jira] [Created] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

2014-09-22 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3644: - Summary: REST API for Spark application info (jobs / stages / tasks / storage info) Key: SPARK-3644 URL: https://issues.apache.org/jira/browse/SPARK-3644 Project: Spark

[jira] [Updated] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3644: -- Assignee: (was: Josh Rosen) > REST API for Spark application info (jobs / stages / tasks / storage i

[jira] [Updated] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3644: -- Description: This JIRA is a forum to draft a design proposal for a REST interface for accessing informa

[jira] [Commented] (SPARK-3431) Parallelize execution of tests

2014-09-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143684#comment-14143684 ] Josh Rosen commented on SPARK-3431: --- [~nchammas] I'm not sure. The different test suite

[jira] [Commented] (SPARK-3642) Better document the nuances of shared variables

2014-09-23 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145918#comment-14145918 ] Josh Rosen commented on SPARK-3642: --- I've linked this JIRA to a couple of related ticket

[jira] [Resolved] (SPARK-3634) Python modules added through addPyFile should take precedence over system modules

2014-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3634. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2492 [https://github.com/

[jira] [Resolved] (SPARK-3679) pickle the exact globals of functions

2014-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3679. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2522 [https://github.com/

[jira] [Commented] (SPARK-889) Bring back DFS broadcast

2014-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147032#comment-14147032 ] Josh Rosen commented on SPARK-889: -- In fact, I think [~rxin] has some JIRAs and PRs to mak

[jira] [Commented] (SPARK-3639) Kinesis examples set master as local

2014-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147042#comment-14147042 ] Josh Rosen commented on SPARK-3639: --- This sounds reasonable to me; feel free to open a P

[jira] [Comment Edited] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148208#comment-14148208 ] Josh Rosen edited comment on SPARK-1823 at 9/25/14 7:42 PM: SP

[jira] [Commented] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148208#comment-14148208 ] Josh Rosen commented on SPARK-1823: --- SPARK-3074 is a related issue for PySpark. > Exter

[jira] [Commented] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148365#comment-14148365 ] Josh Rosen commented on SPARK-3690: --- For additional context, here's the mailing list thr

<    1   2   3   4   5   6   7   8   9   10   >