[jira] [Created] (SPARK-2937) Separate out sampleByKeyExact in PairRDDFunctions as its own API

2014-08-09 Thread Doris Xin (JIRA)
Doris Xin created SPARK-2937: Summary: Separate out sampleByKeyExact in PairRDDFunctions as its own API Key: SPARK-2937 URL: https://issues.apache.org/jira/browse/SPARK-2937 Project: Spark

[jira] [Commented] (SPARK-2937) Separate out sampleByKeyExact in PairRDDFunctions as its own API

2014-08-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091681#comment-14091681 ] Apache Spark commented on SPARK-2937: - User 'dorx' has created a pull request for this

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091684#comment-14091684 ] Saisai Shao commented on SPARK-2926: Hi Sandy, Thanks a lot for your comments, basic

[jira] [Created] (SPARK-2938) Support SASL authentication in Netty network module

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2938: -- Summary: Support SASL authentication in Netty network module Key: SPARK-2938 URL: https://issues.apache.org/jira/browse/SPARK-2938 Project: Spark Issue Type:

[jira] [Created] (SPARK-2939) Support fetching in-memory blocks for Netty network module

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2939: -- Summary: Support fetching in-memory blocks for Netty network module Key: SPARK-2939 URL: https://issues.apache.org/jira/browse/SPARK-2939 Project: Spark Issue

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091685#comment-14091685 ] Matei Zaharia commented on SPARK-2926: -- Hey Saisai, a couple of questions about this:

[jira] [Updated] (SPARK-2468) zero-copy shuffle network communication

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2468: --- Description: Right now shuffle send goes through the block manager. This is inefficient because it

[jira] [Created] (SPARK-2940) Support fetching multiple blocks in a single request in Netty network module

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2940: -- Summary: Support fetching multiple blocks in a single request in Netty network module Key: SPARK-2940 URL: https://issues.apache.org/jira/browse/SPARK-2940 Project:

[jira] [Created] (SPARK-2941) Add config option to support NIO vs OIO in Netty network module

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2941: -- Summary: Add config option to support NIO vs OIO in Netty network module Key: SPARK-2941 URL: https://issues.apache.org/jira/browse/SPARK-2941 Project: Spark

[jira] [Created] (SPARK-2942) Report error messages back from server to client

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2942: -- Summary: Report error messages back from server to client Key: SPARK-2942 URL: https://issues.apache.org/jira/browse/SPARK-2942 Project: Spark Issue Type:

[jira] [Updated] (SPARK-2936) Migrate Netty network module from Java to Scala

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2936: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-2468 Migrate Netty network module

[jira] [Updated] (SPARK-2939) Support fetching in-memory blocks for Netty network module

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2939: --- Target Version/s: 1.2.0 Support fetching in-memory blocks for Netty network module

[jira] [Updated] (SPARK-2942) Report error messages back from server to client

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2942: --- Target Version/s: 1.2.0 Report error messages back from server to client

[jira] [Created] (SPARK-2943) Create config options for Netty sendBufferSize and receiveBufferSize

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2943: -- Summary: Create config options for Netty sendBufferSize and receiveBufferSize Key: SPARK-2943 URL: https://issues.apache.org/jira/browse/SPARK-2943 Project: Spark

[jira] [Created] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly

2014-08-09 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2944: Summary: sc.makeRDD doesn't distribute partitions evenly Key: SPARK-2944 URL: https://issues.apache.org/jira/browse/SPARK-2944 Project: Spark Issue Type:

[jira] [Updated] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly

2014-08-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2944: - Description: 16 nodes EC2 cluster: {code} val rdd = sc.makeRDD(0 until 1e9.toInt, 1000).cache()

[jira] [Resolved] (SPARK-2861) Doc comment of DoubleRDDFunctions.histogram is incorrect

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2861. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1786

[jira] [Updated] (SPARK-2861) Doc comment of DoubleRDDFunctions.histogram is incorrect

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2861: --- Assignee: Chandan Kumar Doc comment of DoubleRDDFunctions.histogram is incorrect

[jira] [Commented] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091699#comment-14091699 ] Patrick Wendell commented on SPARK-2944: Hey [~mengxr], do you know how the

[jira] [Created] (SPARK-2945) Allow specifying num of executors in the context configuration

2014-08-09 Thread Shay Rojansky (JIRA)
Shay Rojansky created SPARK-2945: Summary: Allow specifying num of executors in the context configuration Key: SPARK-2945 URL: https://issues.apache.org/jira/browse/SPARK-2945 Project: Spark

[jira] [Created] (SPARK-2946) Allow specifying * for --num-executors in YARN

2014-08-09 Thread Shay Rojansky (JIRA)
Shay Rojansky created SPARK-2946: Summary: Allow specifying * for --num-executors in YARN Key: SPARK-2946 URL: https://issues.apache.org/jira/browse/SPARK-2946 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091700#comment-14091700 ] Patrick Wendell commented on SPARK-2931: [~matei] Hey Matei - IIRC you looked at

[jira] [Commented] (SPARK-2945) Allow specifying num of executors in the context configuration

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091701#comment-14091701 ] Patrick Wendell commented on SPARK-2945: Hey [~roji] I believe this already exists

[jira] [Commented] (SPARK-1406) PMML model evaluation support via MLib

2014-08-09 Thread Vincenzo Selvaggio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091738#comment-14091738 ] Vincenzo Selvaggio commented on SPARK-1406: --- I agree with Sean, I could see the

[jira] [Commented] (SPARK-1406) PMML model evaluation support via MLib

2014-08-09 Thread Vincenzo Selvaggio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091747#comment-14091747 ] Vincenzo Selvaggio commented on SPARK-1406: --- Thanks for clarifying. PMML model

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091746#comment-14091746 ] Mridul Muralidharan commented on SPARK-2931: [~kayousterhout] this is weird, I

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091753#comment-14091753 ] Saisai Shao commented on SPARK-2926: Hi Matei, thanks a lot for your comments. The

[jira] [Comment Edited] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091753#comment-14091753 ] Saisai Shao edited comment on SPARK-2926 at 8/9/14 2:09 PM: Hi

[jira] [Created] (SPARK-2947) DAGScheduler scheduling dead cycle

2014-08-09 Thread Guoqiang Li (JIRA)
Guoqiang Li created SPARK-2947: -- Summary: DAGScheduler scheduling dead cycle Key: SPARK-2947 URL: https://issues.apache.org/jira/browse/SPARK-2947 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2947) DAGScheduler scheduling dead cycle

2014-08-09 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2947: --- Fix Version/s: 1.0.3 DAGScheduler scheduling dead cycle --

[jira] [Updated] (SPARK-2947) DAGScheduler scheduling dead cycle

2014-08-09 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2947: --- Fix Version/s: 1.1.0 DAGScheduler scheduling dead cycle --

[jira] [Updated] (SPARK-2947) DAGScheduler scheduling dead cycle

2014-08-09 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2947: --- Description: Stage to resubmit more than 5 times. This seems to be caused by

[jira] [Commented] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly

2014-08-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091784#comment-14091784 ] Xiangrui Meng commented on SPARK-2944: -- I checked that one first. It was okay after

[jira] [Commented] (SPARK-2945) Allow specifying num of executors in the context configuration

2014-08-09 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091817#comment-14091817 ] Sandy Ryza commented on SPARK-2945: --- spark.executor.instances apparently isn't used for

[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2931: -- Attachment: scala-sort-by-key.err @ [~kayousterhout]: I can see how that code in {{executorLost()}}

[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-2931: --- Attachment: test.patch A patch to showcase the exception getAllowedLocalityLevel()

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091881#comment-14091881 ] Mridul Muralidharan commented on SPARK-2931: [~joshrosen] [~kayousterhout]

[jira] [Commented] (SPARK-2948) PySpark doesn't work on Python 2.6

2014-08-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091907#comment-14091907 ] Apache Spark commented on SPARK-2948: - User 'sarutak' has created a pull request for

[jira] [Created] (SPARK-2949) SparkContext does not fate-share with ActorSystem

2014-08-09 Thread Aaron Davidson (JIRA)
Aaron Davidson created SPARK-2949: - Summary: SparkContext does not fate-share with ActorSystem Key: SPARK-2949 URL: https://issues.apache.org/jira/browse/SPARK-2949 Project: Spark Issue

[jira] [Updated] (SPARK-2948) PySpark doesn't work on Python 2.6

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2948: -- Affects Version/s: (was: 1.0.2) 1.1.0 PySpark doesn't work on Python 2.6

[jira] [Created] (SPARK-2950) Add GC time and Shuffle Write time to JobLogger output

2014-08-09 Thread Shivaram Venkataraman (JIRA)
Shivaram Venkataraman created SPARK-2950: Summary: Add GC time and Shuffle Write time to JobLogger output Key: SPARK-2950 URL: https://issues.apache.org/jira/browse/SPARK-2950 Project: Spark

[jira] [Created] (SPARK-2951) SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6

2014-08-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2951: - Summary: SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6 Key: SPARK-2951 URL: https://issues.apache.org/jira/browse/SPARK-2951 Project:

[jira] [Commented] (SPARK-2871) Missing API in PySpark

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091986#comment-14091986 ] Josh Rosen commented on SPARK-2871: --- There's actually an open PR for this that's

[jira] [Resolved] (SPARK-1766) Move reduceByKey definitions next to each other in PairRDDFunctions

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1766. Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Chris Cope

[jira] [Resolved] (SPARK-2894) spark-shell doesn't accept flags

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2894. Resolution: Duplicate spark-shell doesn't accept flags

[jira] [Commented] (SPARK-2894) spark-shell doesn't accept flags

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091996#comment-14091996 ] Patrick Wendell commented on SPARK-2894: I closed this in favor of SPARK-2678

[jira] [Updated] (SPARK-2678) `Spark-submit` overrides user application options

2014-08-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2678: --- Assignee: Kousuke Saruta (was: Cheng Lian) `Spark-submit` overrides user application

[jira] [Created] (SPARK-2952) Enable logging actor messages at DEBUG level

2014-08-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2952: -- Summary: Enable logging actor messages at DEBUG level Key: SPARK-2952 URL: https://issues.apache.org/jira/browse/SPARK-2952 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2952) Enable logging actor messages at DEBUG level

2014-08-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092012#comment-14092012 ] Apache Spark commented on SPARK-2952: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-2952) Enable logging actor messages at DEBUG level

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2952: --- Target Version/s: 1.1.0 Enable logging actor messages at DEBUG level

[jira] [Updated] (SPARK-2952) Enable logging actor messages at DEBUG level

2014-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2952: --- Component/s: Spark Core Enable logging actor messages at DEBUG level

[jira] [Commented] (SPARK-2907) Use mutable.HashMap to represent Model in Word2Vec

2014-08-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092014#comment-14092014 ] Apache Spark commented on SPARK-2907: - User 'Ishiihara' has created a pull request for