[jira] [Commented] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Vinod KC (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352837#comment-14352837 ] Vinod KC commented on SPARK-6223: - I'm working on this Avoid Build warning- enable

[jira] [Closed] (SPARK-6224) Also collect NamedExpressions in PhysicalOperation

2015-03-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-6224. -- Resolution: Not a Problem Also collect NamedExpressions in PhysicalOperation

[jira] [Created] (SPARK-6224) Also collect NamedExpressions in PhysicalOperation

2015-03-09 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-6224: -- Summary: Also collect NamedExpressions in PhysicalOperation Key: SPARK-6224 URL: https://issues.apache.org/jira/browse/SPARK-6224 Project: Spark Issue

[jira] [Created] (SPARK-6225) Resolve most build warnings, 1.3.0 edition

2015-03-09 Thread Sean Owen (JIRA)
Sean Owen created SPARK-6225: Summary: Resolve most build warnings, 1.3.0 edition Key: SPARK-6225 URL: https://issues.apache.org/jira/browse/SPARK-6225 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-6188) Instance types can be mislabeled when re-starting cluster with default arguments

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6188. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4916

[jira] [Commented] (SPARK-6224) Also collect NamedExpressions in PhysicalOperation

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352945#comment-14352945 ] Apache Spark commented on SPARK-6224: - User 'viirya' has created a pull request for

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353022#comment-14353022 ] Apache Spark commented on SPARK-5986: - User 'yinxusen' has created a pull request for

[jira] [Created] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Vinod KC (JIRA)
Vinod KC created SPARK-6223: --- Summary: Avoid Build warning- enable implicit value scala.language.existentials visible Key: SPARK-6223 URL: https://issues.apache.org/jira/browse/SPARK-6223 Project: Spark

[jira] [Commented] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352866#comment-14352866 ] Apache Spark commented on SPARK-6223: - User 'vinodkc' has created a pull request for

[jira] [Commented] (SPARK-6225) Resolve most build warnings, 1.3.0 edition

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352984#comment-14352984 ] Apache Spark commented on SPARK-6225: - User 'srowen' has created a pull request for

[jira] [Commented] (SPARK-3066) Support recommendAll in matrix factorization model

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353129#comment-14353129 ] Sean Owen commented on SPARK-3066: -- My anecdotal experience with it was that getting an

[jira] [Updated] (SPARK-6188) Instance types can be mislabeled when re-starting cluster with default arguments

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6188: - Shepherd: (was: Josh Rosen) Assignee: Theodore Vasiloudis Instance types can be mislabeled when

[jira] [Resolved] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6223. -- Resolution: Duplicate I think this is a subset of a larger logical change, to clean up all similar

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-09 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353117#comment-14353117 ] Xusen Yin commented on SPARK-5986: -- Get it. Do you mind assign SPARK-5991 to me? Thanks!

[jira] [Commented] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353030#comment-14353030 ] Cheng Lian commented on SPARK-6201: --- Played Hive type implicit conversion a bit more and

[jira] [Commented] (SPARK-3066) Support recommendAll in matrix factorization model

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353114#comment-14353114 ] Joseph K. Bradley commented on SPARK-3066: -- Oops, true, not an actual metric.

[jira] [Updated] (SPARK-6227) PCA and SVD for PySpark

2015-03-09 Thread Julien Amelot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Amelot updated SPARK-6227: - Affects Version/s: 1.2.1 PCA and SVD for PySpark --- Key:

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353109#comment-14353109 ] Joseph K. Bradley commented on SPARK-5986: -- I'd recommend doing the 2 separately

[jira] [Created] (SPARK-6227) PCA and SVD for PySpark

2015-03-09 Thread Julien Amelot (JIRA)
Julien Amelot created SPARK-6227: Summary: PCA and SVD for PySpark Key: SPARK-6227 URL: https://issues.apache.org/jira/browse/SPARK-6227 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353047#comment-14353047 ] Sean Owen commented on SPARK-5134: -- Yep, I confirmed that ... {code} [INFO] \-

[jira] [Created] (SPARK-6226) Support model save/load in Python's KMeans

2015-03-09 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-6226: Summary: Support model save/load in Python's KMeans Key: SPARK-6226 URL: https://issues.apache.org/jira/browse/SPARK-6226 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-4734) [Streaming]limit the file Dstream size for each batch

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4734. -- Resolution: Won't Fix I feel strongly that this sort of change introduces new problems and doesn't

[jira] [Created] (SPARK-6230) Provide authentication and encryption for Spark's RPC

2015-03-09 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-6230: - Summary: Provide authentication and encryption for Spark's RPC Key: SPARK-6230 URL: https://issues.apache.org/jira/browse/SPARK-6230 Project: Spark Issue

[jira] [Commented] (SPARK-3066) Support recommendAll in matrix factorization model

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353225#comment-14353225 ] Joseph K. Bradley commented on SPARK-3066: -- Thanks for the references! I'll take

[jira] [Comment Edited] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Jianshi Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353248#comment-14353248 ] Jianshi Huang edited comment on SPARK-6201 at 3/9/15 5:40 PM: --

[jira] [Commented] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Jianshi Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353248#comment-14353248 ] Jianshi Huang commented on SPARK-6201: -- Implicit coercion outside the Numeric domain

[jira] [Comment Edited] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Jianshi Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353248#comment-14353248 ] Jianshi Huang edited comment on SPARK-6201 at 3/9/15 5:39 PM: --

[jira] [Comment Edited] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353030#comment-14353030 ] Cheng Lian edited comment on SPARK-6201 at 3/9/15 5:10 PM: ---

[jira] [Created] (SPARK-6229) Support encryption in network/common module

2015-03-09 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-6229: - Summary: Support encryption in network/common module Key: SPARK-6229 URL: https://issues.apache.org/jira/browse/SPARK-6229 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-6201) INSET should coerce types

2015-03-09 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353030#comment-14353030 ] Cheng Lian edited comment on SPARK-6201 at 3/9/15 5:13 PM: ---

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-09 Thread Vladimir Vladimirov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353252#comment-14353252 ] Vladimir Vladimirov commented on SPARK-3278: Had anyone benchmarked the

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353221#comment-14353221 ] Joseph K. Bradley commented on SPARK-5986: -- Subtask assigned Model

[jira] [Updated] (SPARK-6226) Support model save/load in Python's KMeans

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6226: - Assignee: Xusen Yin Support model save/load in Python's KMeans

[jira] [Created] (SPARK-6228) Provide SASL support in network/common module

2015-03-09 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-6228: - Summary: Provide SASL support in network/common module Key: SPARK-6228 URL: https://issues.apache.org/jira/browse/SPARK-6228 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6219) Expand Python lint checks to check for compilation errors

2015-03-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353325#comment-14353325 ] Nicholas Chammas commented on SPARK-6219: - That's a good point, I haven't checked

[jira] [Commented] (SPARK-4911) Report the inputs and outputs of Spark jobs so that external systems can track data lineage

2015-03-09 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353598#comment-14353598 ] Ted Malaska commented on SPARK-4911: Thanks [~sandyr]. Yes this would be very

[jira] [Commented] (SPARK-4600) org.apache.spark.graphx.VertexRDD.diff does not work

2015-03-09 Thread Ankur Dave (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353627#comment-14353627 ] Ankur Dave commented on SPARK-4600: --- As I wrote in SPARK-6022, this is the documented

[jira] [Commented] (SPARK-6190) create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

2015-03-09 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353687#comment-14353687 ] Imran Rashid commented on SPARK-6190: - Another observation as I've dug into the

[jira] [Commented] (SPARK-6211) Test Python Kafka API using Python unit tests

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353737#comment-14353737 ] Tathagata Das commented on SPARK-6211: -- That a good point. That requires the

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Description: Below is a snippet of a simple test application. Run it in one terminal window,

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353487#comment-14353487 ] Sean Owen commented on SPARK-6232: -- I can't reproduce this, although, I just tried a

[jira] [Commented] (SPARK-4911) Report the inputs and outputs of Spark jobs so that external systems can track data lineage

2015-03-09 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353554#comment-14353554 ] Sandy Ryza commented on SPARK-4911: --- I know that [~malaskat] has played around with a

[jira] [Commented] (SPARK-5368) Spark should support NAT (via akka improvements)

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353573#comment-14353573 ] Sean Owen commented on SPARK-5368: -- I feel qualified enough to review doc or config

[jira] [Commented] (SPARK-6022) GraphX `diff` test incorrectly operating on values (not VertexId's)

2015-03-09 Thread Ankur Dave (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353617#comment-14353617 ] Ankur Dave commented on SPARK-6022: --- [~maropu] is correct: the original intent of diff

[jira] [Commented] (SPARK-6113) Stabilize DecisionTree and ensembles APIs

2015-03-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353634#comment-14353634 ] Joseph K. Bradley commented on SPARK-6113: -- Thanks! I don't think there are

[jira] [Created] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Nishkam Ravi (JIRA)
Nishkam Ravi created SPARK-6234: --- Summary: 10% Performance regression with Breeze upgrade Key: SPARK-6234 URL: https://issues.apache.org/jira/browse/SPARK-6234 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5368) Spark should support NAT (via akka improvements)

2015-03-09 Thread Timothy St. Clair (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353586#comment-14353586 ] Timothy St. Clair commented on SPARK-5368: -- [~sowen] IIRC there are other Bugs

[jira] [Commented] (SPARK-5544) wholeTextFiles should recognize multiple input paths delimited by ,

2015-03-09 Thread Lev Khomich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353588#comment-14353588 ] Lev Khomich commented on SPARK-5544: I would like to work on this. [~mengxr], some

[jira] [Updated] (SPARK-677) PySpark should not collect results through local filesystem

2015-03-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-677: - Target Version/s: 1.2.2, 1.4.0, 1.3.1 Affects Version/s: (was: 0.7.0) 1.4.0

[jira] [Commented] (SPARK-677) PySpark should not collect results through local filesystem

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353641#comment-14353641 ] Apache Spark commented on SPARK-677: User 'davies' has created a pull request for this

[jira] [Created] (SPARK-6233) Should spark.ml Models be distributed by default?

2015-03-09 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-6233: Summary: Should spark.ml Models be distributed by default? Key: SPARK-6233 URL: https://issues.apache.org/jira/browse/SPARK-6233 Project: Spark

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Description: Below is a complete source code of a very simple test application. Run it in one

[jira] [Resolved] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4355. -- Resolution: Fixed Target Version/s: 1.2.0, 1.1.1, 1.0.3 (was: 1.1.1, 1.2.0, 1.0.3)

[jira] [Updated] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4355: - Fix Version/s: 1.0.3 OnlineSummarizer doesn't merge mean correctly

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Environment: Ubuntu, MacOS. Tried builds with scala 2.11 and 2.10 (for kafka receiver). Also

[jira] [Updated] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Shreedharan updated SPARK-6222: Description: When testing for our next release, our internal tests written by [~wypoon]

[jira] [Updated] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-03-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6231: --- Labels: DataFrame (was: ) Join on two tables (generated from same one) is broken

[jira] [Updated] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-03-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6050: --- Fix Version/s: (was: 1.4.0) Spark on YARN does not work --executor-cores is specified

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-09 Thread Martin Zapletal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353549#comment-14353549 ] Martin Zapletal commented on SPARK-3278: What particular benchmarks would you like

[jira] [Commented] (SPARK-5368) Spark should support NAT (via akka improvements)

2015-03-09 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353534#comment-14353534 ] Matthew Farrellee commented on SPARK-5368: -- [~srowen] will you take a look at

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353358#comment-14353358 ] Tathagata Das commented on SPARK-6222: -- Could you upload the stack traces, and logs

[jira] [Created] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-03-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6231: - Summary: Join on two tables (generated from same one) is broken Key: SPARK-6231 URL: https://issues.apache.org/jira/browse/SPARK-6231 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353347#comment-14353347 ] Xiangrui Meng commented on SPARK-6192: -- [~Manglano] and [~leckie-chn] Thanks for your

[jira] [Created] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
Platon Potapov created SPARK-6232: - Summary: Spark Streaming: simple application stalls processing Key: SPARK-6232 URL: https://issues.apache.org/jira/browse/SPARK-6232 Project: Spark Issue

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Description: Below is a complete source code of a very simple test application. Run it in one

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Description: Below is a complete source code of a very simple test application. Run it in one

[jira] [Commented] (SPARK-6228) Provide SASL support in network/common module

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353412#comment-14353412 ] Apache Spark commented on SPARK-6228: - User 'vanzin' has created a pull request for

[jira] [Commented] (SPARK-6219) Expand Python lint checks to check for compilation errors

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353473#comment-14353473 ] Sean Owen commented on SPARK-6219: -- OK, that's reasonable to make sure that everything

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-09 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Platon Potapov updated SPARK-6232: -- Description: Below is a snippet of a simple test application. Run it in one terminal window,

[jira] [Commented] (SPARK-6022) GraphX `diff` test incorrectly operating on values (not VertexId's)

2015-03-09 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353364#comment-14353364 ] Brennon York commented on SPARK-6022: - The test is correct (in what I believe {{diff}}

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353419#comment-14353419 ] Xiangrui Meng commented on SPARK-3278: -- I don't know any. It really depends on how

[jira] [Commented] (SPARK-3477) Clean up code in Yarn Client / ClientBase

2015-03-09 Thread Peter Rudenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353784#comment-14353784 ] Peter Rudenko commented on SPARK-3477: -- +1 to return these classes to public. There's

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353793#comment-14353793 ] Nishkam Ravi commented on SPARK-6234: - [~mengxr] Variant of

[jira] [Commented] (SPARK-6142) 10-12% Performance regression with finalize

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353826#comment-14353826 ] Sean Owen commented on SPARK-6142: -- Is this resolved by reverting those commits then?

[jira] [Commented] (SPARK-6005) Flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353838#comment-14353838 ] Tathagata Das commented on SPARK-6005: -- [~c...@koeninger.org] Can you check this out?

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353840#comment-14353840 ] Tathagata Das commented on SPARK-6222: -- Which patch fixes the issue? [STREAMING]

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353854#comment-14353854 ] Sean Owen commented on SPARK-6234: -- No, the thing that's not important here is the

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353867#comment-14353867 ] Hari Shreedharan commented on SPARK-6222: - [~srowen] This patch is actually not

[jira] [Updated] (SPARK-2629) Improve performance of DStream.updateStateByKey

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2629: - Summary: Improve performance of DStream.updateStateByKey (was: Improve performance of

[jira] [Commented] (SPARK-5155) Python API for MQTT streaming

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353913#comment-14353913 ] Tathagata Das commented on SPARK-5155: -- This issue is still blocking on us figuring

[jira] [Created] (SPARK-6238) Support shuffle where individual blocks might be 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6238: -- Summary: Support shuffle where individual blocks might be 2G Key: SPARK-6238 URL: https://issues.apache.org/jira/browse/SPARK-6238 Project: Spark Issue Type:

[jira] [Created] (SPARK-6237) Support network transfer for blocks larger than 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6237: -- Summary: Support network transfer for blocks larger than 2G Key: SPARK-6237 URL: https://issues.apache.org/jira/browse/SPARK-6237 Project: Spark Issue Type:

[jira] [Updated] (SPARK-5155) Python API for MQTT streaming

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5155: - Target Version/s: 1.4.0 (was: 1.3.0) Python API for MQTT streaming

[jira] [Created] (SPARK-6236) Support caching blocks larger than 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6236: -- Summary: Support caching blocks larger than 2G Key: SPARK-6236 URL: https://issues.apache.org/jira/browse/SPARK-6236 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-6190) create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

2015-03-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353940#comment-14353940 ] Reynold Xin commented on SPARK-6190: Hi [~imranr], As I said earlier, I would advise

[jira] [Commented] (SPARK-6128) Update Spark Streaming Guide for Spark 1.3

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353818#comment-14353818 ] Apache Spark commented on SPARK-6128: - User 'tdas' has created a pull request for this

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353842#comment-14353842 ] Hari Shreedharan commented on SPARK-6222: - The one on the jira. [STREAMING] All

[jira] [Updated] (SPARK-5045) Update FlumePollingReceiver to use updated Receiver API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5045: - Target Version/s: (was: 1.3.0) Update FlumePollingReceiver to use updated Receiver API

[jira] [Updated] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5048: - Assignee: Hari Shreedharan Add Flume to the Python Streaming API

[jira] [Updated] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5048: - Target Version/s: 1.4.0 (was: 1.3.0) Add Flume to the Python Streaming API

[jira] [Commented] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353918#comment-14353918 ] Tathagata Das commented on SPARK-5048: -- [~hshreedharan] Can you take a crack at this?

[jira] [Updated] (SPARK-5682) Add encrypted shuffle in spark

2015-03-09 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated SPARK-5682: Summary: Add encrypted shuffle in spark (was: Reuse hadoop encrypted shuffle algorithm to

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353843#comment-14353843 ] Nishkam Ravi commented on SPARK-6234: - Are we saying that Breeze's performance is

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353858#comment-14353858 ] Sean Owen commented on SPARK-6222: -- [~hshreedharan] you can make a [WIP] pull request

[jira] [Commented] (SPARK-5252) Streaming StatefulNetworkWordCount example hangs

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353857#comment-14353857 ] Tathagata Das commented on SPARK-5252: -- [~LutzBuech] Can you try out the latest

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353877#comment-14353877 ] Nishkam Ravi commented on SPARK-6234: - Right. This particular implementation can be

[jira] [Updated] (SPARK-5042) Updated Receiver API to make it easier to write reliable receivers that ack source

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5042: - Target Version/s: (was: 1.4.0) Updated Receiver API to make it easier to write reliable

[jira] [Commented] (SPARK-2629) Improve performance of DStream.updateStateByKey

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353906#comment-14353906 ] Tathagata Das commented on SPARK-2629: -- Since IndexRDD is not supposed to be added to

[jira] [Updated] (SPARK-5205) Inconsistent behaviour between Streaming job and others, when click kill link in WebUI

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5205: - Target Version/s: 1.4.0, 1.3.1 (was: 1.3.0, 1.2.1) Inconsistent behaviour between Streaming job

[jira] [Updated] (SPARK-5046) Update KinesisReceiver to use updated Receiver API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5046: - Target Version/s: 1.4.0 (was: 1.3.0) Update KinesisReceiver to use updated Receiver API

  1   2   >