[jira] [Created] (SPARK-2597) Improve the code related to Table Scan

2014-07-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2597: --- Summary: Improve the code related to Table Scan Key: SPARK-2597 URL: https://issues.apache.org/jira/browse/SPARK-2597 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2597) Improve the code related to Table Scan

2014-07-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067814#comment-14067814 ] Yin Huai commented on SPARK-2597: - Hive uses HiveInputFormat as the wrapper of different

[jira] [Commented] (SPARK-2521) Broadcast RDD object once per TaskSet (instead of sending it for every task)

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067824#comment-14067824 ] Apache Spark commented on SPARK-2521: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-2598) RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2598: -- Summary: RangePartitioner's binary search does not use the given Ordering Key: SPARK-2598 URL: https://issues.apache.org/jira/browse/SPARK-2598 Project: Spark

[jira] [Commented] (SPARK-2045) Sort-based shuffle implementation

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067826#comment-14067826 ] Apache Spark commented on SPARK-2045: - User 'mateiz' has created a pull request for

[jira] [Commented] (SPARK-2598) RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067836#comment-14067836 ] Apache Spark commented on SPARK-2598: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2583: --- Priority: Critical (was: Major) ConnectionManager cannot distinguish whether error occurred or not

[jira] [Updated] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2583: --- Target Version/s: 1.1.0 Assignee: Kousuke Saruta ConnectionManager cannot distinguish

[jira] [Created] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
Doris Xin created SPARK-2599: Summary: almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0 Key: SPARK-2599 URL: https://issues.apache.org/jira/browse/SPARK-2599

[jira] [Commented] (SPARK-2512) Stratified sampling

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067848#comment-14067848 ] Doris Xin commented on SPARK-2512: -- Hey Xiangrui can you close this one since there's

[jira] [Closed] (SPARK-2600) Correlations (Pearson, Spearman)

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin closed SPARK-2600. Resolution: Implemented Correlations (Pearson, Spearman)

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067876#comment-14067876 ] Sean Owen commented on SPARK-2599: -- The relative error will never be more than 2.0; it

[jira] [Comment Edited] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067876#comment-14067876 ] Sean Owen edited comment on SPARK-2599 at 7/20/14 10:20 AM:

[jira] [Commented] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-20 Thread Michael Yannakopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067971#comment-14067971 ] Michael Yannakopoulos commented on SPARK-2552: -- Xiangrui Meng, Sorry about

[jira] [Created] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-20 Thread Kevin Matzen (JIRA)
Kevin Matzen created SPARK-2601: --- Summary: py4j.Py4JException on sc.pickleFile Key: SPARK-2601 URL: https://issues.apache.org/jira/browse/SPARK-2601 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-20 Thread Kevin Matzen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Matzen updated SPARK-2601: Description: {code:title=test.py} from pyspark import SparkContext text_filename = 'README.md'

[jira] [Resolved] (SPARK-2519) Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

2014-07-20 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved SPARK-2519. --- Resolution: Fixed Fix Version/s: 1.1.0 Eliminate pattern-matching on Tuple2 in

[jira] [Commented] (SPARK-2047) Use less memory in AppendOnlyMap.destructiveSortedIterator

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068042#comment-14068042 ] Apache Spark commented on SPARK-2047: - User 'aarondav' has created a pull request for

[jira] [Created] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2602: --- Summary: sbt/sbt test steals window focus on OS X Key: SPARK-2602 URL: https://issues.apache.org/jira/browse/SPARK-2602 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Gera Shegalov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068044#comment-14068044 ] Gera Shegalov commented on SPARK-2602: -- Take a look at the thread on HADOOP-10290

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068045#comment-14068045 ] Sean Owen commented on SPARK-2602: -- I have not observed this ever. OS X 10.9.4 / Java 7.

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068050#comment-14068050 ] Nicholas Chammas commented on SPARK-2602: - Ah, I'm on Java 6. Looking at [this

[jira] [Comment Edited] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068050#comment-14068050 ] Nicholas Chammas edited comment on SPARK-2602 at 7/20/14 9:24 PM:

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068082#comment-14068082 ] Apache Spark commented on SPARK-2282: - User 'aarondav' has created a pull request for

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-20 Thread Aaron Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068083#comment-14068083 ] Aaron Davidson commented on SPARK-2282: --- Hey Ken, I created [PR

[jira] [Created] (SPARK-2603) Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala

2014-07-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2603: --- Summary: Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala Key: SPARK-2603 URL: https://issues.apache.org/jira/browse/SPARK-2603

[jira] [Updated] (SPARK-2603) Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala

2014-07-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2603: Description: In JsonRDD.scalafy, we are using toMap/toList to convert a Java Map/List to a Scala one.

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Debasish Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068094#comment-14068094 ] Debasish Das commented on SPARK-2602: - CDH5 does not even support java6 anymore !

[jira] [Updated] (SPARK-2082) Stratified sampling implementation in PairRDDFunctions

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin updated SPARK-2082: - Target Version/s: 1.1.0 Stratified sampling implementation in PairRDDFunctions

[jira] [Closed] (SPARK-2512) Stratified sampling

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin closed SPARK-2512. Resolution: Duplicate Stratified sampling --- Key: SPARK-2512

[jira] [Resolved] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2552. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1493

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068113#comment-14068113 ] Doris Xin commented on SPARK-2599: -- Found this in-depth article discussing the different

[jira] [Comment Edited] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068113#comment-14068113 ] Doris Xin edited comment on SPARK-2599 at 7/21/14 2:06 AM: ---

[jira] [Commented] (SPARK-2511) Add TF-IDF featurizer

2014-07-20 Thread Michael Yannakopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068159#comment-14068159 ] Michael Yannakopoulos commented on SPARK-2511: -- I am really interested in

[jira] [Commented] (SPARK-2470) Fix PEP 8 violations

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068162#comment-14068162 ] Apache Spark commented on SPARK-2470: - User 'nchammas' has created a pull request for

[jira] [Resolved] (SPARK-1945) Add full Java examples in MLlib docs

2014-07-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1945. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1311

[jira] [Commented] (SPARK-2582) Make Block Manager Master pluggable

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068238#comment-14068238 ] Apache Spark commented on SPARK-2582: - User 'harishreedharan' has created a pull