[jira] [Commented] (SPARK-3261) KMeans clusterer can return duplicate cluster centers

2015-02-24 Thread Derrick Burns (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336169#comment-14336169 ] Derrick Burns commented on SPARK-3261: -- One solution is to run KMeansParallel or KMea

[jira] [Commented] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Anselme Vignon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336152#comment-14336152 ] Anselme Vignon commented on SPARK-5775: --- [~marmbrus][~lian cheng] Hi, I'm quite new

[jira] [Commented] (SPARK-5969) The pyspark.rdd.sortByKey always fills only two partitions when ascending=False.

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336128#comment-14336128 ] Apache Spark commented on SPARK-5969: - User 'foxik' has created a pull request for thi

[jira] [Commented] (SPARK-5999) Remove duplicate Literal matching block

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336120#comment-14336120 ] Apache Spark commented on SPARK-5999: - User 'viirya' has created a pull request for th

[jira] [Created] (SPARK-5999) Remove duplicate Literal matching block

2015-02-24 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-5999: -- Summary: Remove duplicate Literal matching block Key: SPARK-5999 URL: https://issues.apache.org/jira/browse/SPARK-5999 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-5970) Temporary directories are not removed (but their content is)

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336113#comment-14336113 ] Apache Spark commented on SPARK-5970: - User 'foxik' has created a pull request for thi

[jira] [Commented] (SPARK-5970) Temporary directories are not removed (but their content is)

2015-02-24 Thread Milan Straka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336070#comment-14336070 ] Milan Straka commented on SPARK-5970: - I found a _Contributing to Spark_ guide, will d

[jira] [Commented] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2015-02-24 Thread Saurabh Santhosh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336057#comment-14336057 ] Saurabh Santhosh commented on SPARK-1823: - What is the status of this issue? Is th

[jira] [Commented] (SPARK-4705) Driver retries in cluster mode always fail if event logging is enabled

2015-02-24 Thread Twinkle Sachdeva (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336034#comment-14336034 ] Twinkle Sachdeva commented on SPARK-4705: - Hi [~vanzin] Working on it. Thanks, T

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Mukesh Jha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336027#comment-14336027 ] Mukesh Jha commented on SPARK-5837: --- My Hadoop version is Hadoop 2.5.0-cdh5.3.0 > HTTP

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Mukesh Jha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336026#comment-14336026 ] Mukesh Jha commented on SPARK-5837: --- [~srowen] From the Driver logs [3] I can see that S

[jira] [Resolved] (SPARK-5994) Python DataFrame documentation fixes

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5994. - Resolution: Fixed Issue resolved by pull request 4756 [https://github.com/apache/spark/pul

[jira] [Commented] (SPARK-5751) Flaky test: o.a.s.sql.hive.thriftserver.HiveThriftServer2Suite sometimes times out

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335987#comment-14335987 ] Apache Spark commented on SPARK-5751: - User 'liancheng' has created a pull request for

[jira] [Created] (SPARK-5998) Make Spark Streaming checkpoint version compatible

2015-02-24 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-5998: -- Summary: Make Spark Streaming checkpoint version compatible Key: SPARK-5998 URL: https://issues.apache.org/jira/browse/SPARK-5998 Project: Spark Issue Type: Impr

[jira] [Resolved] (SPARK-5286) Fail to drop an invalid table when using the data source API

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5286. - Resolution: Fixed Issue resolved by pull request 4755 [https://github.com/apache/spark/pul

[jira] [Commented] (SPARK-5996) DataFrame.collect() doesn't recognize UDTs

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335947#comment-14335947 ] Apache Spark commented on SPARK-5996: - User 'marmbrus' has created a pull request for

[jira] [Updated] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5775: Assignee: Cheng Lian (was: Michael Armbrust) > GenericRow cannot be cast to SpecificMutable

[jira] [Updated] (SPARK-5993) Published Kafka-assembly JAR was empty in 1.3.0-RC1

2015-02-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5993: - Fix Version/s: 1.3.0 > Published Kafka-assembly JAR was empty in 1.3.0-RC1 > -

[jira] [Updated] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5775: Priority: Blocker (was: Major) > GenericRow cannot be cast to SpecificMutableRow when neste

[jira] [Assigned] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-5775: --- Assignee: Michael Armbrust > GenericRow cannot be cast to SpecificMutableRow when nes

[jira] [Updated] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5775: Target Version/s: 1.3.0 > GenericRow cannot be cast to SpecificMutableRow when nested data a

[jira] [Resolved] (SPARK-5985) sortBy -> orderBy in Python

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5985. - Resolution: Fixed Issue resolved by pull request 4752 [https://github.com/apache/spark/pul

[jira] [Commented] (SPARK-5960) Allow AWS credentials to be passed to KinesisUtils.createStream()

2015-02-24 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335846#comment-14335846 ] Chris Fregly commented on SPARK-5960: - pushing this up to 1.3.1 > Allow AWS credentia

[jira] [Updated] (SPARK-5960) Allow AWS credentials to be passed to KinesisUtils.createStream()

2015-02-24 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Fregly updated SPARK-5960: Target Version/s: 1.3.1 (was: 1.4.0) > Allow AWS credentials to be passed to KinesisUtils.createStr

[jira] [Commented] (SPARK-5994) Python DataFrame documentation fixes

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335845#comment-14335845 ] Apache Spark commented on SPARK-5994: - User 'davies' has created a pull request for th

[jira] [Updated] (SPARK-5997) Increase partition count without performing a shuffle

2015-02-24 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-5997: -- Description: When decreasing partition count with rdd.repartition() or rdd.coalesce(), the user has the

[jira] [Commented] (SPARK-5286) Fail to drop an invalid table when using the data source API

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335835#comment-14335835 ] Apache Spark commented on SPARK-5286: - User 'yhuai' has created a pull request for thi

[jira] [Updated] (SPARK-5286) Fail to drop an invalid table when using the data source API

2015-02-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-5286: Priority: Blocker (was: Critical) > Fail to drop an invalid table when using the data source API >

[jira] [Created] (SPARK-5997) Increase partition count without performing a shuffle

2015-02-24 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-5997: - Summary: Increase partition count without performing a shuffle Key: SPARK-5997 URL: https://issues.apache.org/jira/browse/SPARK-5997 Project: Spark Issue Type: Imp

[jira] [Reopened] (SPARK-5286) Fail to drop an invalid table when using the data source API

2015-02-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reopened SPARK-5286: - I am reopen this issue because we need to catch all Throwables instead of just Exceptions. > Fail to drop an

[jira] [Commented] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Michael Nazario (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335827#comment-14335827 ] Michael Nazario commented on SPARK-5978: If I get the chance I'll look into it. I'

[jira] [Commented] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335823#comment-14335823 ] Apache Spark commented on SPARK-5979: - User 'brkyvz' has created a pull request for th

[jira] [Closed] (SPARK-5816) Add huge backward compatibility warning in DriverWrapper

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5816. Resolution: Fixed Fix Version/s: 1.3.0 Target Version/s: 1.3.0 (was: 1.3.0, 1.4.0) > Add h

[jira] [Created] (SPARK-5996) DataFrame.collect() doesn't recognize UDTs

2015-02-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5996: Summary: DataFrame.collect() doesn't recognize UDTs Key: SPARK-5996 URL: https://issues.apache.org/jira/browse/SPARK-5996 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5845) Time to cleanup spilled shuffle files not included in shuffle write time

2015-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335757#comment-14335757 ] Kay Ousterhout commented on SPARK-5845: --- I'd go with (b) -- it's fine (and good, I t

[jira] [Created] (SPARK-5995) Make ML Prediction Developer APIs public

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5995: Summary: Make ML Prediction Developer APIs public Key: SPARK-5995 URL: https://issues.apache.org/jira/browse/SPARK-5995 Project: Spark Issue Type: Su

[jira] [Commented] (SPARK-5994) Python DataFrame documentation fixes

2015-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335750#comment-14335750 ] Reynold Xin commented on SPARK-5994: Added one more: "GroupedData doesn't have any des

[jira] [Updated] (SPARK-5994) Python DataFrame documentation fixes

2015-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5994: --- Description: select empty should NOT be the same as select(*). make sure selectExpr is behaving the s

[jira] [Commented] (SPARK-5845) Time to cleanup spilled shuffle files not included in shuffle write time

2015-02-24 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335751#comment-14335751 ] Ilya Ganelin commented on SPARK-5845: - My mistake - missed your comment about the spil

[jira] [Created] (SPARK-5994) Python DataFrame documentation fixes

2015-02-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5994: -- Summary: Python DataFrame documentation fixes Key: SPARK-5994 URL: https://issues.apache.org/jira/browse/SPARK-5994 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-3956) Python API for Distributed Matrix

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3956: - Assignee: (was: Davies Liu) > Python API for Distributed Matrix >

[jira] [Commented] (SPARK-5993) Published Kafka-assembly JAR was empty in 1.3.0-RC1

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335736#comment-14335736 ] Apache Spark commented on SPARK-5993: - User 'tdas' has created a pull request for this

[jira] [Created] (SPARK-5993) Published Kafka-assembly JAR was empty in 1.3.0-RC1

2015-02-24 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-5993: Summary: Published Kafka-assembly JAR was empty in 1.3.0-RC1 Key: SPARK-5993 URL: https://issues.apache.org/jira/browse/SPARK-5993 Project: Spark Issue Type:

[jira] [Updated] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5992: - Description: Locality Sensitive Hashing (LSH) would be very useful for ML. It would be gr

[jira] [Created] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5992: Summary: Locality Sensitive Hashing (LSH) for MLlib Key: SPARK-5992 URL: https://issues.apache.org/jira/browse/SPARK-5992 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-5751) Flaky test: o.a.s.sql.hive.thriftserver.HiveThriftServer2Suite sometimes times out

2015-02-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-5751. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4720 [https://github.com/

[jira] [Updated] (SPARK-3956) Python API for Distributed Matrix

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3956: - Target Version/s: 1.4.0 (was: 1.2.0) > Python API for Distributed Matrix > --

[jira] [Updated] (SPARK-3956) Python API for Distributed Matrix

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3956: - Component/s: MLlib > Python API for Distributed Matrix > -

[jira] [Updated] (SPARK-3956) Python API for Distributed Matrix

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3956: - Priority: Major (was: Minor) > Python API for Distributed Matrix > --

[jira] [Updated] (SPARK-3851) Support for reading parquet files with different but compatible schema

2015-02-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3851: --- Fix Version/s: 1.3.0 > Support for reading parquet files with different but compatible schema

[jira] [Created] (SPARK-5991) Python API for ML model import/export

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5991: Summary: Python API for ML model import/export Key: SPARK-5991 URL: https://issues.apache.org/jira/browse/SPARK-5991 Project: Spark Issue Type: Sub-t

[jira] [Created] (SPARK-5990) Model import/export for IsotonicRegression

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5990: Summary: Model import/export for IsotonicRegression Key: SPARK-5990 URL: https://issues.apache.org/jira/browse/SPARK-5990 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5985) sortBy -> orderBy in Python

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335693#comment-14335693 ] Apache Spark commented on SPARK-5985: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-5989) Model import/export for LDAModel

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5989: Summary: Model import/export for LDAModel Key: SPARK-5989 URL: https://issues.apache.org/jira/browse/SPARK-5989 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-5988) Model import/export for PowerIterationClusteringModel

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5988: Summary: Model import/export for PowerIterationClusteringModel Key: SPARK-5988 URL: https://issues.apache.org/jira/browse/SPARK-5988 Project: Spark I

[jira] [Created] (SPARK-5987) Model import/export for GaussianMixtureModel

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5987: Summary: Model import/export for GaussianMixtureModel Key: SPARK-5987 URL: https://issues.apache.org/jira/browse/SPARK-5987 Project: Spark Issue Type

[jira] [Created] (SPARK-5986) Model import/export for KMeansModel

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5986: Summary: Model import/export for KMeansModel Key: SPARK-5986 URL: https://issues.apache.org/jira/browse/SPARK-5986 Project: Spark Issue Type: Sub-tas

[jira] [Created] (SPARK-5985) sortBy -> orderBy in Python

2015-02-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5985: -- Summary: sortBy -> orderBy in Python Key: SPARK-5985 URL: https://issues.apache.org/jira/browse/SPARK-5985 Project: Spark Issue Type: Sub-task Componen

[jira] [Commented] (SPARK-5845) Time to cleanup spilled shuffle files not included in shuffle write time

2015-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335676#comment-14335676 ] Kay Ousterhout commented on SPARK-5845: --- [~pwendell] that's exactly what I meant --

[jira] [Updated] (SPARK-5845) Time to cleanup spilled shuffle files not included in shuffle write time

2015-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-5845: -- Summary: Time to cleanup spilled shuffle files not included in shuffle write time (was: Time to

[jira] [Updated] (SPARK-5962) [MLLIB] Python support for Power Iteration Clustering

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5962: - Assignee: Stephen Boesch > [MLLIB] Python support for Power Iteration Clustering > ---

[jira] [Closed] (SPARK-5963) [MLLIB] Python support for Power Iteration Clustering

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-5963. Resolution: Duplicate Assignee: (was: Stephen Boesch) > [MLLIB] Python support for

[jira] [Commented] (SPARK-5904) DataFrame methods with varargs do not work in Java

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335674#comment-14335674 ] Apache Spark commented on SPARK-5904: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335667#comment-14335667 ] Patrick Wendell commented on SPARK-5845: [~kayousterhout] did you mean the time re

[jira] [Comment Edited] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335646#comment-14335646 ] Ilya Ganelin edited comment on SPARK-5845 at 2/24/15 11:19 PM: -

[jira] [Commented] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335646#comment-14335646 ] Ilya Ganelin commented on SPARK-5845: - If I understand correctly, the file cleanup hap

[jira] [Resolved] (SPARK-5436) Validate GradientBoostedTrees during training

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-5436. -- Resolution: Fixed Fix Version/s: 1.4.0 Target Version/s: 1.4.0 > Valida

[jira] [Commented] (SPARK-4123) Show new dependencies added in pull requests

2015-02-24 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335637#comment-14335637 ] Brennon York commented on SPARK-4123: - Gents, need a bit of input on this one. Looks l

[jira] [Created] (SPARK-5984) TimSort broken

2015-02-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5984: -- Summary: TimSort broken Key: SPARK-5984 URL: https://issues.apache.org/jira/browse/SPARK-5984 Project: Spark Issue Type: Bug Components: Spark Core

[jira] [Resolved] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-5973. -- Resolution: Fixed Fix Version/s: 1.2.2 1.3.0 > zip two rdd wit

[jira] [Updated] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5973: - Assignee: Davies Liu > zip two rdd with AutoBatchedSerializer will fail >

[jira] [Commented] (SPARK-5974) Add save/load to examples in ML guide

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335609#comment-14335609 ] Apache Spark commented on SPARK-5974: - User 'jkbradley' has created a pull request for

[jira] [Commented] (SPARK-5980) Add GradientBoostedTrees Python examples to ML guide

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335610#comment-14335610 ] Apache Spark commented on SPARK-5980: - User 'jkbradley' has created a pull request for

[jira] [Commented] (SPARK-2344) Add Fuzzy C-Means algorithm to MLlib

2015-02-24 Thread Alex (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335577#comment-14335577 ] Alex commented on SPARK-2344: - You're right of course - i did not mean to leave it there (only

[jira] [Commented] (SPARK-5982) Remove Local Read Time

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335576#comment-14335576 ] Apache Spark commented on SPARK-5982: - User 'kayousterhout' has created a pull request

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335573#comment-14335573 ] Nicholas Chammas commented on SPARK-5312: - It's something to consider I guess. Spa

[jira] [Created] (SPARK-5983) Don't respond to HTTP TRACE in HTTP-based UIs

2015-02-24 Thread Sean Owen (JIRA)
Sean Owen created SPARK-5983: Summary: Don't respond to HTTP TRACE in HTTP-based UIs Key: SPARK-5983 URL: https://issues.apache.org/jira/browse/SPARK-5983 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-02-24 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335565#comment-14335565 ] Brennon York commented on SPARK-5312: - New thoughts... So I checked out the link which

[jira] [Updated] (SPARK-5981) pyspark ML models should support predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Description: Currently, most Python models only have limited support for single-vector pr

[jira] [Updated] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Target Version/s: 1.4.0 (was: 1.3.0) > pyspark ML models fail during predict/transform on

[jira] [Updated] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Issue Type: Improvement (was: Bug) > pyspark ML models fail during predict/transform on v

[jira] [Updated] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Priority: Major (was: Critical) > pyspark ML models fail during predict/transform on vect

[jira] [Updated] (SPARK-5981) pyspark ML models should support predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Summary: pyspark ML models should support predict/transform on vector within map (was: py

[jira] [Created] (SPARK-5982) Remove Local Read Time

2015-02-24 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-5982: - Summary: Remove Local Read Time Key: SPARK-5982 URL: https://issues.apache.org/jira/browse/SPARK-5982 Project: Spark Issue Type: Bug Reporter:

[jira] [Comment Edited] (SPARK-5140) Two RDDs which are scheduled concurrently should be able to wait on parent in all cases

2015-02-24 Thread Corey J. Nolet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335518#comment-14335518 ] Corey J. Nolet edited comment on SPARK-5140 at 2/24/15 9:50 PM:

[jira] [Created] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5981: Summary: pyspark ML models fail during predict/transform on vector within map Key: SPARK-5981 URL: https://issues.apache.org/jira/browse/SPARK-5981 Project: S

[jira] [Commented] (SPARK-5140) Two RDDs which are scheduled concurrently should be able to wait on parent in all cases

2015-02-24 Thread Corey J. Nolet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335518#comment-14335518 ] Corey J. Nolet commented on SPARK-5140: --- I have a framework (similar to cascading) w

[jira] [Updated] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5973: -- Description: zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by SPARK-4841 {

[jira] [Updated] (SPARK-5971) Add Mesos support to spark-ec2

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5971: Description: Right now, spark-ec2 can only launch Spark clusters that use the standalone ma

[jira] [Resolved] (SPARK-5952) Failure to lock metastore client in tableExists()

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5952. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4746 [https:/

[jira] [Created] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-24 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5979: -- Summary: `--packages` should not exclude spark streaming assembly jars for kafka and flume Key: SPARK-5979 URL: https://issues.apache.org/jira/browse/SPARK-5979 Project:

[jira] [Created] (SPARK-5980) Add GradientBoostedTrees Python examples to ML guide

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5980: Summary: Add GradientBoostedTrees Python examples to ML guide Key: SPARK-5980 URL: https://issues.apache.org/jira/browse/SPARK-5980 Project: Spark Is

[jira] [Commented] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2015-02-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335482#comment-14335482 ] Xiangrui Meng commented on SPARK-2335: -- [~Rusty] For exact k-NN, the problem is compl

[jira] [Commented] (SPARK-2336) Approximate k-NN Models for MLLib

2015-02-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335476#comment-14335476 ] Xiangrui Meng commented on SPARK-2336: -- [~Rusty] Could you provide a summary of your

[jira] [Commented] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335473#comment-14335473 ] Josh Rosen commented on SPARK-5973: --- Just for searchability's sake, what error message d

[jira] [Commented] (SPARK-5976) Factors returned by ALS do not have partitioners associated.

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335464#comment-14335464 ] Apache Spark commented on SPARK-5976: - User 'mengxr' has created a pull request for th

[jira] [Commented] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335461#comment-14335461 ] Sean Owen commented on SPARK-5978: -- Pretty similar to https://issues.apache.org/jira/brow

[jira] [Created] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Michael Nazario (JIRA)
Michael Nazario created SPARK-5978: -- Summary: Spark examples cannot compile with Hadoop 2 Key: SPARK-5978 URL: https://issues.apache.org/jira/browse/SPARK-5978 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-3665) Java API for GraphX

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3665: - Target Version/s: 1.4.0 (was: 1.3.0) > Java API for GraphX > --- > > Key:

[jira] [Updated] (SPARK-3665) Java API for GraphX

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3665: - Affects Version/s: 1.0.0 > Java API for GraphX > --- > > Key: SPARK-3665 >

  1   2   >