[jira] [Commented] (SPARK-1271) Use Iterator[X] in co-group and group-by signatures

2014-04-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13957293#comment-13957293 ] holdenk commented on SPARK-1271: https://github.com/apache/spark/pull/242 Use

[jira] [Commented] (SPARK-939) Allow user jars to take precedence over Spark jars, if desired

2014-04-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13957294#comment-13957294 ] holdenk commented on SPARK-939: --- https://github.com/apache/spark/pull/217 Allow user jars

[jira] [Created] (SPARK-1551) Spark master does not build in sbt

2014-04-20 Thread holdenk (JIRA)
holdenk created SPARK-1551: -- Summary: Spark master does not build in sbt Key: SPARK-1551 URL: https://issues.apache.org/jira/browse/SPARK-1551 Project: Spark Issue Type: Bug Reporter:

[jira] [Commented] (SPARK-1551) Spark master does not build in sbt

2014-04-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975442#comment-13975442 ] holdenk commented on SPARK-1551: Sorry about that, I had something dirty locally that

[jira] [Created] (SPARK-2023) PySpark reduce does a map side reduce and then sends the results to the driver for final reduce, instead do this more like Scala Spark.

2014-06-04 Thread holdenk (JIRA)
holdenk created SPARK-2023: -- Summary: PySpark reduce does a map side reduce and then sends the results to the driver for final reduce, instead do this more like Scala Spark. Key: SPARK-2023 URL:

[jira] [Created] (SPARK-3311) SparkFiles.get doesn't work in local mode

2014-08-29 Thread holdenk (JIRA)
holdenk created SPARK-3311: -- Summary: SparkFiles.get doesn't work in local mode Key: SPARK-3311 URL: https://issues.apache.org/jira/browse/SPARK-3311 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3312) Add a groupByKey which returns a special GroupBy object like in pandas

2014-08-29 Thread holdenk (JIRA)
holdenk created SPARK-3312: -- Summary: Add a groupByKey which returns a special GroupBy object like in pandas Key: SPARK-3312 URL: https://issues.apache.org/jira/browse/SPARK-3312 Project: Spark

[jira] [Created] (SPARK-3314) Script creation of AMIs

2014-08-29 Thread holdenk (JIRA)
holdenk created SPARK-3314: -- Summary: Script creation of AMIs Key: SPARK-3314 URL: https://issues.apache.org/jira/browse/SPARK-3314 Project: Spark Issue Type: Improvement Components: EC2

[jira] [Commented] (SPARK-3311) SparkFiles.get doesn't work in local mode

2014-08-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115890#comment-14115890 ] holdenk commented on SPARK-3311: Note: this works if you add a file under ./ but not

[jira] [Closed] (SPARK-3311) SparkFiles.get doesn't work in local mode

2014-08-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-3311. -- Resolution: Fixed turns out the comment was just out of date, I'll fix the comment in another PR so

[jira] [Created] (SPARK-3318) The documentation for addFiles is wrong

2014-08-29 Thread holdenk (JIRA)
holdenk created SPARK-3318: -- Summary: The documentation for addFiles is wrong Key: SPARK-3318 URL: https://issues.apache.org/jira/browse/SPARK-3318 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-3406) Python persist API does not have a default storage level

2014-09-04 Thread holdenk (JIRA)
holdenk created SPARK-3406: -- Summary: Python persist API does not have a default storage level Key: SPARK-3406 URL: https://issues.apache.org/jira/browse/SPARK-3406 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3444) Provide a way to easily change the log level in the Spark shell while running

2014-09-08 Thread holdenk (JIRA)
holdenk created SPARK-3444: -- Summary: Provide a way to easily change the log level in the Spark shell while running Key: SPARK-3444 URL: https://issues.apache.org/jira/browse/SPARK-3444 Project: Spark

[jira] [Created] (SPARK-3754) Spark Streaming fileSystem API is not callable from Java

2014-09-30 Thread holdenk (JIRA)
holdenk created SPARK-3754: -- Summary: Spark Streaming fileSystem API is not callable from Java Key: SPARK-3754 URL: https://issues.apache.org/jira/browse/SPARK-3754 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-4015) Documentation in the streaming context references non-existent function

2014-10-20 Thread holdenk (JIRA)
holdenk created SPARK-4015: -- Summary: Documentation in the streaming context references non-existent function Key: SPARK-4015 URL: https://issues.apache.org/jira/browse/SPARK-4015 Project: Spark

[jira] [Commented] (SPARK-3359) `sbt/sbt unidoc` doesn't work with Java 8

2014-10-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180104#comment-14180104 ] holdenk commented on SPARK-3359: I think I've got a fix for it, I'll send a PR :)

[jira] [Created] (SPARK-4767) Add support for launching in a specified placement group to spark ec2 scripts.

2014-12-05 Thread holdenk (JIRA)
holdenk created SPARK-4767: -- Summary: Add support for launching in a specified placement group to spark ec2 scripts. Key: SPARK-4767 URL: https://issues.apache.org/jira/browse/SPARK-4767 Project: Spark

[jira] [Commented] (SPARK-4877) userClassPathFirst doesn't handle user classes inheriting from parent

2015-02-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305810#comment-14305810 ] holdenk commented on SPARK-4877: Hi Matt, I don't believe we need to override loadClass,

[jira] [Commented] (SPARK-7511) PySpark ML seed Param should be random by default

2015-05-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542864#comment-14542864 ] holdenk commented on SPARK-7511: I can do this :) PySpark ML seed Param should be random

[jira] [Commented] (SPARK-7711) startTime() is missing

2015-05-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551498#comment-14551498 ] holdenk commented on SPARK-7711: I can add this. startTime() is missing

[jira] [Commented] (SPARK-7781) GradientBoostedTrees.trainRegressor is missing maxBins parameter in pyspark

2015-05-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553596#comment-14553596 ] holdenk commented on SPARK-7781: I can take this :) GradientBoostedTrees.trainRegressor

[jira] [Commented] (SPARK-8069) Add support for cutoff to RandomForestClassifier

2015-06-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575627#comment-14575627 ] holdenk commented on SPARK-8069: Cool, I'll give that some though this weekend :) Add

[jira] [Created] (SPARK-8069) Add support for cutoff to RandomForestClassifier

2015-06-03 Thread holdenk (JIRA)
holdenk created SPARK-8069: -- Summary: Add support for cutoff to RandomForestClassifier Key: SPARK-8069 URL: https://issues.apache.org/jira/browse/SPARK-8069 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-8498) Fix NullPointerException in error-handling path in UnsafeShuffleWriter

2015-06-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594452#comment-14594452 ] holdenk commented on SPARK-8498: I could take this :) Fix NullPointerException in

[jira] [Created] (SPARK-8601) Disable feature scaling in Linear Regression

2015-06-24 Thread holdenk (JIRA)
holdenk created SPARK-8601: -- Summary: Disable feature scaling in Linear Regression Key: SPARK-8601 URL: https://issues.apache.org/jira/browse/SPARK-8601 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-8613) Add a param for disabling of feature scaling, default to true

2015-06-24 Thread holdenk (JIRA)
holdenk created SPARK-8613: -- Summary: Add a param for disabling of feature scaling, default to true Key: SPARK-8613 URL: https://issues.apache.org/jira/browse/SPARK-8613 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8506) SparkR does not provide an easy way to depend on Spark Packages when performing init from inside of R

2015-06-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594923#comment-14594923 ] holdenk commented on SPARK-8506: Sounds like a plan :) SparkR does not provide an easy

[jira] [Created] (SPARK-8506) SparkR does not provide an easy way to depend on Spark Packages when performing init from inside of R

2015-06-20 Thread holdenk (JIRA)
holdenk created SPARK-8506: -- Summary: SparkR does not provide an easy way to depend on Spark Packages when performing init from inside of R Key: SPARK-8506 URL: https://issues.apache.org/jira/browse/SPARK-8506

[jira] [Commented] (SPARK-8506) SparkR does not provide an easy way to depend on Spark Packages when performing init from inside of R

2015-06-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594868#comment-14594868 ] holdenk commented on SPARK-8506: Thats what I thought would be a good solution. I can go

[jira] [Commented] (SPARK-7888) Be able to disable intercept in Linear Regression in ML package

2015-06-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590201#comment-14590201 ] holdenk commented on SPARK-7888: So it seems like scikit learn takes the easy approach and

[jira] [Commented] (SPARK-7888) Be able to disable intercept in Linear Regression in ML package

2015-06-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591119#comment-14591119 ] holdenk commented on SPARK-7888: Cool, I did a quick prototype as well but mine doesn't

[jira] [Commented] (SPARK-7888) Be able to disable intercept in Linear Regression in ML package

2015-06-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590585#comment-14590585 ] holdenk commented on SPARK-7888: So if we don't re-center but still scale all of the

[jira] [Commented] (SPARK-7674) R-like stats for ML models

2015-06-15 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587376#comment-14587376 ] holdenk commented on SPARK-7674: I'd love to help with this if thats cool :) R-like

[jira] [Created] (SPARK-7910) Expose partitioner information in Java Python APIs.

2015-05-27 Thread holdenk (JIRA)
holdenk created SPARK-7910: -- Summary: Expose partitioner information in Java Python APIs. Key: SPARK-7910 URL: https://issues.apache.org/jira/browse/SPARK-7910 Project: Spark Issue Type:

[jira] [Updated] (SPARK-7910) Expose partitioner information in JavaRDD

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-7910: --- Summary: Expose partitioner information in JavaRDD (was: Expose partitioner information in Java Python

[jira] [Commented] (SPARK-6987) Node Locality is determined with String Matching instead of Inet Comparison

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562120#comment-14562120 ] holdenk commented on SPARK-6987: What do you mean by inet comparison? If the problem is

[jira] [Updated] (SPARK-7852) Set the initial weights based on the previous when GLMs are run with multiple regParams

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-7852: --- Summary: Set the initial weights based on the previous when GLMs are run with multiple regParams (was: Use

[jira] [Commented] (SPARK-7888) Be able to disable intercept in Linear Regression in ML package

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562187#comment-14562187 ] holdenk commented on SPARK-7888: Sounds good, I can go do some reading before I bug you

[jira] [Updated] (SPARK-7910) Expose partitioner information in Java Python APIs.

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-7910: --- Component/s: (was: PySpark) (was: Spark Core) Expose partitioner information in

[jira] [Commented] (SPARK-7888) Be able to disable intercept in Linear Regression in ML package

2015-05-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561864#comment-14561864 ] holdenk commented on SPARK-7888: I could do this since I'm sort of poking around at this

[jira] [Commented] (SPARK-8764) StringIndexer should take option to handle unseen values

2015-07-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611056#comment-14611056 ] holdenk commented on SPARK-8764: I could do this, I've got another PR with the

[jira] [Created] (SPARK-8769) toLocalIterator should mention it results in many jobs

2015-07-01 Thread holdenk (JIRA)
holdenk created SPARK-8769: -- Summary: toLocalIterator should mention it results in many jobs Key: SPARK-8769 URL: https://issues.apache.org/jira/browse/SPARK-8769 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8744) StringIndexerModel should have public constructor

2015-07-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611057#comment-14611057 ] holdenk commented on SPARK-8744: I could do this, I've got another PR with the

[jira] [Commented] (SPARK-8069) Add support for cutoff to RandomForestClassifier

2015-07-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612387#comment-14612387 ] holdenk commented on SPARK-8069: That sounds like a good plan. For rawPrediction's I'm not

[jira] [Created] (SPARK-8771) Actor system deprecation tag uses deprecated deprecation tag

2015-07-01 Thread holdenk (JIRA)
holdenk created SPARK-8771: -- Summary: Actor system deprecation tag uses deprecated deprecation tag Key: SPARK-8771 URL: https://issues.apache.org/jira/browse/SPARK-8771 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8069) Add support for cutoff to RandomForestClassifier

2015-07-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611224#comment-14611224 ] holdenk commented on SPARK-8069: So I started working on doing this for the

[jira] [Commented] (SPARK-7780) The intercept in LogisticRegressionWithLBFGS should not be regularized

2015-05-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555806#comment-14555806 ] holdenk commented on SPARK-7780: Rad :) Yay for long weekends :) The intercept in

[jira] [Created] (SPARK-7852) Add support for re-using weights when training with multiple lambdas

2015-05-25 Thread holdenk (JIRA)
holdenk created SPARK-7852: -- Summary: Add support for re-using weights when training with multiple lambdas Key: SPARK-7852 URL: https://issues.apache.org/jira/browse/SPARK-7852 Project: Spark

[jira] [Commented] (SPARK-7780) The intercept in LogisticRegressionWithLBFGS should not be regularized

2015-05-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556823#comment-14556823 ] holdenk commented on SPARK-7780: I was thinking that since the user could override this

[jira] [Commented] (SPARK-7446) Inverse transform for StringIndexer

2015-05-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555064#comment-14555064 ] holdenk commented on SPARK-7446: I can do this one :) Inverse transform for

[jira] [Commented] (SPARK-7780) The intercept in LogisticRegressionWithLBFGS should not be regularized

2015-05-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555051#comment-14555051 ] holdenk commented on SPARK-7780: I can take a crack at this if that would be cool :) The

[jira] [Updated] (SPARK-9909) Move weightCol to sharedParams

2015-08-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-9909: --- Component/s: ML Move weightCol to sharedParams -- Key:

[jira] [Created] (SPARK-9909) Move weightCol to sharedParams

2015-08-12 Thread holdenk (JIRA)
holdenk created SPARK-9909: -- Summary: Move weightCol to sharedParams Key: SPARK-9909 URL: https://issues.apache.org/jira/browse/SPARK-9909 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-9447) Update python API to include RandomForest as classifier changes.

2015-07-29 Thread holdenk (JIRA)
holdenk created SPARK-9447: -- Summary: Update python API to include RandomForest as classifier changes. Key: SPARK-9447 URL: https://issues.apache.org/jira/browse/SPARK-9447 Project: Spark Issue

[jira] [Commented] (SPARK-8069) Add support for cutoff to RandomForestClassifier

2015-08-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650477#comment-14650477 ] holdenk commented on SPARK-8069: So I was looking at

[jira] [Created] (SPARK-9653) Add an invert method to the StringIndexerModel as was done to StringIndexer

2015-08-05 Thread holdenk (JIRA)
holdenk created SPARK-9653: -- Summary: Add an invert method to the StringIndexerModel as was done to StringIndexer Key: SPARK-9653 URL: https://issues.apache.org/jira/browse/SPARK-9653 Project: Spark

[jira] [Updated] (SPARK-9653) Add an invert method to the StringIndexer as was done for StringIndexerModel

2015-08-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-9653: --- Summary: Add an invert method to the StringIndexer as was done for StringIndexerModel (was: Add an invert

[jira] [Created] (SPARK-9654) Add StringIndexer inverse in Pyspark

2015-08-05 Thread holdenk (JIRA)
holdenk created SPARK-9654: -- Summary: Add StringIndexer inverse in Pyspark Key: SPARK-9654 URL: https://issues.apache.org/jira/browse/SPARK-9654 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-9653) Add an invert method to the StringIndexer as was done for StringIndexerModel

2015-08-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658826#comment-14658826 ] holdenk commented on SPARK-9653: After starting to look at implementing this I'm no longer

[jira] [Commented] (SPARK-9680) add API document for ml.feature.StopWordsRemover

2015-08-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660475#comment-14660475 ] holdenk commented on SPARK-9680: I can do this if no one else is working on it yet. add

[jira] [Commented] (SPARK-9679) Add python interface for ml.feature.StopWordsRemover

2015-08-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660476#comment-14660476 ] holdenk commented on SPARK-9679: I can do this if no one else is working on it yet. Add

[jira] [Commented] (SPARK-9680) add API document for ml.feature.StopWordsRemover

2015-08-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660544#comment-14660544 ] holdenk commented on SPARK-9680: I assumed it was the API docs since it says 'add API

[jira] [Commented] (SPARK-9774) Add Python API for ml.regression.IsotonicRegression

2015-08-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692790#comment-14692790 ] holdenk commented on SPARK-9774: I can do this if no one else is working on it. Add

[jira] [Commented] (SPARK-9769) Add Python API for ml.feature.CountVectorizerModel

2015-08-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692788#comment-14692788 ] holdenk commented on SPARK-9769: I can do this if no one else is working on it. Add

[jira] [Commented] (SPARK-9654) Add StringIndexer inverse in Pyspark

2015-08-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659082#comment-14659082 ] holdenk commented on SPARK-9654: See

[jira] [Created] (SPARK-9016) Make the random forest classifiers implement classification trait

2015-07-13 Thread holdenk (JIRA)
holdenk created SPARK-9016: -- Summary: Make the random forest classifiers implement classification trait Key: SPARK-9016 URL: https://issues.apache.org/jira/browse/SPARK-9016 Project: Spark Issue

[jira] [Created] (SPARK-9204) Add default params test to linear regression

2015-07-20 Thread holdenk (JIRA)
holdenk created SPARK-9204: -- Summary: Add default params test to linear regression Key: SPARK-9204 URL: https://issues.apache.org/jira/browse/SPARK-9204 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-9595) Adding API to SparkConf for kryo serializers registration

2015-10-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967563#comment-14967563 ] holdenk commented on SPARK-9595: I'm not sure I get the question - do you have a bit of code which shows

[jira] [Created] (SPARK-11240) PMML export for SVM models in ML pipeline

2015-10-21 Thread holdenk (JIRA)
holdenk created SPARK-11240: --- Summary: PMML export for SVM models in ML pipeline Key: SPARK-11240 URL: https://issues.apache.org/jira/browse/SPARK-11240 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-11239) PMML export for ML linear regression

2015-10-21 Thread holdenk (JIRA)
holdenk created SPARK-11239: --- Summary: PMML export for ML linear regression Key: SPARK-11239 URL: https://issues.apache.org/jira/browse/SPARK-11239 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-11171) PMML for Pipelines API

2015-10-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967558#comment-14967558 ] holdenk commented on SPARK-11171: - Skimming the mailing list it seems that people mostly just want more

[jira] [Created] (SPARK-11237) PMML export for ML KMeans

2015-10-21 Thread holdenk (JIRA)
holdenk created SPARK-11237: --- Summary: PMML export for ML KMeans Key: SPARK-11237 URL: https://issues.apache.org/jira/browse/SPARK-11237 Project: Spark Issue Type: Sub-task Components:

[jira] [Created] (SPARK-11241) Add a common trait for PMML exportable ML pipeline models

2015-10-21 Thread holdenk (JIRA)
holdenk created SPARK-11241: --- Summary: Add a common trait for PMML exportable ML pipeline models Key: SPARK-11241 URL: https://issues.apache.org/jira/browse/SPARK-11241 Project: Spark Issue Type:

[jira] [Created] (SPARK-11332) WeightedLeastSquares should use ml features generic Instance class instead of private

2015-10-26 Thread holdenk (JIRA)
holdenk created SPARK-11332: --- Summary: WeightedLeastSquares should use ml features generic Instance class instead of private Key: SPARK-11332 URL: https://issues.apache.org/jira/browse/SPARK-11332 Project:

[jira] [Commented] (SPARK-11385) Add foreach API to MLLib's vector API

2015-10-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981529#comment-14981529 ] holdenk commented on SPARK-11385: - So [~dbtsai]can probably give a bit more insight into this - its to

[jira] [Commented] (SPARK-11372) custom UDAF with StringType throws java.lang.ClassCastException

2015-10-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981596#comment-14981596 ] holdenk commented on SPARK-11372: - So you seem to be setting the buffer to Java strings and Spark SQL is

[jira] [Commented] (SPARK-10658) Could pyspark provide addJars() as scala spark API?

2015-10-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977147#comment-14977147 ] holdenk commented on SPARK-10658: - So this turns out be a bit complicated because of a variety of things

[jira] [Commented] (SPARK-11239) PMML export for ML linear regression

2015-10-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977007#comment-14977007 ] holdenk commented on SPARK-11239: - Pretty much, I've got a draft PR out for it but I'm waiting on

[jira] [Created] (SPARK-11365) consolidate aggregates for summary statistics in weighted least squares

2015-10-28 Thread holdenk (JIRA)
holdenk created SPARK-11365: --- Summary: consolidate aggregates for summary statistics in weighted least squares Key: SPARK-11365 URL: https://issues.apache.org/jira/browse/SPARK-11365 Project: Spark

[jira] [Created] (SPARK-11386) Refactor appropriate uses of Vector to use the new foreach API

2015-10-28 Thread holdenk (JIRA)
holdenk created SPARK-11386: --- Summary: Refactor appropriate uses of Vector to use the new foreach API Key: SPARK-11386 URL: https://issues.apache.org/jira/browse/SPARK-11386 Project: Spark Issue

[jira] [Created] (SPARK-11385) Add foreach API to MLLib's vector API

2015-10-28 Thread holdenk (JIRA)
holdenk created SPARK-11385: --- Summary: Add foreach API to MLLib's vector API Key: SPARK-11385 URL: https://issues.apache.org/jira/browse/SPARK-11385 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-10928) Spark Mesos finegrain mode with single core CPUs, rendered useless

2015-10-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979061#comment-14979061 ] holdenk commented on SPARK-10928: - So looking at the MesosSchedulerBackend it requires 1 for the mesos

[jira] [Commented] (SPARK-11138) Flaky pyspark test: test_add_py_file

2015-10-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979854#comment-14979854 ] holdenk commented on SPARK-11138: - So I ran addPyFile and immeditetly used the result 100k times on my

[jira] [Created] (SPARK-11397) PySpark Streaming uses threadcontext class loader, may cause issues on mesos

2015-10-28 Thread holdenk (JIRA)
holdenk created SPARK-11397: --- Summary: PySpark Streaming uses threadcontext class loader, may cause issues on mesos Key: SPARK-11397 URL: https://issues.apache.org/jira/browse/SPARK-11397 Project: Spark

[jira] [Commented] (SPARK-11171) PMML for Pipelines API

2015-10-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967956#comment-14967956 ] holdenk commented on SPARK-11171: - So from the implementation point of view, we could keep extending the

[jira] [Commented] (SPARK-11138) Flaky pyspark test: test_add_py_file

2015-10-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979617#comment-14979617 ] holdenk commented on SPARK-11138: - So I've been digging into this issue a bit today (mostly trying to see

[jira] [Created] (SPARK-11771) Maximum memory is determined by two params but error message only lists one.

2015-11-16 Thread holdenk (JIRA)
holdenk created SPARK-11771: --- Summary: Maximum memory is determined by two params but error message only lists one. Key: SPARK-11771 URL: https://issues.apache.org/jira/browse/SPARK-11771 Project: Spark

[jira] [Commented] (SPARK-11138) Flaky pyspark test: test_add_py_file

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982135#comment-14982135 ] holdenk commented on SPARK-11138: - Makes sense, I guess I'll wait for it to fail again and grab the build

[jira] [Closed] (SPARK-11386) Refactor appropriate uses of Vector to use the new foreach API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-11386. --- Resolution: Fixed > Refactor appropriate uses of Vector to use the new foreach API >

[jira] [Commented] (SPARK-11385) Add foreach API to MLLib's vector API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982073#comment-14982073 ] holdenk commented on SPARK-11385: - ah my misunderstanding then. > Add foreach API to MLLib's vector API

[jira] [Updated] (SPARK-11385) Make foreachActive public in MLLib's vector API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-11385: Description: Make foreachActive public in MLLib's vector API (was: Add a foreach API to MLLib's vector.)

[jira] [Commented] (SPARK-11138) Flaky pyspark test: test_add_py_file

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982103#comment-14982103 ] holdenk commented on SPARK-11138: - So with a non logged in view I don't see it on

[jira] [Updated] (SPARK-11385) Make foreachActive public in MLLib's vector API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-11385: Summary: Make foreachActive public in MLLib's vector API (was: Add foreach API to MLLib's vector API) >

[jira] [Created] (SPARK-11421) Add the ability to add a jar to the current class loader

2015-10-30 Thread holdenk (JIRA)
holdenk created SPARK-11421: --- Summary: Add the ability to add a jar to the current class loader Key: SPARK-11421 URL: https://issues.apache.org/jira/browse/SPARK-11421 Project: Spark Issue Type:

[jira] [Updated] (SPARK-11421) Add the ability to add a jar to the current class loader

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-11421: Component/s: Spark Core > Add the ability to add a jar to the current class loader >

[jira] [Reopened] (SPARK-11386) Refactor appropriate uses of Vector to use the new foreach API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reopened SPARK-11386: - closed with wrong status > Refactor appropriate uses of Vector to use the new foreach API >

[jira] [Closed] (SPARK-11386) Refactor appropriate uses of Vector to use the new foreach API

2015-10-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-11386. --- Resolution: Invalid > Refactor appropriate uses of Vector to use the new foreach API >

[jira] [Updated] (SPARK-11444) Allow bacth seqOp combination in treeAggregate

2015-11-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-11444: Summary: Allow bacth seqOp combination in treeAggregate (was: Allow bacth seqOp combination in

[jira] [Commented] (SPARK-11444) Allow bacth seqOp combination in treeAggregate

2015-11-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985882#comment-14985882 ] holdenk commented on SPARK-11444: - Draft design:

[jira] [Updated] (SPARK-11444) Allow bacth seqOp combination in treeAggregate

2015-11-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-11444: Description: Allow batch seqOp in treeAggregate so as to allow better integration with GPU type workloads.

  1   2   3   4   5   6   7   8   9   10   >