[GitHub] spark pull request: add a util method for changing the log level w...

2014-09-17 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2433 add a util method for changing the log level while running You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark SPARK-3444-provide

[GitHub] spark pull request: add a util method for changing the log level w...

2014-09-17 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2433#issuecomment-55964974 Sounds good, how about I have the spark context do the conversion and call the utils method? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: add a util method for changing the log level w...

2014-09-25 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2433#issuecomment-56780109 Sure I will do that tomorrow :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1311 Map side distinct in collect vertex...

2014-07-29 Thread holdenk
Github user holdenk closed the pull request at: https://github.com/apache/spark/pull/21 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-07-30 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/831#issuecomment-50693037 Sure I should probably restart it from scratch anyways. On Wed, Jul 30, 2014 at 4:09 PM, Patrick Wendell notificati...@github.com wrote: Sure

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-07-30 Thread holdenk
Github user holdenk closed the pull request at: https://github.com/apache/spark/pull/831 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: WIP: Spark 3754 spark streaming file system ap...

2014-10-07 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2703 WIP: Spark 3754 spark streaming file system api callable from java This is a WIP of making the filesystem api callable for the Java version of Spark Streaming. You can merge this pull request

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-10-08 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2703#issuecomment-58447025 @tdas this fixes the issue with the Java API I found while working on the book. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: replace awaitTransformation with awaitTerminat...

2014-10-20 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2861 replace awaitTransformation with awaitTermination in scaladoc/javadoc You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark SPARK-4015

[GitHub] spark pull request: specify unidocGenjavadocVersion of 0.8

2014-10-22 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2893 specify unidocGenjavadocVersion of 0.8 Fixes an issue with being too strict generating javadoc causing errors. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-10-22 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2703#issuecomment-60113226 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-10-27 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/2703#discussion_r19435702 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala --- @@ -331,8 +331,8 @@ class StreamingContext private[streaming

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-10-27 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/2703#discussion_r19435781 --- Diff: streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java --- @@ -1703,6 +1710,65 @@ public void testTextFileStream

[GitHub] spark pull request: add a util method for changing the log level w...

2014-10-27 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2433#issuecomment-60666140 @tdas If we switch to a different logging library it would just be a matter of calling the new log libraries change log level function (and possibly translating some

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-07-12 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/831#issuecomment-48803493 @pwendell : That sounds like a good plan. I'll give it a shot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/16#discussion_r10643477 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -72,11 +72,12 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/217 [WIP] Spark 939 allow user jars to take precedence over spark jars I still need to do a small bit of re-factoring [mostly the one Java file I'll switch it back to a Scala file and use it in both

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/217#issuecomment-38508778 oh hmm... the manifest files failed the Apache RAT test. I'll see if I can make my tests work without them. --- If your project is set up for it, you can reply

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-24 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/16#discussion_r10912961 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -72,11 +72,12 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/217#discussion_r10917591 --- Diff: project/SparkBuild.scala --- @@ -182,7 +182,6 @@ object SparkBuild extends Build { concurrentRestrictions in Global += Tags.limit(Tags.Test

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/217#discussion_r10917645 --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorURLClassLoader.scala --- @@ -19,13 +19,56 @@ package org.apache.spark.executor

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/217#discussion_r10918126 --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorURLClassLoader.scala --- @@ -19,13 +19,56 @@ package org.apache.spark.executor

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-24 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/217#issuecomment-38533060 So when we extend a classloader and provide the parent the loadClass function will resolve through the parent rather than the child which is why we avoid it for the Child

[GitHub] spark pull request: [WIP] Spark 939 allow user jars to take preced...

2014-03-25 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/217#issuecomment-38534295 @AmplabJenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-26 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/242 [WIP] Spark 1271 (1320) cogroup and groupby should pass iterator[x] You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark spark-1320

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-26 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-38765714 Jenkins, retest this please On Wednesday, March 26, 2014, UCB AMPLab notificati...@github.com wrote: Merged build finished. -- Reply

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-27 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-38773795 It would be a bit difficult to preserve the other return type since the call signatures are the same. Its probably worth calling out in the release notes given that I had

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-27 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-38869216 Do we also want these changes for the python API? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-27 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-38886937 So I made the python API match as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [WIP] Spark 1271 (1320) cogroup and groupby sh...

2014-03-27 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-38886975 @mridulm Did you want something like named legacyGroupByKey which would have the old behaviour? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: Spark 1271 (1320) cogroup and groupby should p...

2014-04-02 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-39373102 The python one is allready an iterable. I'll refactor this if @rxin / @aarondav don't have any comments :) --- If your project is set up for it, you can reply

[GitHub] spark pull request: Spark 939 allow user jars to take precedence o...

2014-04-04 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/217#issuecomment-39628786 That code looks useful :) Just a heads up though its probably going to be a bit ugly since: 1) I need the classes to not be in the Spark project its self (so I'll

[GitHub] spark pull request: Spark 939 allow user jars to take precedence o...

2014-04-05 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/217#issuecomment-39630440 On Friday, April 4, 2014, Patrick Wendell notificati...@github.com wrote: 1) What do you mean not be in the Spark project? Do you mean the package name

[GitHub] spark pull request: Spark 1271 (1320) cogroup and groupby should p...

2014-04-08 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/242#discussion_r11406772 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala --- @@ -421,12 +421,12 @@ class ALS private ( * Compute the new feature

[GitHub] spark pull request: Spark 1271 (1320) cogroup and groupby should p...

2014-04-08 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/242#issuecomment-39892406 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1310: Start adding k-fold cross validati...

2014-04-08 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-39920996 @pwendell / @mengxr If this passes the tests do you think it would be ok to merge? It fixes a bug BernoulliSampler :) --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-04-09 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-40014184 Should I still try and get this merged for 1.0? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1310: Start adding k-fold cross validati...

2014-04-15 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-40523420 I merged master in and fixed the conflicts, it should be good to merge now. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-04-16 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-40672235 @mateiz : Do you still think we should try and get this in for 1.0 / is it in a good state to merge? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: SPARK-1551: Add metrics-ganglia to the SparkBu...

2014-04-20 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/461 SPARK-1551: Add metrics-ganglia to the SparkBuild You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark fixcodahalemetrics

[GitHub] spark pull request: SPARK-1551: Add metrics-ganglia to the SparkBu...

2014-04-21 Thread holdenk
Github user holdenk closed the pull request at: https://github.com/apache/spark/pull/461 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-08-23 Thread holdenk
Github user holdenk closed the pull request at: https://github.com/apache/spark/pull/16 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2871] [PySpark] add histgram() API

2014-08-25 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2091#issuecomment-53355210 @JoshRosen sure doing a linear scan works, the evenBuckets was because the caller knows if its providing even buckets. --- If your project is set up for it, you can

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-29 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2210 Documentation update in addFile on how to use SparkFiles.get Rather than specifying the path to SparkFiles we need to use the filename. You can merge this pull request into a Git repository

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974032 @mateiz Thanks, completely forgot to check the javadoc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: Spark-3406 add a default storage level to pyth...

2014-09-05 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2280#issuecomment-54679472 @JoshRosen oh cool, I didn't notice that. I've updated that too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-11-05 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2703#issuecomment-61897978 @tdas: Ok will do. No one should be using the old API because it ins't callable from Java. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-04-23 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-41229100 @mateiz I merged master in and fixed the conflicts, let me know if there is anything else you need on my end. --- If your project is set up for it, you can reply

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-05-19 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/831 Spark 1857 improve error message when trying perform a spark operation inside another spark operation This is a quick little PR that should improve error message when trying perform a spark

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-05-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/831#discussion_r12822822 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -118,8 +118,25 @@ abstract class RDD[T: ClassTag]( // Methods and fields available

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-05-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/831#discussion_r12822810 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -118,8 +118,25 @@ abstract class RDD[T: ClassTag]( // Methods and fields available

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-05-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/831#discussion_r12822863 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -118,8 +118,25 @@ abstract class RDD[T: ClassTag]( // Methods and fields available

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-05-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/831#discussion_r12824928 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -118,8 +118,25 @@ abstract class RDD[T: ClassTag]( // Methods and fields available

[GitHub] spark pull request: Spark 1857 improve error message when trying p...

2014-06-04 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/831#discussion_r13413232 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -118,8 +118,25 @@ abstract class RDD[T: ClassTag]( // Methods and fields available

[GitHub] spark pull request: SPARK-4767: Add support for launching in a spe...

2014-12-05 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/3623 SPARK-4767: Add support for launching in a specified placement group to spark_ec2 Placement groups are cool and all the cool kids are using them. Lets add support for them to spark_ec2.py because

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2014-12-19 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/3725#issuecomment-67676131 I can take a look tonight :) On Thursday, December 18, 2014, Stephen Haberman notificati...@github.com wrote: @holdenk https://github.com/holdenk

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2014-12-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/3725#discussion_r22125536 --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorURLClassLoader.scala --- @@ -39,7 +39,17 @@ private[spark] class ChildExecutorURLClassLoader

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2014-12-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/3725#discussion_r22125664 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorURLClassLoaderSuite.scala --- @@ -37,6 +37,8 @@ class ExecutorURLClassLoaderSuite extends

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2014-12-19 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/3725#issuecomment-67690369 This looks good to me :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2014-12-19 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/3725#discussion_r22126832 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorURLClassLoaderSuite.scala --- @@ -37,6 +37,8 @@ class ExecutorURLClassLoaderSuite extends

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2014-12-30 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2703#issuecomment-68415747 I'll take a look this weekend when I have some time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...

2015-01-25 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/3725#issuecomment-71386034 @JoshRosen I don't have commit privileges so someone else will need to commit this. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: Spark 3754 spark streaming file system api cal...

2015-02-10 Thread holdenk
Github user holdenk closed the pull request at: https://github.com/apache/spark/pull/2703 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3444] Provide an easy way to change log...

2015-05-01 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/5791#discussion_r29492195 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -343,6 +343,15 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request: [SPARK-3444] Provide an easy way to change log...

2015-04-29 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/5791 [SPARK-3444] Provide an easy way to change log level Add support for changing the log level at run time through the SparkContext. Based on an earlier PR, #2433 includes CR feedback from @pwendel

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-16 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102672021 Sure I'll give that a shot tomorrow :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-15 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102297732 Cool, let me know if I should do something else with the RandomForestRegressor :) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-14 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6139#discussion_r30362255 --- Diff: python/pyspark/ml/tests.py --- @@ -112,14 +112,15 @@ def test_pipeline(self): self.assertEqual(6, dataset.index) -class

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-14 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6139#discussion_r30362273 --- Diff: python/pyspark/ml/param/_shared_params_code_gen.py --- @@ -115,7 +118,7 @@ def get$Name(self): (outputCol, output column name, None

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-14 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6139#discussion_r30362380 --- Diff: python/pyspark/ml/tests.py --- @@ -153,10 +155,26 @@ def test_params(self): with self.assertRaises(KeyError

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-14 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6139#discussion_r30369727 --- Diff: python/pyspark/ml/feature.py --- @@ -790,7 +790,7 @@ class Word2Vec(JavaEstimator, HasStepSize, HasMaxIter, HasSeed, HasInputCol, Has

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-14 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102195718 Seems like its failing from a git issue. @AmplabJenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-14 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102168506 @mengxr Thanks for revewing this! I'll update the PR title :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-15 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102296436 @jkbradley: I think its just that the seed wasn't specified in that test, I've gone ahead and specified the seed for that test and we should be good to go (works

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-13 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/6139 [Spark-7511] pyspark ml seed param should be random by default or 42 is quite funny but not very random You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [Spark-7511] pyspark ml seed param should be r...

2015-05-14 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-101950118 @JoshRosen ![image](https://cloud.githubusercontent.com/assets/59893/7627734/3cd33cae-f9cf-11e4-87a3-a482caba62ed.png) :) --- If your project is set up

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-15 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-102472670 I suspect the streaming test failure is unrelated. @AmplabJenkins retest this please. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: Add a startTime property to match the correspo...

2015-05-19 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/6275 Add a startTime property to match the corresponding one in Scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark SPARK-771

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-19 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-103710305 Sure thing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [Spark-7511][MLLIB] pyspark ml seed param shou...

2015-05-20 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6139#issuecomment-103969539 since we didn't change any Scala code failing the Scala style tests is unexpected. @AmplabJenkins retest this please. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-7711] Add a startTime property to match...

2015-05-20 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6275#issuecomment-103969281 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-08 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6386#issuecomment-110124971 So I tried to run this over the weekend with a scaling factor of 0.1 and all of the tests (both on the old and the new branches) failed with OOM. I've decreased

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-08 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6386#issuecomment-110179327 In config.py . I also ran into some failures because the auto build of the mllib tests in spark-perf doesn't seem to pass down the version number. I'm re-doing

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-09 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6386#issuecomment-110257890 Ok looks like it ran ok, I'll do another run against master. Theres a bunch of FAILs but, looking at the spark-perf issues it seems like there are expected failures

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-02 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6386#issuecomment-108069638 Makes sense, less of a rush then. I'll start on the follow up PR that adds the initial weights setting trait and uses it :) --- If your project is set up for it, you

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-02 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6386#issuecomment-108054234 @dbtsai : would it be reasonable to just run the tests in databricks/spark-perf#72 instead of waiting on getting it merged? --- If your project is set up for it, you

[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-02 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6386#discussion_r31585441 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala --- @@ -363,4 +371,54 @@ class LogisticRegressionWithLBFGS

[GitHub] spark pull request: [SPARK-8506] Add pakages to R context created ...

2015-06-21 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6928#issuecomment-113981439 jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7888] Be able to disable intercept in l...

2015-06-21 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/6927 [SPARK-7888] Be able to disable intercept in linear regression in ml package You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark

[GitHub] spark pull request: [SPARK-8506] Add pakages to R context created ...

2015-06-22 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6928#issuecomment-114014173 Well, since SparkSubmitt should ready be tested, could we just test the param code gen? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-8506] Add pakages to R context created ...

2015-06-21 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/6928 [SPARK-8506] Add pakages to R context created through init. You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark SPARK-8506-sparkr

[GitHub] spark pull request: [SPARK-8498][TUNGSTEN] fix npe in errorhandlin...

2015-06-20 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6918#issuecomment-113727924 Sounds like a good plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8498][TUNGSTEN] fix npe in errorhandlin...

2015-06-20 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6918#issuecomment-113728593 Would it not make more sense to just catch/rethrow throwables? (That way if fail during cleanup we can pass back an exception indicating that cleanup failed

[GitHub] spark pull request: [SPARK-7888] Be able to disable intercept in l...

2015-06-22 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6927#discussion_r32909797 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -34,14 +35,20 @@ class LinearRegressionSuite extends

[GitHub] spark pull request: [SPARK-8522][SPARK-8613][MLLIB][TRIVIAL] add p...

2015-06-25 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/7024 [SPARK-8522][SPARK-8613][MLLIB][TRIVIAL] add param to disable linear feature scaling Add a param to disable linear feature scaling (to be implemented later in linear logistic regression). Done

[GitHub] spark pull request: [SPARK-8522][SPARK-8613][MLLIB][TRIVIAL] add p...

2015-06-26 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/7024#issuecomment-115564693 Oh yah this has both the subtask and the parent task jira, I'll switch it to just the subtask jira. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARKR] [DOCS] Add documentation for RStudio ...

2015-06-23 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6916#issuecomment-114376328 Sure, I can do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7888] Be able to disable intercept in l...

2015-06-23 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/6927#discussion_r33015319 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -34,14 +35,24 @@ class LinearRegressionSuite extends

[GitHub] spark pull request: [SPARK-8601][ML][WIP] Add an option to disable...

2015-06-26 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/7037 [SPARK-8601][ML][WIP] Add an option to disable standardization for linear regression You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk

[GitHub] spark pull request: [Spark-7781][MLLIB] gradient boosted trees.tra...

2015-06-15 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6331#issuecomment-112265955 cc @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

  1   2   3   4   5   6   7   8   9   10   >