Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/840#issuecomment-57439459
@debasish83 and @codedeft The weighted method for OWLQN in breeze is merged
https://github.com/scalanlp/breeze/commit/2570911026aa05aa1908ccf7370bc19cd8808a4c
I
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2030#issuecomment-58183559
We had a build against the spark master on Oct 2, and when ran our
application with data around 600GB, we got the following exception. Does this
PR fix this issue which
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/2693
[SPARK-3832][MLlib] Upgrade Breeze dependency to 0.10
In Breeze 0.10, the L1regParam can be configured through anonymous function
in OWLQN, and each component can be penalized differently
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2030#issuecomment-58214186
I thought it was a close issue, so I moved my comment to JIRA. I ran into
this issue in spark-shell not the standalone application, does SPARK-3762
apply
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2693#issuecomment-58276308
@dlwh David, do you know if there is dependency change in breeze-0.10 and
is it compatible with both scala 2.10 and 2.11? Thanks.
---
If your project is set up
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1518
[SPARK-2505][MLlib] Weighted Regularizer for Generalized Linear Model
(Note: This is not ready to be merged. Need documentation, and make sure
it's backforwad compatible with Spark 1.0 apis
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1379#issuecomment-49682150
I think it fails due to the apache license is not in the test file. As you
suggest, I'll move it to be generated in the runtime. Would like to know the
general feedback
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-49682436
`!~==` will be used in the test since `!(a~==b)` will not work due to that
(a~==b) is not returning false but throwing exception for messaging. I will
replace
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-49954543
@srowen @mengxr and @dorx
Based on our discussion, I've implemented two different APIs for relative
error, and absolute error. It makes sense that test writers
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1576#issuecomment-50057950
@mengxr Feel free to merge this one first. After you merge, I'll rebase
#1425 against current master, and address the conflicts.
---
If your project is set up
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50064963
@mengxr `%+-` is used as an operator to indicate this is relative error.
Users can write `assert(a ~== b %+- 1E-10)` for relative error, and `assert(a
~== b +- 1E-10
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50081864
@mengxr I just rebased against master, and it passes the test. Depending on
whether we want to use `absErr`/`relErr`, `+-`/`%+-` or both, I can do further
modification
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1425#discussion_r15443103
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/KMeansSuite.scala ---
@@ -40,27 +41,51 @@ class KMeansSuite extends FunSuite
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50293096
@mengxr Resolved all the conflicts after rebasing, and all the unittests
are passed. Thanks.
---
If your project is set up for it, you can reply to this email and have
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1518#issuecomment-50663418
I tried to make the bias really big to make the intercept smaller to avoid
being regularized. The result is still quite different from R, and very
sensitive
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1379#issuecomment-50982699
@mengxr Is there any problem with asfgit? This is not finished yet, why
asfgit said it's merged into apache:master.
---
If your project is set up for it, you can reply
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15733217
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Normalizer.scala ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15733221
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Normalizer.scala ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15733244
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15733248
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/VectorTransformer.scala ---
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15738936
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Normalizer.scala ---
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15740021
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Normalizer.scala ---
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1207#discussion_r15740240
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/feature/StandardScalerSuite.scala
---
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1518#issuecomment-51151346
It's too late to get into 1.1, but I'll try to make it happen in 1.2. We'll
use this at Alpine implementation first.
---
If your project is set up for it, you can reply
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1796
[MLlib] Use this.type as return type in k-means' builder pattern
to ensure that the return object is itself.
You can merge this pull request into a Git repository by running:
$ git pull https
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1814#discussion_r15908219
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala ---
@@ -35,38 +35,47 @@ import org.apache.spark.rdd.RDD
* @param
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1814#discussion_r15908318
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala ---
@@ -35,38 +35,47 @@ import org.apache.spark.rdd.RDD
* @param
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1814#discussion_r15908504
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/IDF.scala ---
@@ -177,18 +115,72 @@ private object IDF {
private def isEmpty: Boolean
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1814#issuecomment-51511617
LGTM. Merged into both master and branch-1.1. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1862
[SPARK-2934][MLlib] Adding LogisticRegressionWithLBFGS Interface
for training with LBFGS Optimizer which will converge faster than SGD.
You can merge this pull request into a Git repository
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1862#discussion_r16022431
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
---
@@ -188,3 +188,98 @@ object LogisticRegressionWithSGD
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1862#discussion_r16023077
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
---
@@ -188,3 +188,54 @@ object LogisticRegressionWithSGD
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1862#discussion_r16023299
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
---
@@ -188,3 +188,54 @@ object LogisticRegressionWithSGD
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1897
[SPARK-2979][MLlib ]Improve the convergence rate by minimize the condition
number
Scaling to minimize the condition number:
During the optimization process, the convergence (rate) depends
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1897#discussion_r16153527
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala
---
@@ -137,11 +154,45 @@ abstract class
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/2709
Minor change in the comment of spark-defaults.conf.template
spark-defaults.conf is used in spark-shell as well, and this PR added this
into the comment.
You can merge this pull request into a Git
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2712#issuecomment-58361701
Jenkins, please start the test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2718#issuecomment-58435304
LGTM Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2712#issuecomment-58629065
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2712#issuecomment-58732030
It's failing at FlumeStreamSuite.scala:109 which seems to be unrelated to
this patch.
---
If your project is set up for it, you can reply to this email and have your
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2709#issuecomment-59667207
@andrewor14 Sorry for late reply since I was on vacation in Europe last
week. I can continue work on this after I finish my talk in IOTA conf tomorrow.
---
If your
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/2868#issuecomment-59871504
Jenkins, please start the test!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1379#issuecomment-60813678
@BigCrunsh I'm working on this. Let's see if we can merge in Spark 1.2
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/2992
[SPARK-4129][MLlib] Performance tuning in MultivariateOnlineSummarizer
In MultivariateOnlineSummarizer, breeze's activeIterator is used to loop
through the nonZero elements in the vector. However
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1013#issuecomment-45551414
Tested in PivotalHD 1.1 Yarn 4 node cluster. With --addjars
file:///somePath/to/jar, launching spark application works.
---
If your project is set up for it, you can
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1013#discussion_r13573544
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -507,12 +508,19 @@ object Client {
Apps.addToEnvironment(env
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1027
Make sure that empty string is filtered out when we get the secondary jars
from conf
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dbtsai
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/490#discussion_r13624385
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -95,15 +96,18 @@ trait ClientBase extends Logging
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/490#discussion_r13624580
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -95,15 +96,18 @@ trait ClientBase extends Logging
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/490#discussion_r13624615
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -95,15 +96,18 @@ trait ClientBase extends Logging
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/490#issuecomment-45835283
@mengxr Do you think it's in good shape now? This is the only issue
blocking us using vanilla spark. Thanks.
---
If your project is set up for it, you can reply
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13897737
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -38,10 +38,10 @@ import org.apache.spark.mllib.linalg.{Vectors, Vector
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13897825
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -195,4 +195,39 @@ class LBFGSSuite extends FunSuite
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46393840
I think it's legacy reason to have two different way to access the API. As
far as I know, @mengxr is working on consolidating the interface. He probably
can talk about
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13905548
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -195,4 +195,39 @@ class LBFGSSuite extends FunSuite
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46412293
I think it will be a problem for MIMA to change the signature.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1207
SPARK-2272 [MLlib] Feature scaling which standardizes the range of
independent variables or features of data
Feature scaling is a method used to standardize the range of independent
variables
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1215
SPARK-2281 [MLlib] Simplify the duplicate code in Gradient.scala
The Gradient.compute which returns new tuple of (gradient: Vector, loss:
Double) can be constructed by in-place version
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-47250277
Seems that the jenkins is missing the python runtime.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1110#issuecomment-47683286
We benchmarked treeReduce in our random forest implementation, and since
the trees generated from each partition are fairly large (more than 100MB), we
found
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1333
Upgrade junit_xml_listener to 0.5.1 which fixes the following issues
1) fix the class name to be fully qualified classpath
2) make sure the the reporting time is in second not in miliseond, which
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1333#issuecomment-48417558
done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/1215
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/955#discussion_r14796461
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/OnlineSummarizer.scala ---
@@ -0,0 +1,229 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/987#issuecomment-48762832
#560 is merged. Close this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/987
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1379
[SPARK-2309][MLlib] Generalize the binary logistic regression into
multinomial logistic regression
Currently, there is no multi-class classifier in mllib. Logistic regression
can be extended
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1410
[SPARK-2477][MLlib] Using appendBias for adding intercept in
GeneralizedLinearAlgorithm
Instead of using prependOne currently in GeneralizedLinearAlgorithm, we
would like to use appendBias for 1
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/1425
[SPARK-2479][MLlib] Comparing floating-point numbers using relative error
in UnitTests
Floating point math is not exact, and most floating-point numbers end up
being slightly imprecise due
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1425#discussion_r15013544
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala
---
@@ -81,9 +82,8 @@ class LogisticRegressionSuite
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1425#discussion_r15013786
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetricsSuite.scala
---
@@ -20,8 +20,20 @@ package
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-49221370
@mengxr Scalatest 2.x has the tolerance feature, but it's absolute error
not relative error. For large numbers, the absolute error may not be
meaningful
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-49222983
I learn `almostEquals` from boost library. Anyway, in this case, how do we
distinguish the one with throwing out the message, and the one just returning
true/false
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-49253108
@mengxr and @srowen What do you think `assert((0.0001 !~== 0.0) +-
1E-5)`? We have `~==` and `~==` which will have the error message in the latest
commit from my co
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/53
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/353
SPARK-1157: L-BFGS Optimizer based on Breeze's implementation.
This PR uses Breeze's L-BFGS implement, and Breeze dependency has already
been introduced by Xiangrui's sparse input format work
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11404094
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -0,0 +1,251 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/353#issuecomment-39895140
@mengxr As you suggested, I moved the costFun to private CostFun class.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11460767
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11461398
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11463764
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11464280
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11464736
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/353#discussion_r11605070
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/353
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/353#issuecomment-40434555
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user dbtsai reopened a pull request:
https://github.com/apache/spark/pull/353
[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation.
This PR uses Breeze's L-BFGS implement, and Breeze dependency has already
been introduced by Xiangrui's sparse input format
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/353#issuecomment-40434626
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/353#issuecomment-40434691
Timeout for lastest jenkins run. It seems that CI is not stable now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/353
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/481
MLlib doc update for breeze dependency
MLlib is now using breeze linear algebra library instead of jblas; this PR
will update the doc to help users to install the blas native libraries to have
Github user dbtsai closed the pull request at:
https://github.com/apache/spark/pull/481
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/422#discussion_r11841916
--- Diff: docs/mllib-guide.md ---
@@ -3,63 +3,120 @@ layout: global
title: Machine Learning Library (MLlib)
---
+MLlib is a Spark
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/490#discussion_r11883381
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -77,7 +78,8 @@ trait ClientBase extends Logging {
).foreach
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/490#issuecomment-41114289
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1897#issuecomment-52149162
Seems that Jenkins is not stable. Failing on issues related to akka.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1973#discussion_r16319946
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -69,8 +69,17 @@ class LBFGS(private var gradient: Gradient, private var
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/1973#issuecomment-52381503
LGTM. Merged into both master and branch-1.1. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/2068
[SPARK-2841][MLlib] Documentation for feature transformations
Documentation for newly added feature transformations:
1. TF-IDF
2. StandardScaler
3. Normalizer
You can merge this pull
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/2068#discussion_r16561045
--- Diff: docs/mllib-feature-extraction.md ---
@@ -70,4 +70,110 @@ for((synonym, cosineSimilarity) - synonyms) {
/div
/div
-## TFIDF
1 - 100 of 1777 matches
Mail list logo