Repository: mahout Updated Branches: refs/heads/website c81fc8b72 -> 9759e024e
http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/map-reduce/index.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/map-reduce/index.md b/website/docs/algorithms/map-reduce/index.md index 0e55a79..88cedea 100644 --- a/website/docs/algorithms/map-reduce/index.md +++ b/website/docs/algorithms/map-reduce/index.md @@ -1,5 +1,5 @@ --- -layout: page +layout: algorithm title: Deprecated Map Reduce Algorithms theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/preprocessors/AsFactor.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/preprocessors/AsFactor.md b/website/docs/algorithms/preprocessors/AsFactor.md index 0a8c589..255f982 100644 --- a/website/docs/algorithms/preprocessors/AsFactor.md +++ b/website/docs/algorithms/preprocessors/AsFactor.md @@ -1,9 +1,35 @@ --- -layout: page +layout: algorithm title: AsFactor theme: name: mahout2 --- -TODO: Fill this out! -Stub \ No newline at end of file + +### About + +The `AsFactor` preprocessor is used to turn the integer values of the columns into sparse vectors where the value is 1 + at the index that corresponds to the 'category' of that column. This is also known as "One Hot Encoding" in many other + packages. + + +### Parameters + +`AsFactor` takes no parameters. + +### Example + +```scala +import org.apache.mahout.math.algorithms.preprocessing.AsFactor + +val A = drmParallelize(dense( + (3, 2, 1, 2), + (0, 0, 0, 0), + (1, 1, 1, 1)), numPartitions = 2) + +// 0 -> 2, 3 -> 5, 6 -> 9 +val factorizer: AsFactorModel = new AsFactor().fit(A) + +val factoredA = factorizer.transform(A) +``` + http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/preprocessors/StandardScaler.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/preprocessors/StandardScaler.md b/website/docs/algorithms/preprocessors/StandardScaler.md new file mode 100644 index 0000000..5b33709 --- /dev/null +++ b/website/docs/algorithms/preprocessors/StandardScaler.md @@ -0,0 +1,44 @@ +--- +layout: algorithm +title: StandardScaler +theme: + name: mahout2 +--- + +### About + +`StandardScaler` centers the values of each column to their mean, and scales them to unit variance. + +#### Relation to the `scale` function in R-base +The `StandardScaler` is the equivelent of the R-base function [`scale`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/scale.html) with +one noteable tweek. R's `scale` function (indeed all of R) calculates standard deviation with 1 degree of freedom, Mahout +(like many other statistical packages aimed at larger data sets) does not make this adjustment. In larger datasets the difference +is trivial, however when testing the function on smaller datasets the practicioner may be confused by the discrepency. + +To verify this function against R on an arbitrary matrix, use the following form in R to "undo" the degrees of freedom correction. +```R +N <- nrow(x) +scale(x, scale= apply(x, 2, sd) * sqrt(N-1/N)) +``` + +### Parameters + +`StandardScaler` takes no parameters at this time. + +### Example + + +```scala +import org.apache.mahout.math.algorithms.preprocessing.StandardScaler + +val A = drmParallelize(dense( + (1, 1, 5), + (2, 5, -15), + (3, 9, -2)), numPartitions = 2) + +val scaler: StandardScalerModel = new StandardScaler().fit(A) + +val scaledA = scaler.transform(A) +``` + + http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/preprocessors/template.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/preprocessors/template.md b/website/docs/algorithms/preprocessors/template.md new file mode 100644 index 0000000..4a48829 --- /dev/null +++ b/website/docs/algorithms/preprocessors/template.md @@ -0,0 +1,20 @@ +--- +layout: algorithm +title: AsFactor +theme: + name: mahout2 +--- + +TODO: Fill this out! +Stub + +### About + +### Parameters + +### Example + + + + + http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/reccomenders/intro-cooccurrence-spark.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/reccomenders/intro-cooccurrence-spark.md b/website/docs/algorithms/reccomenders/intro-cooccurrence-spark.md index f8b4c12..d7d0185 100644 --- a/website/docs/algorithms/reccomenders/intro-cooccurrence-spark.md +++ b/website/docs/algorithms/reccomenders/intro-cooccurrence-spark.md @@ -1,5 +1,5 @@ --- -layout: default +layout: algorithm title: Intro to Cooccurrence Recommenders with Spark theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/reccomenders/recommender-overview.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/reccomenders/recommender-overview.md b/website/docs/algorithms/reccomenders/recommender-overview.md index 13d331b..00d8ec4 100644 --- a/website/docs/algorithms/reccomenders/recommender-overview.md +++ b/website/docs/algorithms/reccomenders/recommender-overview.md @@ -1,5 +1,5 @@ --- -layout: default +layout: algorithm title: Recommender Quickstart theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/regression/cochrane-orcutt.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/regression/cochrane-orcutt.md b/website/docs/algorithms/regression/cochrane-orcutt.md index 76b28b0..a88a0a0 100644 --- a/website/docs/algorithms/regression/cochrane-orcutt.md +++ b/website/docs/algorithms/regression/cochrane-orcutt.md @@ -1,5 +1,5 @@ --- -layout: page +layout: algorithm title: Cochrane-Orcutt Procedure theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/algorithms/regression/ols.md ---------------------------------------------------------------------- diff --git a/website/docs/algorithms/regression/ols.md b/website/docs/algorithms/regression/ols.md index 63155d3..5c16d1f 100644 --- a/website/docs/algorithms/regression/ols.md +++ b/website/docs/algorithms/regression/ols.md @@ -1,5 +1,5 @@ --- -layout: page +layout: algorithm title: Ordinary Least Squares Regression theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/map-reduce/classification/bankmarketing-example.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/map-reduce/classification/bankmarketing-example.md b/website/docs/tutorials/map-reduce/classification/bankmarketing-example.md index 846a4ce..d7ae229 100644 --- a/website/docs/tutorials/map-reduce/classification/bankmarketing-example.md +++ b/website/docs/tutorials/map-reduce/classification/bankmarketing-example.md @@ -1,5 +1,5 @@ --- -layout: default +layout: mr_tutorial title: theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/map-reduce/classification/breiman-example.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/map-reduce/classification/breiman-example.md b/website/docs/tutorials/map-reduce/classification/breiman-example.md index d8d049e..ea5aa0d 100644 --- a/website/docs/tutorials/map-reduce/classification/breiman-example.md +++ b/website/docs/tutorials/map-reduce/classification/breiman-example.md @@ -1,5 +1,5 @@ --- -layout: default +layout: mr_tutorial title: Breiman Example theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/map-reduce/classification/twenty-newsgroups.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/map-reduce/classification/twenty-newsgroups.md b/website/docs/tutorials/map-reduce/classification/twenty-newsgroups.md index 472aaf6..178c145 100644 --- a/website/docs/tutorials/map-reduce/classification/twenty-newsgroups.md +++ b/website/docs/tutorials/map-reduce/classification/twenty-newsgroups.md @@ -1,5 +1,5 @@ --- -layout: default +layout: mr_tutorial title: Twenty Newsgroups theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/map-reduce/classification/wikipedia-classifier-example.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/map-reduce/classification/wikipedia-classifier-example.md b/website/docs/tutorials/map-reduce/classification/wikipedia-classifier-example.md index 9df07da..fd93e8b 100644 --- a/website/docs/tutorials/map-reduce/classification/wikipedia-classifier-example.md +++ b/website/docs/tutorials/map-reduce/classification/wikipedia-classifier-example.md @@ -1,5 +1,5 @@ --- -layout: default +layout: mr_tutorial title: Wikipedia XML parser and Naive Bayes Example theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/map-reduce/index.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/map-reduce/index.md b/website/docs/tutorials/map-reduce/index.md index 691bb8b..b22827c 100644 --- a/website/docs/tutorials/map-reduce/index.md +++ b/website/docs/tutorials/map-reduce/index.md @@ -1,10 +1,12 @@ --- -layout: page +layout: mr_tutorial title: Deprecated Map Reduce Based Examples theme: name: mahout2 --- +A note about the sunsetting of our support for Map Reduce. + ### Classification http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/spark-samsara/classify-a-doc-from-the-shell.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/spark-samsara/classify-a-doc-from-the-shell.md b/website/docs/tutorials/spark-samsara/classify-a-doc-from-the-shell.md index 0a237d1..8a49903 100644 --- a/website/docs/tutorials/spark-samsara/classify-a-doc-from-the-shell.md +++ b/website/docs/tutorials/spark-samsara/classify-a-doc-from-the-shell.md @@ -1,5 +1,5 @@ --- -layout: page +layout: tutorial title: Text Classification Example theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/spark-samsara/how-to-build-an-app.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/spark-samsara/how-to-build-an-app.md b/website/docs/tutorials/spark-samsara/how-to-build-an-app.md index 0ad232e..a17c189 100644 --- a/website/docs/tutorials/spark-samsara/how-to-build-an-app.md +++ b/website/docs/tutorials/spark-samsara/how-to-build-an-app.md @@ -1,5 +1,5 @@ --- -layout: page +layout: tutorial title: Mahout Samsara In Core theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/spark-samsara/play-with-shell.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/spark-samsara/play-with-shell.md b/website/docs/tutorials/spark-samsara/play-with-shell.md index 6b5e4a0..a01f23c 100644 --- a/website/docs/tutorials/spark-samsara/play-with-shell.md +++ b/website/docs/tutorials/spark-samsara/play-with-shell.md @@ -1,5 +1,5 @@ --- -layout: page +layout: tutorial title: Mahout Samsara In Core theme: name: mahout2 http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/docs/tutorials/spark-samsara/spark-naive-bayes.md ---------------------------------------------------------------------- diff --git a/website/docs/tutorials/spark-samsara/spark-naive-bayes.md b/website/docs/tutorials/spark-samsara/spark-naive-bayes.md index 8823812..3c24fff 100644 --- a/website/docs/tutorials/spark-samsara/spark-naive-bayes.md +++ b/website/docs/tutorials/spark-samsara/spark-naive-bayes.md @@ -1,5 +1,5 @@ --- -layout: default +layout: tutorial title: Spark Naive Bayes theme: name: retro-mahout http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/old_site_migration/completed/d-als.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/completed/d-als.md b/website/old_site_migration/completed/d-als.md index 0d64697..fb4fac8 100644 --- a/website/old_site_migration/completed/d-als.md +++ b/website/old_site_migration/completed/d-als.md @@ -5,8 +5,6 @@ theme: name: retro-mahout --- -Seems like someone has jacked up this page? -# Distributed Cholesky QR ## Intro http://git-wip-us.apache.org/repos/asf/mahout/blob/9759e024/website/old_site_migration/completed/d-qr.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/completed/d-qr.md b/website/old_site_migration/completed/d-qr.md index 5c3e5b8..63f3525 100644 --- a/website/old_site_migration/completed/d-qr.md +++ b/website/old_site_migration/completed/d-qr.md @@ -5,9 +5,6 @@ theme: name: retro-mahout --- -# Distributed Cholesky QR - - ## Intro Mahout has a distributed implementation of QR decomposition for tall thin matrices[1].
