WEBSITE Triage of Old Site Migration

Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/9c031452
Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/9c031452
Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/9c031452

Branch: refs/heads/website
Commit: 9c0314528b4fec247ec92654bfcc1471a880e578
Parents: 3a724de
Author: Trevor <[email protected]>
Authored: Sat Apr 29 18:24:40 2017 -0500
Committer: Trevor <[email protected]>
Committed: Sat Apr 29 18:24:40 2017 -0500

----------------------------------------------------------------------
 website/README.md                               |    3 +-
 website/old_site_migration/README.md            |   41 +
 .../completed/classify-a-doc-from-the-shell.md  |  258 +
 website/old_site_migration/completed/d-als.md   |   60 +
 website/old_site_migration/completed/d-qr.md    |   59 +
 website/old_site_migration/completed/d-spca.md  |  176 +
 website/old_site_migration/completed/d-ssvd.md  |  143 +
 .../old_site_migration/completed/downloads.md   |   68 +
 .../completed/how-to-build-an-app.md            |  257 +
 .../completed/in-core-reference.md              |  304 +
 .../completed/intro-cooccurrence-spark.md       |  446 ++
 .../mailing-lists,-irc-and-archives.md          |   75 +
 .../completed/out-of-core-reference.md          |  318 +
 .../completed/privacy-policy.md                 |   28 +
 .../old_site_migration/completed/quickstart.md  |   59 +
 .../completed/release-notes.md                  |  242 +
 .../old_site_migration/completed/who-we-are.md  |   62 +
 .../dont_migrate/collections.md                 |   98 +
 .../old_site_migration/dont_migrate/glossary.md |   15 +
 .../dont_migrate/mahout-benchmarks.md           |  156 +
 .../dont_migrate/mahoutintegration.md           |    6 +
 .../dont_migrate/recommender-overview.md        |   34 +
 .../needs_work_convenience/algorithms.md        |   58 +
 .../bayesian-commandline.md                     |   64 +
 .../environment/h2o-internals.md                |   51 +
 .../environment/spark-internals.md              |   25 +
 .../needs_work_convenience/faq.md               |  105 +
 .../flinkbindings/flink-internals.md            |   50 +
 .../flinkbindings/playing-with-samsara-flink.md |  111 +
 .../classification/bankmarketing-example.md     |   53 +
 .../map-reduce/classification/bayesian.md       |  147 +
 .../classification/breiman-example.md           |   67 +
 .../classification/class-discovery.md           |  155 +
 .../classification/classifyingyourdata.md       |   27 +
 .../map-reduce/classification/collocations.md   |  385 ++
 .../gaussian-discriminative-analysis.md         |   20 +
 .../classification/hidden-markov-models.md      |  102 +
 .../independent-component-analysis.md           |   17 +
 .../locally-weighted-linear-regression.md       |   25 +
 .../classification/logistic-regression.md       |  129 +
 .../classification/mahout-collections.md        |   60 +
 .../map-reduce/classification/mlp.md            |  172 +
 .../map-reduce/classification/naivebayes.md     |   45 +
 .../map-reduce/classification/neural-network.md |   22 +
 .../classification/partial-implementation.md    |  146 +
 .../map-reduce/classification/random-forests.md |  234 +
 .../restricted-boltzmann-machines.md            |   49 +
 .../classification/support-vector-machines.md   |   43 +
 .../classification/twenty-newsgroups.md         |  179 +
 .../map-reduce/clustering/20newsgroups.md       |   11 +
 .../map-reduce/clustering/canopy-clustering.md  |  188 +
 .../map-reduce/clustering/canopy-commandline.md |   70 +
 .../map-reduce/clustering/cluster-dumper.md     |  106 +
 .../clustering-of-synthetic-control-data.md     |   53 +
 .../clustering/clustering-seinfeld-episodes.md  |   11 +
 .../map-reduce/clustering/clusteringyourdata.md |  126 +
 .../clustering/expectation-maximization.md      |   62 +
 .../clustering/fuzzy-k-means-commandline.md     |   97 +
 .../map-reduce/clustering/fuzzy-k-means.md      |  186 +
 .../clustering/hierarchical-clustering.md       |   15 +
 .../map-reduce/clustering/k-means-clustering.md |  182 +
 .../clustering/k-means-commandline.md           |   94 +
 .../clustering/latent-dirichlet-allocation.md   |  155 +
 .../map-reduce/clustering/lda-commandline.md    |   83 +
 .../clustering/llr---log-likelihood-ratio.md    |   46 +
 .../clustering/spectral-clustering.md           |   84 +
 .../map-reduce/clustering/streaming-k-means.md  |  174 +
 .../map-reduce/clustering/viewing-result.md     |   15 +
 .../map-reduce/clustering/viewing-results.md    |   49 +
 .../clustering/visualizing-sample-clusters.md   |   50 +
 .../map-reduce/misc/mr---map-reduce.md          |   19 +
 .../misc/parallel-frequent-pattern-mining.md    |  185 +
 .../map-reduce/misc/perceptron-and-winnow.md    |   41 +
 .../map-reduce/misc/testing.md                  |   46 +
 .../misc/using-mahout-with-python-via-jpype.md  |  222 +
 .../map-reduce/recommender/intro-als-hadoop.md  |   98 +
 .../recommender/intro-itembased-hadoop.md       |   54 +
 .../recommender/matrix-factorization.md         |  187 +
 .../recommender/recommender-documentation.md    |  277 +
 .../recommender/recommender-first-timer-faq.md  |   54 +
 .../recommender/userbased-5-minutes.md          |  133 +
 .../needs_work_convenience/powered-by-mahout.md |  129 +
 .../creating-vectors-from-text.md               |  291 +
 .../needs_work_priority/creating-vectors.md     |   16 +
 .../dim-reduction/dimensional-reduction.md      |  446 ++
 .../needs_work_priority/dim-reduction/ssvd.md   |  127 +
 .../dim-reduction/ssvd.page/SSVD-CLI.pdf        |  Bin 0 -> 462679 bytes
 .../dim-reduction/ssvd.page/ssvd.R              |  181 +
 .../needs_work_priority/spark-naive-bayes.md    |  132 +
 .../MahoutScalaAndSparkBindings.pptx            |  Bin 0 -> 846177 bytes
 .../sparkbindings/ScalaSparkBindings.pdf        | 6215 ++++++++++++++++++
 .../needs_work_priority/sparkbindings/faq.md    |   52 +
 .../needs_work_priority/sparkbindings/home.md   |  101 +
 .../sparkbindings/play-with-shell.md            |  199 +
 .../wikipedia-classifier-example.md             |   57 +
 .../general/books-tutorials-and-talks.md        |  121 +
 .../old_site/general/mahout-wiki.md             |  202 +
 .../old_site/general/professional-support.md    |   41 +
 .../old_site/general/reference-reading.md       |   71 +
 .../users/basics/matrix-and-vector-needs.md     |   88 +
 .../basics/principal-components-analysis.md     |   29 +
 .../svd---singular-value-decomposition.md       |   52 +
 .../users/basics/system-requirements.md         |   20 +
 ...term-frequency-inverse-document-frequency.md |   21 +
 104 files changed, 17212 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/README.md
----------------------------------------------------------------------
diff --git a/website/README.md b/website/README.md
index 339df5f..0b2c225 100644
--- a/website/README.md
+++ b/website/README.md
@@ -158,4 +158,5 @@ This is a helpful tool for reference 
http://pikock.github.io/bootstrap-magic/3.0
 - [x] Sign up for google analytics
 - [ ] add links to `community/blogs`
 - [ ] would like to see `community/buidingmahout.md` cleaned up a bit (just 
coppied new instructions from README.md)
-- [ ] writeups for native solvers in /docs/native-solvers/
\ No newline at end of file
+- [ ] writeups for native solvers in /docs/native-solvers/
+- [ ] help with triage in `mahout/website/old_site_migration`
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/README.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/README.md 
b/website/old_site_migration/README.md
new file mode 100644
index 0000000..794397b
--- /dev/null
+++ b/website/old_site_migration/README.md
@@ -0,0 +1,41 @@
+
+
+## Website Migration Triage
+
+
+### 1. `./old-site`
+
+Original Mahout site was transferred to `mahout/website/oldsite` where it was
+headers were replaced to be Jekyll complient as well as some witch craft on the
+nav-bar to make the CSS compatible with the Jekyll Boot Strap Themes
+
+All content was then moved to `mahout/website/old_site_migration/old_site`
+
+ALCON please go through files and move them to one of the following directories
+
+### 2a. `./dont_migrate` 
+
+Content that is no longer relevant or is in such bad shape that needs to be 
redone completely goes here
+
+### 2b. `./needs_work_convenience`
+
+Content that should be migrated but needs updated with new information, or 
other work. Please leave a note
+in the top of what needs to be done. This content can be migrated at 
convenience, e.g. is interesting and 
+would be good to bring over, but is not critical (site can go live with out 
this content).
+
+`./needs_work_convenience/map_reduce` has mapReduce related docs that may not 
actually need any work.
+
+### 2c. `./needs_work_priority`
+
+Content that should be migrated but needs updated.  This is critical 
information that needs to be migrated
+before site goes live. 
+
+
+
+### 3. `./completed`
+
+When a file doesn't need work OR the work has been done on it- move a copy 
here, AND move a copy to the appropriate
+location in `mahout/website/front` or `mahout/website/docs`
+
+(don't forget to add page to nav-bar)
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/classify-a-doc-from-the-shell.md
----------------------------------------------------------------------
diff --git 
a/website/old_site_migration/completed/classify-a-doc-from-the-shell.md 
b/website/old_site_migration/completed/classify-a-doc-from-the-shell.md
new file mode 100644
index 0000000..8c98c53
--- /dev/null
+++ b/website/old_site_migration/completed/classify-a-doc-from-the-shell.md
@@ -0,0 +1,258 @@
+---
+layout: default
+title: 
+theme:
+   name: retro-mahout
+---
+
+#Building a text classifier in Mahout's Spark Shell
+
+This tutorial will take you through the steps used to train a Multinomial 
Naive Bayes model and create a text classifier based on that model using the 
```mahout spark-shell```. 
+
+## Prerequisites
+This tutorial assumes that you have your Spark environment variables set for 
the ```mahout spark-shell``` see: [Playing with Mahout's 
Shell](http://mahout.apache.org/users/sparkbindings/play-with-shell.html).  As 
well we assume that Mahout is running in cluster mode (i.e. with the 
```MAHOUT_LOCAL``` environment variable **unset**) as we'll be reading and 
writing to HDFS.
+
+## Downloading and Vectorizing the Wikipedia dataset
+*As of Mahout v. 0.10.0, we are still reliant on the MapReduce versions of 
```mahout seqwiki``` and ```mahout seq2sparse``` to extract and vectorize our 
text.  A* [*Spark implementation of 
seq2sparse*](https://issues.apache.org/jira/browse/MAHOUT-1663) *is in the 
works for Mahout v. 0.11.* However, to download the Wikipedia dataset, extract 
the bodies of the documentation, label each document and vectorize the text 
into TF-IDF vectors, we can simpmly run the 
[wikipedia-classifier.sh](https://github.com/apache/mahout/blob/master/examples/bin/classify-wikipedia.sh)
 example.  
+
+    Please select a number to choose the corresponding task to run
+    1. CBayes (may require increased heap space on yarn)
+    2. BinaryCBayes
+    3. clean -- cleans up the work area in /tmp/mahout-work-wiki
+    Enter your choice :
+
+Enter (2). This will download a large recent XML dump of the Wikipedia 
database, into a ```/tmp/mahout-work-wiki``` directory, unzip it and  place it 
into HDFS.  It will run a [MapReduce job to parse the wikipedia 
set](http://mahout.apache.org/users/classification/wikipedia-classifier-example.html),
 extracting and labeling only pages with category tags for [United States] and 
[United Kingdom] (~11600 documents). It will then run ```mahout seq2sparse``` 
to convert the documents into TF-IDF vectors.  The script will also a build and 
test a [Naive Bayes model using 
MapReduce](http://mahout.apache.org/users/classification/bayesian.html).  When 
it is completed, you should see a confusion matrix on your screen.  For this 
tutorial, we will ignore the MapReduce model, and build a new model using Spark 
based on the vectorized text output by ```seq2sparse```.
+
+## Getting Started
+
+Launch the ```mahout spark-shell```.  There is an example script: 
```spark-document-classifier.mscala``` (.mscala denotes a Mahout-Scala script 
which can be run similarly to an R script).   We will be walking through this 
script for this tutorial but if you wanted to simply run the script, you could 
just issue the command: 
+
+    mahout> :load /path/to/mahout/examples/bin/spark-document-classifier.mscala
+
+For now, lets take the script apart piece by piece.  You can cut and paste the 
following code blocks into the ```mahout spark-shell```.
+
+## Imports
+
+Our Mahout Naive Bayes imports:
+
+    import org.apache.mahout.classifier.naivebayes._
+    import org.apache.mahout.classifier.stats._
+    import org.apache.mahout.nlp.tfidf._
+
+Hadoop imports needed to read our dictionary:
+
+    import org.apache.hadoop.io.Text
+    import org.apache.hadoop.io.IntWritable
+    import org.apache.hadoop.io.LongWritable
+
+## Read in our full set from HDFS as vectorized by seq2sparse in 
classify-wikipedia.sh
+
+    val pathToData = "/tmp/mahout-work-wiki/"
+    val fullData = drmDfsRead(pathToData + "wikipediaVecs/tfidf-vectors")
+
+## Extract the category of each observation and aggregate those observations 
by category
+
+    val (labelIndex, aggregatedObservations) = 
SparkNaiveBayes.extractLabelsAndAggregateObservations(
+                                                                 fullData)
+
+## Build a Muitinomial Naive Bayes model and self test on the training set
+
+    val model = SparkNaiveBayes.train(aggregatedObservations, labelIndex, 
false)
+    val resAnalyzer = SparkNaiveBayes.test(model, fullData, false)
+    println(resAnalyzer)
+    
+printing the ```ResultAnalyzer``` will display the confusion matrix.
+
+## Read in the dictionary and document frequency count from HDFS
+    
+    val dictionary = sdc.sequenceFile(pathToData + 
"wikipediaVecs/dictionary.file-0",
+                                      classOf[Text],
+                                      classOf[IntWritable])
+    val documentFrequencyCount = sdc.sequenceFile(pathToData + 
"wikipediaVecs/df-count",
+                                                  classOf[IntWritable],
+                                                  classOf[LongWritable])
+
+    // setup the dictionary and document frequency count as maps
+    val dictionaryRDD = dictionary.map { 
+                                    case (wKey, wVal) => 
wKey.asInstanceOf[Text]
+                                                             .toString() -> 
wVal.get() 
+                                       }
+                                       
+    val documentFrequencyCountRDD = documentFrequencyCount.map {
+                                            case (wKey, wVal) => 
wKey.asInstanceOf[IntWritable]
+                                                                     .get() -> 
wVal.get() 
+                                                               }
+    
+    val dictionaryMap = dictionaryRDD.collect.map(x => x._1.toString -> 
x._2.toInt).toMap
+    val dfCountMap = documentFrequencyCountRDD.collect.map(x => x._1.toInt -> 
x._2.toLong).toMap
+
+## Define a function to tokenize and vectorize new text using our current 
dictionary
+
+For this simple example, our function ```vectorizeDocument(...)``` will 
tokenize a new document into unigrams using native Java String methods and 
vectorize using our dictionary and document frequencies. You could also use a 
[Lucene](https://lucene.apache.org/core/) analyzer for bigrams, trigrams, etc., 
and integrate Apache [Tika](https://tika.apache.org/) to extract text from 
different document types (PDF, PPT, XLS, etc.).  Here, however we will keep it 
simple, stripping and tokenizing our text using regexs and native String 
methods.
+
+    def vectorizeDocument(document: String,
+                            dictionaryMap: Map[String,Int],
+                            dfMap: Map[Int,Long]): Vector = {
+        val wordCounts = document.replaceAll("[^\\p{L}\\p{Nd}]+", " ")
+                                    .toLowerCase
+                                    .split(" ")
+                                    .groupBy(identity)
+                                    .mapValues(_.length)         
+        val vec = new RandomAccessSparseVector(dictionaryMap.size)
+        val totalDFSize = dfMap(-1)
+        val docSize = wordCounts.size
+        for (word <- wordCounts) {
+            val term = word._1
+            if (dictionaryMap.contains(term)) {
+                val tfidf: TermWeight = new TFIDF()
+                val termFreq = word._2
+                val dictIndex = dictionaryMap(term)
+                val docFreq = dfCountMap(dictIndex)
+                val currentTfIdf = tfidf.calculate(termFreq,
+                                                   docFreq.toInt,
+                                                   docSize,
+                                                   totalDFSize.toInt)
+                vec.setQuick(dictIndex, currentTfIdf)
+            }
+        }
+        vec
+    }
+
+## Setup our classifier
+
+    val labelMap = model.labelIndex
+    val numLabels = model.numLabels
+    val reverseLabelMap = labelMap.map(x => x._2 -> x._1)
+    
+    // instantiate the correct type of classifier
+    val classifier = model.isComplementary match {
+        case true => new ComplementaryNBClassifier(model)
+        case _ => new StandardNBClassifier(model)
+    }
+
+## Define an argmax function 
+
+The label with the highest score wins the classification for a given document.
+    
+    def argmax(v: Vector): (Int, Double) = {
+        var bestIdx: Int = Integer.MIN_VALUE
+        var bestScore: Double = Integer.MIN_VALUE.asInstanceOf[Int].toDouble
+        for(i <- 0 until v.size) {
+            if(v(i) > bestScore){
+                bestScore = v(i)
+                bestIdx = i
+            }
+        }
+        (bestIdx, bestScore)
+    }
+
+## Define our TF(-IDF) vector classifier
+
+    def classifyDocument(clvec: Vector) : String = {
+        val cvec = classifier.classifyFull(clvec)
+        val (bestIdx, bestScore) = argmax(cvec)
+        reverseLabelMap(bestIdx)
+    }
+
+## Two sample news articles: United States Football and United Kingdom Football
+    
+    // A random United States football article
+    // 
http://www.reuters.com/article/2015/01/28/us-nfl-superbowl-security-idUSKBN0L12JR20150128
+    val UStextToClassify = new String("(Reuters) - Super Bowl security 
officials acknowledge" +
+        " the NFL championship game represents a high profile target on a 
world stage but are" +
+        " unaware of any specific credible threats against Sunday's showcase. 
In advance of" +
+        " one of the world's biggest single day sporting events, Homeland 
Security Secretary" +
+        " Jeh Johnson was in Glendale on Wednesday to review security 
preparations and tour" +
+        " University of Phoenix Stadium where the Seattle Seahawks and New 
England Patriots" +
+        " will battle. Deadly shootings in Paris and arrest of suspects in 
Belgium, Greece and" +
+        " Germany heightened fears of more attacks around the world and social 
media accounts" +
+        " linked to Middle East militant groups have carried a number of 
threats to attack" +
+        " high-profile U.S. events. There is no specific credible threat, said 
Johnson, who" + 
+        " has appointed a federal coordination team to work with local, state 
and federal" +
+        " agencies to ensure safety of fans, players and other workers 
associated with the" + 
+        " Super Bowl. I'm confident we will have a safe and secure and 
successful event." +
+        " Sunday's game has been given a Special Event Assessment Rating 
(SEAR) 1 rating, the" +
+        " same as in previous years, except for the year after the Sept. 11, 
2001 attacks, when" +
+        " a higher level was declared. But security will be tight and visible 
around Super" +
+        " Bowl-related events as well as during the game itself. All fans will 
pass through" +
+        " metal detectors and pat downs. Over 4,000 private security personnel 
will be deployed" +
+        " and the almost 3,000 member Phoenix police force will be on Super 
Bowl duty. Nuclear" +
+        " device sniffing teams will be deployed and a network of Bio-Watch 
detectors will be" +
+        " set up to provide a warning in the event of a biological attack. The 
Department of" +
+        " Homeland Security (DHS) said in a press release it had held special 
cyber-security" +
+        " and anti-sniper training sessions. A U.S. official said the 
Transportation Security" +
+        " Administration, which is responsible for screening airline 
passengers, will add" +
+        " screeners and checkpoint lanes at airports. Federal air marshals, 
behavior detection" +
+        " officers and dog teams will help to secure transportation systems in 
the area. We" +
+        " will be ramping it (security) up on Sunday, there is no doubt about 
that, said Federal"+
+        " Coordinator Matthew Allen, the DHS point of contact for planning and 
support. I have" +
+        " every confidence the public safety agencies that represented in the 
planning process" +
+        " are going to have their best and brightest out there this weekend 
and we will have" +
+        " a very safe Super Bowl.")
+    
+    // A random United Kingdom football article
+    // 
http://www.reuters.com/article/2015/01/26/manchester-united-swissquote-idUSL6N0V52RZ20150126
+    val UKtextToClassify = new String("(Reuters) - Manchester United have 
signed a sponsorship" +
+        " deal with online financial trading company Swissquote, expanding the 
commercial" +
+        " partnerships that have helped to make the English club one of the 
richest teams in" +
+        " world soccer. United did not give a value for the deal, the club's 
first in the sector," +
+        " but said on Monday it was a multi-year agreement. The Premier League 
club, 20 times" +
+        " English champions, claim to have 659 million followers around the 
globe, making the" +
+        " United name attractive to major brands like Chevrolet cars and 
sportswear group Adidas." +
+        " Swissquote said the global deal would allow it to use United's 
popularity in Asia to" +
+        " help it meet its targets for expansion in China. Among benefits from 
the deal," +
+        " Swissquote's clients will have a chance to meet United players and 
get behind the scenes" +
+        " at the Old Trafford stadium. Swissquote is a Geneva-based online 
trading company that" +
+        " allows retail investors to buy and sell foreign exchange, equities, 
bonds and other asset" +
+        " classes. Like other retail FX brokers, Swissquote was left nursing 
losses on the Swiss" +
+        " franc after Switzerland's central bank stunned markets this month by 
abandoning its cap" +
+        " on the currency. The fallout from the abrupt move put rival and West 
Ham United shirt" +
+        " sponsor Alpari UK into administration. Swissquote itself was forced 
to book a 25 million" +
+        " Swiss francs ($28 million) provision for its clients who were left 
out of pocket" +
+        " following the franc's surge. United's ability to grow revenues off 
the pitch has made" +
+        " them the second richest club in the world behind Spain's Real 
Madrid, despite a" +
+        " downturn in their playing fortunes. United Managing Director Richard 
Arnold said" +
+        " there was still lots of scope for United to develop sponsorships in 
other areas of" +
+        " business. The last quoted statistics that we had showed that of the 
top 25 sponsorship" +
+        " categories, we were only active in 15 of those, Arnold told Reuters. 
I think there is a" +
+        " huge potential still for the club, and the other thing we have seen 
is there is very" +
+        " significant growth even within categories. United have endured a 
tricky transition" +
+        " following the retirement of manager Alex Ferguson in 2013, finishing 
seventh in the" +
+        " Premier League last season and missing out on a place in the 
lucrative Champions League." +
+        " ($1 = 0.8910 Swiss francs) (Writing by Neil Maidment, additional 
reporting by Jemima" + 
+        " Kelly; editing by Keith Weir)")
+
+## Vectorize and classify our documents
+
+    val usVec = vectorizeDocument(UStextToClassify, dictionaryMap, dfCountMap)
+    val ukVec = vectorizeDocument(UKtextToClassify, dictionaryMap, dfCountMap)
+    
+    println("Classifying the news article about superbowl security (united 
states)")
+    classifyDocument(usVec)
+    
+    println("Classifying the news article about Manchester United (united 
kingdom)")
+    classifyDocument(ukVec)
+
+## Tie everything together in a new method to classify text 
+    
+    def classifyText(txt: String): String = {
+        val v = vectorizeDocument(txt, dictionaryMap, dfCountMap)
+        classifyDocument(v)
+    }
+
+## Now we can simply call our classifyText(...) method on any String
+
+    classifyText("Hello world from Queens")
+    classifyText("Hello world from London")
+    
+## Model persistance
+
+You can save the model to HDFS:
+
+    model.dfsWrite("/path/to/model")
+    
+And retrieve it with:
+
+    val model =  NBModel.dfsRead("/path/to/model")
+
+The trained model can now be embedded in an external application.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/d-als.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/d-als.md 
b/website/old_site_migration/completed/d-als.md
new file mode 100644
index 0000000..0d64697
--- /dev/null
+++ b/website/old_site_migration/completed/d-als.md
@@ -0,0 +1,60 @@
+---
+layout: default
+title: Disitrubted ALS
+theme:
+    name: retro-mahout
+---
+
+Seems like someone has jacked up this page?
+# Distributed Cholesky QR
+
+
+## Intro
+
+Mahout has a distributed implementation of QR decomposition for tall thin 
matricies[1].
+
+## Algorithm 
+
+For the classic QR decomposition of the form 
`\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)` a distributed 
version is fairly easily achieved if `\(\mathbf{A}\)` is tall and thin such 
that `\(\mathbf{A}^{\top}\mathbf{A}\)` fits in memory, i.e. *m* is large but 
*n* < ~5000 Under such circumstances, only `\(\mathbf{A}\)` and 
`\(\mathbf{Q}\)` are distributed matricies and `\(\mathbf{A^{\top}A}\)` and 
`\(\mathbf{R}\)` are in-core products. We just compute the in-core version of 
the Cholesky decomposition in the form of `\(\mathbf{LL}^{\top}= 
\mathbf{A}^{\top}\mathbf{A}\)`.  After that we take `\(\mathbf{R}= 
\mathbf{L}^{\top}\)` and 
`\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)`.  The latter is 
easily achieved by multiplying each verticle block of `\(\mathbf{A}\)` by 
`\(\left(\mathbf{L}^{\top}\right)^{-1}\)`.  (There is no actual matrix 
inversion happening). 
+
+
+
+## Implementation
+
+Mahout `dqrThin(...)` is implemented in the mahout `math-scala` algebraic 
optimizer which translates Mahout's R-like linear algebra operators into a 
physical plan for both Spark and H2O distributed engines.
+
+    def dqrThin[K: ClassTag](A: DrmLike[K], checkRankDeficiency: Boolean = 
true): (DrmLike[K], Matrix) = {        
+        if (drmA.ncol > 5000)
+            log.warn("A is too fat. A'A must fit in memory and easily 
broadcasted.")
+        implicit val ctx = drmA.context
+        val AtA = (drmA.t %*% drmA).checkpoint()
+        val inCoreAtA = AtA.collect
+        val ch = chol(inCoreAtA)
+        val inCoreR = (ch.getL cloned) t
+        if (checkRankDeficiency && !ch.isPositiveDefinite)
+            throw new IllegalArgumentException("R is rank-deficient.")
+        val bcastAtA = sc.broadcast(inCoreAtA)
+        val Q = A.mapBlock() {
+            case (keys, block) => keys -> chol(bcastAtA).solveRight(block)
+        }
+        Q -> inCoreR
+    }
+
+
+## Usage
+
+The scala `dqrThin(...)` method can easily be called in any Spark or H2O 
application built with the `math-scala` library and the corresponding `Spark` 
or `H2O` engine module as follows:
+
+    import org.apache.mahout.math._
+    import decompositions._
+    import drm._
+    
+    val(drmQ, inCoreR) = dqrThin(drma)
+
+ 
+## References
+
+[1]: [Mahout Scala and Mahout Spark Bindings for Linear Algebra 
Subroutines](http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf)
+
+[2]: [Mahout Spark and Scala 
Bindings](http://mahout.apache.org/users/sparkbindings/home.html)
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/d-qr.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/d-qr.md 
b/website/old_site_migration/completed/d-qr.md
new file mode 100644
index 0000000..5c3e5b8
--- /dev/null
+++ b/website/old_site_migration/completed/d-qr.md
@@ -0,0 +1,59 @@
+---
+layout: default
+title: Distributed Cholesky QR
+theme:
+    name: retro-mahout
+---
+
+# Distributed Cholesky QR
+
+
+## Intro
+
+Mahout has a distributed implementation of QR decomposition for tall thin 
matrices[1].
+
+## Algorithm 
+
+For the classic QR decomposition of the form 
`\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)` a distributed 
version is fairly easily achieved if `\(\mathbf{A}\)` is tall and thin such 
that `\(\mathbf{A}^{\top}\mathbf{A}\)` fits in memory, i.e. *m* is large but 
*n* < ~5000 Under such circumstances, only `\(\mathbf{A}\)` and 
`\(\mathbf{Q}\)` are distributed matrices and `\(\mathbf{A^{\top}A}\)` and 
`\(\mathbf{R}\)` are in-core products. We just compute the in-core version of 
the Cholesky decomposition in the form of `\(\mathbf{LL}^{\top}= 
\mathbf{A}^{\top}\mathbf{A}\)`.  After that we take `\(\mathbf{R}= 
\mathbf{L}^{\top}\)` and 
`\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)`.  The latter is 
easily achieved by multiplying each vertical block of `\(\mathbf{A}\)` by 
`\(\left(\mathbf{L}^{\top}\right)^{-1}\)`.  (There is no actual matrix 
inversion happening). 
+
+
+
+## Implementation
+
+Mahout `dqrThin(...)` is implemented in the mahout `math-scala` algebraic 
optimizer which translates Mahout's R-like linear algebra operators into a 
physical plan for both Spark and H2O distributed engines.
+
+    def dqrThin[K: ClassTag](A: DrmLike[K], checkRankDeficiency: Boolean = 
true): (DrmLike[K], Matrix) = {        
+        if (drmA.ncol > 5000)
+            log.warn("A is too fat. A'A must fit in memory and easily 
broadcasted.")
+        implicit val ctx = drmA.context
+        val AtA = (drmA.t %*% drmA).checkpoint()
+        val inCoreAtA = AtA.collect
+        val ch = chol(inCoreAtA)
+        val inCoreR = (ch.getL cloned) t
+        if (checkRankDeficiency && !ch.isPositiveDefinite)
+            throw new IllegalArgumentException("R is rank-deficient.")
+        val bcastAtA = sc.broadcast(inCoreAtA)
+        val Q = A.mapBlock() {
+            case (keys, block) => keys -> chol(bcastAtA).solveRight(block)
+        }
+        Q -> inCoreR
+    }
+
+
+## Usage
+
+The scala `dqrThin(...)` method can easily be called in any Spark or H2O 
application built with the `math-scala` library and the corresponding `Spark` 
or `H2O` engine module as follows:
+
+    import org.apache.mahout.math._
+    import decompositions._
+    import drm._
+    
+    val(drmQ, inCoreR) = dqrThin(drma)
+
+ 
+## References
+
+[1]: [Mahout Scala and Mahout Spark Bindings for Linear Algebra 
Subroutines](http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf)
+
+[2]: [Mahout Spark and Scala 
Bindings](http://mahout.apache.org/users/sparkbindings/home.html)
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/d-spca.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/d-spca.md 
b/website/old_site_migration/completed/d-spca.md
new file mode 100644
index 0000000..0c1ab1e
--- /dev/null
+++ b/website/old_site_migration/completed/d-spca.md
@@ -0,0 +1,176 @@
+---
+layout: default
+title: Distributed Stochastic PCA
+theme:
+    name: retro-mahout
+---
+
+# Distributed Stochastic PCA
+
+
+## Intro
+
+Mahout has a distributed implementation of Stochastic PCA[1]. This algorithm 
computes the exact equivalent of Mahout's dssvd(`\(\mathbf{A-1\mu^\top}\)`) by 
modifying the `dssvd` algorithm so as to avoid forming 
`\(\mathbf{A-1\mu^\top}\)`, which would densify a sparse input. Thus, it is 
suitable for work with both dense and sparse inputs.
+
+## Algorithm
+
+Given an *m* `\(\times\)` *n* matrix `\(\mathbf{A}\)`, a target rank *k*, and 
an oversampling parameter *p*, this procedure computes a *k*-rank PCA by 
finding the unknowns in `\(\mathbf{A−1\mu^\top \approx U\Sigma V^\top}\)`:
+
+1. Create seed for random *n* `\(\times\)` *(k+p)* matrix `\(\Omega\)`.
+2. `\(\mathbf{s_\Omega \leftarrow \Omega^\top \mu}\)`.
+3. `\(\mathbf{Y_0 \leftarrow A\Omega − 1 {s_\Omega}^\top, Y \in 
\mathbb{R}^{m\times(k+p)}}\)`.
+4. Column-orthonormalize `\(\mathbf{Y_0} \rightarrow \mathbf{Q}\)` by 
computing thin decomposition `\(\mathbf{Y_0} = \mathbf{QR}\)`. Also, 
`\(\mathbf{Q}\in\mathbb{R}^{m\times(k+p)}, 
\mathbf{R}\in\mathbb{R}^{(k+p)\times(k+p)}\)`.
+5. `\(\mathbf{s_Q \leftarrow Q^\top 1}\)`.
+6. `\(\mathbf{B_0 \leftarrow Q^\top A: B \in \mathbb{R}^{(k+p)\times n}}\)`.
+7. `\(\mathbf{s_B \leftarrow {B_0}^\top \mu}\)`.
+8. For *i* in 1..*q* repeat (power iterations):
+    - For *j* in 1..*n* apply `\(\mathbf{(B_{i−1})_{∗j} \leftarrow 
(B_{i−1})_{∗j}−\mu_j s_Q}\)`.
+    - `\(\mathbf{Y_i \leftarrow A{B_{i−1}}^\top−1(s_B−\mu^\top \mu 
s_Q)^\top}\)`.
+    - Column-orthonormalize `\(\mathbf{Y_i} \rightarrow \mathbf{Q}\)` by 
computing thin decomposition `\(\mathbf{Y_i = QR}\)`.
+    - `\(\mathbf{s_Q \leftarrow Q^\top 1}\)`.
+    - `\(\mathbf{B_i \leftarrow Q^\top A}\)`.
+    - `\(\mathbf{s_B \leftarrow {B_i}^\top \mu}\)`.
+9. Let `\(\mathbf{C \triangleq s_Q {s_B}^\top}\)`. `\(\mathbf{M \leftarrow B_q 
{B_q}^\top − C − C^\top + \mu^\top \mu s_Q {s_Q}^\top}\)`.
+10. Compute an eigensolution of the small symmetric `\(\mathbf{M = \hat{U} 
\Lambda \hat{U}^\top: M \in \mathbb{R}^{(k+p)\times(k+p)}}\)`.
+11. The singular values `\(\Sigma = \Lambda^{\circ 0.5}\)`, or, in other 
words, `\(\mathbf{\sigma_i= \sqrt{\lambda_i}}\)`.
+12. If needed, compute `\(\mathbf{U = Q\hat{U}}\)`.
+13. If needed, compute `\(\mathbf{V = B^\top \hat{U} \Sigma^{−1}}\)`.
+14. If needed, items converted to the PCA space can be computed as 
`\(\mathbf{U\Sigma}\)`.
+
+## Implementation
+
+Mahout `dspca(...)` is implemented in the mahout `math-scala` algebraic 
optimizer which translates Mahout's R-like linear algebra operators into a 
physical plan for both Spark and H2O distributed engines.
+
+    def dspca[K](drmA: DrmLike[K], k: Int, p: Int = 15, q: Int = 0): 
+    (DrmLike[K], DrmLike[Int], Vector) = {
+
+        // Some mapBlock() calls need it
+        implicit val ktag =  drmA.keyClassTag
+
+        val drmAcp = drmA.checkpoint()
+        implicit val ctx = drmAcp.context
+
+        val m = drmAcp.nrow
+       val n = drmAcp.ncol
+        assert(k <= (m min n), "k cannot be greater than smaller of m, n.")
+        val pfxed = safeToNonNegInt((m min n) - k min p)
+
+        // Actual decomposition rank
+        val r = k + pfxed
+
+        // Dataset mean
+        val mu = drmAcp.colMeans
+
+        val mtm = mu dot mu
+
+        // We represent Omega by its seed.
+        val omegaSeed = RandomUtils.getRandom().nextInt()
+        val omega = Matrices.symmetricUniformView(n, r, omegaSeed)
+
+        // This done in front in a single-threaded fashion for now. Even 
though it doesn't require any
+        // memory beyond that is required to keep xi around, it still might be 
parallelized to backs
+        // for significantly big n and r. TODO
+        val s_o = omega.t %*% mu
+
+        val bcastS_o = drmBroadcast(s_o)
+        val bcastMu = drmBroadcast(mu)
+
+        var drmY = drmAcp.mapBlock(ncol = r) {
+            case (keys, blockA) ⇒
+                val s_o:Vector = bcastS_o
+                val blockY = blockA %*% Matrices.symmetricUniformView(n, r, 
omegaSeed)
+                for (row ← 0 until blockY.nrow) blockY(row, ::) -= s_o
+                keys → blockY
+        }
+                // Checkpoint Y
+                .checkpoint()
+
+        var drmQ = dqrThin(drmY, checkRankDeficiency = false)._1.checkpoint()
+
+        var s_q = drmQ.colSums()
+        var bcastVarS_q = drmBroadcast(s_q)
+
+        // This actually should be optimized as identically partitioned 
map-side A'B since A and Q should
+        // still be identically partitioned.
+        var drmBt = (drmAcp.t %*% drmQ).checkpoint()
+
+        var s_b = (drmBt.t %*% mu).collect(::, 0)
+        var bcastVarS_b = drmBroadcast(s_b)
+
+        for (i ← 0 until q) {
+
+            // These closures don't seem to live well with outside-scope vars. 
This doesn't record closure
+            // attributes correctly. So we create additional set of vals for 
broadcast vars to properly
+            // create readonly closure attributes in this very scope.
+            val bcastS_q = bcastVarS_q
+            val bcastMuInner = bcastMu
+
+            // Fix Bt as B' -= xi cross s_q
+            drmBt = drmBt.mapBlock() {
+                case (keys, block) ⇒
+                    val s_q: Vector = bcastS_q
+                    val mu: Vector = bcastMuInner
+                    keys.zipWithIndex.foreach {
+                        case (key, idx) ⇒ block(idx, ::) -= s_q * mu(key)
+                    }
+                    keys → block
+            }
+
+            drmY.uncache()
+            drmQ.uncache()
+
+            val bCastSt_b = drmBroadcast(s_b -=: mtm * s_q)
+
+            drmY = (drmAcp %*% drmBt)
+                // Fix Y by subtracting st_b from each row of the AB'
+                .mapBlock() {
+                case (keys, block) ⇒
+                    val st_b: Vector = bCastSt_b
+                    block := { (_, c, v) ⇒ v - st_b(c) }
+                    keys → block
+            }
+            // Checkpoint Y
+            .checkpoint()
+
+            drmQ = dqrThin(drmY, checkRankDeficiency = false)._1.checkpoint()
+
+            s_q = drmQ.colSums()
+            bcastVarS_q = drmBroadcast(s_q)
+
+            // This on the other hand should be inner-join-and-map A'B 
optimization since A and Q_i are not
+            // identically partitioned anymore.
+            drmBt = (drmAcp.t %*% drmQ).checkpoint()
+
+            s_b = (drmBt.t %*% mu).collect(::, 0)
+            bcastVarS_b = drmBroadcast(s_b)
+        }
+
+        val c = s_q cross s_b
+        val inCoreBBt = (drmBt.t %*% drmBt).checkpoint(CacheHint.NONE).collect 
-=:
+            c -=: c.t +=: mtm *=: (s_q cross s_q)
+        val (inCoreUHat, d) = eigen(inCoreBBt)
+        val s = d.sqrt
+
+        // Since neither drmU nor drmV are actually computed until actually 
used, we don't need the flags
+        // instructing compute (or not compute) either of the U,V outputs 
anymore. Neat, isn't it?
+        val drmU = drmQ %*% inCoreUHat
+        val drmV = drmBt %*% (inCoreUHat %*% diagv(1 / s))
+
+        (drmU(::, 0 until k), drmV(::, 0 until k), s(0 until k))
+    }
+
+## Usage
+
+The scala `dspca(...)` method can easily be called in any Spark, Flink, or H2O 
application built with the `math-scala` library and the corresponding `Spark`, 
`Flink`, or `H2O` engine module as follows:
+
+    import org.apache.mahout.math._
+    import decompositions._
+    import drm._
+    
+    val (drmU, drmV, s) = dspca(drmA, k=200, q=1)
+
+Note the parameter is optional and its default value is zero.
+ 
+## References
+
+[1]: Lyubimov and Palumbo, ["Apache Mahout: Beyond MapReduce; Distributed 
Algorithm 
Design"](https://www.amazon.com/Apache-Mahout-MapReduce-Dmitriy-Lyubimov/dp/1523775785)

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/d-ssvd.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/d-ssvd.md 
b/website/old_site_migration/completed/d-ssvd.md
new file mode 100644
index 0000000..8063fa3
--- /dev/null
+++ b/website/old_site_migration/completed/d-ssvd.md
@@ -0,0 +1,143 @@
+---
+layout: default
+title: Distributed Stochastic Singular Value Decomposition
+theme:
+    name: retro-mahout
+---
+
+# Distributed Stochastic Singular Value Decomposition
+
+
+## Intro
+
+Mahout has a distributed implementation of Stochastic Singular Value 
Decomposition [1] using the parallelization strategy comprehensively defined in 
Nathan Halko's dissertation ["Randomized methods for computing low-rank 
approximations of 
matrices"](http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf)
 [2].
+
+## Modified SSVD Algorithm
+
+Given an `\(m\times n\)`
+matrix `\(\mathbf{A}\)`, a target rank `\(k\in\mathbb{N}_{1}\)`
+, an oversampling parameter `\(p\in\mathbb{N}_{1}\)`, 
+and the number of additional power iterations `\(q\in\mathbb{N}_{0}\)`, 
+this procedure computes an `\(m\times\left(k+p\right)\)`
+SVD `\(\mathbf{A\approx U}\boldsymbol{\Sigma}\mathbf{V}^{\top}\)`:
+
+  1. Create seed for random `\(n\times\left(k+p\right)\)`
+  matrix `\(\boldsymbol{\Omega}\)`. The seed defines matrix 
`\(\mathbf{\Omega}\)`
+  using Gaussian unit vectors per one of suggestions in [Halko, Martinsson, 
Tropp].
+
+  2. 
`\(\mathbf{Y=A\boldsymbol{\Omega}},\,\mathbf{Y}\in\mathbb{R}^{m\times\left(k+p\right)}\)`
+ 
+  3. Column-orthonormalize `\(\mathbf{Y}\rightarrow\mathbf{Q}\)`
+  by computing thin decomposition `\(\mathbf{Y}=\mathbf{Q}\mathbf{R}\)`.
+  Also, 
`\(\mathbf{Q}\in\mathbb{R}^{m\times\left(k+p\right)},\,\mathbf{R}\in\mathbb{R}^{\left(k+p\right)\times\left(k+p\right)}\)`;
 denoted as `\(\mathbf{Q}=\mbox{qr}\left(\mathbf{Y}\right).\mathbf{Q}\)`
+
+  4. 
`\(\mathbf{B}_{0}=\mathbf{Q}^{\top}\mathbf{A}:\,\,\mathbf{B}\in\mathbb{R}^{\left(k+p\right)\times
 n}\)`.
+ 
+  5. If `\(q>0\)`
+  repeat: for `\(i=1..q\)`: 
+  
`\(\mathbf{B}_{i}^{\top}=\mathbf{A}^{\top}\mbox{qr}\left(\mathbf{A}\mathbf{B}_{i-1}^{\top}\right).\mathbf{Q}\)`
+  (power iterations step).
+
+  6. Compute Eigensolution of a small Hermitian 
`\(\mathbf{B}_{q}\mathbf{B}_{q}^{\top}=\mathbf{\hat{U}}\boldsymbol{\Lambda}\mathbf{\hat{U}}^{\top}\)`,
+  
`\(\mathbf{B}_{q}\mathbf{B}_{q}^{\top}\in\mathbb{R}^{\left(k+p\right)\times\left(k+p\right)}\)`.
+ 
+  7. Singular values 
`\(\mathbf{\boldsymbol{\Sigma}}=\boldsymbol{\Lambda}^{0.5}\)`,
+  or, in other words, `\(s_{i}=\sqrt{\sigma_{i}}\)`.
+ 
+  8. If needed, compute `\(\mathbf{U}=\mathbf{Q}\hat{\mathbf{U}}\)`.
+
+  9. If needed, compute 
`\(\mathbf{V}=\mathbf{B}_{q}^{\top}\hat{\mathbf{U}}\boldsymbol{\Sigma}^{-1}\)`.
+Another way is 
`\(\mathbf{V}=\mathbf{A}^{\top}\mathbf{U}\boldsymbol{\Sigma}^{-1}\)`.
+
+
+
+
+## Implementation
+
+Mahout `dssvd(...)` is implemented in the mahout `math-scala` algebraic 
optimizer which translates Mahout's R-like linear algebra operators into a 
physical plan for both Spark and H2O distributed engines.
+
+    def dssvd[K: ClassTag](drmA: DrmLike[K], k: Int, p: Int = 15, q: Int = 0):
+        (DrmLike[K], DrmLike[Int], Vector) = {
+
+        val drmAcp = drmA.checkpoint()
+
+        val m = drmAcp.nrow
+        val n = drmAcp.ncol
+        assert(k <= (m min n), "k cannot be greater than smaller of m, n.")
+        val pfxed = safeToNonNegInt((m min n) - k min p)
+
+        // Actual decomposition rank
+        val r = k + pfxed
+
+        // We represent Omega by its seed.
+        val omegaSeed = RandomUtils.getRandom().nextInt()
+
+        // Compute Y = A*Omega.  
+        var drmY = drmAcp.mapBlock(ncol = r) {
+            case (keys, blockA) =>
+                val blockY = blockA %*% Matrices.symmetricUniformView(n, r, 
omegaSeed)
+            keys -> blockY
+        }
+
+        var drmQ = dqrThin(drmY.checkpoint())._1
+
+        // Checkpoint Q if last iteration
+        if (q == 0) drmQ = drmQ.checkpoint()
+
+        var drmBt = drmAcp.t %*% drmQ
+        
+        // Checkpoint B' if last iteration
+        if (q == 0) drmBt = drmBt.checkpoint()
+
+        for (i <- 0  until q) {
+            drmY = drmAcp %*% drmBt
+            drmQ = dqrThin(drmY.checkpoint())._1            
+            
+            // Checkpoint Q if last iteration
+            if (i == q - 1) drmQ = drmQ.checkpoint()
+            
+            drmBt = drmAcp.t %*% drmQ
+            
+            // Checkpoint B' if last iteration
+            if (i == q - 1) drmBt = drmBt.checkpoint()
+        }
+
+        val (inCoreUHat, d) = eigen(drmBt.t %*% drmBt)
+        val s = d.sqrt
+
+        // Since neither drmU nor drmV are actually computed until actually 
used
+        // we don't need the flags instructing compute (or not compute) either 
of the U,V outputs 
+        val drmU = drmQ %*% inCoreUHat
+        val drmV = drmBt %*% (inCoreUHat %*%: diagv(1 /: s))
+
+        (drmU(::, 0 until k), drmV(::, 0 until k), s(0 until k))
+    }
+
+Note: As a side effect of checkpointing, U and V values are returned as 
logical operators (i.e. they are neither checkpointed nor computed).  Therefore 
there is no physical work actually done to compute `\(\mathbf{U}\)` or 
`\(\mathbf{V}\)` until they are used in a subsequent expression.
+
+
+## Usage
+
+The scala `dssvd(...)` method can easily be called in any Spark or H2O 
application built with the `math-scala` library and the corresponding `Spark` 
or `H2O` engine module as follows:
+
+    import org.apache.mahout.math._
+    import decompositions._
+    import drm._
+    
+    
+    val(drmU, drmV, s) = dssvd(drma, k = 40, q = 1)
+
+ 
+## References
+
+[1]: [Mahout Scala and Mahout Spark Bindings for Linear Algebra 
Subroutines](http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf)
+
+[2]: [Randomized methods for computing low-rank
+approximations of 
matrices](http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf)
+
+[2]: [Halko, Martinsson, Tropp](http://arxiv.org/abs/0909.4061)
+
+[3]: [Mahout Spark and Scala 
Bindings](http://mahout.apache.org/users/sparkbindings/home.html)
+
+
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/downloads.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/downloads.md 
b/website/old_site_migration/completed/downloads.md
new file mode 100644
index 0000000..0822d19
--- /dev/null
+++ b/website/old_site_migration/completed/downloads.md
@@ -0,0 +1,68 @@
+---
+layout: default
+title: Downloads
+theme:
+    name: retro-mahout
+---
+
+<a name="Downloads-OfficialRelease"></a>
+# Official Release
+Apache Mahout is an official Apache project and thus available from any of
+the Apache mirrors. The latest Mahout release is available for download at: 
+
+* [Download Latest](http://www.apache.org/dyn/closer.cgi/mahout/)
+* [Release Archive](http://archive.apache.org/dist/mahout/)
+
+
+# Source code for the current snapshot
+
+Apache Mahout is mirrored to [Github](https://github.com/apache/mahout). To 
get all source:
+
+    git clone https://github.com/apache/mahout.git mahout
+   
+# Environment
+
+Whether you are using Mahout's Shell, running command line jobs or using it as 
a library to build your own apps 
+you'll need to setup several environment variables. 
+Edit your environment in ```~/.bash_profile``` for Mac or ```~/.bashrc``` for 
many linux distributions. Add the following
+
+    export MAHOUT_HOME=/path/to/mahout
+    export MAHOUT_LOCAL=true # for running standalone on your dev machine, 
+    # unset MAHOUT_LOCAL for running on a cluster 
+
+If you are running on Spark you will also need $SPARK_HOME
+
+Make sure to have $JAVA_HOME set also
+
+# Using Mahout as a Library
+
+Running any application that uses Mahout will require installing a binary or 
source version and setting the environment.  
+Then add the appropriate setting to your pom.xml or build.sbt following the 
template below.
+ 
+If you only need the math part of Mahout:
+
+    <dependency>
+        <groupId>org.apache.mahout</groupId>
+        <artifactId>mahout-math</artifactId>
+        <version>${mahout.version}</version>
+    </dependency>
+
+In case you would like to use some of our integration tooling (e.g. for 
generating vectors from Lucene):
+
+    <dependency>
+        <groupId>org.apache.mahout</groupId>
+        <artifactId>mahout-hdfs</artifactId>
+        <version>${mahout.version}</version>
+    </dependency>
+
+In case you are using Ivy, Gradle, Buildr, Grape or SBT you might want to 
directly head over to the official [Maven Repository 
search](http://mvnrepository.com/artifact/org.apache.mahout/mahout-core).
+
+
+<a name="Downloads-FutureReleases"></a>
+# Future Releases
+
+Official releases are usually created when the developers feel there are
+sufficient changes, improvements and bug fixes to warrant a release. Watch
+the <a 
href="https://mahout.apache.org/general/mailing-lists,-irc-and-archives.html";>Mailing
 lists</a>
+ for latest release discussions and check the Github repo.
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/how-to-build-an-app.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/how-to-build-an-app.md 
b/website/old_site_migration/completed/how-to-build-an-app.md
new file mode 100644
index 0000000..89fc575
--- /dev/null
+++ b/website/old_site_migration/completed/how-to-build-an-app.md
@@ -0,0 +1,257 @@
+---
+layout: default
+title: 
+theme:
+   name: retro-mahout
+---
+
+# How to create and App using Mahout
+
+This is an example of how to create a simple app using Mahout as a Library. 
The source is available on Github in the [3-input-cooc 
project](https://github.com/pferrel/3-input-cooc) with more explanation about 
what it does (has to do with collaborative filtering). For this tutorial we'll 
concentrate on the app rather than the data science.
+
+The app reads in three user-item interactions types and creats indicators for 
them using cooccurrence and cross-cooccurrence. The indicators will be written 
to text files in a format ready for search engine indexing in search engine 
based recommender.
+
+##Setup
+In order to build and run the CooccurrenceDriver you need to install the 
following:
+
+* Install the Java 7 JDK from Oracle. Mac users look here: [Java SE 
Development Kit 
7u72](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html).
+* Install sbt (simple build tool) 0.13.x for 
[Mac](http://www.scala-sbt.org/release/tutorial/Installing-sbt-on-Mac.html), 
[Linux](http://www.scala-sbt.org/release/tutorial/Installing-sbt-on-Linux.html) 
or [manual 
instalation](http://www.scala-sbt.org/release/tutorial/Manual-Installation.html).
+* Install [Spark 
1.1.1](https://spark.apache.org/docs/1.1.1/spark-standalone.html). Don't forget 
to setup SPARK_HOME
+* Install [Mahout 0.10.0](http://mahout.apache.org/general/downloads.html). 
Don't forget to setup MAHOUT_HOME and MAHOUT_LOCAL
+
+Why install if you are only using them as a library? Certain binaries and 
scripts are required by the libraries to get information about the environment 
like discovering where jars are located.
+
+Spark requires a set of jars on the classpath for the client side part of an 
app and another set of jars must be passed to the Spark Context for running 
distributed code. The example should discover all the neccessary classes 
automatically.
+
+##Application
+Using Mahout as a library in an application will require a little Scala code. 
Scala has an App trait so we'll create an object, which inherits from ```App```
+
+
+    object CooccurrenceDriver extends App {
+    }
+    
+
+This will look a little different than Java since ```App``` does delayed 
initialization, which causes the body to be executed when the App is launched, 
just as in Java you would create a main method.
+
+Before we can execute something on Spark we'll need to create a context. We 
could use raw Spark calls here but default values are setup for a Mahout 
context by using the Mahout helper function.
+
+    implicit val mc = mahoutSparkContext(masterUrl = "local", 
+      appName = "CooccurrenceDriver")
+    
+We need to read in three files containing different interaction types. The 
files will each be read into a Mahout IndexedDataset. This allows us to 
preserve application-specific user and item IDs throughout the calculations.
+
+For example, here is data/purchase.csv:
+
+    u1,iphone
+    u1,ipad
+    u2,nexus
+    u2,galaxy
+    u3,surface
+    u4,iphone
+    u4,galaxy
+
+Mahout has a helper function that reads the text delimited files  
SparkEngine.indexedDatasetDFSReadElements. The function reads single element 
tuples (user-id,item-id) in a distributed way to create the IndexedDataset. 
Distributed Row Matrices (DRM) and Vectors are important data types supplied by 
Mahout and IndexedDataset is like a very lightweight Dataframe in R, it wraps a 
DRM with HashBiMaps for row and column IDs. 
+
+One important thing to note about this example is that we read in all datasets 
before we adjust the number of rows in them to match the total number of users 
in the data. This is so the math works out [(A'A, A'B, 
A'C)](http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html) 
even if some users took one action but not another there must be the same 
number of rows in all matrices.
+
+    /**
+     * Read files of element tuples and create IndexedDatasets one per action. 
These 
+     * share a userID BiMap but have their own itemID BiMaps
+     */
+    def readActions(actionInput: Array[(String, String)]): Array[(String, 
IndexedDataset)] = {
+      var actions = Array[(String, IndexedDataset)]()
+
+      val userDictionary: BiMap[String, Int] = HashBiMap.create()
+
+      // The first action named in the sequence is the "primary" action and 
+      // begins to fill up the user dictionary
+      for ( actionDescription <- actionInput ) {// grab the path to actions
+        val action: IndexedDataset = SparkEngine.indexedDatasetDFSReadElements(
+          actionDescription._2,
+          schema = DefaultIndexedDatasetElementReadSchema,
+          existingRowIDs = userDictionary)
+        userDictionary.putAll(action.rowIDs)
+        // put the name in the tuple with the indexedDataset
+        actions = actions :+ (actionDescription._1, action) 
+      }
+
+      // After all actions are read in the userDictonary will contain every 
user seen, 
+      // even if they may not have taken all actions . Now we adjust the row 
rank of 
+      // all IndxedDataset's to have this number of rows
+      // Note: this is very important or the cooccurrence calc may fail
+      val numUsers = userDictionary.size() // one more than the cardinality
+
+      val resizedNameActionPairs = actions.map { a =>
+        //resize the matrix by, in effect by adding empty rows
+        val resizedMatrix = a._2.create(a._2.matrix, userDictionary, 
a._2.columnIDs).newRowCardinality(numUsers)
+        (a._1, resizedMatrix) // return the Tuple of (name, IndexedDataset)
+      }
+      resizedNameActionPairs // return the array of Tuples
+    }
+
+
+Now that we have the data read in we can perform the cooccurrence calculation.
+
+    // actions.map creates an array of just the IndeedDatasets
+    val indicatorMatrices = SimilarityAnalysis.cooccurrencesIDSs(
+      actions.map(a => a._2)) 
+
+All we need to do now is write the indicators.
+
+    // zip a pair of arrays into an array of pairs, reattaching the action 
names
+    val indicatorDescriptions = actions.map(a => a._1).zip(indicatorMatrices)
+    writeIndicators(indicatorDescriptions)
+
+
+The ```writeIndicators``` method uses the default write function 
```dfsWrite```.
+
+    /**
+     * Write indicatorMatrices to the output dir in the default format
+     * for indexing by a search engine.
+     */
+    def writeIndicators( indicators: Array[(String, IndexedDataset)]) = {
+      for (indicator <- indicators ) {
+        // create a name based on the type of indicator
+        val indicatorDir = OutputPath + indicator._1
+        indicator._2.dfsWrite(
+          indicatorDir,
+          // Schema tells the writer to omit LLR strengths 
+          // and format for search engine indexing
+          IndexedDatasetWriteBooleanSchema) 
+      }
+    }
+ 
+
+See the Github project for the full source. Now we create a build.sbt to build 
the example. 
+
+    name := "cooccurrence-driver"
+
+    organization := "com.finderbots"
+
+    version := "0.1"
+
+    scalaVersion := "2.10.4"
+
+    val sparkVersion = "1.1.1"
+
+    libraryDependencies ++= Seq(
+      "log4j" % "log4j" % "1.2.17",
+      // Mahout's Spark code
+      "commons-io" % "commons-io" % "2.4",
+      "org.apache.mahout" % "mahout-math-scala_2.10" % "0.10.0",
+      "org.apache.mahout" % "mahout-spark_2.10" % "0.10.0",
+      "org.apache.mahout" % "mahout-math" % "0.10.0",
+      "org.apache.mahout" % "mahout-hdfs" % "0.10.0",
+      // Google collections, AKA Guava
+      "com.google.guava" % "guava" % "16.0")
+
+    resolvers += "typesafe repo" at " 
http://repo.typesafe.com/typesafe/releases/";
+
+    resolvers += Resolver.mavenLocal
+
+    packSettings
+
+    packMain := Map(
+      "cooc" -> "CooccurrenceDriver")
+
+
+##Build
+Building the examples from project's root folder:
+
+    $ sbt pack
+
+This will automatically set up some launcher scripts for the driver. To run 
execute
+
+    $ target/pack/bin/cooc
+    
+The driver will execute in Spark standalone mode and put the data in 
/path/to/3-input-cooc/data/indicators/*indicator-type*
+
+##Using a Debugger
+To build and run this example in a debugger like IntelliJ IDEA. Install from 
the IntelliJ site and add the Scala plugin.
+
+Open IDEA and go to the menu File->New->Project from existing 
sources->SBT->/path/to/3-input-cooc. This will create an IDEA project from 
```build.sbt``` in the root directory.
+
+At this point you may create a "Debug Configuration" to run. In the menu 
choose Run->Edit Configurations. Under "Default" choose "Application". In the 
dialog hit the elipsis button "..." to the right of "Environment Variables" and 
fill in your versions of JAVA_HOME, SPARK_HOME, and MAHOUT_HOME. In 
configuration editor under "Use classpath from" choose root-3-input-cooc 
module. 
+
+![image](http://mahout.apache.org/images/debug-config.png)
+
+Now choose "Application" in the left pane and hit the plus sign "+". give the 
config a name and hit the elipsis button to the right of the "Main class" field 
as shown.
+
+![image](http://mahout.apache.org/images/debug-config-2.png)
+
+
+After setting breakpoints you are now ready to debug the configuration. Go to 
the Run->Debug... menu and pick your configuration. This will execute using a 
local standalone instance of Spark.
+
+##The Mahout Shell
+
+For small script-like apps you may wish to use the Mahout shell. It is a Scala 
REPL type interactive shell built on the Spark shell with Mahout-Samsara 
extensions.
+
+To make the CooccurrenceDriver.scala into a script make the following changes:
+
+* You won't need the context, since it is created when the shell is launched, 
comment that line out.
+* Replace the logger.info lines with println
+* Remove the package info since it's not needed, this will produce the file in 
```path/to/3-input-cooc/bin/CooccurrenceDriver.mscala```. 
+
+Note the extension ```.mscala``` to indicate we are using Mahout's scala 
extensions for math, otherwise known as 
[Mahout-Samsara](http://mahout.apache.org/users/environment/out-of-core-reference.html)
+
+To run the code make sure the output does not exist already
+
+    $ rm -r /path/to/3-input-cooc/data/indicators
+    
+Launch the Mahout + Spark shell:
+
+    $ mahout spark-shell
+    
+You'll see the Mahout splash:
+
+    MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
+
+                         _                 _
+             _ __ ___   __ _| |__   ___  _   _| |_
+            | '_ ` _ \ / _` | '_ \ / _ \| | | | __|
+            | | | | | | (_| | | | | (_) | |_| | |_
+            |_| |_| |_|\__,_|_| |_|\___/ \__,_|\__|  version 0.10.0
+
+      
+    Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.7.0_72)
+    Type in expressions to have them evaluated.
+    Type :help for more information.
+    15/04/26 09:30:48 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
+    Created spark context..
+    Mahout distributed context is available as "implicit val sdc".
+    mahout> 
+
+To load the driver type:
+
+    mahout> :load /path/to/3-input-cooc/bin/CooccurrenceDriver.mscala
+    Loading ./bin/CooccurrenceDriver.mscala...
+    import com.google.common.collect.{HashBiMap, BiMap}
+    import org.apache.log4j.Logger
+    import org.apache.mahout.math.cf.SimilarityAnalysis
+    import org.apache.mahout.math.indexeddataset._
+    import org.apache.mahout.sparkbindings._
+    import scala.collection.immutable.HashMap
+    defined module CooccurrenceDriver
+    mahout> 
+
+To run the driver type:
+
+    mahout> CooccurrenceDriver.main(args = Array(""))
+    
+You'll get some stats printed:
+
+    Total number of users for all actions = 5
+    purchase indicator matrix:
+      Number of rows for matrix = 4
+      Number of columns for matrix = 5
+      Number of rows after resize = 5
+    view indicator matrix:
+      Number of rows for matrix = 4
+      Number of columns for matrix = 5
+      Number of rows after resize = 5
+    category indicator matrix:
+      Number of rows for matrix = 5
+      Number of columns for matrix = 7
+      Number of rows after resize = 5
+    
+If you look in ```path/to/3-input-cooc/data/indicators``` you should find 
folders containing the indicator matrices.

http://git-wip-us.apache.org/repos/asf/mahout/blob/9c031452/website/old_site_migration/completed/in-core-reference.md
----------------------------------------------------------------------
diff --git a/website/old_site_migration/completed/in-core-reference.md 
b/website/old_site_migration/completed/in-core-reference.md
new file mode 100644
index 0000000..109ca1d
--- /dev/null
+++ b/website/old_site_migration/completed/in-core-reference.md
@@ -0,0 +1,304 @@
+---
+layout: default
+title: 
+theme:
+   name: retro-mahout
+---
+
+## Mahout-Samsara's In-Core Linear Algebra DSL Reference
+
+#### Imports
+
+The following imports are used to enable Mahout-Samsara's Scala DSL bindings 
for in-core Linear Algebra:
+
+    import org.apache.mahout.math._
+    import scalabindings._
+    import RLikeOps._
+    
+#### Inline initalization
+
+Dense vectors:
+
+    val densVec1: Vector = (1.0, 1.1, 1.2)
+    val denseVec2 = dvec(1, 0, 1,1 ,1,2)
+
+Sparse vectors:
+
+    val sparseVec1: Vector = (5 -> 1.0) :: (10 -> 2.0) :: Nil
+    val sparseVec1 = svec((5 -> 1.0) :: (10 -> 2.0) :: Nil)
+
+    // to create a vector with specific cardinality
+    val sparseVec1 = svec((5 -> 1.0) :: (10 -> 2.0) :: Nil, cardinality = 20)
+    
+Inline matrix initialization, either sparse or dense, is always done row wise. 
+
+Dense matrices:
+
+    val A = dense((1, 2, 3), (3, 4, 5))
+    
+Sparse matrices:
+
+    val A = sparse(
+              (1, 3) :: Nil,
+              (0, 2) :: (1, 2.5) :: Nil
+                  )
+
+Diagonal matrix with constant diagonal elements:
+
+    diag(3.5, 10)
+
+Diagonal matrix with main diagonal backed by a vector:
+
+    diagv((1, 2, 3, 4, 5))
+    
+Identity matrix:
+
+    eye(10)
+    
+####Slicing and Assigning
+
+Getting a vector element:
+
+    val d = vec(5)
+
+Setting a vector element:
+    
+    vec(5) = 3.0
+    
+Getting a matrix element:
+
+    val d = m(3,5)
+    
+Setting a matrix element:
+
+    M(3,5) = 3.0
+    
+Getting a matrix row or column:
+
+    val rowVec = M(3, ::)
+    val colVec = M(::, 3)
+    
+Setting a matrix row or column via vector assignment:
+
+    M(3, ::) := (1, 2, 3)
+    M(::, 3) := (1, 2, 3)
+    
+Setting a subslices of a matrix row or column:
+
+    a(0, 0 to 1) = (3, 5)
+   
+Setting a subslices of a matrix row or column via vector assignment:
+
+    a(0, 0 to 1) := (3, 5)
+   
+Getting a matrix as from matrix contiguous block:
+
+    val B = A(2 to 3, 3 to 4)
+   
+Assigning a contiguous block to a matrix:
+
+    A(0 to 1, 1 to 2) = dense((3, 2), (3 ,3))
+   
+Assigning a contiguous block to a matrix using the matrix assignment operator:
+
+    A(o to 1, 1 to 2) := dense((3, 2), (3, 3))
+   
+Assignment operator used for copying between vectors or matrices:
+
+    vec1 := vec2
+    M1 := M2
+   
+Assignment operator using assignment through a functional literal for a matrix:
+
+    M := ((row, col, x) => if (row == col) 1 else 0
+    
+Assignment operator using assignment through a functional literal for a vector:
+
+    vec := ((index, x) => sqrt(x)
+    
+#### BLAS-like operations
+
+Plus/minus either vector or numeric with assignment or not:
+
+    a + b
+    a - b
+    a + 5.0
+    a - 5.0
+    
+Hadamard (elementwise) product, either vector or matrix or numeric operands:
+
+    a * b
+    a * 0.5
+
+Operations with assignment:
+
+    a += b
+    a -= b
+    a += 5.0
+    a -= 5.0
+    a *= b
+    a *= 5
+   
+*Some nuanced rules*: 
+
+1/x in R (where x is a vector or a matrix) is elementwise inverse.  In scala 
it would be expressed as:
+
+    val xInv = 1 /: x
+
+and R's 5.0 - x would be:
+   
+    val x1 = 5.0 -: x
+    
+*note: All assignment operations, including :=, return the assignee just like 
in C++*:
+
+    a -= b 
+    
+assigns **a - b** to **b** (in-place) and returns **b**.  Similarly for **a 
/=: b** or **1 /=: v** 
+    
+
+Dot product:
+
+    a dot b
+    
+Matrix and vector equivalency (or non-equivalency).  **Dangerous, exact 
equivalence is rarely useful, better to use norm comparisons with an allowance 
of small errors.**
+    
+    a === b
+    a !== b
+    
+Matrix multiply:    
+
+    a %*% b
+    
+Optimized Right Multiply with a diagonal matrix: 
+
+    diag(5, 5) :%*% b
+   
+Optimized Left Multiply with a diagonal matrix:
+
+    A %*%: diag(5, 5)
+
+Second norm, of a vector or matrix:
+
+    a.norm
+    
+Transpose:
+
+    val Mt = M.t
+    
+*note: Transposition is currently handled via view, i.e. updating a transposed 
matrix will be updating the original.*  Also computing something like 
`\(\mathbf{X^\top}\mathbf{X}\)`:
+
+    val XtX = X.t %*% X
+    
+will not therefore incur any additional data copying.
+
+#### Decompositions
+
+Matrix decompositions require an additional import:
+
+    import org.apache.mahout.math.decompositions._
+
+
+All arguments in the following are matricies.
+
+**Cholesky decomposition**
+
+    val ch = chol(M)
+    
+**SVD**
+
+    val (U, V, s) = svd(M)
+    
+**EigenDecomposition**
+
+    val (V, d) = eigen(M)
+    
+**QR decomposition**
+
+    val (Q, R) = qr(M)
+    
+**Rank**: Check for rank deficiency (runs rank-revealing QR)
+
+    M.isFullRank
+   
+**In-core SSVD**
+
+    Val (U, V, s) = ssvd(A, k = 50, p = 15, q = 1)
+    
+**Solving linear equation systems and matrix inversion:** fully similar to R 
semantics; there are three forms of invocation:
+
+
+Solve `\(\mathbf{AX}=\mathbf{B}\)`:
+
+    solve(A, B)
+   
+Solve `\(\mathbf{Ax}=\mathbf{b}\)`:
+  
+    solve(A, b)
+   
+Compute `\(\mathbf{A^{-1}}\)`:
+
+    solve(A)
+   
+#### Misc
+
+Vector cardinality:
+
+    a.length
+    
+Matrix cardinality:
+
+    m.nrow
+    m.ncol
+    
+Means and sums:
+
+    m.colSums
+    m.colMeans
+    m.rowSums
+    m.rowMeans
+    
+Copy-By-Value:
+
+    val b = a cloned
+    
+#### Random Matrices
+
+`\(\mathcal{U}\)`(0,1) random matrix view:
+
+    val incCoreA = Matrices.uniformView(m, n, seed)
+
+    
+`\(\mathcal{U}\)`(-1,1) random matrix view:
+
+    val incCoreA = Matrices.symmetricUniformView(m, n, seed)
+
+`\(\mathcal{N}\)`(-1,1) random matrix view:
+
+    val incCoreA = Matrices.gaussianView(m, n, seed)
+    
+#### Iterators 
+
+Mahout-Math already exposes a number of iterators.  Scala code just needs the 
following imports to enable implicit conversions to scala iterators.
+
+    import collection._
+    import JavaConversions._
+    
+Iterating over rows in a Matrix:
+
+    for (row <- m) {
+      ... do something with row
+    }
+    
+<!--Iterating over non-zero and all elements of a vector:
+*Note that Vector.Element also has some implicit syntatic sugar, e.g to add 
5.0 to every non-zero element of a matrix, the following code may be used:*
+
+    for (row <- m; el <- row.nonZero) el = 5.0 + el
+    ... or 
+    for (row <- m; el <- row.nonZero) el := 5.0 + el
+    
+Similarly **row.all** produces an iterator over all elements in a row 
(Vector). 
+-->
+
+For more information including information on Mahout-Samsara's out-of-core 
Linear algebra bindings see: [Mahout Scala Bindings and Mahout Spark Bindings 
for Linear Algebra 
Subroutines](http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf)
+
+

Reply via email to