WEBSITE final cleanup before merge to master
Project: http://git-wip-us.apache.org/repos/asf/mahout/repo Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/3c53a6dc Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/3c53a6dc Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/3c53a6dc Branch: refs/heads/master Commit: 3c53a6dcfb0cc8e25d7da4b62e8ed056a7629b98 Parents: c4feca0 Author: rawkintrevo <[email protected]> Authored: Thu May 4 20:13:50 2017 -0500 Committer: rawkintrevo <[email protected]> Committed: Thu May 4 20:13:50 2017 -0500 ---------------------------------------------------------------------- website/README.md | 43 +- website/_layouts_old/body-old.html | 7 - website/_layouts_old/default.html | 6 - website/_layouts_old/doc.html | 0 website/_layouts_old/footer.html | 7 - website/_layouts_old/front_page.html | 7 - website/_layouts_old/header.html | 7 - website/_layouts_old/homepage.html | 7 - website/_layouts_old/inner.html | 16 - website/_layouts_old/mahout.html | 16 - website/_layouts_old/mahoutdoc.html | 118 - website/_layouts_old/navbar_docs.html | 7 - website/_layouts_old/navbar_main.html | 7 - website/_layouts_old/page.html | 7 - website/_layouts_old/pagination.html | 7 - website/_layouts_old/post.html | 7 - website/_layouts_old/tile.html | 7 - website/_layouts_old/title-group.html | 7 - website/docs/LICENSE | 21 - website/docs/archive.html | 10 - website/docs/atom.xml | 28 - website/docs/categories.html | 22 - website/docs/pages.html | 13 - website/docs/rss.xml | 28 - website/docs/tags.html | 21 - website/front/History.markdown | 16 - website/front/LICENSE | 21 - .../2011-12-29-jekyll-introduction.md | 412 -- website/front/atom.xml | 28 - website/front/community/blogs.md | 1 + website/front/pages.html | 13 - website/front/rss.xml | 28 - website/front/tags.html | 21 - website/old_site_migration/README.md | 41 - .../completed/bankmarketing-example.md | 53 - .../completed/breiman-example.md | 67 - .../completed/classification/bayesian.md | 147 - .../completed/classification/class-discovery.md | 155 - .../classification/classifyingyourdata.md | 27 - .../completed/classification/collocations.md | 385 -- .../gaussian-discriminative-analysis.md | 20 - .../classification/hidden-markov-models.md | 102 - .../independent-component-analysis.md | 17 - .../locally-weighted-linear-regression.md | 25 - .../classification/logistic-regression.md | 129 - .../classification/mahout-collections.md | 60 - .../completed/classification/mlp.md | 172 - .../completed/classification/naivebayes.md | 45 - .../completed/classification/neural-network.md | 22 - .../classification/partial-implementation.md | 146 - .../completed/classification/random-forests.md | 234 - .../restricted-boltzmann-machines.md | 49 - .../classification/support-vector-machines.md | 43 - .../completed/classify-a-doc-from-the-shell.md | 258 - website/old_site_migration/completed/d-als.md | 58 - website/old_site_migration/completed/d-qr.md | 56 - website/old_site_migration/completed/d-spca.md | 176 - website/old_site_migration/completed/d-ssvd.md | 143 - .../old_site_migration/completed/downloads.md | 68 - .../completed/environment/h2o-internals.md | 51 - .../completed/environment/spark-internals.md | 25 - .../completed/flinkbindings/flink-internals.md | 50 - .../flinkbindings/playing-with-samsara-flink.md | 111 - .../completed/how-to-build-an-app.md | 257 - .../completed/in-core-reference.md | 304 - .../completed/intro-cooccurrence-spark.md | 446 -- .../mailing-lists,-irc-and-archives.md | 75 - .../completed/out-of-core-reference.md | 318 - .../completed/privacy-policy.md | 28 - .../old_site_migration/completed/quickstart.md | 59 - .../completed/release-notes.md | 242 - .../completed/spark-naive-bayes.md | 132 - .../MahoutScalaAndSparkBindings.pptx | Bin 846177 -> 0 bytes .../sparkbindings/ScalaSparkBindings.pdf | 6215 ------------------ .../completed/sparkbindings/faq.md | 52 - .../completed/sparkbindings/home.md | 101 - .../completed/sparkbindings/play-with-shell.md | 199 - .../completed/twenty-newsgroups.md | 179 - .../old_site_migration/completed/who-we-are.md | 62 - .../completed/wikipedia-classifier-example.md | 57 - .../dont_migrate/algorithms.md | 58 - .../dont_migrate/collections.md | 98 - .../old_site_migration/dont_migrate/glossary.md | 15 - .../dont_migrate/mahout-benchmarks.md | 156 - .../dont_migrate/mahoutintegration.md | 6 - .../dont_migrate/recommender-overview.md | 34 - .../bayesian-commandline.md | 64 - .../needs_work_convenience/faq.md | 105 - .../map-reduce/clustering/20newsgroups.md | 11 - .../map-reduce/clustering/canopy-clustering.md | 188 - .../map-reduce/clustering/canopy-commandline.md | 70 - .../map-reduce/clustering/cluster-dumper.md | 106 - .../clustering-of-synthetic-control-data.md | 53 - .../clustering/clustering-seinfeld-episodes.md | 11 - .../map-reduce/clustering/clusteringyourdata.md | 126 - .../clustering/expectation-maximization.md | 62 - .../clustering/fuzzy-k-means-commandline.md | 97 - .../map-reduce/clustering/fuzzy-k-means.md | 186 - .../clustering/hierarchical-clustering.md | 15 - .../map-reduce/clustering/k-means-clustering.md | 182 - .../clustering/k-means-commandline.md | 94 - .../clustering/latent-dirichlet-allocation.md | 155 - .../map-reduce/clustering/lda-commandline.md | 83 - .../clustering/llr---log-likelihood-ratio.md | 46 - .../clustering/spectral-clustering.md | 84 - .../map-reduce/clustering/streaming-k-means.md | 174 - .../map-reduce/clustering/viewing-result.md | 15 - .../map-reduce/clustering/viewing-results.md | 49 - .../clustering/visualizing-sample-clusters.md | 50 - .../map-reduce/misc/mr---map-reduce.md | 19 - .../misc/parallel-frequent-pattern-mining.md | 185 - .../map-reduce/misc/perceptron-and-winnow.md | 41 - .../map-reduce/misc/testing.md | 46 - .../misc/using-mahout-with-python-via-jpype.md | 222 - .../map-reduce/recommender/intro-als-hadoop.md | 98 - .../recommender/intro-itembased-hadoop.md | 54 - .../recommender/matrix-factorization.md | 187 - .../recommender/recommender-documentation.md | 277 - .../recommender/recommender-first-timer-faq.md | 54 - .../recommender/userbased-5-minutes.md | 133 - .../needs_work_convenience/powered-by-mahout.md | 129 - .../creating-vectors-from-text.md | 291 - .../needs_work_priority/creating-vectors.md | 16 - .../dim-reduction/dimensional-reduction.md | 446 -- .../needs_work_priority/dim-reduction/ssvd.md | 127 - .../dim-reduction/ssvd.page/SSVD-CLI.pdf | Bin 462679 -> 0 bytes .../dim-reduction/ssvd.page/ssvd.R | 181 - .../general/books-tutorials-and-talks.md | 121 - .../old_site/general/mahout-wiki.md | 202 - .../old_site/general/professional-support.md | 41 - .../old_site/general/reference-reading.md | 71 - .../users/basics/matrix-and-vector-needs.md | 88 - .../basics/principal-components-analysis.md | 29 - .../svd---singular-value-decomposition.md | 52 - .../users/basics/system-requirements.md | 20 - ...term-frequency-inverse-document-frequency.md | 21 - website/oldsite/Gemfile.lock | 56 - website/oldsite/History.markdown | 16 - website/oldsite/LICENSE | 21 - website/oldsite/README.md | 79 +- .../_drafts/jekyll-introduction-draft.md | 10 - website/oldsite/_layouts/default.html | 4 +- website/oldsite/_layouts/page.html | 4 +- website/oldsite/_layouts/post.html | 4 +- .../2011-12-29-jekyll-introduction.md | 412 -- website/oldsite/_site/History.markdown | 16 - website/oldsite/_site/LICENSE | 21 - .../mahout-retro/css/bootstrap-responsive.css | 3 - website/oldsite/_site/atom.xml | 443 -- .../_site/developers/buildingmahout.html | 228 +- .../_site/developers/developer-resources.html | 228 +- website/oldsite/_site/developers/github.html | 228 +- website/oldsite/_site/developers/githubPRs.html | 228 +- website/oldsite/_site/developers/gsoc.html | 228 +- .../developers/how-to-become-a-committer.html | 228 +- .../_site/developers/how-to-contribute.html | 228 +- .../_site/developers/how-to-release.html | 228 +- .../developers/how-to-update-the-website.html | 216 +- .../oldsite/_site/developers/issue-tracker.html | 228 +- .../_site/developers/patch-check-list.html | 228 +- .../developers/thirdparty-dependencies.html | 228 +- .../_site/developers/version-control.html | 228 +- .../general/books-tutorials-and-talks.html | 228 +- website/oldsite/_site/general/downloads.html | 228 +- website/oldsite/_site/general/faq.html | 228 +- website/oldsite/_site/general/glossary.html | 216 +- .../_site/general/mahout-benchmarks.html | 228 +- website/oldsite/_site/general/mahout-wiki.html | 228 +- .../mailing-lists,-irc-and-archives.html | 228 +- .../_site/general/powered-by-mahout.html | 228 +- .../oldsite/_site/general/privacy-policy.html | 228 +- .../_site/general/professional-support.html | 228 +- .../_site/general/reference-reading.html | 228 +- .../oldsite/_site/general/release-notes.html | 228 +- website/oldsite/_site/general/who-we-are.html | 228 +- website/oldsite/_site/index.html | 228 +- .../lessons/2011/12/29/jekyll-introduction.html | 736 --- website/oldsite/_site/overview.html | 228 +- website/oldsite/_site/pages.html | 1216 ---- website/oldsite/_site/rss.xml | 442 -- website/oldsite/_site/sitemap.xml | 16 +- website/oldsite/_site/tags.html | 424 -- .../oldsite/_site/users/algorithms/d-als.html | 228 +- .../oldsite/_site/users/algorithms/d-qr.html | 228 +- .../oldsite/_site/users/algorithms/d-spca.html | 228 +- .../oldsite/_site/users/algorithms/d-ssvd.html | 228 +- .../algorithms/intro-cooccurrence-spark.html | 228 +- .../users/algorithms/recommender-overview.html | 228 +- .../users/algorithms/spark-naive-bayes.html | 228 +- .../oldsite/_site/users/basics/algorithms.html | 216 +- .../oldsite/_site/users/basics/collections.html | 228 +- .../_site/users/basics/collocations.html | 228 +- .../basics/creating-vectors-from-text.html | 228 +- .../_site/users/basics/creating-vectors.html | 216 +- .../gaussian-discriminative-analysis.html | 228 +- .../basics/independent-component-analysis.html | 228 +- .../_site/users/basics/mahout-collections.html | 228 +- .../_site/users/basics/mahoutintegration.html | 210 +- .../users/basics/matrix-and-vector-needs.html | 228 +- .../basics/principal-components-analysis.html | 228 +- .../oldsite/_site/users/basics/quickstart.html | 228 +- .../svd---singular-value-decomposition.html | 228 +- .../_site/users/basics/system-requirements.html | 228 +- ...rm-frequency-inverse-document-frequency.html | 216 +- .../classification/bankmarketing-example.html | 228 +- .../classification/bayesian-commandline.html | 228 +- .../_site/users/classification/bayesian.html | 228 +- .../users/classification/breiman-example.html | 228 +- .../users/classification/class-discovery.html | 228 +- .../classification/classifyingyourdata.html | 228 +- .../classification/hidden-markov-models.html | 228 +- .../locally-weighted-linear-regression.html | 228 +- .../classification/logistic-regression.html | 228 +- .../oldsite/_site/users/classification/mlp.html | 228 +- .../_site/users/classification/naivebayes.html | 228 +- .../users/classification/neural-network.html | 228 +- .../classification/partial-implementation.html | 228 +- .../users/classification/random-forests.html | 228 +- .../restricted-boltzmann-machines.html | 228 +- .../classification/support-vector-machines.html | 228 +- .../users/classification/twenty-newsgroups.html | 228 +- .../wikipedia-classifier-example.html | 228 +- .../_site/users/clustering/20newsgroups.html | 214 +- .../users/clustering/canopy-clustering.html | 228 +- .../users/clustering/canopy-commandline.html | 228 +- .../_site/users/clustering/cluster-dumper.html | 228 +- .../clustering-of-synthetic-control-data.html | 228 +- .../clustering-seinfeld-episodes.html | 214 +- .../users/clustering/clusteringyourdata.html | 228 +- .../clustering/expectation-maximization.html | 228 +- .../clustering/fuzzy-k-means-commandline.html | 228 +- .../_site/users/clustering/fuzzy-k-means.html | 228 +- .../clustering/hierarchical-clustering.html | 228 +- .../users/clustering/k-means-clustering.html | 228 +- .../users/clustering/k-means-commandline.html | 228 +- .../clustering/latent-dirichlet-allocation.html | 228 +- .../_site/users/clustering/lda-commandline.html | 228 +- .../clustering/llr---log-likelihood-ratio.html | 228 +- .../users/clustering/spectral-clustering.html | 228 +- .../users/clustering/streaming-k-means.html | 228 +- .../_site/users/clustering/viewing-result.html | 216 +- .../_site/users/clustering/viewing-results.html | 228 +- .../clustering/visualizing-sample-clusters.html | 228 +- .../dim-reduction/dimensional-reduction.html | 228 +- .../oldsite/_site/users/dim-reduction/ssvd.html | 228 +- .../classify-a-doc-from-the-shell.html | 228 +- .../_site/users/environment/h2o-internals.html | 228 +- .../users/environment/how-to-build-an-app.html | 228 +- .../users/environment/in-core-reference.html | 226 +- .../environment/out-of-core-reference.html | 228 +- .../users/environment/spark-internals.html | 228 +- .../users/flinkbindings/flink-internals.html | 228 +- .../playing-with-samsara-flink.html | 228 +- .../_site/users/misc/mr---map-reduce.html | 216 +- .../misc/parallel-frequent-pattern-mining.html | 228 +- .../_site/users/misc/perceptron-and-winnow.html | 228 +- website/oldsite/_site/users/misc/testing.html | 228 +- .../using-mahout-with-python-via-jpype.html | 228 +- .../users/recommender/intro-als-hadoop.html | 228 +- .../recommender/intro-cooccurrence-spark.html | 228 +- .../recommender/intro-itembased-hadoop.html | 228 +- .../users/recommender/matrix-factorization.html | 228 +- .../_site/users/recommender/quickstart.html | 228 +- .../recommender/recommender-documentation.html | 228 +- .../recommender-first-timer-faq.html | 228 +- .../users/recommender/userbased-5-minutes.html | 228 +- .../oldsite/_site/users/sparkbindings/faq.html | 228 +- .../oldsite/_site/users/sparkbindings/home.html | 228 +- .../users/sparkbindings/play-with-shell.html | 228 +- website/oldsite/atom.xml | 28 - website/oldsite/pages.html | 13 - website/oldsite/rss.xml | 28 - website/oldsite/tags.html | 21 - 273 files changed, 10438 insertions(+), 37834 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/README.md ---------------------------------------------------------------------- diff --git a/website/README.md b/website/README.md index 889afdf..b711140 100644 --- a/website/README.md +++ b/website/README.md @@ -140,25 +140,24 @@ If you want to edit the style edit `assets/themes/<your_theme>/css/style.css` an This is a helpful tool for reference http://pikock.github.io/bootstrap-magic/3.0/app/index.html#!/editor -## Pressing ToDos for Reboot - -- [ ] Fill out todo list -- [x] "flatten" everything (we shouldn't have a docs folder) -- [x] Port Distributed Linear Algebra pages -- [ ] Write up pages for existing 'pre canned algos' -- [ ] New themes require a lot less in `_imports` than dustin had initially, however- keeping his old files for now as they have lots of useful things that occastionally need to be scuttled. Eventaully need to delete though. -- [x] Clean up tiles on front page -- [ ] Add apache licenses all over the place - let dust settle first then we can go through methodically. -- [ ] Update of `/community/powered-by-mahout.md` -- [ ] Folks need to review their contact info in `community/professional-support.md` -- [-] Get rid of `developer/patch-check-list.md` and add it to the notes as a checkbox when opening a PR (see zeppelin) -- [x] `developer/release-notes.md` stuck on 0.12.0... bump it. -- [x] refactor to 'top-site' and 'docs' as we need a different jekyll build to change base path for new docs version -- [x] Sign up for google analytics -- [ ] add links to `community/blogs` -- [ ] would like to see `community/buidingmahout.md` cleaned up a bit (just coppied new instructions from README.md) -- [ ] writeups for native solvers in /docs/native-solvers/ -- [ ] help with triage in `mahout/website/old_site_migration` -- [x] Update sidebars in `mr_algo_navbar`, `mr_tutorial_navbar`, `tutorial_navbar`, to look like `docs/includes/algo_navbar.html` -- [ ] Write ups for new algos -- [ ] Search the directory (Ctrl+Shift+F in intellij) for "TODO" you'll find stuff. + + + + + + + + + + + + + + + + + + + + + http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/body-old.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/body-old.html b/website/_layouts_old/body-old.html deleted file mode 100644 index 5aac866..0000000 --- a/website/_layouts_old/body-old.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/body-old.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/default.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/default.html b/website/_layouts_old/default.html deleted file mode 100644 index fc93d46..0000000 --- a/website/_layouts_old/default.html +++ /dev/null @@ -1,6 +0,0 @@ ---- -theme : - name : mahout2 ---- -{% include JB/setup %} -{% include themes/mahout2/default.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/doc.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/doc.html b/website/_layouts_old/doc.html deleted file mode 100644 index e69de29..0000000 http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/footer.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/footer.html b/website/_layouts_old/footer.html deleted file mode 100644 index 1523020..0000000 --- a/website/_layouts_old/footer.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/footer.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/front_page.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/front_page.html b/website/_layouts_old/front_page.html deleted file mode 100644 index 1301a92..0000000 --- a/website/_layouts_old/front_page.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -layout: front_page -theme : -name : mahout2 ---- -{% include JB/setup %} -{% include themes/mahout2/front_page.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/header.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/header.html b/website/_layouts_old/header.html deleted file mode 100644 index 11c268b..0000000 --- a/website/_layouts_old/header.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/header.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/homepage.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/homepage.html b/website/_layouts_old/homepage.html deleted file mode 100644 index 64b198a..0000000 --- a/website/_layouts_old/homepage.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/homepage.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/inner.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/inner.html b/website/_layouts_old/inner.html deleted file mode 100644 index 6b6960c..0000000 --- a/website/_layouts_old/inner.html +++ /dev/null @@ -1,16 +0,0 @@ -{% include themes/mahout/header.html %} - - - <article> - <div class="container"> - <div class="row"> - <div class="col-md-10 col-md-offset-1"> - - - {{ content }} - </div> - </div> - </div> - </article> - -{% include themes/mahout/footer.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/mahout.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/mahout.html b/website/_layouts_old/mahout.html deleted file mode 100644 index 16e445d..0000000 --- a/website/_layouts_old/mahout.html +++ /dev/null @@ -1,16 +0,0 @@ -{% include themes/mahout/header.html %} -{% include themes/mahout/navbar_main.html %} - - - <article> - <div class="container"> - <div class="row"> - <div class="col-md-10 col-md-offset-1"> - - {{ content }} - </div> - </div> - </div> - </article> - -{% include themes/mahout/footer.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/mahoutdoc.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/mahoutdoc.html b/website/_layouts_old/mahoutdoc.html deleted file mode 100644 index 491f9fa..0000000 --- a/website/_layouts_old/mahoutdoc.html +++ /dev/null @@ -1,118 +0,0 @@ -{% include themes/mahout/header.html %} -<body class="{{ post.title | downcase | replace:' ','-' | replace:',','' | strip_html }}{% if page.category %} category-{{ page.category }}{% endif %}{% if page.layout-class %} layout-{{ page.layout-class }}{% endif %}"> - - <div class="navbar navbar-inverse navbar-fixed-top" role="navigation"> - <div class="container"> - <div class="navbar-header"> - <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse"> - <span class="sr-only">Toggle navigation</span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - <a class="navbar-brand" href="/"> - <img src="/assets/themes/mahout/img/mahout-logo.png" width="75" alt="I'm mahout"> - Apache Mahout - </a> - </div> - <nav class="navbar-collapse collapse" role="navigation"> - <ul class="nav navbar-nav navbar-right"> - - <!-- Quick Start --> - <li id="quickstart"> - <a href="/docs/0.13.0/quickstart" >Quick Start</a> - </li> - - <li id="samsara"> - <!-- Samsara docs - old http://mahout.apache.org/users/sparkbindings/faq.html - - --> - <a href="#" data-toggle="dropdown" class="dropdown-toggle">Mahout-Samsara<b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><span><b>Reference Info</b><span></li> - <li><a href="/docs/0.13.0/mahout-samsara/incore">In-core Reference</a></li> - <li><a href="/docs/0.13.0/mahout-samsara/outofcore">Out-of-core Reference</a></li> - <li><a href="/docs/0.13.0/mahout-samsara/faq">Samsara FAQ</a></li> - <li role="separator" class="divider"></li> - <li><span><b>Bindings</b><span></li> - <li><a href="/docs/0.13.0/mahout-samsara/spark-bindings">Spark Bindings</a></li> - <li><a href="/docs/0.13.0/mahout-samsara/flink-bindings">Flink Bindings</a></li> - <li><a href="/docs/0.13.0/mahout-samsara/flink-bindings">H20 Bindings</a></li> - </ul> - </li> - - <li id="tutorials"> - <!-- Tutorials --> - <a href="#" data-toggle="dropdown" class="dropdown-toggle">Tutorials<b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><span><b>Spark Examples using Samsara</b><span></li> - <li><a href="/docs/0.13.0/tutorials/samsara-spark-shell">Samsara in Spark Shell</a></li> - <li><a href="/docs/0.13.0/tutorials/build-app">Samsara Spark Application</a></li> - <li><a href="/docs/0.13.0/tutorials/text-classification">Text Classification</a></li> - <li role="separator" class="divider"></li> - <li><span><b>Fl</b><span></li> - <li><a href="/docs/0.13.1-SNAPSHOT">tbd1</a></li> - <li><a href="/docs/0.13.1-SNAPSHOT">tbd2</a></li> - <li role="separator" class="divider"></li> - <li><span><b>MapReduce</b><span></li> - <li><a href="/docs/0.13.1-SNAPSHOT">tbd1</a></li> - <li><a href="/docs/0.13.1-SNAPSHOT">tbd2</a></li> - </ul> - </li> - - - <!-- Algorithms (Samsara / MR) --> - <li id="algorithms"> - <a href="#" data-toggle="dropdown" class="dropdown-toggle">Algorithms<b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><span><b>Samsara Code</b><span></li> - <li><a href="/docs/0.13.0/algorithms/samsara/dssvd">Dist SVD</a></li> - <li><a href="/docs/0.13.0/algorithms/samsara/dspca">Dist PCA</a></li> - <li><a href="/docs/0.13.0/algorithms/samsara/dqr">Dist QR</a></li> - <li><a href="/docs/0.13.0/algorithms/samsara/dals">Dist ALS</a></li> - <li role="separator" class="divider"></li> - <li><span><b>Command Line</b><span></li> - <li><a href="/docs/0.13.0">algo1</a></li> - <li><a href="/docs/0.13.0">algo2</a></li> - <li role="separator" class="divider"></li> - <li><span><b>xxxx</b> (xxxxx)<span></li> - <li><a href="/docs/0.13.1-SNAPSHOT">xxxxx</a></li> - </ul> - </li> - - <!-- Scala Docs --> - <li id="scaladocs"> - <a href="tbd" data-toggle="dropdown" class="dropdown-toggle">Scala Docs<b class="caret"></b></a> - <ul class="dropdown-menu"> - <li class="title"><span><b>TODO - Needs update per new scaladocs</b><span></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout Math</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math-scala/index.html">Mahout Math Scala bindings</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.htmlhttp://apache.github.io/mahout/0.10.1/docs/mahout-spark/index.html">Mahout Spark bindings</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout Spark bindings shell</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout H2O backend Scaladoc</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout H2O backend Javadoc</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout HDFS</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout Map-Reduce</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout Examples</a></li> - <li><a href="http://apache.github.io/mahout/0.10.1/docs/mahout-math/index.html">Mahout Integration</a></li> - </ul> - </li> - - - </ul> - </nav><!--/.navbar-collapse --> - </div> - </div> - - <article> - <div class="container"> - <div class="row"> - <div class="col-md-10 col-md-offset-1"> - {{ content }} - </div> - </div> - </div> - </article> - -{% include themes/mahout/footer.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/navbar_docs.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/navbar_docs.html b/website/_layouts_old/navbar_docs.html deleted file mode 100644 index 3341d18..0000000 --- a/website/_layouts_old/navbar_docs.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout2 -layout: default ---- -{% include JB/setup %} -{% include themes/mahout2/navbar_docs.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/navbar_main.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/navbar_main.html b/website/_layouts_old/navbar_main.html deleted file mode 100644 index ba21274..0000000 --- a/website/_layouts_old/navbar_main.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout2 -layout: default ---- -{% include JB/setup %} -{% include themes/mahout2/navbar_main.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/page.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/page.html b/website/_layouts_old/page.html deleted file mode 100644 index ce8948c..0000000 --- a/website/_layouts_old/page.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout2 - ---- -{% include JB/setup %} -{% include themes/mahout2/page.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/pagination.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/pagination.html b/website/_layouts_old/pagination.html deleted file mode 100644 index ddb1713..0000000 --- a/website/_layouts_old/pagination.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/pagination.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/post.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/post.html b/website/_layouts_old/post.html deleted file mode 100644 index 12c42af..0000000 --- a/website/_layouts_old/post.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout2 -layout: default ---- -{% include JB/setup %} -{% include themes/mahout2/post.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/tile.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/tile.html b/website/_layouts_old/tile.html deleted file mode 100644 index 6625873..0000000 --- a/website/_layouts_old/tile.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/tile.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/_layouts_old/title-group.html ---------------------------------------------------------------------- diff --git a/website/_layouts_old/title-group.html b/website/_layouts_old/title-group.html deleted file mode 100644 index 2944fc0..0000000 --- a/website/_layouts_old/title-group.html +++ /dev/null @@ -1,7 +0,0 @@ ---- -theme : - name : mahout -layout: default ---- -{% include JB/setup %} -{% include themes/mahout/title-group.html %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/LICENSE ---------------------------------------------------------------------- diff --git a/website/docs/LICENSE b/website/docs/LICENSE deleted file mode 100755 index 01a0839..0000000 --- a/website/docs/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -The MIT License (MIT) - -Copyright (c) 2015 Jade Dominguez - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/archive.html ---------------------------------------------------------------------- diff --git a/website/docs/archive.html b/website/docs/archive.html deleted file mode 100755 index dc7c054..0000000 --- a/website/docs/archive.html +++ /dev/null @@ -1,10 +0,0 @@ ---- -layout: page -title : Archive -header : Post Archive -group: navigation ---- -{% include JB/setup %} - -{% assign posts_collate = site.posts %} -{% include JB/posts_collate %} \ No newline at end of file http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/atom.xml ---------------------------------------------------------------------- diff --git a/website/docs/atom.xml b/website/docs/atom.xml deleted file mode 100755 index 2b1f27b..0000000 --- a/website/docs/atom.xml +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: null -title : Atom Feed ---- -<?xml version="1.0" encoding="utf-8"?> -<feed xmlns="http://www.w3.org/2005/Atom"> - - <title>{{ site.title | xml_escape }}</title> - <link href="{{ site.production_url }}{{ site.JB.atom_path }}" rel="self"/> - <link href="{{ site.production_url }}"/> - <updated>{{ site.time | date_to_xmlschema }}</updated> - <id>{{ site.production_url }}</id> - <author> - <name>{{ site.author.name | xml_escape }}</name> - <email>{{ site.author.email }}</email> - </author> - - {% for post in site.posts limit:20 %} - <entry> - <title>{{ post.title | xml_escape }}</title> - <link href="{{ site.production_url }}{{ post.url }}"/> - <updated>{{ post.date | date_to_xmlschema }}</updated> - <id>{{ site.production_url }}{{ post.id }}</id> - <content type="html">{{ post.content | xml_escape }}</content> - </entry> - {% endfor %} - -</feed> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/categories.html ---------------------------------------------------------------------- diff --git a/website/docs/categories.html b/website/docs/categories.html deleted file mode 100755 index cdb8789..0000000 --- a/website/docs/categories.html +++ /dev/null @@ -1,22 +0,0 @@ ---- -layout: page -title: Categories -header: Posts By Category -group: navigation ---- -{% include JB/setup %} - -<ul class="tag_box inline"> - {% assign categories_list = site.categories %} - {% include JB/categories_list %} -</ul> - - -{% for category in site.categories %} - <h2 id="{{ category[0] }}-ref">{{ category[0] | join: "/" }}</h2> - <ul> - {% assign pages_list = category[1] %} - {% include JB/pages_list %} - </ul> -{% endfor %} - http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/pages.html ---------------------------------------------------------------------- diff --git a/website/docs/pages.html b/website/docs/pages.html deleted file mode 100755 index bde1a32..0000000 --- a/website/docs/pages.html +++ /dev/null @@ -1,13 +0,0 @@ ---- -layout: page -title: Pages -header: Pages -group: navigation ---- -{% include JB/setup %} - -<h2>All Pages</h2> -<ul> -{% assign pages_list = site.pages %} -{% include JB/pages_list %} -</ul> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/rss.xml ---------------------------------------------------------------------- diff --git a/website/docs/rss.xml b/website/docs/rss.xml deleted file mode 100755 index 419c897..0000000 --- a/website/docs/rss.xml +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: null -title : RSS Feed ---- - -<?xml version="1.0" encoding="UTF-8" ?> -<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> -<channel> - <title>{{ site.title | xml_escape }}</title> - <description>{{ site.title | xml_escape }} - {{ site.author.name | xml_escape }}</description> - <link>{{ site.production_url }}</link> - <atom:link href="{{ site.production_url }}{{ site.JB.rss_path }}" rel="self" type="application/rss+xml" /> - <lastBuildDate>{{ site.time | date_to_rfc822 }}</lastBuildDate> - <pubDate>{{ site.time | date_to_rfc822 }}</pubDate> - <ttl>60</ttl> - -{% for post in site.posts limit:20 %} - <item> - <title>{{ post.title | xml_escape }}</title> - <description>{{ post.content | xml_escape }}</description> - <link>{{ site.production_url }}{{ post.url }}</link> - <guid>{{ site.production_url }}{{ post.id }}</guid> - <pubDate>{{ post.date | date_to_rfc822 }}</pubDate> - </item> -{% endfor %} - -</channel> -</rss> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/docs/tags.html ---------------------------------------------------------------------- diff --git a/website/docs/tags.html b/website/docs/tags.html deleted file mode 100755 index 5e216cb..0000000 --- a/website/docs/tags.html +++ /dev/null @@ -1,21 +0,0 @@ ---- -layout: page -title: Tags -header: Posts By Tag -group: navigation ---- -{% include JB/setup %} - -<ul class="tag_box inline"> - {% assign tags_list = site.tags %} - {% include JB/tags_list %} -</ul> - - -{% for tag in site.tags %} - <h2 id="{{ tag[0] }}-ref">{{ tag[0] }}</h2> - <ul> - {% assign pages_list = tag[1] %} - {% include JB/pages_list %} - </ul> -{% endfor %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/History.markdown ---------------------------------------------------------------------- diff --git a/website/front/History.markdown b/website/front/History.markdown deleted file mode 100755 index 5ef89c1..0000000 --- a/website/front/History.markdown +++ /dev/null @@ -1,16 +0,0 @@ -## HEAD - -### Major Enhancements - -### Minor Enahncements - * Add `drafts` folder support (#167) - * Add `excerpt` support (#168) - * Create History.markdown to help project management (#169) - -### Bug Fixes - -### Site Enhancements - -### Compatibility updates - * Update `preview` task - http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/LICENSE ---------------------------------------------------------------------- diff --git a/website/front/LICENSE b/website/front/LICENSE deleted file mode 100755 index 01a0839..0000000 --- a/website/front/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -The MIT License (MIT) - -Copyright (c) 2015 Jade Dominguez - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/_posts/core-samples/2011-12-29-jekyll-introduction.md ---------------------------------------------------------------------- diff --git a/website/front/_posts/core-samples/2011-12-29-jekyll-introduction.md b/website/front/_posts/core-samples/2011-12-29-jekyll-introduction.md deleted file mode 100755 index 13fe3dc..0000000 --- a/website/front/_posts/core-samples/2011-12-29-jekyll-introduction.md +++ /dev/null @@ -1,412 +0,0 @@ ---- -layout: post -category : lessons -tagline: "Supporting tagline" -tags : [intro, beginner, jekyll, tutorial] ---- -{% include JB/setup %} - -This Jekyll introduction will outline specifically what Jekyll is and why you would want to use it. -Directly following the intro we'll learn exactly _how_ Jekyll does what it does. - -## Overview - -### What is Jekyll? - -Jekyll is a parsing engine bundled as a ruby gem used to build static websites from -dynamic components such as templates, partials, liquid code, markdown, etc. Jekyll is known as "a simple, blog aware, static site generator". - -### Examples - -This website is created with Jekyll. [Other Jekyll websites](https://github.com/mojombo/jekyll/wiki/Sites). - - - -### What does Jekyll Do? - -Jekyll is a ruby gem you install on your local system. -Once there you can call `jekyll --server` on a directory and provided that directory -is setup in a way jekyll expects, it will do magic stuff like parse markdown/textile files, -compute categories, tags, permalinks, and construct your pages from layout templates and partials. - -Once parsed, Jekyll stores the result in a self-contained static `_site` folder. -The intention here is that you can serve all contents in this folder statically from a plain static web-server. - -You can think of Jekyll as a normalish dynamic blog but rather than parsing content, templates, and tags -on each request, Jekyll does this once _beforehand_ and caches the _entire website_ in a folder for serving statically. - -### Jekyll is Not Blogging Software - -**Jekyll is a parsing engine.** - -Jekyll does not come with any content nor does it have any templates or design elements. -This is a common source of confusion when getting started. -Jekyll does not come with anything you actually use or see on your website - you have to make it. - -### Why Should I Care? - -Jekyll is very minimalistic and very efficient. -The most important thing to realize about Jekyll is that it creates a static representation of your website requiring only a static web-server. -Traditional dynamic blogs like Wordpress require a database and server-side code. -Heavily trafficked dynamic blogs must employ a caching layer that ultimately performs the same job Jekyll sets out to do; serve static content. - -Therefore if you like to keep things simple and you prefer the command-line over an admin panel UI then give Jekyll a try. - -**Developers like Jekyll because we can write content like we write code:** - -- Ability to write content in markdown or textile in your favorite text-editor. -- Ability to write and preview your content via localhost. -- No internet connection required. -- Ability to publish via git. -- Ability to host your blog on a static web-server. -- Ability to host freely on GitHub Pages. -- No database required. - -# How Jekyll Works - -The following is a complete but concise outline of exactly how Jekyll works. - -Be aware that core concepts are introduced in rapid succession without code examples. -This information is not intended to specifically teach you how to do anything, rather it -is intended to give you the _full picture_ relative to what is going on in Jekyll-world. - -Learning these core concepts should help you avoid common frustrations and ultimately -help you better understand the code examples contained throughout Jekyll-Bootstrap. - - -## Initial Setup - -After [installing jekyll](/index.html#start-now) you'll need to format your website directory in a way jekyll expects. -Jekyll-bootstrap conveniently provides the base directory format. - -### The Jekyll Application Base Format - -Jekyll expects your website directory to be laid out like so: - - . - |-- _config.yml - |-- _includes - |-- _layouts - | |-- default.html - | |-- post.html - |-- _posts - | |-- 2011-10-25-open-source-is-good.markdown - | |-- 2011-04-26-hello-world.markdown - |-- _site - |-- index.html - |-- assets - |-- css - |-- style.css - |-- javascripts - - -- **\_config.yml** - Stores configuration data. - -- **\_includes** - This folder is for partial views. - -- **\_layouts** - This folder is for the main templates your content will be inserted into. - You can have different layouts for different pages or page sections. - -- **\_posts** - This folder contains your dynamic content/posts. - the naming format is required to be `@YEAR-MONTH-DATE-title.MARKUP@`. - -- **\_site** - This is where the generated site will be placed once Jekyll is done transforming it. - -- **assets** - This folder is not part of the standard jekyll structure. - The assets folder represents _any generic_ folder you happen to create in your root directory. - Directories and files not properly formatted for jekyll will be left untouched for you to serve normally. - -(read more: <https://github.com/mojombo/jekyll/wiki/Usage>) - - -### Jekyll Configuration - -Jekyll supports various configuration options that are fully outlined here: -(<https://github.com/mojombo/jekyll/wiki/Configuration>) - - - - -## Content in Jekyll - -Content in Jekyll is either a post or a page. -These content "objects" get inserted into one or more templates to build the final output for its respective static-page. - -### Posts and Pages - -Both posts and pages should be written in markdown, textile, or HTML and may also contain Liquid templating syntax. -Both posts and pages can have meta-data assigned on a per-page basis such as title, url path, as well as arbitrary custom meta-data. - -### Working With Posts - -**Creating a Post** -Posts are created by properly formatting a file and placing it the `_posts` folder. - -**Formatting** -A post must have a valid filename in the form `YEAR-MONTH-DATE-title.MARKUP` and be placed in the `_posts` directory. -If the data format is invalid Jekyll will not recognize the file as a post. The date and title are automatically parsed from the filename of the post file. -Additionally, each file must have [YAML Front-Matter](https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter) prepended to its content. -YAML Front-Matter is a valid YAML syntax specifying meta-data for the given file. - -**Order** -Ordering is an important part of Jekyll but it is hard to specify a custom ordering strategy. -Only reverse chronological and chronological ordering is supported in Jekyll. - -Since the date is hard-coded into the filename format, to change the order, you must change the dates in the filenames. - -**Tags** -Posts can have tags associated with them as part of their meta-data. -Tags may be placed on posts by providing them in the post's YAML front matter. -You have access to the post-specific tags in the templates. These tags also get added to the sitewide collection. - -**Categories** -Posts may be categorized by providing one or more categories in the YAML front matter. -Categories offer more significance over tags in that they can be reflected in the URL path to the given post. -Note categories in Jekyll work in a specific way. -If you define more than one category you are defining a category hierarchy "set". -Example: - - --- - title : Hello World - categories : [lessons, beginner] - --- - -This defines the category hierarchy "lessons/beginner". Note this is _one category_ node in Jekyll. -You won't find "lessons" and "beginner" as two separate categories unless you define them elsewhere as singular categories. - -### Working With Pages - -**Creating a Page** -Pages are created by properly formatting a file and placing it anywhere in the root directory or subdirectories that do _not_ start with an underscore. - -**Formatting** -In order to register as a Jekyll page the file must contain [YAML Front-Matter](https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter). -Registering a page means 1) that Jekyll will process the page and 2) that the page object will be available in the `site.pages` array for inclusion into your templates. - -**Categories and Tags** -Pages do not compute categories nor tags so defining them will have no effect. - -**Sub-Directories** -If pages are defined in sub-directories, the path to the page will be reflected in the url. -Example: - - . - |-- people - |-- bob - |-- essay.html - -This page will be available at `http://yourdomain.com/people/bob/essay.html` - - -**Recommended Pages** - -- **index.html** - You will always want to define the root index.html page as this will display on your root URL. -- **404.html** - Create a root 404.html page and GitHub Pages will serve it as your 404 response. -- **sitemap.html** - Generating a sitemap is good practice for SEO. -- **about.html** - A nice about page is easy to do and gives the human perspective to your website. - - -## Templates in Jekyll - -Templates are used to contain a page's or post's content. -All templates have access to a global site object variable: `site` as well as a page object variable: `page`. -The site variable holds all accessible content and metadata relative to the site. -The page variable holds accessible data for the given page or post being rendered at that point. - -**Create a Template** -Templates are created by properly formatting a file and placing it in the `_layouts` directory. - -**Formatting** -Templates should be coded in HTML and contain YAML Front Matter. -All templates can contain Liquid code to work with your site's data. - -**Rending Page/Post Content in a Template** -There is a special variable in all templates named : `content`. -The `content` variable holds the page/post content including any sub-template content previously defined. -Render the content variable wherever you want your main content to be injected into your template: - -{% capture text %}... -<body> - <div id="sidebar"> ... </div> - <div id="main"> - |.{content}.| - </div> -</body> -...{% endcapture %} -{% include JB/liquid_raw %} - -### Sub-Templates - -Sub-templates are exactly templates with the only difference being they -define another "root" layout/template within their YAML Front Matter. -This essentially means a template will render inside of another template. - -### Includes -In Jekyll you can define include files by placing them in the `_includes` folder. -Includes are NOT templates, rather they are just code snippets that get included into templates. -In this way, you can treat the code inside includes as if it was native to the parent template. - -Any valid template code may be used in includes. - - -## Using Liquid for Templating - -Templating is perhaps the most confusing and frustrating part of Jekyll. -This is mainly due to the fact that Jekyll templates must use the Liquid Templating Language. - -### What is Liquid? - -[Liquid](https://github.com/Shopify/liquid) is a secure templating language developed by [Shopify](http://shopify.com). -Liquid is designed for end-users to be able to execute logic within template files -without imposing any security risk on the hosting server. - -Jekyll uses Liquid to generate the post content within the final page layout structure and as the primary interface for working with -your site and post/page data. - -### Why Do We Have to Use Liquid? - -GitHub uses Jekyll to power [GitHub Pages](http://pages.github.com/). -GitHub cannot afford to run arbitrary code on their servers so they lock developers down via Liquid. - -### Liquid is Not Programmer-Friendly. - -The short story is liquid is not real code and its not intended to execute real code. -The point being you can't do jackshit in liquid that hasn't been allowed explicitly by the implementation. -What's more you can only access data-structures that have been explicitly passed to the template. - -In Jekyll's case it is not possible to alter what is passed to Liquid without hacking the gem or running custom plugins. -Both of which cannot be supported by GitHub Pages. - -As a programmer - this is very frustrating. - -But rather than look a gift horse in the mouth we are going to -suck it up and view it as an opportunity to work around limitations and adopt client-side solutions when possible. - -**Aside** -My personal stance is to not invest time trying to hack liquid. It's really unnecessary -_from a programmer's_ perspective. That is to say if you have the ability to run custom plugins (i.e. run arbitrary ruby code) -you are better off sticking with ruby. Toward that end I've built [Mustache-with-Jekyll](http://github.com/plusjade/mustache-with-jekyll) - - -## Static Assets - -Static assets are any file in the root or non-underscored subfolders that are not pages. -That is they have no valid YAML Front Matter and are thus not treated as Jekyll Pages. - -Static assets should be used for images, css, and javascript files. - - - - -## How Jekyll Parses Files - -Remember Jekyll is a processing engine. There are two main types of parsing in Jekyll. - -- **Content parsing.** - This is done with textile or markdown. -- **Template parsing.** - This is done with the liquid templating language. - -And thus there are two main types of file formats needed for this parsing. - -- **Post and Page files.** - All content in Jekyll is either a post or a page so valid posts and pages are parsed with markdown or textile. -- **Template files.** - These files go in `_layouts` folder and contain your blogs **templates**. They should be made in HTML with the help of Liquid syntax. - Since include files are simply injected into templates they are essentially parsed as if they were native to the template. - -**Arbitrary files and folders.** -Files that _are not_ valid pages are treated as static content and pass through -Jekyll untouched and reside on your blog in the exact structure and format they originally existed in. - -### Formatting Files for Parsing. - -We've outlined the need for valid formatting using **YAML Front Matter**. -Templates, posts, and pages all need to provide valid YAML Front Matter even if the Matter is empty. -This is the only way Jekyll knows you want the file processed. - -YAML Front Matter must be prepended to the top of template/post/page files: - - --- - layout: post - category : pages - tags : [how-to, jekyll] - --- - - ... contents ... - -Three hyphens on a new line start the Front-Matter block and three hyphens on a new line end the block. -The data inside the block must be valid YAML. - -Configuration parameters for YAML Front-Matter is outlined here: -[A comprehensive explanation of YAML Front Matter](https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter) - -#### Defining Layouts for Posts and Templates Parsing. - -The `layout` parameter in the YAML Front Matter defines the template file for which the given post or template should be injected into. -If a template file specifies its own layout, it is effectively being used as a `sub-template.` -That is to say loading a post file into a template file that refers to another template file with work in the way you'd expect; as a nested sub-template. - - - - - -## How Jekyll Generates the Final Static Files. - -Ultimately, Jekyll's job is to generate a static representation of your website. -The following is an outline of how that's done: - -1. **Jekyll collects data.** - Jekyll scans the posts directory and collects all posts files as post objects. It then scans the layout assets and collects those and finally scans other directories in search of pages. - -2. **Jekyll computes data.** - Jekyll takes these objects, computes metadata (permalinks, tags, categories, titles, dates) from them and constructs one - big `site` object that holds all the posts, pages, layouts, and respective metadata. - At this stage your site is one big computed ruby object. - -3. **Jekyll liquifies posts and templates.** - Next jekyll loops through each post file and converts (through markdown or textile) and **liquifies** the post inside of its respective layout(s). - Once the post is parsed and liquified inside the the proper layout structure, the layout itself is "liquified". - **Liquification** is defined as follows: Jekyll initiates a Liquid template, and passes a simpler hash representation of the ruby site object as well as a simpler - hash representation of the ruby post object. These simplified data structures are what you have access to in the templates. - -3. **Jekyll generates output.** - Finally the liquid templates are "rendered", thereby processing any liquid syntax provided in the templates - and saving the final, static representation of the file. - -**Notes.** -Because Jekyll computes the entire site in one fell swoop, each template is given access to -a global `site` hash that contains useful data. It is this data that you'll iterate through and format -using the Liquid tags and filters in order to render it onto a given page. - -Remember, in Jekyll you are an end-user. Your API has only two components: - -1. The manner in which you setup your directory. -2. The liquid syntax and variables passed into the liquid templates. - -All the data objects available to you in the templates via Liquid are outlined in the **API Section** of Jekyll-Bootstrap. -You can also read the original documentation here: <https://github.com/mojombo/jekyll/wiki/Template-Data> - -## Conclusion - -I hope this paints a clearer picture of what Jekyll is doing and why it works the way it does. -As noted, our main programming constraint is the fact that our API is limited to what is accessible via Liquid and Liquid only. - -Jekyll-bootstrap is intended to provide helper methods and strategies aimed at making it more intuitive and easier to work with Jekyll =) - -**Thank you** for reading this far. - -## Next Steps - -Please take a look at [{{ site.categories.api.first.title }}]({{ BASE_PATH }}{{ site.categories.api.first.url }}) -or jump right into [Usage]({{ BASE_PATH }}{{ site.categories.usage.first.url }}) if you'd like. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/atom.xml ---------------------------------------------------------------------- diff --git a/website/front/atom.xml b/website/front/atom.xml deleted file mode 100755 index 2b1f27b..0000000 --- a/website/front/atom.xml +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: null -title : Atom Feed ---- -<?xml version="1.0" encoding="utf-8"?> -<feed xmlns="http://www.w3.org/2005/Atom"> - - <title>{{ site.title | xml_escape }}</title> - <link href="{{ site.production_url }}{{ site.JB.atom_path }}" rel="self"/> - <link href="{{ site.production_url }}"/> - <updated>{{ site.time | date_to_xmlschema }}</updated> - <id>{{ site.production_url }}</id> - <author> - <name>{{ site.author.name | xml_escape }}</name> - <email>{{ site.author.email }}</email> - </author> - - {% for post in site.posts limit:20 %} - <entry> - <title>{{ post.title | xml_escape }}</title> - <link href="{{ site.production_url }}{{ post.url }}"/> - <updated>{{ post.date | date_to_xmlschema }}</updated> - <id>{{ site.production_url }}{{ post.id }}</id> - <content type="html">{{ post.content | xml_escape }}</content> - </entry> - {% endfor %} - -</feed> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/community/blogs.md ---------------------------------------------------------------------- diff --git a/website/front/community/blogs.md b/website/front/community/blogs.md index b169455..9076db1 100644 --- a/website/front/community/blogs.md +++ b/website/front/community/blogs.md @@ -21,3 +21,4 @@ An introduction to the new Algorithms Framework How to setup Apache Mahout in IBM's Datascience Experience Notebooking Environment, and run a few trivial programs. +### http://www.weatheringthroughtechdays.com/2015/04/mahout-010x-first-mahout-release-as.html \ No newline at end of file http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/pages.html ---------------------------------------------------------------------- diff --git a/website/front/pages.html b/website/front/pages.html deleted file mode 100755 index bde1a32..0000000 --- a/website/front/pages.html +++ /dev/null @@ -1,13 +0,0 @@ ---- -layout: page -title: Pages -header: Pages -group: navigation ---- -{% include JB/setup %} - -<h2>All Pages</h2> -<ul> -{% assign pages_list = site.pages %} -{% include JB/pages_list %} -</ul> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/rss.xml ---------------------------------------------------------------------- diff --git a/website/front/rss.xml b/website/front/rss.xml deleted file mode 100755 index 419c897..0000000 --- a/website/front/rss.xml +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: null -title : RSS Feed ---- - -<?xml version="1.0" encoding="UTF-8" ?> -<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> -<channel> - <title>{{ site.title | xml_escape }}</title> - <description>{{ site.title | xml_escape }} - {{ site.author.name | xml_escape }}</description> - <link>{{ site.production_url }}</link> - <atom:link href="{{ site.production_url }}{{ site.JB.rss_path }}" rel="self" type="application/rss+xml" /> - <lastBuildDate>{{ site.time | date_to_rfc822 }}</lastBuildDate> - <pubDate>{{ site.time | date_to_rfc822 }}</pubDate> - <ttl>60</ttl> - -{% for post in site.posts limit:20 %} - <item> - <title>{{ post.title | xml_escape }}</title> - <description>{{ post.content | xml_escape }}</description> - <link>{{ site.production_url }}{{ post.url }}</link> - <guid>{{ site.production_url }}{{ post.id }}</guid> - <pubDate>{{ post.date | date_to_rfc822 }}</pubDate> - </item> -{% endfor %} - -</channel> -</rss> http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/front/tags.html ---------------------------------------------------------------------- diff --git a/website/front/tags.html b/website/front/tags.html deleted file mode 100755 index 5e216cb..0000000 --- a/website/front/tags.html +++ /dev/null @@ -1,21 +0,0 @@ ---- -layout: page -title: Tags -header: Posts By Tag -group: navigation ---- -{% include JB/setup %} - -<ul class="tag_box inline"> - {% assign tags_list = site.tags %} - {% include JB/tags_list %} -</ul> - - -{% for tag in site.tags %} - <h2 id="{{ tag[0] }}-ref">{{ tag[0] }}</h2> - <ul> - {% assign pages_list = tag[1] %} - {% include JB/pages_list %} - </ul> -{% endfor %} http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/old_site_migration/README.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/README.md b/website/old_site_migration/README.md deleted file mode 100644 index 794397b..0000000 --- a/website/old_site_migration/README.md +++ /dev/null @@ -1,41 +0,0 @@ - - -## Website Migration Triage - - -### 1. `./old-site` - -Original Mahout site was transferred to `mahout/website/oldsite` where it was -headers were replaced to be Jekyll complient as well as some witch craft on the -nav-bar to make the CSS compatible with the Jekyll Boot Strap Themes - -All content was then moved to `mahout/website/old_site_migration/old_site` - -ALCON please go through files and move them to one of the following directories - -### 2a. `./dont_migrate` - -Content that is no longer relevant or is in such bad shape that needs to be redone completely goes here - -### 2b. `./needs_work_convenience` - -Content that should be migrated but needs updated with new information, or other work. Please leave a note -in the top of what needs to be done. This content can be migrated at convenience, e.g. is interesting and -would be good to bring over, but is not critical (site can go live with out this content). - -`./needs_work_convenience/map_reduce` has mapReduce related docs that may not actually need any work. - -### 2c. `./needs_work_priority` - -Content that should be migrated but needs updated. This is critical information that needs to be migrated -before site goes live. - - - -### 3. `./completed` - -When a file doesn't need work OR the work has been done on it- move a copy here, AND move a copy to the appropriate -location in `mahout/website/front` or `mahout/website/docs` - -(don't forget to add page to nav-bar) - http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/old_site_migration/completed/bankmarketing-example.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/completed/bankmarketing-example.md b/website/old_site_migration/completed/bankmarketing-example.md deleted file mode 100644 index 846a4ce..0000000 --- a/website/old_site_migration/completed/bankmarketing-example.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -layout: default -title: -theme: - name: retro-mahout ---- - -Notice: Licensed to the Apache Software Foundation (ASF) under one - or more contributor license agreements. See the NOTICE file - distributed with this work for additional information - regarding copyright ownership. The ASF licenses this file - to you under the Apache License, Version 2.0 (the - "License"); you may not use this file except in compliance - with the License. You may obtain a copy of the License at - . - http://www.apache.org/licenses/LICENSE-2.0 - . - Unless required by applicable law or agreed to in writing, - software distributed under the License is distributed on an - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - KIND, either express or implied. See the License for the - specific language governing permissions and limitations - under the License. - -#Bank Marketing Example - -### Introduction - -This page describes how to run Mahout's SGD classifier on the [UCI Bank Marketing dataset](http://mlr.cs.umass.edu/ml/datasets/Bank+Marketing). -The goal is to predict if the client will subscribe a term deposit offered via a phone call. The features in the dataset consist -of information such as age, job, marital status as well as information about the last contacts from the bank. - -### Code & Data - -The bank marketing example code lives under - -*mahout-examples/src/main/java/org.apache.mahout.classifier.sgd.bankmarketing* - -The data can be found at - -*mahout-examples/src/main/resources/bank-full.csv* - -### Code details - -This example consists of 3 classes: - - - BankMarketingClassificationMain - - TelephoneCall - - TelephoneCallParser - -When you run the main method of BankMarketingClassificationMain it parses the dataset using the TelephoneCallParser and trains -a logistic regression model with 20 runs and 20 passes. The TelephoneCallParser uses Mahout's feature vector encoder -to encode the features in the dataset into a vector. Afterwards the model is tested and the learning rate and AUC is printed accuracy is printed to standard output. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/old_site_migration/completed/breiman-example.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/completed/breiman-example.md b/website/old_site_migration/completed/breiman-example.md deleted file mode 100644 index d8d049e..0000000 --- a/website/old_site_migration/completed/breiman-example.md +++ /dev/null @@ -1,67 +0,0 @@ ---- -layout: default -title: Breiman Example -theme: - name: retro-mahout ---- - -#Breiman Example - -#### Introduction - -This page describes how to run the Breiman example, which implements the test procedure described in [Leo Breiman's paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.23.3999&rep=rep1&type=pdf). The basic algorithm is as follows : - - * repeat *I* iterations - * in each iteration do - * keep 10% of the dataset apart as a testing set - * build two forests using the training set, one with *m = int(log2(M) + 1)* (called Random-Input) and one with *m = 1* (called Single-Input) - * choose the forest that gave the lowest oob error estimation to compute -the test set error - * compute the test set error using the Single Input Forest (test error), -this demonstrates that even with *m = 1*, Decision Forests give comparable -results to greater values of *m* - * compute the mean testset error using every tree of the chosen forest -(tree error). This should indicate how well a single Decision Tree performs - * compute the mean test error for all iterations - * compute the mean tree error for all iterations - - -#### Running the Example - -The current implementation is compatible with the [UCI repository](http://archive.ics.uci.edu/ml/) file format. We'll show how to run this example on two datasets: - -First, we deal with [Glass Identification](http://archive.ics.uci.edu/ml/datasets/Glass+Identification): download the [dataset](http://archive.ics.uci.edu/ml/machine-learning-databases/glass/glass.data) file called **glass.data** and store it onto your local machine. Next, we must generate the descriptor file **glass.info** for this dataset with the following command: - - bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/glass.data -f /path/to/glass.info -d I 9 N L - -Substitute */path/to/* with the folder where you downloaded the dataset, the argument "I 9 N L" indicates the nature of the variables. Here it means 1 -ignored (I) attribute, followed by 9 numerical(N) attributes, followed by -the label (L). - -Finally, we build and evaluate our random forest classifier as follows: - - bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/glass.data -ds /path/to/glass.info -i 10 -t 100 -which builds 100 trees (-t argument) and repeats the test 10 iterations (-i -argument) - -The example outputs the following results: - - * Selection error: mean test error for the selected forest on all iterations - * Single Input error: mean test error for the single input forest on all -iterations - * One Tree error: mean single tree error on all iterations - * Mean Random Input Time: mean build time for random input forests on all -iterations - * Mean Single Input Time: mean build time for single input forests on all -iterations - -We can repeat this for a [Sonar](http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar,+Mines+vs.+Rocks%29) usecase: download the [dataset](http://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data) file called **sonar.all-data** and store it onto your local machine. Generate the descriptor file **sonar.info** for this dataset with the following command: - - bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/sonar.all-data -f /path/to/sonar.info -d 60 N L - -The argument "60 N L" means 60 numerical(N) attributes, followed by the label (L). Analogous to the previous case, we run the evaluation as follows: - - bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/sonar.all-data -ds /path/to/sonar.info -i 10 -t 100 - - - http://git-wip-us.apache.org/repos/asf/mahout/blob/3c53a6dc/website/old_site_migration/completed/classification/bayesian.md ---------------------------------------------------------------------- diff --git a/website/old_site_migration/completed/classification/bayesian.md b/website/old_site_migration/completed/classification/bayesian.md deleted file mode 100644 index 51a5c74..0000000 --- a/website/old_site_migration/completed/classification/bayesian.md +++ /dev/null @@ -1,147 +0,0 @@ ---- -layout: default -title: -theme: - name: retro-mahout ---- - -# Naive Bayes - - -## Intro - -Mahout currently has two Naive Bayes implementations. The first is standard Multinomial Naive Bayes. The second is an implementation of Transformed Weight-normalized Complement Naive Bayes as introduced by Rennie et al. [[1]](http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf). We refer to the former as Bayes and the latter as CBayes. - -Where Bayes has long been a standard in text classification, CBayes is an extension of Bayes that performs particularly well on datasets with skewed classes and has been shown to be competitive with algorithms of higher complexity such as Support Vector Machines. - - -## Implementations -Both Bayes and CBayes are currently trained via MapReduce Jobs. Testing and classification can be done via a MapReduce Job or sequentially. Mahout provides CLI drivers for preprocessing, training and testing. A Spark implementation is currently in the works ([MAHOUT-1493](https://issues.apache.org/jira/browse/MAHOUT-1493)). - -## Preprocessing and Algorithm - -As described in [[1]](http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf) Mahout Naive Bayes is broken down into the following steps (assignments are over all possible index values): - -- Let `\(\vec{d}=(\vec{d_1},...,\vec{d_n})\)` be a set of documents; `\(d_{ij}\)` is the count of word `\(i\)` in document `\(j\)`. -- Let `\(\vec{y}=(y_1,...,y_n)\)` be their labels. -- Let `\(\alpha_i\)` be a smoothing parameter for all words in the vocabulary; let `\(\alpha=\sum_i{\alpha_i}\)`. -- **Preprocessing**(via seq2Sparse) TF-IDF transformation and L2 length normalization of `\(\vec{d}\)` - 1. `\(d_{ij} = \sqrt{d_{ij}}\)` - 2. `\(d_{ij} = d_{ij}\left(\log{\frac{\sum_k1}{\sum_k\delta_{ik}+1}}+1\right)\)` - 3. `\(d_{ij} =\frac{d_{ij}}{\sqrt{\sum_k{d_{kj}^2}}}\)` -- **Training: Bayes**`\((\vec{d},\vec{y})\)` calculate term weights `\(w_{ci}\)` as: - 1. `\(\hat\theta_{ci}=\frac{d_{ic}+\alpha_i}{\sum_k{d_{kc}}+\alpha}\)` - 2. `\(w_{ci}=\log{\hat\theta_{ci}}\)` -- **Training: CBayes**`\((\vec{d},\vec{y})\)` calculate term weights `\(w_{ci}\)` as: - 1. `\(\hat\theta_{ci} = \frac{\sum_{j:y_j\neq c}d_{ij}+\alpha_i}{\sum_{j:y_j\neq c}{\sum_k{d_{kj}}}+\alpha}\)` - 2. `\(w_{ci}=-\log{\hat\theta_{ci}}\)` - 3. `\(w_{ci}=\frac{w_{ci}}{\sum_i \lvert w_{ci}\rvert}\)` -- **Label Assignment/Testing:** - 1. Let `\(\vec{t}= (t_1,...,t_n)\)` be a test document; let `\(t_i\)` be the count of the word `\(t\)`. - 2. Label the document according to `\(l(t)=\arg\max_c \sum\limits_{i} t_i w_{ci}\)` - -As we can see, the main difference between Bayes and CBayes is the weight calculation step. Where Bayes weighs terms more heavily based on the likelihood that they belong to class `\(c\)`, CBayes seeks to maximize term weights on the likelihood that they do not belong to any other class. - -## Running from the command line - -Mahout provides CLI drivers for all above steps. Here we will give a simple overview of Mahout CLI commands used to preprocess the data, train the model and assign labels to the training set. An [example script](https://github.com/apache/mahout/blob/master/examples/bin/classify-20newsgroups.sh) is given for the full process from data acquisition through classification of the classic [20 Newsgroups corpus](https://mahout.apache.org/users/classification/twenty-newsgroups.html). - -- **Preprocessing:** -For a set of Sequence File Formatted documents in PATH_TO_SEQUENCE_FILES the [mahout seq2sparse](https://mahout.apache.org/users/basics/creating-vectors-from-text.html) command performs the TF-IDF transformations (-wt tfidf option) and L2 length normalization (-n 2 option) as follows: - - mahout seq2sparse - -i ${PATH_TO_SEQUENCE_FILES} - -o ${PATH_TO_TFIDF_VECTORS} - -nv - -n 2 - -wt tfidf - -- **Training:** -The model is then trained using `mahout trainnb` . The default is to train a Bayes model. The -c option is given to train a CBayes model: - - mahout trainnb - -i ${PATH_TO_TFIDF_VECTORS} - -o ${PATH_TO_MODEL}/model - -li ${PATH_TO_MODEL}/labelindex - -ow - -c - -- **Label Assignment/Testing:** -Classification and testing on a holdout set can then be performed via `mahout testnb`. Again, the -c option indicates that the model is CBayes. The -seq option tells `mahout testnb` to run sequentially: - - mahout testnb - -i ${PATH_TO_TFIDF_TEST_VECTORS} - -m ${PATH_TO_MODEL}/model - -l ${PATH_TO_MODEL}/labelindex - -ow - -o ${PATH_TO_OUTPUT} - -c - -seq - -## Command line options - -- **Preprocessing:** - - Only relevant parameters used for Bayes/CBayes as detailed above are shown. Several other transformations can be performed by `mahout seq2sparse` and used as input to Bayes/CBayes. For a full list of `mahout seq2Sparse` options see the [Creating vectors from text](https://mahout.apache.org/users/basics/creating-vectors-from-text.html) page. - - mahout seq2sparse - --output (-o) output The directory pathname for output. - --input (-i) input Path to job input directory. - --weight (-wt) weight The kind of weight to use. Currently TF - or TFIDF. Default: TFIDF - --norm (-n) norm The norm to use, expressed as either a - float or "INF" if you want to use the - Infinite norm. Must be greater or equal - to 0. The default is not to normalize - --overwrite (-ow) If set, overwrite the output directory - --sequentialAccessVector (-seq) (Optional) Whether output vectors should - be SequentialAccessVectors. If set true - else false - --namedVector (-nv) (Optional) Whether output vectors should - be NamedVectors. If set true else false - -- **Training:** - - mahout trainnb - --input (-i) input Path to job input directory. - --output (-o) output The directory pathname for output. - --alphaI (-a) alphaI Smoothing parameter. Default is 1.0 - --trainComplementary (-c) Train complementary? Default is false. - --labelIndex (-li) labelIndex The path to store the label index in - --overwrite (-ow) If present, overwrite the output directory - before running job - --help (-h) Print out help - --tempDir tempDir Intermediate output directory - --startPhase startPhase First phase to run - --endPhase endPhase Last phase to run - -- **Testing:** - - mahout testnb - --input (-i) input Path to job input directory. - --output (-o) output The directory pathname for output. - --overwrite (-ow) If present, overwrite the output directory - before running job - - - --model (-m) model The path to the model built during training - --testComplementary (-c) Test complementary? Default is false. - --runSequential (-seq) Run sequential? - --labelIndex (-l) labelIndex The path to the location of the label index - --help (-h) Print out help - --tempDir tempDir Intermediate output directory - --startPhase startPhase First phase to run - --endPhase endPhase Last phase to run - - -## Examples - -Mahout provides an example for Naive Bayes classification: - -1. [Classify 20 Newsgroups](twenty-newsgroups.html) - -## References - -[1]: Jason D. M. Rennie, Lawerence Shih, Jamie Teevan, David Karger (2003). [Tackling the Poor Assumptions of Naive Bayes Text Classifiers](http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf). Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003). - -
