http://git-wip-us.apache.org/repos/asf/mahout/blob/5112e9ec/docs/latest/algorithms/reccomenders/cco.html ---------------------------------------------------------------------- diff --git a/docs/latest/algorithms/reccomenders/cco.html b/docs/latest/algorithms/reccomenders/cco.html index 4808fb4..59baa1b 100644 --- a/docs/latest/algorithms/reccomenders/cco.html +++ b/docs/latest/algorithms/reccomenders/cco.html @@ -1,312 +1,169 @@ - - <!DOCTYPE html> -<html lang="en"> +<html lang=" en "> + <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> - <title>Intro to Cooccurrence Recommenders with Spark</title> - - <meta name="author" content="Apache Mahout"> - - <!-- Enable responsive viewport --> - <meta name="viewport" content="width=device-width, initial-scale=1.0"> - - <!-- Bootstrap styles --> - <link href="/assets/themes/mahout3/css/bootstrap.min.css" rel="stylesheet"> - <!-- Optional theme --> - <link href="/assets/themes/mahout3/css/bootstrap-theme.min.css" rel="stylesheet"> - <!-- Sticky Footer --> - <link href="/assets/themes/mahout3/css/bs-sticky-footer.css" rel="stylesheet"> - - <!-- Custom styles --> - <link href="/assets/themes/mahout3/css/style.css" rel="stylesheet" type="text/css" media="all"> - - <!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries --> - <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script> - <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script> - <![endif]--> - - <!-- Fav and touch icons --> - <!-- Update these with your own images - <link rel="shortcut icon" href="images/favicon.ico"> - <link rel="apple-touch-icon" href="images/apple-touch-icon.png"> - <link rel="apple-touch-icon" sizes="72x72" href="images/apple-touch-icon-72x72.png"> - <link rel="apple-touch-icon" sizes="114x114" href="images/apple-touch-icon-114x114.png"> - --> - - <!-- atom & rss feed --> - <link href="/atom.xml" type="application/atom+xml" rel="alternate" title="Sitewide ATOM Feed"> - <link href="/rss.xml" type="application/rss+xml" rel="alternate" title="Sitewide RSS Feed"> - <script type="text/x-mathjax-config"> - MathJax.Hub.Config({ - tex2jax: { - skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'] - } - }); - MathJax.Hub.Queue(function() { - var all = MathJax.Hub.getAllJax(), i; - for(i = 0; i < all.length; i += 1) { - all[i].SourceElement().parentNode.className += ' has-jax'; - } - }); - </script> - <script type="text/javascript"> - var mathjax = document.createElement('script'); - mathjax.type = 'text/javascript'; - mathjax.async = true; - - mathjax.src = ('https:' == document.location.protocol) ? - 'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' : - 'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; - - var s = document.getElementsByTagName('script')[0]; - s.parentNode.insertBefore(mathjax, s); - </script> -</head> - -<nav class="navbar navbar-default navbar-fixed-top"> - <div class="container-fluid"> - <!-- Brand and toggle get grouped for better mobile display --> - <div class="navbar-header"> - <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false"> - <span class="sr-only">Toggle navigation</span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - <a class="navbar-brand" href="/"> - <img src="/assets/img/Mahout-logo-82x100.png" height="30" alt="I'm mahout"> - </a> - </div> - + <title> + Intro to Cooccurrence Recommenders with Spark + </title> + <meta name="description" content="Distributed Linear Algebra"> -<!-- Collect the nav links, forms, and other content for toggling --> -<div class="collapse navbar-collapse" id="main-navbar"> - <ul class="nav navbar-nav"> - - <!-- Quick Start --> - <li id="quickstart"> - <a href="/index.html" >Mahout Overview</a> - </li> - - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Key Concepts<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/index.html">Mahout Overview</a></li> - <li><span><b> Scala DSL</b><span></li> - <li><a href="/mahout-samsara/in-core-reference.html">In-core Reference</a></li> - <li><a href="/mahout-samsara/out-of-core-reference.html">Out-of-core Reference</a></li> - <li><a href="/mahout-samsara/faq.html">Samsara FAQ</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Bindings</b><span></li> - <li><a href="/distributed/spark-bindings/">Spark Bindings</a></li> - <li><a href="/distributed/flink-bindings.html">Flink Bindings</a></li> - <li><a href="/distributed/flink-bindings.html">H20 Bindings</a></li> - <!--<li role="separator" class="divider"></li> - <li><span> <b>Native Solvers</b><span></li> - <li><a href="/native-solvers/viennacl.html">ViennaCL</a></li> - <li><a href="/native-solvers/viennacl-omp.html">ViennaCL-OMP</a></li> - <li><a href="/native-solvers/cuda.html">CUDA</a></li>--> - </ul> - </li> - - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Tutorials<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><span> <b>Reccomenders</b><span></li> - <li><a href="/tutorials/cco-lastfm">CCO Example with Last.FM Data</a></li> - <li><a href="/tutorials/intro-cooccurrence-spark">Introduction to Cooccurrence in Spark</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Mahout Samsara</b><span></li> - <li><a href="/tutorials/samsara/play-with-shell.html">Playing with Samsara in Spark Shell</a></li> - <li><a href="/tutorials/samsara/playing-with-samsara-flink-batch.html">Playing with Samsara in Flink Batch</a></li> - <li><a href="/tutorials/samsara/classify-a-doc-from-the-shell.html">Text Classification (Shell)</a></li> - <li><a href="/tutorials/samsara/spark-naive-bayes.html">Spark Naive Bayes</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Misc</b><span></li> - <li><a href="/tutorials/misc/mahout-in-zeppelin">Mahout in Apache Zeppelin</a></li> - <li><a href="/tutorials/misc/contributing-algos">How To Contribute a New Algorithm</a></li> - <li><a href="/tutorials/misc/how-to-build-an-app.html">How To Build An App</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Deprecated</b><span></li> - <li><a href="/tutorials/map-reduce">MapReduce</a></li> - </ul> - </li> - - - <!-- Algorithms (Samsara / MR) --> - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Algorithms<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/algorithms/linear-algebra">Distributed Linear Algebra</a></li> - <li><a href="/algorithms/preprocessors">Preprocessors</a></li> - <li><a href="/algorithms/regression">Regression</a></li> - <li><a href="/algorithms/reccomenders">Reccomenders</a></li> - <li role="separator" class="divider"></li> - <li><a href="/algorithms/map-reduce">MapReduce <i>(deprecated)</i></a></li> - </ul> - <!--<li><a href="/algorithms/reccomenders/recommender-overview.html">Reccomender Overview</a></li> Do we still need? seems like short version of next post--> - <!-- - <li><a href="/algorithms/reccomenders/intro-cooccurrence-spark.html">Intro to Coocurrence With Spark</a></li> - <li role="separator" class="divider"></li> - <li><span> <a href="/algorithms/map-reduce"><b>MapReduce</b> (deprecated)</a><span></li> + <link rel="stylesheet" href="/assets/css/main.css"> + <!-- Font Awesome --> + <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous"> - --> - </li> + <!-- Google Fonts --> + <link href="https://fonts.googleapis.com/css?family=Maven+Pro:400,500" rel="stylesheet"> + <link href="https://fonts.googleapis.com/css?family=Muli:400,400i,700,700i" rel="stylesheet"> - <!-- Scala Docs --> - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">API Docs<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/0.13.0/api/index.html">0.13.0</a></li> - </ul> - </li> - - - </ul> - <form class="navbar-form navbar-left"> - <div class="form-group"> - <input type="text" class="form-control" placeholder="Search"> - </div> - <button type="submit" class="btn btn-default">Submit</button> - </form> - <ul class="nav navbar-nav navbar-right"> - <li><a href="http://github.com/apache/mahout">Github</a></li> - - <!-- Apache --> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache <span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="http://www.apache.org/foundation/how-it-works.html">Apache Software Foundation</a></li> - <li><a href="http://www.apache.org/licenses/">Apache License</a></li> - <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> - <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> - </ul> - </li> + <link rel="canonical" href="http://mahout.apache.org//docs/latest/algorithms/reccomenders/cco.html"> + <link rel="alternate" type="application/rss+xml" title="Apache Mahout" href="/%20/feed.xml"> - </ul> -</div><!-- /.navbar-collapse --> - </div><!-- /.container-fluid --> -</nav> +</head> + <body> -<div id="wrap"> - <body class=""> + <nav class="navbar navbar-expand-lg navbar-light bg-light navbar-mahout"> + + <div class="container"> + + <a class="navbar-brand" href="/"> + <img src="/assets/mahout-logo-blue.svg" alt=""> + </a> + + <button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation"> + <span class="navbar-toggler-icon"></span> + </button> + + <div class="collapse navbar-collapse" id="navbarSupportedContent"> + + <div class="navbar-nav ml-auto"> + + <!-- Quick Start --> + <li class="nav-item"> + <a class="nav-link" href="//docs/latest/" >Mahout Overview</a> + </li> + + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Key Concepts</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/index.html">Mahout Overview</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Scala DSL</h6> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/in-core-reference.html">In-core Reference</a> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/out-of-core-reference.html">Out-of-core Reference</a> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/faq.html">Samsara FAQ</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Distributed Engine Bindings</h6> + <a class="dropdown-item" href="/docs/latest/distributed/spark-bindings/">Spark Bindings</a> + <a class="dropdown-item" href="/docs/latest/distributed/flink-bindings.html">Flink Bindings</a> + <a class="dropdown-item" href="/docs/latest/distributed/flink-bindings.html">H20 Bindings</a> + <!--<div class="dropdown-divider"></div> + <h6 class="dropdown-header">Native Solvers</h6> + <a class="dropdown-item" href="/docs/latest/native-solvers/viennacl.html">ViennaCL</a></li> + <a class="dropdown-item" href="/docs/latest/native-solvers/viennacl-omp.html">ViennaCL-OMP</a></li> + <a class="dropdown-item" href="/docs/latest/native-solvers/cuda.html">CUDA</a></li>--> + </div> + </li> + + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Tutorial</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Reccomenders</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/cco-lastfm">CCO Example with Last.FM Data</a> + <a class="dropdown-item" href="/docs/latest/tutorials/intro-cooccurrence-spark">Introduction to Cooccurrence in Spark</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Mahout Samsara</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/play-with-shell.html">Playing with Samsara in Spark Shell</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/playing-with-samsara-flink-batch.html">Playing with Samsara in Flink Batch</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/classify-a-doc-from-the-shell.html">Text Classification (Shell)</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/spark-naive-bayes.html">Spark Naive Bayes</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Misc</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/mahout-in-zeppelin">Mahout in Apache Zeppelin</a> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/contributing-algos">How To Contribute a New Algorithm</a> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/how-to-build-an-app.html">How To Build An App</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Deprecated</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/map-reduce">MapReduce</a> + </div> + </li> + + + <!-- Algorithms (Samsara / MR) --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Algorithms</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/algorithms/linear-algebra">Distributed Linear Algebra</a> + <a class="dropdown-item" href="/docs/latest/algorithms/preprocessors">Preprocessors</a> + <a class="dropdown-item" href="/docs/latest/algorithms/regression">Regression</a> + <a class="dropdown-item" href="/docs/latest/algorithms/reccomenders">Reccomenders</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Deprecated</h6> + <a class="dropdown-item" href="/docs/latest/algorithms/map-reduce">MapReduce <i>(deprecated)</i></a> + </div> + <!--<a class="dropdown-item" href="/docs/latest/algorithms/reccomenders/recommender-overview.html">Reccomender Overview</a></li> Do we still need? seems like short version of next post--> + <!-- + <a class="dropdown-item" href="/docs/latest/algorithms/reccomenders/intro-cooccurrence-spark.html">Intro to Coocurrence With Spark</a></li> + <li role="separator" class="divider"></li> + <li><span> <a href="/docs/latest/algorithms/map-reduce"><b>MapReduce</b> (deprecated)</a><span></li> + + + --> + </li> + + <!-- Scala /docs --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">API /docs</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/0.13.0/api/index.html">0.13.0</a> + </div> + </li> + + <!-- Apache --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Apache</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="http://www.apache.org/foundation/how-it-works.html">Apache Software Foundation</a> + <a class="dropdown-item" href="http://www.apache.org/licenses/">Apache License</a> + <a class="dropdown-item" href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a> + <a class="dropdown-item" href="http://www.apache.org/foundation/thanks.html">Thanks</a> + </div> + </li> - <div class="container"> - + </ul> + + <!--<form class="navbar-form navbar-left">--> + <!--<div class="form-group">--> + <!--<input type="text" class="form-control" placeholder="Search">--> + <!--</div>--> + <!--<button type="submit" class="btn btn-default">Submit</button>--> + <!--</form>--> + <!--<ul class="nav navbar-nav navbar-right">--> + <!--<a class="dropdown-item" href="http://github.com/apache/mahout">Github</a></li>--> -<div class="row"> - <div class="col-md-3"> - <div id="AlgoMenu"> - <span><b>Mahout-Samsara Algorithms</b></span> - <div class="list-group panel"> - <a href="#linalg" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Linear Algebra</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="linalg"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/linear-algebra/d-qr.html">Distributed QR Decomposition</a></li> - <li> <a href="/algorithms/linear-algebra/d-spca.html">Distributed Stochastic Principal Component Analysis</a></li> - <li> <a href="/algorithms/linear-algebra/d-ssvd.html">Distributed Stochastic Singular Value Decomposition</a></li> - </ul> - </div> - <a href="#clustering" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Clustering</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="clustering"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/clustering">Clustering Algorithms</a></li> - <li> <a href="/algorithms/clustering/distance-metrics.html">Distance Metrics</a></li> - <li> <a href="/algorithms/clustering/canopy">Canopy Clustering</a></li> - </ul> - </div> - <a href="#preprocessors" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Preprocessors</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="preprocessors"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/preprocessors/AsFactor.html">AsFactor (a.k.a. One-Hot-Encoding)</a></li> - <li> <a href="/algorithms/preprocessors/StandardScaler.html">StandardScaler</a></li> - <li> <a href="/algorithms/preprocessors/MeanCenter.html">MeanCenter</a></li> - </ul> - </div> - <a href="#regression" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Regression</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="regression"> - <ul class="nav sidebar-nav"> - <a href="#serial-correlation" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#regression"><b>• Serial Correlation</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="serial-correlation"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/regression/serial-correlation/cochrane-orcutt.html">Cochrane-Orcutt Procedure</a></li> - <li> <a href="/algorithms/regression/serial-correlation/dw-test.html">Durbin Watson Test</a></li> - </ul> - </div> - <li> <a href="/algorithms/regression/ols.html">Ordinary Least Squares (Closed Form)</a></li> - <li> <a href="/algorithms/regression/fittness-tests.html">Fitness Tests</a></li> - </ul> - </div> - <a href="#reccomenders" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Reccomenders</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="reccomenders"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/reccomenders">Reccomender Overview</a></li> - <li> <a href="/algorithms/reccomenders/cco.html">CCO</a></li> - <li> <a href="/algorithms/reccomenders/d-als.html">Distributed Alternating Least Squares</a></li> - </ul> - </div> - </div> - <span><b>Map Reduce Algorithms</b> (deprecated)</span> - <div class="list-group panel"> - <a href="#classification" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Classification</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="classification"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/map-reduce/classification/bayesian.html">Bayesian</a></li> - <li> <a href="/algorithms/map-reduce/classification/class-discovery.html">Class Discovery</a></li> - <li> <a href="/algorithms/map-reduce/classification/classifyingyourdata.html">Classifying Your Data</a></li> - <li> <a href="/algorithms/map-reduce/classification/collocations.html">Collocation</a></li> - <li> <a href="/algorithms/map-reduce/classification/gaussian-discriminative-analysis.html">Gaussian Discriminative Analysis</a></li> - <li> <a href="/algorithms/map-reduce/classification/hidden-markov-models.html">Hidden Markov Models</a></li> - <li> <a href="/algorithms/map-reduce/classification/independent-component-analysis.html">Independent Component Analysis</a></li> - <li> <a href="/algorithms/map-reduce/classification/locally-weighted-linear-regression.html">Locally Weighted Linear Regression</a></li> - <li> <a href="/algorithms/map-reduce/classification/logistic-regression.html">Logistic Regression</a></li> - <li> <a href="/algorithms/map-reduce/classification/mahout-collections.html">Mahout Collections</a></li> - <li> <a href="/algorithms/map-reduce/classification/mlp.html">Multilayer Perceptron</a></li> - <li> <a href="/algorithms/map-reduce/classification/naivebayes.html">Naive Bayes</a></li> - <li> <a href="/algorithms/map-reduce/classification/neural-network.html">Neural Networks</a></li> - <li> <a href="/algorithms/map-reduce/classification/partial-implementation.html">Partial Implementation</a></li> - <li> <a href="/algorithms/map-reduce/classification/random-forrests.html">Random Forrests</a></li> - <li> <a href="/algorithms/map-reduce/classification/restricted-boltzman-machines.html">Restricted Boltzman Machines</a></li> - <li> <a href="/algorithms/map-reduce/classification/support-vector-machines.html">Support Vector Machines</a></li> - </ul> - </div> - <a href="#mr-clustering" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Clustering</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="mr-clustering"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/map-reduce/clustering/canopy-clustering.html">Canopy Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/cluster-dumper.html">Cluster Dumper</a></li> - <li> <a href="/algorithms/map-reduce/clustering/expectation-maximization.html">Expectation Maximization</a></li> - <li> <a href="/algorithms/map-reduce/clustering/fuzzy-k-means.html">Fuzzy K-Means</a></li> - <li> <a href="/algorithms/map-reduce/clustering/hierarchical-clustering.html">Hierarchical Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/k-means-clustering.html">K-Means Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/latent-dirichlet-allocation.html">Latent Dirichlet Allocation</a></li> - <li> <a href="/algorithms/map-reduce/clustering/llr---log-likelihood-ratio.html">Log Likelihood Ratio</a></li> - <li> <a href="/algorithms/map-reduce/clustering/spectral-clustering.html">Spectral Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/streaming-k-means.html">Streaming K-Means</a></li> - </ul> - </div> - </div> -</div> + <!--</ul>--> + </div><!-- /.navbar-collapse --> </div> +</nav> + + <div class="container mt-5 pb-4"> + + <div class="row"> - <div class="col-md-8"> - <div class="page-header"> - <h1>Intro to Cooccurrence Recommenders with Spark </h1> - </div> - <p>Mahout provides several important building blocks for creating recommendations using Spark. <em>spark-itemsimilarity</em> can + <div class="col-lg-8"> + <p>Mahout provides several important building blocks for creating recommendations using Spark. <em>spark-itemsimilarity</em> can be used to create âother people also liked these thingsâ type recommendations and paired with a search engine can personalize recommendations for individual users. <em>spark-rowsimilarity</em> can provide non-personalized content based recommendations and when paired with a search engine can be used to personalize content based recommendations.</p> @@ -336,7 +193,7 @@ For instance they might say an item-view is 0.2 of an item purchase. In practice cross-cooccurrence is a more principled way to handle this case. In effect it scrubs secondary actions with the action you want to recommend.</p> -<pre><code>spark-itemsimilarity Mahout 1.0 +<div class="highlighter-rouge"><pre class="highlight"><code>spark-itemsimilarity Mahout 1.0 Usage: spark-itemsimilarity [options] Disconnected from the target VM, address: '127.0.0.1:64676', transport: 'socket' @@ -401,6 +258,7 @@ Spark config options: -h | --help prints this usage text </code></pre> +</div> <p>This looks daunting but defaults to simple fairly sane values to take exactly the same input as legacy code and is pretty flexible. It allows the user to point to a single text file, a directory full of files, or a tree of directories to be traversed recursively. The files included can be specified with either a regex-style pattern or filename. The schema for the file is defined by column numbers, which map to the important bits of data including IDs and values. The files can even contain filters, which allow unneeded rows to be discarded or used for cross-cooccurrence calculations.</p> @@ -410,20 +268,23 @@ Spark config options: <p>If all defaults are used the input can be as simple as:</p> -<pre><code>userID1,itemID1 +<div class="highlighter-rouge"><pre class="highlight"><code>userID1,itemID1 userID2,itemID2 ... </code></pre> +</div> <p>With the command line:</p> -<pre><code>bash$ mahout spark-itemsimilarity --input in-file --output out-dir +<div class="highlighter-rouge"><pre class="highlight"><code>bash$ mahout spark-itemsimilarity --input in-file --output out-dir </code></pre> +</div> <p>This will use the âlocalâ Spark context and will output the standard text version of a DRM</p> -<pre><code>itemID1<tab>itemID2:value2<space>itemID10:value10... +<div class="highlighter-rouge"><pre class="highlight"><code>itemID1<tab>itemID2:value2<space>itemID10:value10... </code></pre> +</div> <h3 id="how-to-use-multiple-user-actions"><a name="multiple-actions">How To Use Multiple User Actions</a></h3> @@ -439,7 +300,7 @@ to calculate the cross-cooccurrence indicator matrix.</p> <p><em>spark-itemsimilarity</em> can read separate actions from separate files or from a mixed action log by filtering certain lines. For a mixed action log of the form:</p> -<pre><code>u1,purchase,iphone +<div class="highlighter-rouge"><pre class="highlight"><code>u1,purchase,iphone u1,purchase,ipad u2,purchase,nexus u2,purchase,galaxy @@ -460,12 +321,13 @@ u4,view,iphone u4,view,ipad u4,view,galaxy </code></pre> +</div> <h3 id="command-line">Command Line</h3> <p>Use the following options:</p> -<pre><code>bash$ mahout spark-itemsimilarity \ +<div class="highlighter-rouge"><pre class="highlight"><code>bash$ mahout spark-itemsimilarity \ --input in-file \ # where to look for data --output out-path \ # root dir for output --master masterUrl \ # URL of the Spark master server @@ -475,34 +337,38 @@ u4,view,galaxy --rowIDPosition 0 \ # column that has the user ID --filterPosition 1 # column that has the filter word </code></pre> +</div> <h3 id="output">Output</h3> <p>The output of the job will be the standard text version of two Mahout DRMs. This is a case where we are calculating cross-cooccurrence so a primary indicator matrix and cross-cooccurrence indicator matrix will be created</p> -<pre><code>out-path +<div class="highlighter-rouge"><pre class="highlight"><code>out-path |-- similarity-matrix - TDF part files \-- cross-similarity-matrix - TDF part-files </code></pre> +</div> <p>The indicator matrix will contain the lines:</p> -<pre><code>galaxy\tnexus:1.7260924347106847 +<div class="highlighter-rouge"><pre class="highlight"><code>galaxy\tnexus:1.7260924347106847 ipad\tiphone:1.7260924347106847 nexus\tgalaxy:1.7260924347106847 iphone\tipad:1.7260924347106847 surface </code></pre> +</div> <p>The cross-cooccurrence indicator matrix will contain:</p> -<pre><code>iphone\tnexus:1.7260924347106847 iphone:1.7260924347106847 ipad:1.7260924347106847 galaxy:1.7260924347106847 +<div class="highlighter-rouge"><pre class="highlight"><code>iphone\tnexus:1.7260924347106847 iphone:1.7260924347106847 ipad:1.7260924347106847 galaxy:1.7260924347106847 ipad\tnexus:0.6795961471815897 iphone:0.6795961471815897 ipad:0.6795961471815897 galaxy:0.6795961471815897 nexus\tnexus:0.6795961471815897 iphone:0.6795961471815897 ipad:0.6795961471815897 galaxy:0.6795961471815897 galaxy\tnexus:1.7260924347106847 iphone:1.7260924347106847 ipad:1.7260924347106847 galaxy:1.7260924347106847 surface\tsurface:4.498681156950466 nexus:0.6795961471815897 </code></pre> +</div> <p><strong>Note:</strong> You can run this multiple times to use more than two actions or you can use the underlying SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any number of cross-cooccurrence indicators.</p> @@ -511,7 +377,7 @@ SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any n <p>A common method of storing data is in log files. If they are written using some delimiter they can be consumed directly by spark-itemsimilarity. For instance input of the form:</p> -<pre><code>2014-06-23 14:46:53.115\tu1\tpurchase\trandom text\tiphone +<div class="highlighter-rouge"><pre class="highlight"><code>2014-06-23 14:46:53.115\tu1\tpurchase\trandom text\tiphone 2014-06-23 14:46:53.115\tu1\tpurchase\trandom text\tipad 2014-06-23 14:46:53.115\tu2\tpurchase\trandom text\tnexus 2014-06-23 14:46:53.115\tu2\tpurchase\trandom text\tgalaxy @@ -532,10 +398,11 @@ SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any n 2014-06-23 14:46:53.115\tu4\tview\trandom text\tipad 2014-06-23 14:46:53.115\tu4\tview\trandom text\tgalaxy </code></pre> +</div> <p>Can be parsed with the following CLI and run on the cluster producing the same output as the above example.</p> -<pre><code>bash$ mahout spark-itemsimilarity \ +<div class="highlighter-rouge"><pre class="highlight"><code>bash$ mahout spark-itemsimilarity \ --input in-file \ --output out-path \ --master spark://sparkmaster:4044 \ @@ -546,6 +413,7 @@ SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any n --rowIDPosition 1 \ --filterPosition 2 </code></pre> +</div> <h2 id="2-spark-rowsimilarity">2. spark-rowsimilarity</h2> @@ -560,7 +428,7 @@ by a list of the most similar rows.</p> <p>The command line interface is:</p> -<pre><code>spark-rowsimilarity Mahout 1.0 +<div class="highlighter-rouge"><pre class="highlight"><code>spark-rowsimilarity Mahout 1.0 Usage: spark-rowsimilarity [options] Input, output options @@ -607,6 +475,7 @@ Spark config options: -h | --help prints this usage text </code></pre> +</div> <p>See RowSimilarityDriver.scala in Mahoutâs spark module if you want to customize the code.</p> @@ -686,32 +555,36 @@ content or metadata, not by which users interacted with them.</p> <p>For this we need input of the form:</p> -<pre><code>itemID<tab>list-of-tags +<div class="highlighter-rouge"><pre class="highlight"><code>itemID<tab>list-of-tags ... </code></pre> +</div> <p>The full collection will look like the tags column from a catalog DB. For our ecom example it might be:</p> -<pre><code>3459860b<tab>men long-sleeve chambray clothing casual +<div class="highlighter-rouge"><pre class="highlight"><code>3459860b<tab>men long-sleeve chambray clothing casual 9446577d<tab>women tops chambray clothing casual ... </code></pre> +</div> <p>Weâll use <em>spark-rowimilairity</em> because we are looking for similar rows, which encode items in this case. As with the collaborative filtering indicator and cross-cooccurrence indicator we use the âomitStrength option. The strengths created are probabilistic log-likelihood ratios and so are used to filter unimportant similarities. Once the filtering or downsampling is finished we no longer need the strengths. We will get an indicator matrix of the form:</p> -<pre><code>itemID<tab>list-of-item IDs +<div class="highlighter-rouge"><pre class="highlight"><code>itemID<tab>list-of-item IDs ... </code></pre> +</div> <p>This is a content indicator since it has found other items with similar content or metadata.</p> -<pre><code>3459860b<tab>3459860b 3459860b 6749860c 5959860a 3434860a 3477860a +<div class="highlighter-rouge"><pre class="highlight"><code>3459860b<tab>3459860b 3459860b 6749860c 5959860a 3434860a 3477860a 9446577d<tab>9446577d 9496577d 0943577d 8346577d 9442277d 9446577e ... </code></pre> +</div> <p>We now have three indicators, two collaborative filtering type and one content type.</p> @@ -723,11 +596,12 @@ For a given user, map their history of an action or content to the correct indic <p>We have 3 indicators, these are indexed by the search engine into 3 fields, weâll call them âpurchaseâ, âviewâ, and âtagsâ. We take the userâs history that corresponds to each indicator and create a query of the form:</p> -<pre><code>Query: +<div class="highlighter-rouge"><pre class="highlight"><code>Query: field: purchase; q:user's-purchase-history field: view; q:user's view-history field: tags; q:user's-tags-associated-with-purchases </code></pre> +</div> <p>The query will result in an ordered list of items recommended for purchase but skewed towards items with similar tags to the ones the user has already purchased.</p> @@ -739,11 +613,12 @@ by tagging items with some category of popularity (hot, warm, cold for instance) index that as a new indicator field and include the corresponding value in a query on the popularity field. If we use the ecom example but use the query to get âhotâ recommendations it might look like this:</p> -<pre><code>Query: +<div class="highlighter-rouge"><pre class="highlight"><code>Query: field: purchase; q:user's-purchase-history field: view; q:user's view-history field: popularity; q:"hot" </code></pre> +</div> <p>This will return recommendations favoring ones that have the intrinsic indicator âhotâ.</p> @@ -756,32 +631,25 @@ on the popularity field. If we use the ecom example but use the query to get â </ol> </div> -</div> - </div> - - -</div> -<div id="footer"> - <div class="container"> - <p>© 2017 Apache Mahout - with help from <a href="http://jekyllbootstrap.com" target="_blank" title="The Definitive Jekyll Blogging Framework">Jekyll Bootstrap</a> - and <a href="http://getbootstrap.com" target="_blank">Bootstrap</a> - </p> </div> -</div> - - +</div> + <footer class="footer bg-light"> + <div class="container text-center small"> + Copyright © 2014-2017 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. + </div> +</footer> + <script src="/assets/vendor/jquery/jquery-slim.min.js"></script> + <script src="/assets/vendor/popper/popper.min.js"></script> + <script src="/assets/vendor/bootstrap/js/bootstrap.min.js"></script> + <script src="/assets/header.js"></script> + <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script> -<!-- Latest compiled and minified JavaScript, requires jQuery 1.x (2.x not supported in IE8) --> -<!-- Placed at the end of the document so the pages load faster --> -<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script> -<script src="/assets/themes/mahout3/js/bootstrap.min.js"></script> </body> -</html> +</html>
http://git-wip-us.apache.org/repos/asf/mahout/blob/5112e9ec/docs/latest/algorithms/reccomenders/d-als.html ---------------------------------------------------------------------- diff --git a/docs/latest/algorithms/reccomenders/d-als.html b/docs/latest/algorithms/reccomenders/d-als.html index 5dbdb6e..0db767d 100644 --- a/docs/latest/algorithms/reccomenders/d-als.html +++ b/docs/latest/algorithms/reccomenders/d-als.html @@ -1,312 +1,169 @@ - - <!DOCTYPE html> -<html lang="en"> +<html lang=" en "> + <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> - <title>Mahout Samsara Distributed ALS</title> - - <meta name="author" content="Apache Mahout"> - - <!-- Enable responsive viewport --> - <meta name="viewport" content="width=device-width, initial-scale=1.0"> - - <!-- Bootstrap styles --> - <link href="/assets/themes/mahout3/css/bootstrap.min.css" rel="stylesheet"> - <!-- Optional theme --> - <link href="/assets/themes/mahout3/css/bootstrap-theme.min.css" rel="stylesheet"> - <!-- Sticky Footer --> - <link href="/assets/themes/mahout3/css/bs-sticky-footer.css" rel="stylesheet"> - - <!-- Custom styles --> - <link href="/assets/themes/mahout3/css/style.css" rel="stylesheet" type="text/css" media="all"> - - <!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries --> - <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script> - <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script> - <![endif]--> - - <!-- Fav and touch icons --> - <!-- Update these with your own images - <link rel="shortcut icon" href="images/favicon.ico"> - <link rel="apple-touch-icon" href="images/apple-touch-icon.png"> - <link rel="apple-touch-icon" sizes="72x72" href="images/apple-touch-icon-72x72.png"> - <link rel="apple-touch-icon" sizes="114x114" href="images/apple-touch-icon-114x114.png"> - --> - - <!-- atom & rss feed --> - <link href="/atom.xml" type="application/atom+xml" rel="alternate" title="Sitewide ATOM Feed"> - <link href="/rss.xml" type="application/rss+xml" rel="alternate" title="Sitewide RSS Feed"> - <script type="text/x-mathjax-config"> - MathJax.Hub.Config({ - tex2jax: { - skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'] - } - }); - MathJax.Hub.Queue(function() { - var all = MathJax.Hub.getAllJax(), i; - for(i = 0; i < all.length; i += 1) { - all[i].SourceElement().parentNode.className += ' has-jax'; - } - }); - </script> - <script type="text/javascript"> - var mathjax = document.createElement('script'); - mathjax.type = 'text/javascript'; - mathjax.async = true; - - mathjax.src = ('https:' == document.location.protocol) ? - 'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' : - 'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; - - var s = document.getElementsByTagName('script')[0]; - s.parentNode.insertBefore(mathjax, s); - </script> -</head> - -<nav class="navbar navbar-default navbar-fixed-top"> - <div class="container-fluid"> - <!-- Brand and toggle get grouped for better mobile display --> - <div class="navbar-header"> - <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false"> - <span class="sr-only">Toggle navigation</span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - <a class="navbar-brand" href="/"> - <img src="/assets/img/Mahout-logo-82x100.png" height="30" alt="I'm mahout"> - </a> - </div> - + <title> + Mahout Samsara Distributed ALS + </title> + <meta name="description" content="Distributed Linear Algebra"> -<!-- Collect the nav links, forms, and other content for toggling --> -<div class="collapse navbar-collapse" id="main-navbar"> - <ul class="nav navbar-nav"> - - <!-- Quick Start --> - <li id="quickstart"> - <a href="/index.html" >Mahout Overview</a> - </li> - - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Key Concepts<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/index.html">Mahout Overview</a></li> - <li><span><b> Scala DSL</b><span></li> - <li><a href="/mahout-samsara/in-core-reference.html">In-core Reference</a></li> - <li><a href="/mahout-samsara/out-of-core-reference.html">Out-of-core Reference</a></li> - <li><a href="/mahout-samsara/faq.html">Samsara FAQ</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Bindings</b><span></li> - <li><a href="/distributed/spark-bindings/">Spark Bindings</a></li> - <li><a href="/distributed/flink-bindings.html">Flink Bindings</a></li> - <li><a href="/distributed/flink-bindings.html">H20 Bindings</a></li> - <!--<li role="separator" class="divider"></li> - <li><span> <b>Native Solvers</b><span></li> - <li><a href="/native-solvers/viennacl.html">ViennaCL</a></li> - <li><a href="/native-solvers/viennacl-omp.html">ViennaCL-OMP</a></li> - <li><a href="/native-solvers/cuda.html">CUDA</a></li>--> - </ul> - </li> - - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Tutorials<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><span> <b>Reccomenders</b><span></li> - <li><a href="/tutorials/cco-lastfm">CCO Example with Last.FM Data</a></li> - <li><a href="/tutorials/intro-cooccurrence-spark">Introduction to Cooccurrence in Spark</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Mahout Samsara</b><span></li> - <li><a href="/tutorials/samsara/play-with-shell.html">Playing with Samsara in Spark Shell</a></li> - <li><a href="/tutorials/samsara/playing-with-samsara-flink-batch.html">Playing with Samsara in Flink Batch</a></li> - <li><a href="/tutorials/samsara/classify-a-doc-from-the-shell.html">Text Classification (Shell)</a></li> - <li><a href="/tutorials/samsara/spark-naive-bayes.html">Spark Naive Bayes</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Misc</b><span></li> - <li><a href="/tutorials/misc/mahout-in-zeppelin">Mahout in Apache Zeppelin</a></li> - <li><a href="/tutorials/misc/contributing-algos">How To Contribute a New Algorithm</a></li> - <li><a href="/tutorials/misc/how-to-build-an-app.html">How To Build An App</a></li> - <li role="separator" class="divider"></li> - <li><span> <b>Deprecated</b><span></li> - <li><a href="/tutorials/map-reduce">MapReduce</a></li> - </ul> - </li> - - - <!-- Algorithms (Samsara / MR) --> - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Algorithms<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/algorithms/linear-algebra">Distributed Linear Algebra</a></li> - <li><a href="/algorithms/preprocessors">Preprocessors</a></li> - <li><a href="/algorithms/regression">Regression</a></li> - <li><a href="/algorithms/reccomenders">Reccomenders</a></li> - <li role="separator" class="divider"></li> - <li><a href="/algorithms/map-reduce">MapReduce <i>(deprecated)</i></a></li> - </ul> - <!--<li><a href="/algorithms/reccomenders/recommender-overview.html">Reccomender Overview</a></li> Do we still need? seems like short version of next post--> - <!-- - <li><a href="/algorithms/reccomenders/intro-cooccurrence-spark.html">Intro to Coocurrence With Spark</a></li> - <li role="separator" class="divider"></li> - <li><span> <a href="/algorithms/map-reduce"><b>MapReduce</b> (deprecated)</a><span></li> + <link rel="stylesheet" href="/assets/css/main.css"> + <!-- Font Awesome --> + <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous"> - --> - </li> + <!-- Google Fonts --> + <link href="https://fonts.googleapis.com/css?family=Maven+Pro:400,500" rel="stylesheet"> + <link href="https://fonts.googleapis.com/css?family=Muli:400,400i,700,700i" rel="stylesheet"> - <!-- Scala Docs --> - <li id="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">API Docs<span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="/0.13.0/api/index.html">0.13.0</a></li> - </ul> - </li> - - - </ul> - <form class="navbar-form navbar-left"> - <div class="form-group"> - <input type="text" class="form-control" placeholder="Search"> - </div> - <button type="submit" class="btn btn-default">Submit</button> - </form> - <ul class="nav navbar-nav navbar-right"> - <li><a href="http://github.com/apache/mahout">Github</a></li> - - <!-- Apache --> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache <span class="caret"></span></a> - <ul class="dropdown-menu"> - <li><a href="http://www.apache.org/foundation/how-it-works.html">Apache Software Foundation</a></li> - <li><a href="http://www.apache.org/licenses/">Apache License</a></li> - <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> - <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> - </ul> - </li> + <link rel="canonical" href="http://mahout.apache.org//docs/latest/algorithms/reccomenders/d-als.html"> + <link rel="alternate" type="application/rss+xml" title="Apache Mahout" href="/%20/feed.xml"> - </ul> -</div><!-- /.navbar-collapse --> - </div><!-- /.container-fluid --> -</nav> +</head> + <body> -<div id="wrap"> - <body class=""> + <nav class="navbar navbar-expand-lg navbar-light bg-light navbar-mahout"> + + <div class="container"> + + <a class="navbar-brand" href="/"> + <img src="/assets/mahout-logo-blue.svg" alt=""> + </a> + + <button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation"> + <span class="navbar-toggler-icon"></span> + </button> + + <div class="collapse navbar-collapse" id="navbarSupportedContent"> + + <div class="navbar-nav ml-auto"> + + <!-- Quick Start --> + <li class="nav-item"> + <a class="nav-link" href="//docs/latest/" >Mahout Overview</a> + </li> + + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Key Concepts</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/index.html">Mahout Overview</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Scala DSL</h6> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/in-core-reference.html">In-core Reference</a> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/out-of-core-reference.html">Out-of-core Reference</a> + <a class="dropdown-item" href="/docs/latest/mahout-samsara/faq.html">Samsara FAQ</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Distributed Engine Bindings</h6> + <a class="dropdown-item" href="/docs/latest/distributed/spark-bindings/">Spark Bindings</a> + <a class="dropdown-item" href="/docs/latest/distributed/flink-bindings.html">Flink Bindings</a> + <a class="dropdown-item" href="/docs/latest/distributed/flink-bindings.html">H20 Bindings</a> + <!--<div class="dropdown-divider"></div> + <h6 class="dropdown-header">Native Solvers</h6> + <a class="dropdown-item" href="/docs/latest/native-solvers/viennacl.html">ViennaCL</a></li> + <a class="dropdown-item" href="/docs/latest/native-solvers/viennacl-omp.html">ViennaCL-OMP</a></li> + <a class="dropdown-item" href="/docs/latest/native-solvers/cuda.html">CUDA</a></li>--> + </div> + </li> + + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Tutorial</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Reccomenders</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/cco-lastfm">CCO Example with Last.FM Data</a> + <a class="dropdown-item" href="/docs/latest/tutorials/intro-cooccurrence-spark">Introduction to Cooccurrence in Spark</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Mahout Samsara</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/play-with-shell.html">Playing with Samsara in Spark Shell</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/playing-with-samsara-flink-batch.html">Playing with Samsara in Flink Batch</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/classify-a-doc-from-the-shell.html">Text Classification (Shell)</a> + <a class="dropdown-item" href="/docs/latest/tutorials/samsara/spark-naive-bayes.html">Spark Naive Bayes</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Misc</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/mahout-in-zeppelin">Mahout in Apache Zeppelin</a> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/contributing-algos">How To Contribute a New Algorithm</a> + <a class="dropdown-item" href="/docs/latest/tutorials/misc/how-to-build-an-app.html">How To Build An App</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Deprecated</h6> + <a class="dropdown-item" href="/docs/latest/tutorials/map-reduce">MapReduce</a> + </div> + </li> + + + <!-- Algorithms (Samsara / MR) --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Algorithms</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/algorithms/linear-algebra">Distributed Linear Algebra</a> + <a class="dropdown-item" href="/docs/latest/algorithms/preprocessors">Preprocessors</a> + <a class="dropdown-item" href="/docs/latest/algorithms/regression">Regression</a> + <a class="dropdown-item" href="/docs/latest/algorithms/reccomenders">Reccomenders</a> + <div class="dropdown-divider"></div> + <h6 class="dropdown-header">Deprecated</h6> + <a class="dropdown-item" href="/docs/latest/algorithms/map-reduce">MapReduce <i>(deprecated)</i></a> + </div> + <!--<a class="dropdown-item" href="/docs/latest/algorithms/reccomenders/recommender-overview.html">Reccomender Overview</a></li> Do we still need? seems like short version of next post--> + <!-- + <a class="dropdown-item" href="/docs/latest/algorithms/reccomenders/intro-cooccurrence-spark.html">Intro to Coocurrence With Spark</a></li> + <li role="separator" class="divider"></li> + <li><span> <a href="/docs/latest/algorithms/map-reduce"><b>MapReduce</b> (deprecated)</a><span></li> + + + --> + </li> + + <!-- Scala /docs --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">API /docs</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="/docs/latest/0.13.0/api/index.html">0.13.0</a> + </div> + </li> + + <!-- Apache --> + <li class="nav-item dropdown"> + <a class="nav-link dropdown-toggle" href="" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Apache</a> + <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> + <a class="dropdown-item" href="http://www.apache.org/foundation/how-it-works.html">Apache Software Foundation</a> + <a class="dropdown-item" href="http://www.apache.org/licenses/">Apache License</a> + <a class="dropdown-item" href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a> + <a class="dropdown-item" href="http://www.apache.org/foundation/thanks.html">Thanks</a> + </div> + </li> - <div class="container"> - + </ul> + + <!--<form class="navbar-form navbar-left">--> + <!--<div class="form-group">--> + <!--<input type="text" class="form-control" placeholder="Search">--> + <!--</div>--> + <!--<button type="submit" class="btn btn-default">Submit</button>--> + <!--</form>--> + <!--<ul class="nav navbar-nav navbar-right">--> + <!--<a class="dropdown-item" href="http://github.com/apache/mahout">Github</a></li>--> -<div class="row"> - <div class="col-md-3"> - <div id="AlgoMenu"> - <span><b>Mahout-Samsara Algorithms</b></span> - <div class="list-group panel"> - <a href="#linalg" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Linear Algebra</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="linalg"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/linear-algebra/d-qr.html">Distributed QR Decomposition</a></li> - <li> <a href="/algorithms/linear-algebra/d-spca.html">Distributed Stochastic Principal Component Analysis</a></li> - <li> <a href="/algorithms/linear-algebra/d-ssvd.html">Distributed Stochastic Singular Value Decomposition</a></li> - </ul> - </div> - <a href="#clustering" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Clustering</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="clustering"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/clustering">Clustering Algorithms</a></li> - <li> <a href="/algorithms/clustering/distance-metrics.html">Distance Metrics</a></li> - <li> <a href="/algorithms/clustering/canopy">Canopy Clustering</a></li> - </ul> - </div> - <a href="#preprocessors" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Preprocessors</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="preprocessors"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/preprocessors/AsFactor.html">AsFactor (a.k.a. One-Hot-Encoding)</a></li> - <li> <a href="/algorithms/preprocessors/StandardScaler.html">StandardScaler</a></li> - <li> <a href="/algorithms/preprocessors/MeanCenter.html">MeanCenter</a></li> - </ul> - </div> - <a href="#regression" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Regression</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="regression"> - <ul class="nav sidebar-nav"> - <a href="#serial-correlation" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#regression"><b>• Serial Correlation</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="serial-correlation"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/regression/serial-correlation/cochrane-orcutt.html">Cochrane-Orcutt Procedure</a></li> - <li> <a href="/algorithms/regression/serial-correlation/dw-test.html">Durbin Watson Test</a></li> - </ul> - </div> - <li> <a href="/algorithms/regression/ols.html">Ordinary Least Squares (Closed Form)</a></li> - <li> <a href="/algorithms/regression/fittness-tests.html">Fitness Tests</a></li> - </ul> - </div> - <a href="#reccomenders" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Reccomenders</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="reccomenders"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/reccomenders">Reccomender Overview</a></li> - <li> <a href="/algorithms/reccomenders/cco.html">CCO</a></li> - <li> <a href="/algorithms/reccomenders/d-als.html">Distributed Alternating Least Squares</a></li> - </ul> - </div> - </div> - <span><b>Map Reduce Algorithms</b> (deprecated)</span> - <div class="list-group panel"> - <a href="#classification" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Classification</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="classification"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/map-reduce/classification/bayesian.html">Bayesian</a></li> - <li> <a href="/algorithms/map-reduce/classification/class-discovery.html">Class Discovery</a></li> - <li> <a href="/algorithms/map-reduce/classification/classifyingyourdata.html">Classifying Your Data</a></li> - <li> <a href="/algorithms/map-reduce/classification/collocations.html">Collocation</a></li> - <li> <a href="/algorithms/map-reduce/classification/gaussian-discriminative-analysis.html">Gaussian Discriminative Analysis</a></li> - <li> <a href="/algorithms/map-reduce/classification/hidden-markov-models.html">Hidden Markov Models</a></li> - <li> <a href="/algorithms/map-reduce/classification/independent-component-analysis.html">Independent Component Analysis</a></li> - <li> <a href="/algorithms/map-reduce/classification/locally-weighted-linear-regression.html">Locally Weighted Linear Regression</a></li> - <li> <a href="/algorithms/map-reduce/classification/logistic-regression.html">Logistic Regression</a></li> - <li> <a href="/algorithms/map-reduce/classification/mahout-collections.html">Mahout Collections</a></li> - <li> <a href="/algorithms/map-reduce/classification/mlp.html">Multilayer Perceptron</a></li> - <li> <a href="/algorithms/map-reduce/classification/naivebayes.html">Naive Bayes</a></li> - <li> <a href="/algorithms/map-reduce/classification/neural-network.html">Neural Networks</a></li> - <li> <a href="/algorithms/map-reduce/classification/partial-implementation.html">Partial Implementation</a></li> - <li> <a href="/algorithms/map-reduce/classification/random-forrests.html">Random Forrests</a></li> - <li> <a href="/algorithms/map-reduce/classification/restricted-boltzman-machines.html">Restricted Boltzman Machines</a></li> - <li> <a href="/algorithms/map-reduce/classification/support-vector-machines.html">Support Vector Machines</a></li> - </ul> - </div> - <a href="#mr-clustering" class="list-group-item list-group-item-success" data-toggle="collapse" data-parent="#AlgoMenu"><b>Clustering</b><i class="fa fa-caret-down"></i></a> - <div class="collapse" id="mr-clustering"> - <ul class="nav sidebar-nav"> - <li> <a href="/algorithms/map-reduce/clustering/canopy-clustering.html">Canopy Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/cluster-dumper.html">Cluster Dumper</a></li> - <li> <a href="/algorithms/map-reduce/clustering/expectation-maximization.html">Expectation Maximization</a></li> - <li> <a href="/algorithms/map-reduce/clustering/fuzzy-k-means.html">Fuzzy K-Means</a></li> - <li> <a href="/algorithms/map-reduce/clustering/hierarchical-clustering.html">Hierarchical Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/k-means-clustering.html">K-Means Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/latent-dirichlet-allocation.html">Latent Dirichlet Allocation</a></li> - <li> <a href="/algorithms/map-reduce/clustering/llr---log-likelihood-ratio.html">Log Likelihood Ratio</a></li> - <li> <a href="/algorithms/map-reduce/clustering/spectral-clustering.html">Spectral Clustering</a></li> - <li> <a href="/algorithms/map-reduce/clustering/streaming-k-means.html">Streaming K-Means</a></li> - </ul> - </div> - </div> -</div> + <!--</ul>--> + </div><!-- /.navbar-collapse --> </div> +</nav> + + <div class="container mt-5 pb-4"> + + <div class="row"> - <div class="col-md-8"> - <div class="page-header"> - <h1>Mahout Samsara Distributed ALS </h1> - </div> - <p>Seems like someone has jacked up this page? + <div class="col-lg-8"> + <p>Seems like someone has jacked up this page? TODO: Find the ALS Page</p> <h2 id="intro">Intro</h2> @@ -315,13 +172,13 @@ TODO: Find the ALS Page</p> <h2 id="algorithm">Algorithm</h2> -<p>For the classic QR decomposition of the form <code>\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)</code> a distributed version is fairly easily achieved if <code>\(\mathbf{A}\)</code> is tall and thin such that <code>\(\mathbf{A}^{\top}\mathbf{A}\)</code> fits in memory, i.e. <em>m</em> is large but <em>n</em> < ~5000 Under such circumstances, only <code>\(\mathbf{A}\)</code> and <code>\(\mathbf{Q}\)</code> are distributed matricies and <code>\(\mathbf{A^{\top}A}\)</code> and <code>\(\mathbf{R}\)</code> are in-core products. We just compute the in-core version of the Cholesky decomposition in the form of <code>\(\mathbf{LL}^{\top}= \mathbf{A}^{\top}\mathbf{A}\)</code>. After that we take <code>\(\mathbf{R}= \mathbf{L}^{\top}\)</code> and <code>\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)</code>. The latter is easily achieved by multiplying each verticle block of <code>\(\mathbf{A}\)</code> by <code>\(\left(\mathbf{L}^{\top}\right)^{-1}\)</code >. (There is no actual matrix inversion happening).</p> +<p>For the classic QR decomposition of the form <code class="highlighter-rouge">\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)</code> a distributed version is fairly easily achieved if <code class="highlighter-rouge">\(\mathbf{A}\)</code> is tall and thin such that <code class="highlighter-rouge">\(\mathbf{A}^{\top}\mathbf{A}\)</code> fits in memory, i.e. <em>m</em> is large but <em>n</em> < ~5000 Under such circumstances, only <code class="highlighter-rouge">\(\mathbf{A}\)</code> and <code class="highlighter-rouge">\(\mathbf{Q}\)</code> are distributed matricies and <code class="highlighter-rouge">\(\mathbf{A^{\top}A}\)</code> and <code class="highlighter-rouge">\(\mathbf{R}\)</code> are in-core products. We just compute the in-core version of the Cholesky decomposition in the form of <code class="highlighter-rouge">\(\mathbf{LL}^{\top}= \mathbf{A}^{\top}\mathbf{A}\)</code>. After that we take <code class="highlighter-rouge">\(\mathbf{R}= \mathbf{L}^{\top}\)</co de> and <code class="highlighter-rouge">\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)</code>. The latter is easily achieved by multiplying each verticle block of <code class="highlighter-rouge">\(\mathbf{A}\)</code> by <code class="highlighter-rouge">\(\left(\mathbf{L}^{\top}\right)^{-1}\)</code>. (There is no actual matrix inversion happening).</p> <h2 id="implementation">Implementation</h2> -<p>Mahout <code>dqrThin(...)</code> is implemented in the mahout <code>math-scala</code> algebraic optimizer which translates Mahoutâs R-like linear algebra operators into a physical plan for both Spark and H2O distributed engines.</p> +<p>Mahout <code class="highlighter-rouge">dqrThin(...)</code> is implemented in the mahout <code class="highlighter-rouge">math-scala</code> algebraic optimizer which translates Mahoutâs R-like linear algebra operators into a physical plan for both Spark and H2O distributed engines.</p> -<pre><code>def dqrThin[K: ClassTag](A: DrmLike[K], checkRankDeficiency: Boolean = true): (DrmLike[K], Matrix) = { +<div class="highlighter-rouge"><pre class="highlight"><code>def dqrThin[K: ClassTag](A: DrmLike[K], checkRankDeficiency: Boolean = true): (DrmLike[K], Matrix) = { if (drmA.ncol > 5000) log.warn("A is too fat. A'A must fit in memory and easily broadcasted.") implicit val ctx = drmA.context @@ -338,48 +195,43 @@ TODO: Find the ALS Page</p> Q -> inCoreR } </code></pre> +</div> <h2 id="usage">Usage</h2> -<p>The scala <code>dqrThin(...)</code> method can easily be called in any Spark or H2O application built with the <code>math-scala</code> library and the corresponding <code>Spark</code> or <code>H2O</code> engine module as follows:</p> +<p>The scala <code class="highlighter-rouge">dqrThin(...)</code> method can easily be called in any Spark or H2O application built with the <code class="highlighter-rouge">math-scala</code> library and the corresponding <code class="highlighter-rouge">Spark</code> or <code class="highlighter-rouge">H2O</code> engine module as follows:</p> -<pre><code>import org.apache.mahout.math._ +<div class="highlighter-rouge"><pre class="highlight"><code>import org.apache.mahout.math._ import decompositions._ import drm._ val(drmQ, inCoreR) = dqrThin(drma) </code></pre> +</div> <h2 id="references">References</h2> </div> -</div> - </div> - - -</div> -<div id="footer"> - <div class="container"> - <p>© 2017 Apache Mahout - with help from <a href="http://jekyllbootstrap.com" target="_blank" title="The Definitive Jekyll Blogging Framework">Jekyll Bootstrap</a> - and <a href="http://getbootstrap.com" target="_blank">Bootstrap</a> - </p> </div> -</div> - - +</div> + <footer class="footer bg-light"> + <div class="container text-center small"> + Copyright © 2014-2017 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. + </div> +</footer> + <script src="/assets/vendor/jquery/jquery-slim.min.js"></script> + <script src="/assets/vendor/popper/popper.min.js"></script> + <script src="/assets/vendor/bootstrap/js/bootstrap.min.js"></script> + <script src="/assets/header.js"></script> + <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script> -<!-- Latest compiled and minified JavaScript, requires jQuery 1.x (2.x not supported in IE8) --> -<!-- Placed at the end of the document so the pages load faster --> -<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script> -<script src="/assets/themes/mahout3/js/bootstrap.min.js"></script> </body> -</html> +</html>
