logistic-regression.html

buildbot Sun, 29 Mar 2015 11:54:26 -0700

Author: buildbot
Date: Sun Mar 29 18:54:05 2015
New Revision: 945539

Log:
Staging update by buildbot for mahout


Modified:
    websites/staging/mahout/trunk/content/   (props changed)
    
websites/staging/mahout/trunk/content/users/classification/logistic-regression.html

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Mar 29 18:54:05 2015
@@ -1 +1 @@
-1669854
+1669950

Modified: 
websites/staging/mahout/trunk/content/users/classification/logistic-regression.html
==============================================================================
--- 
websites/staging/mahout/trunk/content/users/classification/logistic-regression.html
 (original)
+++ 
websites/staging/mahout/trunk/content/users/classification/logistic-regression.html
 Sun Mar 29 18:54:05 2015
@@ -261,8 +261,10 @@ production fraud detection and advertisi
 The Mahout implementation uses Stochastic Gradient Descent (SGD) to all
 large training sets to be used.</p>
 <p>For a more detailed analysis of the approach, have a look at the <a 
href="http://www.autonlab.org/autonweb/14709/version/4/part/5/data/komarek:lr_thesis.pdf?branch=main&amp;language=en";>thesis
 of
-Paul Komarek</a>.</p>
+Paul Komarek</a> [1].</p>
 <p>See MAHOUT-228 for the main JIRA issue for SGD.</p>
+<p>A more detailed overview of the Mahout Linear Regression classifier and <a 
href="http://blog.trifork.com/2014/02/04/an-introduction-to-mahouts-logistic-regression-sgd-classifier/";>detailed
 discription of building a Logistic Regression classifier</a> for the classic 
<a href="http://en.wikipedia.org/wiki/Iris_flower_data_set";>Iris flower 
dataset</a> is also available [2]. </p>
+<p>An example of using training a Logistic Regression classifier for the <a 
href="http://mlr.cs.umass.edu/ml/datasets/Bank+Marketing";>UCI Bank Marketing 
Dataset</a> can be found <a 
href="http://mahout.apache.org/users/classification/bankmarketing-example.html";>on
 the Mahout website</a> [3].</p>
 <p><a name="LogisticRegression-Parallelizationstrategy"></a></p>
 <h2 id="parallelization-strategy">Parallelization strategy</h2>
 <p>The bad news is that SGD is an inherently sequential algorithm.  The good
@@ -298,7 +300,7 @@ include</p>
 </li>
 </ul>
 <p><a name="LogisticRegression-Featurevectorencoding"></a></p>
-<h3 id="feature-vector-encoding">Feature vector encoding</h3>
+<h2 id="feature-vector-encoding">Feature vector encoding</h2>
 <p>Because the SGD algorithms need to have fixed length feature vectors and
 because it is a pain to build a dictionary ahead of time, most SGD
 applications use the hashed feature vector encoding system that is rooted
@@ -317,7 +319,7 @@ case you are getting your training data
 <p>Here is a class diagram for the encoders package:</p>
 <p><img alt="class diagram" src="../../images/vector-class-hierarchy.png" 
/></p>
 <p><a name="LogisticRegression-SGDLearning"></a></p>
-<h3 id="sgd-learning">SGD Learning</h3>
+<h2 id="sgd-learning">SGD Learning</h2>
 <p>For the simplest applications, you can construct an
 OnlineLogisticRegression and be off and running.  Typically, though, it is
 nice to have running estimates of performance on held out data.  To do
@@ -338,6 +340,12 @@ so that you don't have to.</p>
 the number of twiddlable knobs is pretty large.  For some examples, see the
 TrainNewsGroups example code.</p>
 <p><img alt="sgd class diagram" src="../../images/sgd-class-hierarchy.png" 
/></p>
+<h2 id="references">References</h2>
+<p>[1] <a 
href="http://www.autonlab.org/autonweb/14709/version/4/part/5/data/komarek:lr_thesis.pdf?branch=main&amp;language=en";>Thesis
 of
+Paul Komarek</a></p>
+<p>[2] <a 
href="http://blog.trifork.com/2014/02/04/an-introduction-to-mahouts-logistic-regression-sgd-classifier/";>An
 Introduction To Mahout's Logistic Regression SGD Classifier</a></p>
+<h2 id="examples">Examples</h2>
+<p>[3] <a 
href="http://mahout.apache.org/users/classification/bankmarketing-example.html";>SGD
 Bank Marketing Example</a></p>
    </div>
   </div>     
 </div>

svn commit: r945539 - in /websites/staging/mahout/trunk/content: ./ users/classification/logistic-regression.html

Reply via email to