Author: tdunning
Date: Wed Aug 18 17:26:38 2010
New Revision: 986798
URL: http://svn.apache.org/viewvc?rev=986798&view=rev
Log:
Added comment.
Modified:
mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AdaptiveAnnealedLogisticRegression.java
Modified:
mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AdaptiveAnnealedLogisticRegression.java
URL:
http://svn.apache.org/viewvc/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AdaptiveAnnealedLogisticRegression.java?rev=986798&r1=986797&r2=986798&view=diff
==============================================================================
---
mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AdaptiveAnnealedLogisticRegression.java
(original)
+++
mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AdaptiveAnnealedLogisticRegression.java
Wed Aug 18 17:26:38 2010
@@ -18,6 +18,13 @@ import java.util.List;
* seem that it would to maintain multiple learners in memory. Doing this
adaptation on-line as we
* learn also decreases the number of learning rate parameters required and
replaces the normal
* hyper-parameter search.
+ *
+ * One wrinkle is that the pool of learners that we maintain is actually a
pool of CrossFoldLearners
+ * which themselves contain several OnlineLogisticRegression objects. These
pools allow estimation
+ * of performance on the fly even if we make many passes through the data.
This does, however, increase
+ * the cost of training since if we are using 5-fold cross-validation, each
vector is used 4 times for
+ * training and once for classification. If this becomes a problem, then we
should probably use a
+ * 2-way unbalanced train/test split rather than full cross validation.
*/
public class AdaptiveAnnealedLogisticRegression implements OnlineLearner {
private int record = 0;