The distributed SVD is in core, but the base implementations (both SVD variants - Lanczos and GHA) are a) totally not-Hadoop dependent, and b) not dependent on anything else. So I think keeping them in math is the right place for them - then anyone who wants them can have them with only a simple maven dependency on mahout-math, and nothing else.
-jake On Sun, Apr 4, 2010 at 10:39 PM, Robin Anil <robin.a...@gmail.com> wrote: > SVD shouldn't really be in Math. I agree its "Math" but in principle its a > core Mahout algorithm like clustering or recommendations. I know its a very > debatable thought but for me collections and Math are just tools to aid > complex algorithms in Mahout core. Maybe we can move it under core and > adding the required logging. > > Robin > > > On Mon, Apr 5, 2010 at 11:03 AM, Jake Mannix <jake.man...@gmail.com> > wrote: > > > Umm, I actually depend pretty heavily on the logging in the SVD solvers. > > They are very long-running processes, and give off a ton of useful > > information about what the heck is going on. > > > > Reducing dependencies is great, but logging? I think the math stuff > could > > really use logging. I haven't been able to follow all the JIRA tickets > > lately, things have been crazy, sorry. > > > > -jake > > > > On Sun, Apr 4, 2010 at 10:21 PM, <sro...@apache.org> wrote: > > > > > Author: srowen > > > Date: Mon Apr 5 05:21:27 2010 > > > New Revision: 930796 > > > > > > URL: http://svn.apache.org/viewvc?rev=930796&view=rev > > > Log: > > > MAHOUT-361 Hearing no objection and believing Math shouldn't have log > > > statements and seeing that they're not really used much, I commit > > > > > > Modified: > > > lucene/mahout/trunk/math/pom.xml > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonMatrixAdapter.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonVectorAdapter.java > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/Timer.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileCalc.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java > > > > > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java > > > > > > > > > > lucene/mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java > > > > > > Modified: lucene/mahout/trunk/math/pom.xml > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/pom.xml?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- lucene/mahout/trunk/math/pom.xml (original) > > > +++ lucene/mahout/trunk/math/pom.xml Mon Apr 5 05:21:27 2010 > > > @@ -100,19 +100,6 @@ > > > </dependency> > > > > > > <dependency> > > > - <groupId>org.slf4j</groupId> > > > - <artifactId>slf4j-api</artifactId> > > > - <version>1.5.8</version> > > > - </dependency> > > > - > > > - <dependency> > > > - <groupId>org.slf4j</groupId> > > > - <artifactId>slf4j-jcl</artifactId> > > > - <version>1.5.8</version> > > > - <scope>test</scope> > > > - </dependency> > > > - > > > - <dependency> > > > <groupId>junit</groupId> > > > <artifactId>junit</artifactId> > > > <scope>test</scope> > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonMatrixAdapter.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonMatrixAdapter.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonMatrixAdapter.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonMatrixAdapter.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -27,15 +27,12 @@ import com.google.gson.JsonPrimitive; > > > import com.google.gson.JsonSerializationContext; > > > import com.google.gson.JsonSerializer; > > > import com.google.gson.reflect.TypeToken; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > > > > import java.lang.reflect.Type; > > > > > > public class JsonMatrixAdapter implements JsonSerializer<Matrix>, > > > JsonDeserializer<Matrix> { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(JsonMatrixAdapter.class); > > > public static final String CLASS = "class"; > > > public static final String MATRIX = "matrix"; > > > > > > @@ -73,7 +70,7 @@ public class JsonMatrixAdapter implement > > > try { > > > cl = ccl.loadClass(klass); > > > } catch (ClassNotFoundException e) { > > > - log.warn("Error while loading class", e); > > > + throw new JsonParseException(e); > > > } > > > return (Matrix) gson.fromJson(matrix, cl); > > > } > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonVectorAdapter.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonVectorAdapter.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonVectorAdapter.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/JsonVectorAdapter.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -26,15 +26,12 @@ import com.google.gson.JsonParseExceptio > > > import com.google.gson.JsonPrimitive; > > > import com.google.gson.JsonSerializationContext; > > > import com.google.gson.JsonSerializer; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > > > > import java.lang.reflect.Type; > > > > > > public class JsonVectorAdapter implements JsonSerializer<Vector>, > > > JsonDeserializer<Vector> { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(JsonVectorAdapter.class); > > > public static final String VECTOR = "vector"; > > > > > > public JsonElement serialize(Vector src, Type typeOfSrc, > > > @@ -61,7 +58,7 @@ public class JsonVectorAdapter implement > > > try { > > > cl = ccl.loadClass(klass); > > > } catch (ClassNotFoundException e) { > > > - log.warn("Error while loading class", e); > > > + throw new JsonParseException(e); > > > } > > > return (Vector) gson.fromJson(vector, cl); > > > } > > > > > > Modified: > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/Timer.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/Timer.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/Timer.java > > > (original) > > > +++ > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/Timer.java > > Mon > > > Apr 5 05:21:27 2010 > > > @@ -8,9 +8,6 @@ It is provided "as is" without expressed > > > */ > > > package org.apache.mahout.math; > > > > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > - > > > /** > > > * A handy stopwatch for benchmarking. > > > * Like a real stop watch used on ancient running tracks you can start > > the > > > watch, stop it, > > > @@ -21,8 +18,6 @@ import org.slf4j.LoggerFactory; > > > @Deprecated > > > public class Timer extends PersistentObject { > > > > > > - private static final Logger log = > > LoggerFactory.getLogger(Timer.class); > > > - > > > private long baseTime; > > > private long elapsedTime; > > > > > > @@ -33,16 +28,6 @@ public class Timer extends PersistentObj > > > this.reset(); > > > } > > > > > > - /** > > > - * Prints the elapsed time on System.out > > > - * > > > - * @return <tt>this</tt> (for convenience only). > > > - */ > > > - public Timer display() { > > > - log.info(this.toString()); > > > - return this; > > > - } > > > - > > > /** Same as <tt>seconds()</tt>. */ > > > public float elapsedTime() { > > > return seconds(); > > > @@ -127,44 +112,6 @@ public class Timer extends PersistentObj > > > return this; > > > } > > > > > > - /** Shows how to use a timer in convenient ways. */ > > > - public static void test(int size) { > > > - //benchmark this piece > > > - Timer t = new Timer().start(); > > > - int j = 0; > > > - for (int i = 0; i < size; i++) { > > > - j++; > > > - } > > > - t.stop(); > > > - t.display(); > > > - > > > - > > > - //do something we do not want to benchmark > > > - j = 0; > > > - for (int i = 0; i < size; i++) { > > > - j++; > > > - } > > > - > > > - > > > - //benchmark another piece and add to last benchmark > > > - t.start(); > > > - j = 0; > > > - for (int i = 0; i < size; i++) { > > > - j++; > > > - } > > > - t.stop().display(); > > > - > > > - > > > - //benchmark yet another piece independently > > > - t.reset(); //set timer to zero > > > - t.start(); > > > - j = 0; > > > - for (int i = 0; i < size; i++) { > > > - j++; > > > - } > > > - t.stop().display(); > > > - } > > > - > > > /** Returns a String representation of the receiver. */ > > > public String toString() { > > > return "Time=" + Float.toString(this.elapsedTime()) + " secs"; > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -23,7 +23,6 @@ import java.util.Random; > > > > > > import java.util.ArrayList; > > > > > > -import org.apache.mahout.math.AbstractMatrix; > > > import org.apache.mahout.math.DenseMatrix; > > > import org.apache.mahout.math.DenseVector; > > > import org.apache.mahout.math.Matrix; > > > @@ -33,8 +32,6 @@ import org.apache.mahout.math.decomposer > > > import org.apache.mahout.math.function.TimesFunction; > > > import org.apache.mahout.math.Vector; > > > import org.apache.mahout.math.function.PlusMult; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > > > > /** > > > * The Hebbian solver is an iterative, sparse, singular value > > decomposition > > > solver, based on the paper > > > @@ -44,16 +41,13 @@ import org.slf4j.LoggerFactory; > > > */ > > > public class HebbianSolver { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(HebbianSolver.class); > > > - > > > private final EigenUpdater updater; > > > private final SingularVectorVerifier verifier; > > > private final double convergenceTarget; > > > private final int maxPassesPerEigen; > > > private final Random rng = new Random(); > > > > > > - private int numPasses = 0; > > > - private static final boolean debug = false; > > > + //private int numPasses = 0; > > > > > > /** > > > * Creates a new HebbianSolver > > > @@ -163,7 +157,6 @@ public class HebbianSolver { > > > int cols = corpus.numCols(); > > > Matrix eigens = new DenseMatrix(desiredRank, cols); > > > List<Double> eigenValues = new ArrayList<Double>(); > > > - log.info("Finding " + desiredRank + " singular vectors of matrix > > with > > > " + corpus.numRows() + " rows, via Hebbian"); > > > /** > > > * The corpusProjections matrix is a running cache of the residual > > > projection of each corpus vector against all > > > * of the previously found singular vectors. Without this, if > > multiple > > > passes over the data is made (per > > > @@ -186,17 +179,6 @@ public class HebbianSolver { > > > updater.update(currentEigen, corpus.getRow(corpusRow), > > state); > > > } > > > state.setFirstPass(false); > > > - if (debug) { > > > - if (previousEigen == null) { > > > - previousEigen = currentEigen.clone(); > > > - } else { > > > - double dot = currentEigen.dot(previousEigen); > > > - if (dot > 0) { > > > - dot /= (currentEigen.norm(2) * previousEigen.norm(2)); > > > - } > > > - // log.info("Current pass * previous pass = {}", dot); > > > - } > > > - } > > > } > > > // converged! > > > double eigenValue = > > > state.getStatusProgress().get(state.getStatusProgress().size() - > > > 1).getEigenValue(); > > > @@ -206,7 +188,6 @@ public class HebbianSolver { > > > eigens.assignRow(i, currentEigen); > > > eigenValues.add(eigenValue); > > > state.setCurrentEigenValues(eigenValues); > > > - log.info("Found eigenvector {}, eigenvalue: {}", i, > eigenValue); > > > > > > /** > > > * TODO: Persist intermediate output! > > > @@ -216,7 +197,7 @@ public class HebbianSolver { > > > state.setActivationDenominatorSquared(0); > > > state.setActivationNumerator(0); > > > state.getStatusProgress().clear(); > > > - numPasses = 0; > > > + //numPasses = 0; > > > } > > > return state; > > > } > > > @@ -253,13 +234,11 @@ public class HebbianSolver { > > > protected boolean hasNotConverged(Vector currentPseudoEigen, > > > Matrix corpus, > > > TrainingState state) { > > > - numPasses++; > > > + //numPasses++; > > > if (state.isFirstPass()) { > > > - log.info("First pass through the corpus, no need to check > > > convergence..."); > > > return true; > > > } > > > Matrix previousEigens = state.getCurrentEigens(); > > > - log.info("Have made {} passes through the corpus, checking > > > convergence...", numPasses); > > > /* > > > * Step 1: orthogonalize currentPseudoEigen by subtracting off > > eigen(i) > > > * helper.get(i) > > > * Step 2: zero-out the helper vector because it has already > helped. > > > @@ -269,20 +248,11 @@ public class HebbianSolver { > > > currentPseudoEigen.assign(previousEigen, new > > > PlusMult(-state.getHelperVector().get(i))); > > > state.getHelperVector().set(i, 0); > > > } > > > - if (debug && currentPseudoEigen.norm(2) > 0) { > > > - for (int i = 0; i < state.getNumEigensProcessed(); i++) { > > > - Vector previousEigen = previousEigens.getRow(i); > > > - log.info("dot with previous: {}", > > > (previousEigen.dot(currentPseudoEigen)) / currentPseudoEigen.norm(2)); > > > - } > > > - } > > > /* > > > * Step 3: verify how eigen-like the prospective eigen is. This is > > > potentially asynchronous. > > > */ > > > EigenStatus status = verify(corpus, currentPseudoEigen); > > > - if (status.inProgress()) { > > > - log.info("Verifier not finished, making another pass..."); > > > - } else { > > > - log.info("Has 1 - cosAngle: {}, convergence target is: {}", (1 > - > > > status.getCosAngle()), convergenceTarget); > > > + if (!status.inProgress()) { > > > state.getStatusProgress().add(status); > > > } > > > return (state.getStatusProgress().size() <= maxPassesPerEigen && 1 > - > > > status.getCosAngle() > convergenceTarget); > > > @@ -300,7 +270,6 @@ public class HebbianSolver { > > > String corpusDir = props.getProperty("solver.input.dir"); > > > String outputDir = props.getProperty("solver.output.dir"); > > > if (corpusDir == null || corpusDir.length() == 0 || outputDir == > null > > > || outputDir.length() == 0) { > > > - log.error("{} must contain values for solver.input.dir and > > > solver.output.dir", propertiesFile); > > > return; > > > } > > > int inBufferSize = > > > Integer.parseInt(props.getProperty("solver.input.bufferSize")); > > > @@ -321,11 +290,7 @@ public class HebbianSolver { > > > } else { > > > // corpus = new ParallelMultiplyingDiskBufferedDoubleMatrix(new > > > File(corpusDir), inBufferSize, numThreads); > > > } > > > - long now = System.currentTimeMillis(); > > > TrainingState finalState = solver.solve(corpus, rank); > > > - long time = (System.currentTimeMillis() - now) / 1000; > > > - log.info("Solved {} eigenVectors in {} seconds. Persisted to > {}", > > > - new Object[] > > > {finalState.getCurrentEigens().size()[AbstractMatrix.ROW], time, > > > outputDir}); > > > } > > > > > > } > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -35,8 +35,6 @@ import org.apache.mahout.math.matrix.Dou > > > import org.apache.mahout.math.matrix.DoubleMatrix2D; > > > import org.apache.mahout.math.matrix.impl.DenseDoubleMatrix2D; > > > import org.apache.mahout.math.matrix.linalg.EigenvalueDecomposition; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > > > > /** > > > * <p>Simple implementation of the <a href=" > > > http://en.wikipedia.org/wiki/Lanczos_algorithm">Lanczos algorithm</a> > > for > > > @@ -65,8 +63,6 @@ import org.slf4j.LoggerFactory; > > > */ > > > public class LanczosSolver { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(LanczosSolver.class); > > > - > > > public static final double SAFE_MAX = 1.0e150; > > > > > > private static final double NANOS_IN_MILLI = 1.0e6; > > > @@ -77,7 +73,7 @@ public class LanczosSolver { > > > > > > private final Map<TimingSection, Long> startTimes = new > > > EnumMap<TimingSection, Long>(TimingSection.class); > > > private final Map<TimingSection, Long> times = new > > EnumMap<TimingSection, > > > Long>(TimingSection.class); > > > - protected double scaleFactor = 0; > > > + protected double scaleFactor = 0.0; > > > > > > private static final class Scale implements UnaryFunction { > > > private final double d; > > > @@ -103,7 +99,6 @@ public class LanczosSolver { > > > Matrix eigenVectors, > > > List<Double> eigenValues, > > > boolean isSymmetric) { > > > - log.info("Finding {} singular vectors of matrix with {} rows, via > > > Lanczos", desiredRank, corpus.numRows()); > > > Vector currentVector = getInitialVector(corpus); > > > Vector previousVector = new DenseVector(currentVector.size()); > > > Matrix basis = new SparseRowMatrix(new int[]{desiredRank, > > > corpus.numCols()}); > > > @@ -114,7 +109,6 @@ public class LanczosSolver { > > > for (int i = 1; i < desiredRank; i++) { > > > startTime(TimingSection.ITERATE); > > > Vector nextVector = isSymmetric ? corpus.times(currentVector) : > > > corpus.timesSquared(currentVector); > > > - log.info("{} passes through the corpus so far...", i); > > > calculateScaleFactor(nextVector); > > > nextVector.assign(new Scale(1 / scaleFactor)); > > > nextVector.assign(previousVector, new PlusMult(-beta)); > > > @@ -128,7 +122,6 @@ public class LanczosSolver { > > > // and normalize > > > beta = nextVector.norm(2); > > > if (outOfRange(beta) || outOfRange(alpha)) { > > > - log.warn("Lanczos parameters out of range: alpha = {}, beta = > > {}. > > > Bailing out early!", alpha, beta); > > > break; > > > } > > > final double b = beta; > > > @@ -145,7 +138,6 @@ public class LanczosSolver { > > > } > > > startTime(TimingSection.TRIDIAG_DECOMP); > > > > > > - log.info("Lanczos iteration complete - now to diagonalize the > > > tri-diagonal auxiliary matrix."); > > > // at this point, have tridiag all filled out, and basis is all > > filled > > > out, and orthonormalized > > > EigenvalueDecomposition decomp = new > > EigenvalueDecomposition(triDiag); > > > > > > @@ -164,10 +156,8 @@ public class LanczosSolver { > > > } > > > realEigen = realEigen.normalize(); > > > eigenVectors.assignRow(i, realEigen); > > > - log.info("Eigenvector {} found with eigenvalue {}", i, > > > eigenVals.get(i)); > > > eigenValues.add(eigenVals.get(i)); > > > } > > > - log.info("LanczosSolver finished."); > > > endTime(TimingSection.FINAL_EIGEN_CREATE); > > > } > > > > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -10,8 +10,7 @@ package org.apache.mahout.math.jet.rando > > > > > > import org.apache.mahout.math.PersistentObject; > > > import org.apache.mahout.math.jet.random.engine.RandomEngine; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > + > > > /** > > > * Space and time efficiently computes a sorted <i>Simple Random Sample > > > Without Replacement (SRSWOR)</i>, that is, a sorted set of <tt>n</tt> > > random > > > numbers from an interval of <tt>N</tt> numbers; > > > * Example: Computing <tt>n=3</tt> random numbers from the interval > > > <tt>[1,50]</tt> may yield the sorted random set <tt>(7,13,47)</tt>. > > > @@ -112,8 +111,6 @@ import org.slf4j.LoggerFactory; > > > @Deprecated > > > public class RandomSampler extends PersistentObject { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(RandomSampler.class); > > > - > > > //public class RandomSampler extends Object implements > > > java.io.Serializable { > > > private long n; > > > private long N; > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileCalc.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileCalc.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileCalc.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileCalc.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -8,14 +8,9 @@ It is provided "as is" without expressed > > > */ > > > package org.apache.mahout.math.jet.stat.quantile; > > > > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > - > > > /** Computes b and k vor various parameters. */ > > > class QuantileCalc { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(QuantileCalc.class); > > > - > > > private QuantileCalc() { > > > } > > > > > > @@ -289,77 +284,6 @@ class QuantileCalc { > > > return result; > > > } > > > > > > - public static void main(String[] args) { > > > - test_B_and_K_Calculation(args); > > > - } > > > - > > > - /** Computes b and k for different parameters. */ > > > - public static void test_B_and_K_Calculation(String[] args) { > > > - boolean known_N; > > > - if (args == null) { > > > - known_N = false; > > > - } else { > > > - known_N = Boolean.valueOf(args[0]); > > > - } > > > - > > > - int[] quantiles = {1, 1000}; > > > - > > > - long[] sizes = {100000, 1000000, 10000000, 1000000000}; > > > - > > > - double[] deltas = {0.0, 0.001, 0.0001, 0.00001}; > > > - > > > - double[] epsilons = {0.0, 0.1, 0.05, 0.01, 0.005, 0.001, > 0.0000001}; > > > - > > > - > > > - if (!known_N) { > > > - sizes = new long[]{0}; > > > - } > > > - log.info("\n\n"); > > > - if (known_N) { > > > - log.info("Computing b's and k's for KNOWN N"); > > > - } else { > > > - log.info("Computing b's and k's for UNKNOWN N"); > > > - } > > > - log.info("mem [elements/1024]"); > > > - log.info("***********************************"); > > > - > > > - for (int p : quantiles) { > > > - log.info("------------------------------"); > > > - log.info("computing for p = {}", p); > > > - for (long N : sizes) { > > > - log.info(" ------------------------------"); > > > - log.info(" computing for N = {}", N); > > > - for (double delta : deltas) { > > > - log.info(" ------------------------------"); > > > - log.info(" computing for delta = {}", delta); > > > - for (double epsilon : epsilons) { > > > - double[] returnSamplingRate = new double[1]; > > > - long[] result; > > > - if (known_N) { > > > - result = known_N_compute_B_and_K(N, epsilon, delta, p, > > > returnSamplingRate); > > > - } else { > > > - result = unknown_N_compute_B_and_K(epsilon, delta, p); > > > - } > > > - > > > - long b = result[0]; > > > - long k = result[1]; > > > - log.info(" (e,d,N,p)=({},{},{},{}) --> ", new > > > Object[] {epsilon, delta, N, p}); > > > - log.info("(b,k,mem"); > > > - if (known_N) { > > > - log.info(",sampling"); > > > - } > > > - log.info(")=({},{},{}", new Object[] {b, k, (b * k / > > 1024)}); > > > - if (known_N) { > > > - log.info(",{}", returnSamplingRate[0]); > > > - } > > > - log.info(")"); > > > - } > > > - } > > > - } > > > - } > > > - > > > - } > > > - > > > /** > > > * Computes the number of buffers and number of values per buffer > such > > > that quantiles can be determined with an > > > * approximation error no more than epsilon with a certain > probability. > > > @@ -468,7 +392,6 @@ class QuantileCalc { > > > } //end for b > > > > > > if (best_b == Long.MAX_VALUE) { > > > - log.info("Warning: Computing b and k looks like a lot of > > work!"); > > > // no solution found so far. very unlikely. Anyway, try again. > > > max_b *= 2; > > > max_h *= 2; > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -13,8 +13,7 @@ package org.apache.mahout.math.jet.stat. > > > import org.apache.mahout.math.jet.math.Arithmetic; > > > import org.apache.mahout.math.jet.random.engine.RandomEngine; > > > import org.apache.mahout.math.list.DoubleArrayList; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > + > > > /** > > > * Factory constructing exact and approximate quantile finders for both > > > known and unknown <tt>N</tt>. > > > * Also see {...@link hep.aida.bin.QuantileBin1D}, demonstrating how this > > > package can be used. > > > @@ -95,8 +94,6 @@ import org.slf4j.LoggerFactory; > > > @Deprecated > > > public class QuantileFinderFactory { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(QuantileFinderFactory.class); > > > - > > > /** Make this class non instantiable. Let still allow others to > > inherit. > > > */ > > > private QuantileFinderFactory() { > > > } > > > @@ -587,23 +584,20 @@ public class QuantileFinderFactory { > > > double alpha_two = (c + 2.0 * d - root) / (2.0 * d); > > > > > > // any alpha must satisfy 0<alpha<1 to yield valid solutions > > > - boolean alpha_one_OK = false; > > > - if (0.0 < alpha_one && alpha_one < 1.0) { > > > - alpha_one_OK = true; > > > - } > > > - boolean alpha_two_OK = false; > > > - if (0.0 < alpha_two && alpha_two < 1.0) { > > > - alpha_two_OK = true; > > > - } > > > + boolean alpha_one_OK = 0.0 < alpha_one && alpha_one < 1.0; > > > + boolean alpha_two_OK = 0.0 < alpha_two && alpha_two < 1.0; > > > if (alpha_one_OK || alpha_two_OK) { > > > - double alpha = alpha_one; > > > - if (alpha_one_OK && alpha_two_OK) { > > > - // take the alpha that minimizes d/alpha > > > - alpha = Math.max(alpha_one, alpha_two); > > > - } else if (alpha_two_OK) { > > > + double alpha; > > > + if (alpha_one_OK) { > > > + if (alpha_two_OK) { > > > + // take the alpha that minimizes d/alpha > > > + alpha = Math.max(alpha_one, alpha_two); > > > + } else { > > > + alpha = alpha_one; > > > + } > > > + } else { > > > alpha = alpha_two; > > > } > > > - > > > // now we have k=Ceiling(Max(d/alpha, (h+1)/(2*epsilon))) > > > long k = (long) Math.ceil(Math.max(d / alpha, (h + 1) / > (2.0 > > * > > > epsilon))); > > > if (k > 0) { // valid solution? > > > @@ -621,7 +615,6 @@ public class QuantileFinderFactory { > > > } //end for b > > > > > > if (best_b == Long.MAX_VALUE) { > > > - log.warn("Computing b and k looks like a lot of work!"); > > > // no solution found so far. very unlikely. Anyway, try again. > > > max_b *= 2; > > > max_h *= 2; > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -15,8 +15,7 @@ import org.apache.mahout.math.jet.random > > > import org.apache.mahout.math.matrix.impl.DenseDoubleMatrix2D; > > > import org.apache.mahout.math.matrix.impl.RCDoubleMatrix2D; > > > import org.apache.mahout.math.matrix.impl.SparseDoubleMatrix2D; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > + > > > /** > > > Factory for convenient construction of 2-d matrices holding > > > <tt>double</tt> > > > cells. Also provides convenient methods to compose (concatenate) and > > > decompose > > > @@ -85,8 +84,6 @@ sample} to construct random matrices. </ > > > @Deprecated > > > public class DoubleFactory2D extends PersistentObject { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(DoubleFactory2D.class); > > > - > > > /** A factory producing dense matrices. */ > > > public static final DoubleFactory2D dense = new DoubleFactory2D(); > > > > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -15,8 +15,7 @@ import org.apache.mahout.math.matrix.imp > > > import org.apache.mahout.math.matrix.impl.AbstractMatrix1D; > > > import org.apache.mahout.math.matrix.impl.AbstractMatrix2D; > > > import org.apache.mahout.math.matrix.impl.Former; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > + > > > /** > > > Flexible, well human readable matrix print formatting; By default > > decimal > > > point aligned. Build on top of the C-like <i>sprintf</i> functionality > > > provided by the Format class written by Cay Horstmann. > > > @@ -272,8 +271,6 @@ import org.slf4j.LoggerFactory; > > > @Deprecated > > > public class Formatter extends AbstractFormatter { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(Formatter.class); > > > - > > > /** Constructs and returns a matrix formatter with format > > <tt>"%G"</tt>. > > > */ > > > public Formatter() { > > > this("%G"); > > > > > > Modified: > > > > > > lucene/mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java > > > URL: > > > > > > http://svn.apache.org/viewvc/lucene/mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java?rev=930796&r1=930795&r2=930796&view=diff > > > > > > > > > ============================================================================== > > > --- > > > > > > lucene/mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java > > > (original) > > > +++ > > > > > > lucene/mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java > > > Mon Apr 5 05:21:27 2010 > > > @@ -23,16 +23,12 @@ import org.apache.mahout.math.Sequential > > > import org.apache.mahout.math.SparseRowMatrix; > > > import org.apache.mahout.math.Vector; > > > import org.apache.mahout.math.VectorIterable; > > > -import org.slf4j.Logger; > > > -import org.slf4j.LoggerFactory; > > > > > > import java.util.Random; > > > > > > > > > public abstract class SolverTest extends TestCase { > > > > > > - private static final Logger log = > > > LoggerFactory.getLogger(SolverTest.class); > > > - > > > protected SolverTest(String name) { > > > super(name); > > > } > > > @@ -75,7 +71,6 @@ public abstract class SolverTest extends > > > double dot = afterMultiply.dot(e); > > > double afterNorm = afterMultiply.getLengthSquared(); > > > double error = 1 - dot / Math.sqrt(afterNorm * > > e.getLengthSquared()); > > > - log.info("Eigenvalue({}) = {}", i, > > > Math.sqrt(afterNorm/e.getLengthSquared())); > > > assertTrue("Error margin: {" + error + " too high! (for eigen " + > i > > + > > > ')', Math.abs(error) < errorMargin); > > > } > > > } > > > > > > > > > > > >