On Fri, Dec 16, 2011 at 3:44 PM, Ramprakash Ramamoorthy < [email protected]> wrote:
> > > On Fri, Dec 16, 2011 at 3:40 PM, JAGANADH G <[email protected]> wrote: > >> > Ok. So i will basically write a java file that calls the classifier >> > function, and the folder path as parameters. Should I write it in >> > mahout-core? If not, where should I write the file? >> > >> >> >> @Ramaprakash >> >> It can be done in your classifier java code itself. >> Create a method called listDir which returns all the .txt files in the >> directory. Itreate the list and open each files and pass to classifier . >> that is all . There is no need to got to mahout-core etc.. Still if you >> feel it hard please show your code >> >> >> >> -- >> ********************************** >> JAGANADH G >> http://jaganadhg.in >> *ILUGCBE* >> http://ilugcbe.org.in >> > > @Jagan > > That is great news. Will go ahead. Thanks :) > > > > -- > With Thanks and Regards, > Ramprakash Ramamoorthy, > B.Tech ICT, > SASTRA University. > +91 9626975420 > > @Jagan I had been executing my classifier through the command line only so far, that is through /bin/mahout. Just attempted to write this java file that takes a single file as input. *package org.apache.mahout.classifier.bayes;* * * *import java.io.BufferedReader;* *import java.io.File;* *import java.io.FileReader;* *import java.io.IOException;* *import java.util.List;* * * *import org.apache.mahout.classifier.ClassifierResult;* *import org.apache.mahout.classifier.bayes.algorithm.BayesAlgorithm;* *import org.apache.mahout.classifier.bayes.common.BayesParameters;* *import org.apache.mahout.classifier.bayes.datastore.InMemoryBayesDatastore; * *import org.apache.mahout.classifier.bayes.exceptions.InvalidDatastoreException;* *import org.apache.mahout.classifier.bayes.interfaces.Algorithm;* *import org.apache.mahout.classifier.bayes.interfaces.Datastore;* *import org.apache.mahout.classifier.bayes.model.ClassifierContext;* *import org.apache.mahout.common.nlp.NGrams;* * * *public class ramSample {* * * * /*** * * @param args* * * @throws IOException * * * @throws InvalidDatastoreException * * */* * public static void main(String[] args) throws IOException, InvalidDatastoreException {* * final BayesParameters params=new BayesParameters();* * params.setGramSize(1);* * params.setBasePath("/home/ramprakash-pt09/mahout-distribution-0.5/examples/src/main/java/org/apache/mahout/classifier/bayes/bayes-model"); * * params.set( "verbose", "false" );* * params.set( "classifierType", "bayes" );* * params.set( "defaultCat", "OTHER" );* * params.set( "encoding", "UTF-8" );* * params.set( "alpha_i", "1.0" );* * params.set( "dataSource", "hdfs" );* * * * Algorithm algorithm=new BayesAlgorithm();* * Datastore datastore = new InMemoryBayesDatastore( params );* * ClassifierContext classifier = new ClassifierContext( algorithm, datastore );* * classifier.initialize();* * * * File file=new File("/home/ramprakash-pt09/mahout-distribution-0.5/examples/src/main/java/org/apache/mahout/classifier/bayes/input.txt"); * * * * final BufferedReader reader = new BufferedReader( new FileReader( file ) );* * String entry = reader.readLine();* * * * while( entry != null ) {* * List< String > document = new NGrams( entry, * * Integer.parseInt( params.get( "gramSize" ) ) )* * .generateNGramsWithoutLabel();* * * * ClassifierResult result = classifier.classifyDocument( * * document.toArray( new String[ document.size() ] ), * * params.get( "defaultCat" )); * * * * entry = reader.readLine();* * }* * }* * * *}* On compiling and running this code, I get the following output : *16 Dec, 2011 6:37:05 PM org.slf4j.impl.JCLLoggerAdapter info* *INFO: 57425.12741460857* *16 Dec, 2011 6:37:06 PM org.slf4j.impl.JCLLoggerAdapter info* *INFO: pos -374948.0234153431 374948.0234153431 -1.0* *16 Dec, 2011 6:37:06 PM org.slf4j.impl.JCLLoggerAdapter info* *INFO: neg -236477.77478425365 374948.0234153431 -0.630694816391388 * * * I have two categories : pos & neg. But this states both. When checking the same input content through /bin mahout, the following is the output. *INFO: Category for examples/ACTUAL/input.txt is ClassifierResult{category='pos', score=35.42897640254213}* *16 Dec, 2011 6:43:18 PM org.slf4j.impl.JCLLoggerAdapter info* * * I can make the input folder parsing via Java IO, but this seems to be a bigger problem now - running the classifier through a JAVA file. Sorry for bugging and thanks for your response. -- With Thanks and Regards, Ramprakash Ramamoorthy, B.Tech ICT, SASTRA University. +91 9626975420
