On Fri, Dec 16, 2011 at 3:44 PM, Ramprakash Ramamoorthy <
[email protected]> wrote:

>
>
> On Fri, Dec 16, 2011 at 3:40 PM, JAGANADH G <[email protected]> wrote:
>
>> > Ok. So i will basically write a java file that calls the classifier
>> > function, and the folder path as parameters. Should I write it in
>> > mahout-core? If not, where should I write the file?
>> >
>>
>>
>>  @Ramaprakash
>>
>> It can be done in your classifier java code itself.
>> Create a method called listDir which returns all the .txt files in the
>> directory. Itreate the list and open each files and pass to classifier .
>> that is all . There is no need to got to mahout-core etc.. Still if you
>> feel it hard please show your code
>>
>>
>>
>> --
>> **********************************
>> JAGANADH G
>> http://jaganadhg.in
>> *ILUGCBE*
>> http://ilugcbe.org.in
>>
>
> @Jagan
>
>        That is great news. Will go ahead. Thanks :)
>
>
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> B.Tech ICT,
> SASTRA University.
> +91 9626975420
>
>
@Jagan

I had been executing my classifier through the command line only so far,
that is through /bin/mahout.

Just attempted to write this java file that takes a single file as input.

*package org.apache.mahout.classifier.bayes;*
*
*
*import java.io.BufferedReader;*
*import java.io.File;*
*import java.io.FileReader;*
*import java.io.IOException;*
*import java.util.List;*
*
*
*import org.apache.mahout.classifier.ClassifierResult;*
*import org.apache.mahout.classifier.bayes.algorithm.BayesAlgorithm;*
*import org.apache.mahout.classifier.bayes.common.BayesParameters;*
*import org.apache.mahout.classifier.bayes.datastore.InMemoryBayesDatastore;
*
*import
org.apache.mahout.classifier.bayes.exceptions.InvalidDatastoreException;*
*import org.apache.mahout.classifier.bayes.interfaces.Algorithm;*
*import org.apache.mahout.classifier.bayes.interfaces.Datastore;*
*import org.apache.mahout.classifier.bayes.model.ClassifierContext;*
*import org.apache.mahout.common.nlp.NGrams;*
*
*
*public class ramSample {*
*
*
* /***
* * @param args*
* * @throws IOException *
* * @throws InvalidDatastoreException *
* */*
* public static void main(String[] args) throws IOException,
InvalidDatastoreException {*
* final BayesParameters params=new BayesParameters();*
* params.setGramSize(1);*
*
params.setBasePath("/home/ramprakash-pt09/mahout-distribution-0.5/examples/src/main/java/org/apache/mahout/classifier/bayes/bayes-model");
*
* params.set( "verbose", "false" );*
* params.set( "classifierType", "bayes" );*
* params.set( "defaultCat", "OTHER" );*
* params.set( "encoding", "UTF-8" );*
* params.set( "alpha_i", "1.0" );*
* params.set( "dataSource", "hdfs" );*
* *
* Algorithm algorithm=new BayesAlgorithm();*
* Datastore datastore = new InMemoryBayesDatastore( params );*
* ClassifierContext classifier = new ClassifierContext( algorithm,
datastore );*
* classifier.initialize();*
* *
* File file=new
File("/home/ramprakash-pt09/mahout-distribution-0.5/examples/src/main/java/org/apache/mahout/classifier/bayes/input.txt");
*
* *
*      final BufferedReader reader = new BufferedReader( new FileReader(
file ) );*
*      String entry = reader.readLine();*
*      *
*      while( entry != null ) {*
*          List< String > document = new NGrams( entry, *
*                          Integer.parseInt( params.get( "gramSize" ) ) )*
*                          .generateNGramsWithoutLabel();*
*
*
*          ClassifierResult result = classifier.classifyDocument( *
*                           document.toArray( new String[ document.size() ]
), *
*                           params.get( "defaultCat" ));          *
*
*
*          entry = reader.readLine();*
* }*
* }*
*
*
*}*


On compiling and running this code, I get the following output :

*16 Dec, 2011 6:37:05 PM org.slf4j.impl.JCLLoggerAdapter info*
*INFO: 57425.12741460857*
*16 Dec, 2011 6:37:06 PM org.slf4j.impl.JCLLoggerAdapter info*
*INFO: pos -374948.0234153431 374948.0234153431 -1.0*
*16 Dec, 2011 6:37:06 PM org.slf4j.impl.JCLLoggerAdapter info*
*INFO: neg -236477.77478425365 374948.0234153431 -0.630694816391388 *
*
*
I have two categories : pos & neg. But this states both. When checking the
same input content through /bin mahout, the following is the output.

*INFO: Category for examples/ACTUAL/input.txt is
ClassifierResult{category='pos', score=35.42897640254213}*
*16 Dec, 2011 6:43:18 PM org.slf4j.impl.JCLLoggerAdapter info*
*
*
I can make the input folder parsing via Java IO, but this seems to be a
bigger problem now - running the classifier through a JAVA file. Sorry for
bugging and thanks for your response.


-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
B.Tech ICT,
SASTRA University.
+91 9626975420

Reply via email to