Hi,
I am doing text classification using complimentary Naïve bayes in Mahout 0.7
and Hadoop 1.0.2. I want to export the confusion matrix as HTML.
I running the following command
mahout cmdump -i PIPS0-testing/part-m-0 -o PIPS0-testing -ow -html
I am getting the following exception
Hi,
I tried vectordump and it created the CSV file. Is there any easy way to
convert this file or vectors into confusion matrix? Please suggest.
Regards,
Anand.C
-Original Message-
From: Chandra Mohan, Ananda Vel Murugan
Sent: Thursday, May 30, 2013 12:11 PM
To:
Hi Suneel,
Thanks, For the point 2, I tried to look how to achieve this using Lucene but
was not able to gather much information.
It would be helpful if you can guide me through the relevant links or samples
through which I can achieve Point 2.
Thanks
Stuti Awasthi
-Original Message-
Hi Suneel/Dmitriy,
I got mahout-examples-0.8-SNAPSHOT-job.jar compiled from trunk.
Now I got -us param as your mentioned for the input set working.
Steps followed are:
mahout arff.vector --input /mnt/cluster/t/PE_EXE/input-set.arff --output
/user/hadoop/t/input-set-vector/ --dictOut
You should be using 'pca' with ssvd
mahout ssvd -i /user/hadoop/t/input-set-vector/ -o
/user/hadoop/t/input-set-svd/ -k 50 --reduceTasks 2 -U true -V false -us
true -ow -pca true
You should be using USigma (U*Sigma) this is generated by the 'us' option.
Hello Suneel,
Thanks alot for quick reply for missing param.
mahout arff.vector --input /mnt/cluster/t/input-set.arff --output
/user/hadoop/t/input-set-vector/ --dictOut /mnt/cluster/t/input-set-dict
mahout ssvd --input /user/hadoop/t/input-set-vector/ --output
/user/hadoop/t/input-set-svd/ -k
Hi,
I got it working.
I wrote a utility class which takes the classification output (part-m-0)
and creates the confusion matrix. part-m-0 was a sequence file with vectors
and cmdump trying to convert Vectors into Matrix and hence I was getting the
error. I don't know whether it is a
Hi,
I want to do bottom up clustering (rather hierarchical clustering) rather
than top-down as mentioned in
https://cwiki.apache.org/MAHOUT/top-down-clustering.html
kmeans-clusterdump-clusterpp and then kmeans on each cluster
How to use centroid from first phase of canopy and use them for
The input to canopy is your vectors from seq2sparse and not cluster centroids
(as u had it), hence the error message u r seeing.
The output of canopy could be fed into kmeans as input centroids.
From: Rajesh Nikam rajeshni...@gmail.com
To:
Hello Suneel,
I got it. Next step to canopy is to feed these centroids to kmeans and
cluster.
However I want is to use centroids from these clusters and do clustering on
them so as to find related clusters.
Thanks
Rajesh
On Thu, May 30, 2013 at 8:38 PM, Suneel Marthi
Rajesh
The streaming k-means implementation is very much like what you are asking for.
The first pass is to cluster into many, many clusters and then cluster those
clusters.
Sent from my iPhone
On May 30, 2013, at 11:20, Rajesh Nikam rajeshni...@gmail.com wrote:
Hello Suneel,
I got
To add to Ted's reply, streaming k-means was recently added to Mahout (thanks
to Dan and Ted).
Here's the reference paper that talks about Streaming k-means:
http://books.nips.cc/papers/files/nips24/NIPS2011_1271.pdf
You have to be working off of trunk to use this, its not available as part
I believe this flow describes how to use lanczos svd in mahout to arrive at
the same reduction as ssvd already provides with pca and USigma options in
one step. This flow is irrelevant when working with ssvd, it already does
it all internally for you.
On May 30, 2013 5:45 AM, Rajesh Nikam
I.e. i guess you want to run kmeans directly on usigma output.
On May 30, 2013 9:37 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
I believe this flow describes how to use lanczos svd in mahout to arrive
at the same reduction as ssvd already provides with pca and USigma options
in one step.
Agree with Dmitriy.
From: Dmitriy Lyubimov dlie...@gmail.com
To: user@mahout.apache.org
Cc: Suneel Marthi suneel_mar...@yahoo.com
Sent: Thursday, May 30, 2013 12:39 PM
Subject: Re: Fwd: Re: convert input for SVD
I.e. i guess you want to run kmeans
Yes, how to run canopy/ kmeans on usigma output? What is the connecting
step? Please update on the same.
Thanks,
Rajesh
On May 30, 2013 10:09 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
I.e. i guess you want to run kmeans directly on usigma output.
On May 30, 2013 9:37 AM, Dmitriy Lyubimov
Hi All,
Is there a way to compute precision and recall values given a file of
recommendations and a test file of user preferences.
I know there is GenericRecommenderIRStatsEvaluator in Mahout to compute
the IR Stats but it takes a RecommenderBuilder object among others as
parameters to build a
THere's nothing direct, but you can probably save yourself time by copying
the code that computes these stats and apply them to your pre-computed
values. It's not terribly complex, just counting the intersection and union
size and deriving some stats from it.
The split is actually based on value
On Thu, May 30, 2013 at 12:01 PM, Sean Owen sro...@gmail.com wrote:
THere's nothing direct, but you can probably save yourself time by copying
the code that computes these stats and apply them to your pre-computed
values. It's not terribly complex, just counting the intersection and union
That's correct. Also that SnowballAnalyzer implicitly converts all text to
lower case and u could avoid that step in ur computation.
All of your keywords would have to be first run through the SnowballAnalyzer
and the same goes for your documents
before u make the call to
20 matches
Mail list logo