date:20140422

Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara

I am Final year BE Student from Gujarat,India. right now studying in Information Technology Branch. i have Final Year project as Document Clustering using Hadoop. At this stage i am able to find final result from cluster dump command in which i can see number of document in particular cluster and

Does Mahout handle missing values in train and test data, for Decision Forest?

2014-04-22 Thread Himanshu

In Weka it is possible to mark the field with a question mark ? for unknown values and these are handled. Is there a similar way to mark unknown/missing field values in Mahout training and test data as well. Appreciate any suggestions/pointers. Breiman talks about two ways to handle missing

Re: Does Mahout handle missing values in train and test data, for Decision Forest?

2014-04-22 Thread Sean Owen

From looking at the code recently, no it is not handled. On Tue, Apr 22, 2014 at 1:27 PM, Himanshu himanshu.ash...@gmail.com wrote: In Weka it is possible to mark the field with a question mark ? for unknown values and these are handled. Is there a similar way to mark unknown/missing field

Re: Is there any website documentation repository or tool for Apache Mahout?

2014-04-22 Thread tuxdna

Also can anyone explain how the .mdtext files are eventually converted into HTML for the current Mahout website? I guess there is a static site generator written in Perl ( lib/view.pm and lib/path.pm ). But what really invokes the site generation in terms of the entry point? I was able to

Getting error in qualcluster command

2014-04-22 Thread Darshan Sonagara

I want to analyze cluster which i did clustering on mahout by kmeans algorithm. In qualcluster command there is an comman linne argument as -c what kind of file i need to give as input for kmeans algorithm. I did it for streaming kmeans. It worked. But every time i run qualcluster i am getting

Re: Getting error in qualcluster command

2014-04-22 Thread Suneel Marthi

What is the error u r seeing? the output from KMeans is (IntWritable, ClusterWritable) and for Streaming KMeans its (IntWritable, CentroidWritable) QualCluster may be expecting the later and hence works for Streaming KMeans. Could u post the error u r seeing? On Tue, Apr 22, 2014 at 9:12 AM,

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Ted Dunning

On Tue, Apr 22, 2014 at 12:11 AM, Darshan Sonagara darshan.sonag...@gmail.com wrote: But the problem is that i want check that whether my clustering is good or bad. so for that i need to calculate Entropy Value. I am not having any idea how to calculate entropy in mahout or by other

Re: Getting error in qualcluster command

2014-04-22 Thread Darshan Sonagara

yes exactly sir, it is expecting CentroidWritable. so error is like ClassCast Exception. i will send snapshot soon. but can you tell me one thing that every time i run qualcluster for Streaming KMeans it is showing different output. why it is like that ? and as you you instructed earlier i checked

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara

Thnks for the Replay sir, actually i am doing clustering for gathering similar king of document in same cluster as much as possible. i can see from output file by cluster dump by observing top term. i also figure out that by varying Distance Measure Technique. it differs. but i want some

Re: Getting error in qualcluster command

2014-04-22 Thread Darshan Sonagara

waiting for the replay sir . . . . On Tue, Apr 22, 2014 at 7:00 PM, Darshan Sonagara darshan.sonag...@gmail.com wrote: yes exactly sir, it is expecting CentroidWritable. so error is like ClassCast Exception. i will send snapshot soon. but can you tell me one thing that every time i run

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara

waiting for the replay sir . On Tue, Apr 22, 2014 at 7:13 PM, Darshan Sonagara darshan.sonag...@gmail.com wrote: Thnks for the Replay sir, actually i am doing clustering for gathering similar king of document in same cluster as much as possible. i can see from output file by cluster dump

Re: Spark Mahout with a CLI?

2014-04-22 Thread Pat Ferrel

Sebastian created an example import around this that is really nice for several reasons so anyone interested should check out https://issues.apache.org/jira/browse/MAHOUT-1518, make sure to look at the patches, the comment thread is a bit cluttered. 1) Spark is awesome because of it’s use of

Question Regarding Entropy calculation in Mahout

Does Mahout handle missing values in train and test data, for Decision Forest?

Re: Does Mahout handle missing values in train and test data, for Decision Forest?

Re: Is there any website documentation repository or tool for Apache Mahout?

Getting error in qualcluster command

Re: Getting error in qualcluster command

Re: Question Regarding Entropy calculation in Mahout

Re: Getting error in qualcluster command

Re: Question Regarding Entropy calculation in Mahout

Re: Getting error in qualcluster command

Re: Question Regarding Entropy calculation in Mahout

Re: Spark Mahout with a CLI?

12 matches

Site Navigation

Mail list logo

Footer information