Am i able to run `Decision tree` from mahout in Eclipse without installing.
Should i `install` Mahout in my system or download all `jar` dependencies
and include them in lib.
I want to Know the working of Decision Tree.
Where can i find the `source code` for Mahout Decision tree.
--
*Thanks & R
Hi Andrew et al.,
I have the following statement in my pig script.
AU = FOREACH A GENERATE myparser.myUDF(param1, param2); STORE AU into
'/scratch/AU';
AU has the following format:
(userid, (item_view_history))
(27,(0,1,1,0,0))(28,(0,0,1,0,0))(29,(0,0,1,0,1))(30,(1,0,1,0,1))
I will have at least
In the meantime, you might apply the patch in MAHOUT-1354, build mahout
using mvn package -Phadoop2 -DskipTests=true, use that mahout version and
see if that works
Gokhan
On Wed, Dec 11, 2013 at 10:09 PM, Gokhan Capan wrote:
> I apologize, Suneel is right, Counter breaks the binary compatibili
Here are the full contents of my pom file:
http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
4.0.0
clustertest
clustertest
1.0
jar
cluster
I apologize, Suneel is right, Counter breaks the binary compatibility.
Well, I can say there is a work in progress for building mahout against
hadoop2.
Gokhan
On Wed, Dec 11, 2013 at 10:03 PM, Hi There wrote:
> Here are the full contents of my pom file:
>
> http://maven.apache.org/POM/4.0.0";
Per this link, one notability incompatibility is Counter and CounterGroup.
http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html
On Wednesday, December 11, 2013 2:46 PM, Hi There wrote:
I tried to run SparseVe
Could you check the following?
Are you sure that your hadoop cluster is hadoop 2.2.0?
Are you sure other dependencies of your project do not have a transitive
dependency to hadoop?
Gokhan
On Wed, Dec 11, 2013 at 9:46 PM, Hi There wrote:
> I tried to run SparseVectorsFromSequenceFiles, specify
I tried to run SparseVectorsFromSequenceFiles, specifying a directory with
sequence files, and I got the following error:
java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.Counter, but class was expected
Here is a relevant snippet of my pom
Hi Zoltan,
I am saying that hadoop2-stable and hadoop1 are binary compatible. I don't know
what version of hadoop is used in cdh4-mr2 but I guess it was hadoop2 alpha,
since bigtop was at hadoop 2.0.6 alpha last time I checked, which was last week.
Just try it and let us know if you experience
Sebastian,
R we still using SplitInputJob, seems like its been replaced by a much newer
SplitInput.
Do u think this needs to be purged from the codebase for 0.9, its been marked
as deprecated anyways?
On Wednesday, December 11, 2013 2:08 PM, Suneel Marthi
wrote:
A quick search thru the
A quick search thru the codebase has the following using old mapred :-
DistributedRowMatrix
SplitInputJob
MatrixMultiplicationJob
BtJob
TransposeJob
TimesSquaredJob
ABtJob
ABtDenseOutJob
BtJob
QJob
QRFirstStep
On Wednesday, December 11, 2013 2:01 PM, Sebastian Schelter
wrote:
I think t
I think there are still parts of the code (e.g. in DistributedRowMatrix)
that use the old API.
--sebastian
On 11.12.2013 19:56, Suneel Marthi wrote:
> Mahout is using the newer mapreduce API and not the older mapred API.
> Was that what u were looking for?
>
>
>
>
>
> On Wednesday, December
Mahout is using the newer mapreduce API and not the older mapred API.
Was that what u were looking for?
On Wednesday, December 11, 2013 1:53 PM, Zoltan Prekopcsak
wrote:
Hi Gokhan,
Thank you for the clarification.
Does it mean that Mahout is using the mapred API everywhere and there is
Hi Gokhan,
Thank you for the clarification.
Does it mean that Mahout is using the mapred API everywhere and there is
no mapreduce API left? As far as I know, the mapreduce API needs to be
recompiled and I remember needing to recompile Mahout for CDH4 when it
first came out.
Thanks, Zoltan
This is not right. THe sequential version would have finished long before
this for any reasonable value of k.
I do note, however, that you have set k = 200,000 where you only have
300,000 documents. Depending on which value you set (I don't have the code
handy), this may actually be increased in
Hi,
I first tried Streaming K-means with about 5000 news stories, and it worked
just fine. Then I tried it over 300,000 news stories and gave it 10GB of
RAM. After more than 43 hours, It was still in the last merge-pass when I
eventually decided to stop it.
I set K to 20 and KM 2522308 (its f
I am currently using naive bayes for text classification.
I prefer NB over SVM because;
- SVM has long training time
- NB can be incremental
- NB can be fully parallel
the main decisions you should make while using NB is using tf or tfidf and
using binary NB or multinomial
if you classify short te
17 matches
Mail list logo