If you want to use Hadoop 0.23, there is no point in specifying 0.22 (a mostly abandoned branch), or 0.20 (an old version of the stable branch, but something I thought you didn't want to use for some reason). So I would simply stop bothering with any of that. Don't use SNAPSHOTs of anything.
examples / integration depend on core, but if core works, they should work. You have to 'mvn install' your core artifact locally to make it use it. Your error may be caused by that. Why do you want to use 0.23 in the first place? 1.1.x and 2.0.x are the best stable / experimental branches now. On Tue, Oct 30, 2012 at 11:27 AM, Diego Ceccarelli < [email protected]> wrote: > Thanks Sean, > > So I first tried commenting the hadoop-core dependency but it did not work, > then I added a different version for hadoop-core (0.22.0-SNAPSHOT) > and I was able to compile the mahout core ( mvn -P hadoop-0.23 install > -DskipTests) > I had errors with the integration and examples modules (and it > seems that I need to compile also them to run mahout). (integration > [1]), (examples errors: [2]) > > So I set hadoop-core version to 0.20.2, and I was able to compile > everything except > the integration module (which I excluded from the reactor). > When I run mahout anyway I got the same initial error. > So I used hadoop-core 0.22.0-SNAPSHOT and I compiled > separately mahout examples with the 0.20.2 version > > Then I tried to run lda on my twitter dataset: > > bin/mahout cvb -i /user/diegolo/twitter/tweets-rowid -o > /user/diegolo/twitter/text_lda -k 100 -dict > /user/diegolo/twitter/dictionary.file-0 --maxIter 20 > > The job started but I got this error: > > > 12/10/30 11:19:44 INFO mapreduce.Job: Running job: job_1351559192903_4948 > 12/10/30 11:19:55 INFO mapreduce.Job: Job job_1351559192903_4948 > running in uber mode : false > 12/10/30 11:19:55 INFO mapreduce.Job: map 0% reduce 0% > 12/10/30 11:20:07 INFO mapreduce.Job: Task Id : > attempt_1351559192903_4948_m_000001_0, Status : FAILED > Error: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot > be cast to org.apache.mahout.math.VectorWritable > at > org.apache.mahout.clustering.lda.cvb.CachingCVB0Mapper.map(CachingCVB0Mapper.java:55) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152) > > do you think is due to the dirty mix I did? why bin/mahout needs the > folder examples? > > Thanks, > Diego > > > [1] http://pastebin.com/q6VsSAFB > [2] http://pastebin.com/YvcegjBZ > > On Mon, Oct 29, 2012 at 11:20 PM, Sean Owen <[email protected]> wrote: > > I haven't tried it, I don't know if it works. From reading the pom.xml it > > looks like it should not consider hadoop-core a dependency if you select > > the other profile. If not, I don't know why. You could always just delete > > all the hadoop-core bits and do away with the alternate profile, that > would > > work. > > > > On Mon, Oct 29, 2012 at 10:07 PM, Diego Ceccarelli < > > [email protected]> wrote: > > > >> > But, most of all note that you are not looking for hadoop-core but > >> > hadoop-common > >> > >> Sorry, but it's 11 pm here and I'm bit tired ;) I don't understand the > >> above sentence: > >> in the main pom.xml hadoop-core and hadoop-common are imported with the > >> same > >> placeholder $hadoop.version, and the problem that I have is that i > >> can't compile > >> because maven does not find the version 0.23.3/4 of hadoop-core. > >> You are telling me that I have to exclude hadoop core? or to use an > >> older version > >> for the core? > >> Sorry again :( > >> > >> cheers > >> Diego > >> > >> > > > > -- > Computers are useless. They can only give you answers. > (Pablo Picasso) > _______________ > Diego Ceccarelli > High Performance Computing Laboratory > Information Science and Technologies Institute (ISTI) > Italian National Research Council (CNR) > Via Moruzzi, 1 > 56124 - Pisa - Italy > > Phone: +39 050 315 3055 > Fax: +39 050 315 2040 > ________________________________________ >
