[ https://issues.apache.org/jira/browse/MAHOUT-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225834#comment-13225834 ]
jirapos...@reviews.apache.org commented on MAHOUT-822: ------------------------------------------------------ bq. On 2012-03-08 23:42:23, Dmitriy Lyubimov wrote: bq. > trunk/core/src/main/java/org/apache/mahout/common/HadoopUtil.java, line 242 bq. > <https://reviews.apache.org/r/4237/diff/1/?file=88913#file88913line242> bq. > bq. > Extra white spaces generate style warnings but are notoriously hard to eliminate. I have no problems with them but it was a subject of review notes from other committers before, I am just reproducing collective mind here. Good point - I'll address this before committing. bq. On 2012-03-08 23:42:23, Dmitriy Lyubimov wrote: bq. > trunk/core/src/main/java/org/apache/mahout/common/HadoopUtil.java, line 247 bq. > <https://reviews.apache.org/r/4237/diff/1/?file=88913#file88913line247> bq. > bq. > is this really an issue with Hadoop 0.20.203? i don't think we care about classic hadoop 0.20.2 anymore. Hmm, I am not sure, but I think this is a comment we can omit. Without the new listStatus methods (but with the changes to other files), there is at least one test that won't compile. I'm willing to figure out who added this to the patch and ping them about it, but before I do, what should I ask them to clarify? Exactly which version of Hadoop they had in mind when making the changes to HadoopUtil? bq. On 2012-03-08 23:42:23, Dmitriy Lyubimov wrote: bq. > trunk/core/src/test/java/org/apache/mahout/clustering/meanshift/TestMeanShift.java, line 356 bq. > <https://reviews.apache.org/r/4237/diff/1/?file=88920#file88920line356> bq. > bq. > This and some other changes look suspiciously like a functional change rather than a hadoop compatibility change. I think it would help somebody else more familiar with this code to assure it is indeed benign.. The changes in tests were needed because some tests were requiring outputs to appear in a certain order. That is to say, we'd need to output "cluster 1 contains items a, b and d; cluster 2 contains c, e and f"; finding the same set of clusters but calling {c,e,f} "cluster 1" would not do. Under Hadoop 0.23.1, sometimes these outputs come in a different order. Most of the changes to tests simply allow permutations in the expected output. Mean shift was it's own special case - it is very sensitive to these ordering effects, and it is an iterative process. I didn't dig into the implementation enough to convince myself it was possible to devise a new test that would be immune to ordering (though it's likely this is true); I just found an input permutation that gave a consistent answer between 0.20.203 and 0.23.1. In the other cases I think I made the tests a little better, but in this case I made the test pass. Is there someone you'd nominate as an additional reviewer? - tom ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4237/#review5754 ----------------------------------------------------------- On 2012-03-08 01:24:21, tom pierce wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4237/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-03-08 01:24:21) bq. bq. bq. Review request for mahout and Dmitriy Lyubimov. bq. bq. bq. Summary bq. ------- bq. bq. This is the current patch for MAHOUT-822 (as posted by Bilung Lee). bq. bq. bq. This addresses bug MAHOUT-822. bq. https://issues.apache.org/jira/browse/MAHOUT-822 bq. bq. bq. Diffs bq. ----- bq. bq. trunk/core/pom.xml 1296318 bq. trunk/core/src/main/java/org/apache/mahout/common/HadoopUtil.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/classifier/df/mapreduce/partial/MockContext.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/classifier/df/mapreduce/partial/PartialSequentialBuilder.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/classifier/df/mapreduce/partial/Step1MapperTest.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/clustering/classify/ClusterClassificationDriverTest.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/clustering/meanshift/TestMeanShift.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/common/DummyCounter.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/common/DummyRecordWriter.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/common/DummyStatusReporter.java 1296318 bq. trunk/core/src/test/java/org/apache/mahout/math/hadoop/TestDistributedRowMatrix.java 1296318 bq. trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java 1296318 bq. trunk/pom.xml 1296318 bq. bq. Diff: https://reviews.apache.org/r/4237/diff bq. bq. bq. Testing bq. ------- bq. bq. Passes unit tests under default config as well as under hadoop 0.23.1. bq. bq. bq. Thanks, bq. bq. tom bq. bq. > Mahout needs to be made compatible with Hadoop .23 releases > ----------------------------------------------------------- > > Key: MAHOUT-822 > URL: https://issues.apache.org/jira/browse/MAHOUT-822 > Project: Mahout > Issue Type: Improvement > Components: build > Affects Versions: 0.6 > Reporter: Roman Shaposhnik > Labels: bigtop > Attachments: MAHOUT-822-build.patch.txt, MAHOUT-822.patch, > MAHOUT-822.patch, MAHOUT-822.patch, MAHOUT-822.patch, MAHOUT-822.patch, > MAHOUT-822.patch, MAHOUT-822.patch, MAHOUT-822.patch > > > As part of the Hadoop stack integration project (Apache Bigtop) we are now > trying to compile Mahout's upcoming 0.6 release against Hadoop 0.22 and 0.23. > I'm attaching the patch to Mahout's Maven build system that made it possible. > I would also like to request help in solving the real issues that poped > up when we tried to compile Mahout: > http://bigtop01.cloudera.org:8080/job/Bigtop-hadoop22/COMPONENT=mahout,label=centos5/6/console > {noformat} > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/Step0JobTest.java:[182,33] > org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be > instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/Step0JobTest.java:[218,9] > org.apache.mahout.df.mapreduce.partial.Step0JobTest.Step0Context is not > abstract and does not override abstract method getInputSplit() in > org.apache.hadoop.mapreduce.MapContext > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/Step0JobTest.java:[229,12] > cannot find symbol > [ERROR] symbol : constructor > Context(org.apache.hadoop.conf.Configuration,org.apache.hadoop.mapreduce.TaskAttemptID,<nulltype>,<nulltype>,<nulltype>,<nulltype>,<nulltype>) > [ERROR] location: class org.apache.hadoop.mapreduce.Mapper.Context > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/common/DummyRecordWriter.java:[68,18] > org.apache.hadoop.mapreduce.Mapper.Context is abstract; cannot be > instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/common/DummyRecordWriter.java:[77,19] > org.apache.hadoop.mapreduce.Reducer.Context is abstract; cannot be > instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/PartialSequentialBuilder.java:[110,30] > org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be > instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/PartialSequentialBuilder.java:[206,28] > org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/PartialSequentialBuilder.java:[227,30] > org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be > instantiated > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/MockContext.java:[30,6] > org.apache.mahout.df.mapreduce.partial.MockContext is not abstract and does > not override abstract method getInputSplit() in > org.apache.hadoop.mapreduce.MapContext > [ERROR] > /mnt/jenkins/workspace/workspace/Bigtop-hadoop22/COMPONENT/mahout/label/centos5/build/mahout/rpm/BUILD/apache-mahout-c298f70/core/src/test/java/org/apache/mahout/df/mapreduce/partial/MockContext.java:[38,10] > cannot find symbol > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira