I've made changes (patch in MAHOUT-167e.patch) to migrate the WikipediaDatasetCreatorEtc to 0.20.2 and the changes compile and the existing unit tests all run. But I had to port new 0.20 versions of MultipleOutputFormat and MultipleTextOutputFormat to do this and there are no unit tests for any of the wikipedia code in this package. Further, the code snippets to run the full example in the wiki (https://cwiki.apache.org/MAHOUT/wikipediabayesexample.html) are obsolete and build-deprecated.xml is no longer in trunk. This makes verifying the correctness of my port pretty difficult, for me at least since this is all unfamiliar code. What shall I do?

A. commit it, since the unit tests all run, and hope somebody else will verify the example
B. get help to run the example to verify it is correct, then commit it
C. leave the patch in jira and move on to utils

I'm loath to do A and would prefer to do B; however, C is what I'm going to have to do in the short term due to my schedule

Jeff

Reply via email to