Hi all,
https://issues.apache.org/jira/browse/MAHOUT-1309
https://issues.apache.org/jira/browse/MAHOUT-1310
https://issues.apache.org/jira/browse/MAHOUT-1311
This tickets is a part of porting mahout to Windows. After this change
mahout compile, build And also all mahout example scripts must work without
exception on Windows.
To summarize the general progress for Mahout on Windows we have worked on
the
the following tasks:
Ported *.sh scripts to cmd scripts
Fixed all failed unit tests to achieve 100% pass (results are visible
on Jenkins)
Created install/uninstall script for Mahout and integrated it to the
HDP MSI installer
Tested Mahout installation and made regression testing on the local
machine
Merged all changes from 0.9 branch to 0.7
Helped to configure system tests run on jenkins
Completed work on system tests corrections and system code updates to
allow 100% pass (results are visible on Jenkins)
Regarding list of changes:
Product code changes:
Changed the hadoop-core to version 1.2 for windows.
Added depending following packages: commons-cli:commons-cli:1.2,
org.apache.commons: commons-math:2.1, commons-lang:commons-lang:2.4,
commons-configuration: commons-configuration:1.9,
commons-httpclient:commons-httpclient:3.1, commons-io:commons-io:2.4,
com.google.guava:guava: r09, org.uncommons.maths:uncommons-maths:1.2.2.
These dependencies are required to run mahout on windows and passing unit
tests.
Added install scripts for mahout
Ported bin/mahout, example/bin/build-20news-bayes,
example/bin/build-cluster-syntheticcontrol, example/bin/build-reuters,
examples/bin/classify-20newsgroups with shell to cmd scripts
Added module winpkg that add mahout and installation scripts to
archive ready for use in HDP msi
During the assembly winpkg added plugin
com.google.code.maven-replacer-plugin:
replacer: 1.5.2 that set correct mahout version in installation script.
In classes TrainNewsGroups and SGDHelper fixed model saving path
from /tmp/ to C:/tmp/
Added mahout smoke test performed the next check:
Check for presence of the Mahout system variable
Check for presence of all jars required for its correct work
System tests modifications:
test_recommendation.py
Fixed separator in creating and parsing strings from \r\n
to \n
test_classification.py
Fixed path reading models from /tmp/news-group.model to
Machine.getTempDir ()/news-group.model
Added removing files that were generated during a previous
test run
test_clustering.py
Added copy reuters data to hdfs after extract it.