[ https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837345#action_12837345 ]
Jake Mannix commented on MAHOUT-301: ------------------------------------ Hey Drew, thanks for looking at this. Problems you saw are probably what are known as "bugs". :) {quote} Did some testing, here's a patch to clean some of these things up + a couple questions: Could we load the default driver.classes.props from the classpath? If it was loaded that way the default would work regardless of where the mahout script is run from (it currently only works if ./bin/mahout is run, not ./mahout for example) and regardless of whether we're running from a binary release or the dev environment. (included in patch) {quote} YES! We should indeed load from classpath. My most recent version of this patch (which isn't posted, because it conflicts with yours, I'm trying to resolve that now) changes it so that you just supply a single directory in which driver.classes.props and the shortNames.props files are located. {quote} Something else I noticed is that the 'mahout' script doesn't add the classes in $MAHOUT_HOME/lib/*.jar to the classpath. This breakes the binary release in that it can't run anything, e.g: ./mahout vectordump Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli2/OptionException Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli2.OptionException (fixed in patch) {code} This wasn't a problem with my patch, right? That was an issue of the mahout script in trunk itself? {code} Using -core in the context of a dev build should work properly, but leaving out -core will cause the script to error unless run in the context of a release - this is the way it should work, right? {code} What is the -core option for? I've never used it, how does it work? {code} Also added a help message for the 'run' argument. {code} Where did you add that? {code} Does executing './mahout run --help' hang for anyone else or is it something specific to my environment? (didn't track this one down) {code} The --help option I didn't have in there, you added it, do you know where it's hanging? > Improve command-line shell script by allowing default properties files > ---------------------------------------------------------------------- > > Key: MAHOUT-301 > URL: https://issues.apache.org/jira/browse/MAHOUT-301 > Project: Mahout > Issue Type: New Feature > Components: Utils > Affects Versions: 0.3 > Reporter: Jake Mannix > Assignee: Jake Mannix > Priority: Minor > Fix For: 0.4 > > Attachments: MAHOUT-301-drew.patch, MAHOUT-301.patch, > MAHOUT-301.patch, MAHOUT-301.patch > > > Snippet from javadoc gives the idea: > {code} > /** > * General-purpose driver class for Mahout programs. Utilizes > org.apache.hadoop.util.ProgramDriver to run > * main methods of other classes, but first loads up default properties from > a properties file. > * > * Usage: run on Hadoop like so: > * > * $HADOOP_HOME/bin/hadoop -jar path/to/job > org.apache.mahout.driver.MahoutDriver [classes.props file] shortJobName \ > * [default.props file for this class] [over-ride options, all specified in > long form: --input, --jarFile, etc] > * > * TODO: set the Main-Class to just be MahoutDriver, so that this option > isn't needed? > * > * (note: using the current shell scipt, this could be modified to be just > * $MAHOUT_HOME/bin/mahout [classes.props file] shortJobName [default.props > file] [over-ride options] > * ) > * > * Works like this: by default, the file > "core/src/main/resources/driver.classes.prop" is loaded, which > * defines a mapping between short names like "VectorDumper" and fully > qualified class names. This file may > * instead be overridden on the command line by having the first argument be > some string of the form *classes.props. > * > * The next argument to the Driver is supposed to be the short name of the > class to be run (as defined in the > * driver.classes.props file). After this, if the next argument ends in > ".props" / ".properties", it is taken to > * be the file to use as the default properties file for this execution, and > key-value pairs are built up from that: > * if the file contains > * > * input=/path/to/my/input > * output=/path/to/my/output > * > * Then the class which will be run will have it's main called with > * > * main(new String[] { "--input", "/path/to/my/input", "--output", > "/path/to/my/output" }); > * > * After all the "default" properties are loaded from the file, any further > command-line arguments are taken in, > * and over-ride the defaults. > */ > {code} > Could be cleaned up, as it's kinda ugly with the whole "file named in > .props", but gives the idea. Really helps cut down on repetitive long > command lines, lets defaults be put props files instead of locked into the > code also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.