What I'm passing in MAHOUT_OPTS are parameters of the same nature of those being set in bin/mahout:
MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.dir=$MAHOUT_LOG_DIR" MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.file=$MAHOUT_LOGFILE" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.min.split.size=512MB" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.child.java.opts=-Xmx4096m" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.child.java.opts=-Xmx4096m" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.output.compress=true" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.compress.map.output=true" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.tasks=1" MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.tasks=1" MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.factor=30" MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.mb=1024" MAHOUT_OPTS="$MAHOUT_OPTS -Dio.file.buffer.size=32786" I have a beefy dev box, and so can afford to tune those values. In the current exec call, those parameters are not considered in the tasks being launched by org.apache.mahout.driver.MahoutDriver. I can look at this in more detail when Im back in the office on monday and submit a JIRA ticket and patch (depending on how involved the right fix turns out to be). Cheers, Mario > > > On Sat, Aug 31, 2013 at 2:34 PM, Harsh J <[email protected]> wrote: > >> I don't quite know what its used for, but that order change can be >> considered incompatible, mainly cause in its current form it is (and >> doubles up) applying directly to the JVM that launches Mahout, but the >> changed form makes it into application-only arguments. >> >> On Sun, Sep 1, 2013 at 1:05 AM, Gokhan Capan <[email protected]> wrote: >> > Hi Mario, >> > >> > Could you create a JIRA ticket for that, and submit your diff as a >> patch if >> > possible? >> > http://issues.apache.org/jira/browse/MAHOUT >> > >> > Best, >> > Gokhan >> > >> > >> > On Sat, Aug 31, 2013 at 8:56 PM, Mario Rodriguez < >> [email protected]>wrote: >> > >> >> Hi everyone, >> >> >> >> It seems MAHOUT_OPTS is not getting picked up when running mahout >> locally >> >> (MAHOUT_LOCAL=true). This can be fixed by switching the order in which >> >> MAHOUT_OPTS is passed in bin/mahout from: >> >> >> >> exec "$JAVA" $JAVA_HEAP_MAX $MAHOUT_OPTS -classpath "$CLASSPATH" $CLASS >> >> "$@" >> >> >> >> to: >> >> >> >> exec "$JAVA" $JAVA_HEAP_MAX -classpath "$CLASSPATH" $CLASS "$@" >> >> $MAHOUT_OPTS >> >> >> >> >> >> I cant guarantee it wont break some other way of running it; it does >> not >> >> look like it will, but I have not tested it. >> >> >> >> Cheers, >> >> >> >> Mario >> >> >> >> >> >> -- >> Harsh J >> > >
