Hi Harsh, Yes, I agree with you that there needs to be a way to pass args to the jvm. As for app args, technically, it isn't necessary to have another env-var for them, since users can just define them on their own scripts. However, adding the env-var you suggest, MAHOUT_APP_OPTS, would help clear some confusion, IMO, as right now, what is set in bin/mahout for MAHOUT_OPTS misleads users into thinking that those settings will actually be used.
On Wed, Sep 4, 2013 at 1:23 AM, Harsh J <[email protected]> wrote: > Here's what am trying to say: In most of the other projects, such as > Hadoop, Pig, Sqoop, Flume, etc., the PROJECT_OPTS is used to specify > "Additional JVM arguments" rather than application arguments. It has > been the same in Mahout too, so MAHOUT_OPTS wasn't ever intended to be > a way to pass application options/configs to the runtime, but rather > to control heap space/system properties/etc.. > > The change you're proposing moves it AFTER the class invocation, which > would break other uses relying on its right use today, so instead you > could introduce a new env-var MAHOUT_APP_OPTS which goes after the > classname and can accept all that -D generic conf params. > > On Sun, Sep 1, 2013 at 4:06 AM, Mario Rodriguez <[email protected]> > wrote: > > What I'm passing in MAHOUT_OPTS are parameters of the same nature of > those > > being set in bin/mahout: > > > > MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.dir=$MAHOUT_LOG_DIR" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.file=$MAHOUT_LOGFILE" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.min.split.size=512MB" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.child.java.opts=-Xmx4096m" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.child.java.opts=-Xmx4096m" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.output.compress=true" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.compress.map.output=true" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.tasks=1" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.tasks=1" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.factor=30" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.mb=1024" > > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.file.buffer.size=32786" > > > > > > I have a beefy dev box, and so can afford to tune those values. > > > > In the current exec call, those parameters are not considered in the > tasks > > being launched by org.apache.mahout.driver.MahoutDriver. > > > > I can look at this in more detail when Im back in the office on monday > and > > submit a JIRA ticket and patch (depending on how involved the right fix > > turns out to be). > > > > Cheers, > > > > Mario > > > >> > >> > >> On Sat, Aug 31, 2013 at 2:34 PM, Harsh J <[email protected]> wrote: > >> > >>> I don't quite know what its used for, but that order change can be > >>> considered incompatible, mainly cause in its current form it is (and > >>> doubles up) applying directly to the JVM that launches Mahout, but the > >>> changed form makes it into application-only arguments. > >>> > >>> On Sun, Sep 1, 2013 at 1:05 AM, Gokhan Capan <[email protected]> > wrote: > >>> > Hi Mario, > >>> > > >>> > Could you create a JIRA ticket for that, and submit your diff as a > >>> patch if > >>> > possible? > >>> > http://issues.apache.org/jira/browse/MAHOUT > >>> > > >>> > Best, > >>> > Gokhan > >>> > > >>> > > >>> > On Sat, Aug 31, 2013 at 8:56 PM, Mario Rodriguez < > >>> [email protected]>wrote: > >>> > > >>> >> Hi everyone, > >>> >> > >>> >> It seems MAHOUT_OPTS is not getting picked up when running mahout > >>> locally > >>> >> (MAHOUT_LOCAL=true). This can be fixed by switching the order in > which > >>> >> MAHOUT_OPTS is passed in bin/mahout from: > >>> >> > >>> >> exec "$JAVA" $JAVA_HEAP_MAX $MAHOUT_OPTS -classpath "$CLASSPATH" > $CLASS > >>> >> "$@" > >>> >> > >>> >> to: > >>> >> > >>> >> exec "$JAVA" $JAVA_HEAP_MAX -classpath "$CLASSPATH" $CLASS "$@" > >>> >> $MAHOUT_OPTS > >>> >> > >>> >> > >>> >> I cant guarantee it wont break some other way of running it; it does > >>> not > >>> >> look like it will, but I have not tested it. > >>> >> > >>> >> Cheers, > >>> >> > >>> >> Mario > >>> >> > >>> > >>> > >>> > >>> -- > >>> Harsh J > >>> > >> > >> > > > > -- > Harsh J >
