Hi Harsh,

Yes, I agree with you that there needs to be a way to pass args to the jvm.
 As for app args, technically, it isn't necessary to have another env-var
for them, since users can just define them on their own scripts.  However,
adding the env-var you suggest, MAHOUT_APP_OPTS, would help clear some
confusion, IMO, as right now, what is set in bin/mahout for MAHOUT_OPTS
misleads users into thinking that those settings will actually be used.


On Wed, Sep 4, 2013 at 1:23 AM, Harsh J <[email protected]> wrote:

> Here's what am trying to say: In most of the other projects, such as
> Hadoop, Pig, Sqoop, Flume, etc., the PROJECT_OPTS is used to specify
> "Additional JVM arguments" rather than application arguments. It has
> been the same in Mahout too, so MAHOUT_OPTS wasn't ever intended to be
> a way to pass application options/configs to the runtime, but rather
> to control heap space/system properties/etc..
>
> The change you're proposing moves it AFTER the class invocation, which
> would break other uses relying on its right use today, so instead you
> could introduce a new env-var MAHOUT_APP_OPTS which goes after the
> classname and can accept all that -D generic conf params.
>
> On Sun, Sep 1, 2013 at 4:06 AM, Mario Rodriguez <[email protected]>
> wrote:
> > What I'm passing in MAHOUT_OPTS are parameters of the same nature of
> those
> > being set in bin/mahout:
> >
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.dir=$MAHOUT_LOG_DIR"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dhadoop.log.file=$MAHOUT_LOGFILE"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.min.split.size=512MB"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.child.java.opts=-Xmx4096m"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.child.java.opts=-Xmx4096m"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.output.compress=true"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.compress.map.output=true"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.tasks=1"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.tasks=1"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.factor=30"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.sort.mb=1024"
> > MAHOUT_OPTS="$MAHOUT_OPTS -Dio.file.buffer.size=32786"
> >
> >
> > I have a beefy dev box, and so can afford to tune those values.
> >
> > In the current exec call, those parameters are not considered in the
> tasks
> > being launched by org.apache.mahout.driver.MahoutDriver.
> >
> > I can look at this in more detail when Im back in the office on monday
> and
> > submit a JIRA ticket and patch (depending on how involved the right fix
> > turns out to be).
> >
> > Cheers,
> >
> > Mario
> >
> >>
> >>
> >> On Sat, Aug 31, 2013 at 2:34 PM, Harsh J <[email protected]> wrote:
> >>
> >>> I don't quite know what its used for, but that order change can be
> >>> considered incompatible, mainly cause in its current form it is (and
> >>> doubles up) applying directly to the JVM that launches Mahout, but the
> >>> changed form makes it into application-only arguments.
> >>>
> >>> On Sun, Sep 1, 2013 at 1:05 AM, Gokhan Capan <[email protected]>
> wrote:
> >>> > Hi Mario,
> >>> >
> >>> > Could you create a JIRA ticket for that, and submit your diff as a
> >>> patch if
> >>> > possible?
> >>> > http://issues.apache.org/jira/browse/MAHOUT
> >>> >
> >>> > Best,
> >>> > Gokhan
> >>> >
> >>> >
> >>> > On Sat, Aug 31, 2013 at 8:56 PM, Mario Rodriguez <
> >>> [email protected]>wrote:
> >>> >
> >>> >> Hi everyone,
> >>> >>
> >>> >> It seems MAHOUT_OPTS is not getting picked up when running mahout
> >>> locally
> >>> >> (MAHOUT_LOCAL=true).  This can be fixed by switching the order in
> which
> >>> >> MAHOUT_OPTS is passed in bin/mahout from:
> >>> >>
> >>> >> exec "$JAVA" $JAVA_HEAP_MAX $MAHOUT_OPTS -classpath "$CLASSPATH"
> $CLASS
> >>> >> "$@"
> >>> >>
> >>> >> to:
> >>> >>
> >>> >> exec "$JAVA" $JAVA_HEAP_MAX  -classpath "$CLASSPATH" $CLASS  "$@"
> >>> >> $MAHOUT_OPTS
> >>> >>
> >>> >>
> >>> >> I cant guarantee it wont break some other way of running it; it does
> >>> not
> >>> >> look like it will, but I have not tested it.
> >>> >>
> >>> >> Cheers,
> >>> >>
> >>> >> Mario
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>>
> >>
> >>
>
>
>
> --
> Harsh J
>

Reply via email to