[
https://issues.apache.org/jira/browse/MAHOUT-502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe Prasanna Kumar updated MAHOUT-502:
--------------------------------------
Attachment: MAHOUT-502.patch
patch modifies org.apache.mahout.common.CommandLineUtil to add footer note
> Adding footer note to command line utility
> ------------------------------------------
>
> Key: MAHOUT-502
> URL: https://issues.apache.org/jira/browse/MAHOUT-502
> Project: Mahout
> Issue Type: Improvement
> Components: Utils
> Reporter: Joe Prasanna Kumar
> Priority: Trivial
> Attachments: MAHOUT-502.patch
>
>
> Hi all,
> Since ClusterDumper doesnt seem to have elaborate documentation, just created
> a page https://cwiki.apache.org/confluence/display/MAHOUT/Cluster+Dumper
> While playing around with clusterdump utility, I learned that it can be run
> on hadoop or as a standalone java program.
> As most of you are aware, when executed on hadoop, the seqFileDir and
> pointsDir should be the HDFS location else the local system path location.
> Since some of the clustering related wiki pages specified that we can get the
> output from HDFS and then run clusterdump, I was assuming that the
> clusterdump would always read data from local FS.
> I am not sure if newbies would have this same thought process.. So I was
> thinking if we'd need to make this explicit by changing the help list of
> clusterdump
> Currently ClusterDumper.java has
> addOption(SEQ_FILE_DIR_OPTION, "s", "The directory containing Sequence Files
> for the Clusters", true);
> Should we specify something like
> addOption(SEQ_FILE_DIR_OPTION, "s", "The directory (HDFS if using Hadoop /
> Local filesystem if on standalone mode) containing Sequence Files for the
> Clusters", true);
> and so on..
> The problem with this approach is itz repetitive in that we'd need to change
> in quite a few places.. (I believe vectordump also follows the same principle)
> or
> should we modify CommandLineUtil to have a generic message in the help
> specifying the fact that while running hadoop, the directories should
> reference HDFS location else local FS.
> How about adding it to the footer like
> formatter.setFooter("Specify HDFS directories while running hadoop; else
> specify local File System directories");
> formatter.printFooter();
> Appreciate your feedbacks / thots.
> thanks
> Joe.
> from Jeff Eastman <[email protected]>
> reply-to [email protected]
> to [email protected]
> date Fri, Sep 3, 2010 at 2:45 PM
> subject Re: ClusterDumper - Hadoop or standalone ?
> mailed-by mahout.apache.org
> hide details Sep 3 (12 days ago)
> - Show quoted text -
> +1 to generic message approach
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.