[
https://issues.apache.org/jira/browse/MAHOUT-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015548#comment-13015548
]
Sean Owen commented on MAHOUT-651:
----------------------------------
Completely agree. It's likely a source of subtle bugs. I adjusted a lot of
methods recently along these lines. I'll try to commit your patch shortly.
> Pass hadoop configuration to methods that use FileSystem operations, even if
> they don't invoke map/reduce jobs
> --------------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-651
> URL: https://issues.apache.org/jira/browse/MAHOUT-651
> Project: Mahout
> Issue Type: Improvement
> Components: Clustering
> Affects Versions: 0.4
> Reporter: Robert Mahfoud
> Fix For: 0.5
>
> Attachments: patch-mahout-651.txt
>
>
> Some classes in the Classification component internally use the hadoop's
> FileSystem class, however, they instantiate the hadoop configuration locally
> in the method using {{new Configuration()}}. This limits the ability to
> integrate these tools within applications that manage and enrich their own
> configuration rather than rely on the default hadoop resources that get
> loaded when calling {{new Configuration()}}.
> The fix is simply to make these methods take a {{Configuration}} parameter
> rather than creating a new instance when needed. An example for an that
> creates a new {{Configuration}} instances is:
> {{org.apache.mahout.clustering.kmeans.KMeansUtil.configureWithClusterInfo(Path,
> List<Cluster>)}}
> This problem could also exists beyond the Clustering module, but this issue
> only addresses the Clustering code.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira