Jenkins build is unstable: Mahout-Quality #1405
See https://builds.apache.org/job/Mahout-Quality/1405/changes
[jira] [Commented] (MAHOUT-981) Refactor KMeans Clustering into a separate post process with outlier pruning
[ https://issues.apache.org/jira/browse/MAHOUT-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234184#comment-13234184 ] Hudson commented on MAHOUT-981: --- Integrated in Mahout-Quality #1405 (See [https://builds.apache.org/job/Mahout-Quality/1405/]) Mahout-981, Fixing test cases which are keeping clusters-*-final in the same directory for canopy and kmeans. (Revision 1303282) Result = SUCCESS Refactor KMeans Clustering into a separate post process with outlier pruning Key: MAHOUT-981 URL: https://issues.apache.org/jira/browse/MAHOUT-981 Project: Mahout Issue Type: Sub-task Components: Classification, Clustering Affects Versions: 0.6 Reporter: Paritosh Ranjan Assignee: Paritosh Ranjan Labels: classification, clustering Fix For: 0.7 Attachments: MAHOUT-981.txt Use ClusterClassificationDriver to refactor clustering out of KMeansDriver with outlier pruning support. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches -- Key: MAHOUT-994 URL: https://issues.apache.org/jira/browse/MAHOUT-994 Project: Mahout Issue Type: Bug Components: Integration Affects Versions: 0.6 Reporter: Roman Shaposhnik Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
[ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234521#comment-13234521 ] Dmitriy Lyubimov commented on MAHOUT-994: - What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings? mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches -- Key: MAHOUT-994 URL: https://issues.apache.org/jira/browse/MAHOUT-994 Project: Mahout Issue Type: Bug Components: Integration Affects Versions: 0.6 Reporter: Roman Shaposhnik Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
[ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234521#comment-13234521 ] Dmitriy Lyubimov edited comment on MAHOUT-994 at 3/21/12 5:13 PM: -- What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings? Could you please describe the solution? was (Author: dlyubimov): What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings? mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches -- Key: MAHOUT-994 URL: https://issues.apache.org/jira/browse/MAHOUT-994 Project: Mahout Issue Type: Bug Components: Integration Affects Versions: 0.6 Reporter: Roman Shaposhnik Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
MatrixVectorView Considered Harmful
Can anyone tell me what the following implementation choice will cause: public class MatrixVectorView extends AbstractVector { //... public IteratorElement iterateNonZero() { return iterator(); } // ... } Note that MatrixVectorView is returned from every call to viewRow(), and getRow() was removed in the last release. -- -jake
Re: MatrixVectorView Considered Harmful
This causes implementations that don't over-ride that method to lose the benefits of sparsity when iterating through rows. I deduce from the existence of your email that important sparse matrix implementations suffer from this defect. On Wed, Mar 21, 2012 at 10:32 PM, Jake Mannix jake.man...@gmail.com wrote: Can anyone tell me what the following implementation choice will cause: public class MatrixVectorView extends AbstractVector { //... public IteratorElement iterateNonZero() { return iterator(); } // ... } Note that MatrixVectorView is returned from every call to viewRow(), and getRow() was removed in the last release. -- -jake
Jenkins build is still unstable: Mahout-Quality #1406
See https://builds.apache.org/job/Mahout-Quality/changes
[jira] [Commented] (MAHOUT-984) Refactor Fuzzy K Means Clustering into a separate post process with outlier pruning
[ https://issues.apache.org/jira/browse/MAHOUT-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235364#comment-13235364 ] Saikat Kanjilal commented on MAHOUT-984: Paritosh, I'm running into a strange issue, I've refactored the FuzzyKMeansDriver similar to KMeansDriver and to use the FuzzyKMeansClusteringPolicy with the other logic being pretty much the same. The unit test for FuzzyKMeansDriver when run individually passes, however the unit test fails when I go to run all the unit tests together. I am attaching the clusterData function here, any ideas on this? Regards public static void clusterData(Path input, Path clustersIn, Path output, DistanceMeasure measure, double convergenceDelta, float m, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException, ClassNotFoundException, InterruptedException { if (log.isInfoEnabled()) { log.info(Running Clustering); log.info(Input: {} Clusters In: {} Out: {} Distance: {}, new Object[] {input, clustersIn, output, measure}); } ClusterClassifier.writePolicy(new FuzzyKMeansClusteringPolicy((double)m,convergenceDelta), clustersIn); ClusterClassificationDriver.run(input, output, new Path(output, CLUSTERED_POINTS_DIRECTORY), threshold, true, runSequential); } Refactor Fuzzy K Means Clustering into a separate post process with outlier pruning --- Key: MAHOUT-984 URL: https://issues.apache.org/jira/browse/MAHOUT-984 Project: Mahout Issue Type: Sub-task Components: Clustering Affects Versions: 0.6 Reporter: Paritosh Ranjan Assignee: Paritosh Ranjan Labels: clustering Fix For: 0.7 Use ClusterClassificationDriver to refactor clustering out of FuzzyKMeansDriver with outlier pruning support. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira