>From the Top of my head:-
On Sunday, March 2, 2014 1:16 PM, Sebastian Schelter <[email protected]> wrote: Hi all, A long time ago, we decided to move unresolved issues which we do not see to be going into the next release to a special fix version called "Backlog". We wanted to keep those issues open to give contributors a chance to return and finish their work. A lot of stuff has been accumulated there and work has stalled on that, mostly because people did not come with updated patches or did not fix issues with their code. I think the "backlog" approach has not proven to work, therefore I suggest we simply close all those issues to not have them "pollute" our jira. We can always reopen them if someone wants to restart working on them. I went through the issues and compiled a list of things that we should close. If there are issues that you want to pick up and fix, please shout, otherwise I will close the tickets one week from now. Here's the list: MAHOUT-525 Implement LatentFactorLogLinear models Code is available for this at Ted's github repo, given that this JIRA >4 years old, is it still relevant today? MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training. We have this in trunk, but its pending documentation and examples, this needs to be addressed. MAHOUT-668 Adding knn support to Mahout classifiers Do we still need this in light of Streaming kmeans? MAHOUT-772 Refactor Matrix/Vector implementation with linear operators (Not sure what this is about) MAHOUT-836 On donating my Robust PCA Java code to Mahout This could be marked 'Do not Fix' or 'Invalid', given that Mahout already has a robust SSVC-PCA. MAHOUT-918 Implement SGD based classifiers using MapReduce MAHOUT-928 Add the ARFF data loader/converter on DF How important is this? May be marked this as 'Will not fix'? MAHOUT-943 Improbe the way to make the split point on DF. May not be needed in light of the recent fix by Sean, not sure though. MAHOUT-968 Classifier based on restricted boltzmann machines My $0.02, we should first stabilize what's already present in Mahout before we attempt this 'Deep Learning' stuff. MAHOUT-975 Bug in Gradient Machine - Computation of the gradient Not needed in light of the MLP that will be part of Mahout soon. MAHOUT-1153 Implement streaming random forests Andy Twigg already has an implementation for this (based on Spark). MAHOUT-1177 GSOC 2013: Reform and simplify the clustering APIs Definitely important, but need to address Sean's points first to get to this. MAHOUT-1178 GSOC 2013: Improve Lucene support in Mahout MAHOUT-1179 GSOC 2013: Refactor and improve the classification APIs Definitely important, but need to address Sean's points first to get to this. MAHOUT-1193 We may want a BlockSparseMatrix Gokhan, is this still needed? MAHOUT-1204 Rewrite Benchmarks using Caliper I feel we should close this for now as 'Won't fix'. Create a new JIRA if there's renewed interest. MAHOUT-1206 Add density-based clustering algorithms to mahout 'Won't fix' MAHOUT-1257 performance improvement to LogLikehood 'Won't fix'. This patch only provides a very small marginal change in performance. MAHOUT-716 Implement Boosting Would be great to have, the patch is from pre-0.6 and needs cleaning up. MAHOUT-732 Implement ranking autoencoder on top of gradient machine My $0.02, we should first stabilize what's already present in Mahout before we attempt this 'Deep Learning' stuff. Not to mention this patch is from pre-0.6 and the code needs to be cleaned up. MAHOUT-880 Add some matrix method(like addition, subtraction, norm ... etc) to DistributedRowMatrix MAHOUT-932 RandomForest quits with ArrayIndexOutOfBoundsException while running sample May have been addressed by Sean in the recent fix for RDF. Need to cross check. MAHOUT-1004 Distributed User-based Collaborative Filtering 'Won't fix' MAHOUT-1022 Process Mining Algorithm Example in Mahout 'Won't fix' MAHOUT-953 ArffVectorIterable does not gracefully handle duplicate attribute name Present patch is hacky, if someone could contribute a cleaner patch that would be great. Best, Sebastian
