Inline comments.
On Sun, Mar 2, 2014 at 10:39 AM, Suneel Marthi <[email protected]>wrote: > From the Top of my head:- > > > > > > > > > MAHOUT-525 Implement LatentFactorLogLinear models > > > Code is available for this at Ted's github repo, given that this JIRA >4 > years old, is it still relevant today? > Nuke it. The code is very old and doesn't much apply any more. > MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov >> Model Training. > > > We have this in trunk, but its pending documentation and examples, this > needs to be addressed. > No opinion. > > MAHOUT-668 Adding knn support to Mahout classifiers > > > Do we still need this in light of Streaming kmeans? > Let's refactor existing classifiers first. Knn is not a mainstream requirement. RF and neural networks should handle the needs pretty well. > > MAHOUT-772 Refactor Matrix/Vector implementation with linear operators > > > (Not sure what this is about) > Old idea. Might come back some day, but we can nuke this for now. > > MAHOUT-836 On donating my Robust PCA Java code to Mahout > > > This could be marked 'Do not Fix' or 'Invalid', given that Mahout already > has a robust SSVC-PCA. > SSVD isn't really a robust PCA, but I still agree. Will not fix. > > MAHOUT-918 Implement SGD based classifiers using MapReduce > > > Newer JIRA's cover this. Close. > MAHOUT-928 Add the ARFF data >> loader/converter on DF > > > How important is this? May be marked this as 'Will not fix'? > Not very. DNF for me. > > MAHOUT-943 Improbe the way to make the split point on DF. > > > May not be needed in light of the recent fix by Sean, not sure though. > I think that Sean's fix is sufficient for now. > > MAHOUT-968 Classifier based on restricted boltzmann machines > > > My $0.02, we should first stabilize what's already present in Mahout > before we attempt this 'Deep Learning' stuff. > Agreed. > MAHOUT-975 Bug in Gradient Machine - Computation of the gradient > > > Not needed in light of the MLP that will be part of Mahout soon. > Agreed. > > MAHOUT-1153 Implement streaming random forests > > > Andy Twigg already has an implementation for this (based on Spark). > Andy hasn't been around Mahout for some time. DNF for now. MAHOUT-1177 GSOC 2013: Reform and simplify the clustering APIs > > Definitely important, but need to address Sean's points first to get to > this. > Should be closed as is due to the tie to GSOC. > MAHOUT-1178 GSOC 2013: Improve Lucene support in Mahout > > Again close since this is a GSOC project. > > MAHOUT-1179 GSOC 2013: Refactor and improve the classification APIs > > Likewise. > Definitely important, but need to address Sean's points first to get to > this. > > MAHOUT-1193 We may want a BlockSparseMatrix > > > Gokhan, is this still needed? > Only weakly needed in my experience. > > MAHOUT-1204 Rewrite Benchmarks using Caliper > > > I feel we should close this for now as 'Won't fix'. Create a new JIRA if > there's renewed interest. > Last I checked caliper wasn't available via Maven, but that seems to have changed. > > MAHOUT-1206 Add density-based clustering algorithms to mahout > > > 'Won't fix' > Agree. > > MAHOUT-1257 performance improvement to LogLikehood > > > 'Won't fix'. This patch only provides a very small marginal change in > performance. > Agree. > > MAHOUT-716 Implement Boosting > > > Would be great to have, the patch is from pre-0.6 and needs cleaning up. > Agree. > > MAHOUT-732 Implement ranking autoencoder on top of gradient machine > > > My $0.02, we should first stabilize what's already present in Mahout > before we attempt this 'Deep Learning' stuff. > Not to mention this patch is from pre-0.6 and the code needs to be cleaned > up. > Yes. Let's close WONTFIX and open a new one if a coder shows up. > MAHOUT-880 Add some matrix method(like addition, subtraction, >> norm ... >> etc) to DistributedRowMatrix > > > Close this and open a new one as needed. > > MAHOUT-932 RandomForest quits with ArrayIndexOutOfBoundsException while > > running sample > > May have been addressed by Sean in the recent fix for RDF. Need to cross > check. > Good idea. > > MAHOUT-1004 Distributed User-based Collaborative Filtering > > > 'Won't fix' > Agree. > > MAHOUT-1022 Process Mining Algorithm Example in Mahout > > > 'Won't fix' > Agre. > > MAHOUT-953 ArffVectorIterable does not gracefully handle duplicate >> attribute name > > > Present patch is hacky, if someone could contribute a cleaner patch that > would be great. > If a better patch doesn't show up soon, WONTFIX is the right answer.
