As promised, I removed the mentioned backlog issues.

I think we should return to what Sean did when he was still PMC chair: if there is no work on an issue for quite some time, close it. Keeping it lingering around and waiting for people come back for months justs "pollutes" our jira and makes us loose focus. If people come back, we can always reopen issues.

--sebastian




On 03/02/2014 08:48 PM, Ted Dunning wrote:
Inline comments.


On Sun, Mar 2, 2014 at 10:39 AM, Suneel Marthi <[email protected]>wrote:

 From the Top of my head:-








MAHOUT-525 Implement LatentFactorLogLinear models


Code is available for this at Ted's github repo, given that this JIRA >4
years old, is it still relevant today?


Nuke it.  The code is very old and doesn't much apply any more.


MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov
Model Training.


We have this in trunk, but its pending documentation and examples, this
needs to be addressed.


No opinion.



MAHOUT-668 Adding knn support to Mahout classifiers


Do we still need this in light of Streaming kmeans?


Let's refactor existing classifiers first.  Knn is not a mainstream
requirement.  RF and neural networks should handle the needs pretty well.



MAHOUT-772 Refactor Matrix/Vector implementation with linear operators


(Not sure what this is about)


Old idea.  Might come back some day, but we can nuke this for now.



MAHOUT-836 On donating my Robust PCA Java code to Mahout


This could be marked 'Do not Fix' or 'Invalid', given that Mahout already
has a robust SSVC-PCA.


SSVD isn't really a robust PCA, but I still agree.  Will not fix.



MAHOUT-918 Implement SGD based classifiers using MapReduce



Newer JIRA's cover this.  Close.


MAHOUT-928 Add the ARFF data
  loader/converter on DF


How important is this? May be marked this as 'Will not fix'?


Not very.

DNF for me.



MAHOUT-943 Improbe the way to make the split point on DF.


May not be needed in light of the recent fix by Sean, not sure though.


I think that Sean's fix is sufficient for now.



MAHOUT-968 Classifier based on restricted boltzmann machines


My $0.02, we should first stabilize what's already present in Mahout
before we attempt this 'Deep Learning' stuff.


Agreed.


MAHOUT-975 Bug in Gradient Machine - Computation of the gradient


Not needed in light of the MLP that will be part of Mahout soon.


Agreed.



MAHOUT-1153 Implement streaming random forests


Andy Twigg already has an implementation for this (based on Spark).


Andy hasn't been around Mahout for some time.  DNF for now.

MAHOUT-1177 GSOC 2013: Reform and simplify the clustering APIs

Definitely important, but need to address Sean's points first to get to
this.


Should be closed as is due to the tie to GSOC.


MAHOUT-1178 GSOC 2013: Improve Lucene support in Mahout


Again close since this is a GSOC project.



MAHOUT-1179 GSOC 2013: Refactor and improve the classification APIs


Likewise.


Definitely important, but need to address Sean's points first to get to
this.

MAHOUT-1193 We may want a BlockSparseMatrix


Gokhan, is this still needed?


Only weakly needed in my experience.



MAHOUT-1204 Rewrite Benchmarks using Caliper


I feel we should close this for now as 'Won't fix'.  Create a new JIRA if
there's renewed interest.


Last I checked caliper wasn't available via Maven, but that seems to have
changed.



MAHOUT-1206 Add density-based clustering algorithms to mahout


'Won't fix'


Agree.



MAHOUT-1257 performance improvement to LogLikehood


'Won't fix'. This patch only provides a very small marginal change in
performance.


Agree.



MAHOUT-716 Implement Boosting


Would be great to have, the patch is from pre-0.6  and needs cleaning up.


Agree.



MAHOUT-732 Implement ranking autoencoder on top of gradient machine


My $0.02, we should first stabilize what's already present in Mahout
before we attempt this 'Deep Learning' stuff.
Not to mention this patch is from pre-0.6 and the code needs to be cleaned
up.


Yes.  Let's close WONTFIX and open a new one if a coder shows up.


MAHOUT-880 Add some matrix method(like addition, subtraction,
  norm ...
etc) to DistributedRowMatrix



Close this and open a new one as needed.



MAHOUT-932 RandomForest quits with ArrayIndexOutOfBoundsException while

running sample

May have been addressed by Sean in the recent fix for RDF. Need to cross
check.


Good idea.



MAHOUT-1004 Distributed User-based Collaborative Filtering


'Won't fix'


Agree.



MAHOUT-1022 Process Mining Algorithm Example in Mahout


'Won't fix'


Agre.



MAHOUT-953 ArffVectorIterable does not gracefully handle duplicate
attribute name


Present patch is hacky, if someone could contribute a cleaner patch that
would be great.


If a better patch doesn't show up soon, WONTFIX is the right answer.


Reply via email to