Inline comments.

On Sun, Mar 2, 2014 at 10:39 AM, Suneel Marthi <[email protected]>wrote:

> From the Top of my head:-
>
>
>
>
>
>
>
>
> MAHOUT-525 Implement LatentFactorLogLinear models
>
>
> Code is available for this at Ted's github repo, given that this JIRA >4
> years old, is it still relevant today?
>

Nuke it.  The code is very old and doesn't much apply any more.


> MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov
>> Model Training.
>
>
> We have this in trunk, but its pending documentation and examples, this
> needs to be addressed.
>

No opinion.


>
> MAHOUT-668 Adding knn support to Mahout classifiers
>
>
> Do we still need this in light of Streaming kmeans?
>

Let's refactor existing classifiers first.  Knn is not a mainstream
requirement.  RF and neural networks should handle the needs pretty well.


>
> MAHOUT-772 Refactor Matrix/Vector implementation with linear operators
>
>
> (Not sure what this is about)
>

Old idea.  Might come back some day, but we can nuke this for now.


>
> MAHOUT-836 On donating my Robust PCA Java code to Mahout
>
>
> This could be marked 'Do not Fix' or 'Invalid', given that Mahout already
> has a robust SSVC-PCA.
>

SSVD isn't really a robust PCA, but I still agree.  Will not fix.


>
> MAHOUT-918 Implement SGD based classifiers using MapReduce
>
>
>
Newer JIRA's cover this.  Close.


> MAHOUT-928 Add the ARFF data
>>  loader/converter on DF
>
>
> How important is this? May be marked this as 'Will not fix'?
>

Not very.

DNF for me.


>
> MAHOUT-943 Improbe the way to make the split point on DF.
>
>
> May not be needed in light of the recent fix by Sean, not sure though.
>

I think that Sean's fix is sufficient for now.


>
> MAHOUT-968 Classifier based on restricted boltzmann machines
>
>
> My $0.02, we should first stabilize what's already present in Mahout
> before we attempt this 'Deep Learning' stuff.
>

Agreed.


> MAHOUT-975 Bug in Gradient Machine - Computation of the gradient
>
>
> Not needed in light of the MLP that will be part of Mahout soon.
>

Agreed.


>
> MAHOUT-1153 Implement streaming random forests
>
>
> Andy Twigg already has an implementation for this (based on Spark).
>

Andy hasn't been around Mahout for some time.  DNF for now.

MAHOUT-1177 GSOC 2013: Reform and simplify the clustering APIs
>
> Definitely important, but need to address Sean's points first to get to
> this.
>

Should be closed as is due to the tie to GSOC.


> MAHOUT-1178 GSOC 2013: Improve Lucene support in Mahout
>
>
Again close since this is a GSOC project.


>
> MAHOUT-1179 GSOC 2013: Refactor and improve the classification APIs
>
>
Likewise.


> Definitely important, but need to address Sean's points first to get to
> this.
>
> MAHOUT-1193 We may want a BlockSparseMatrix
>
>
> Gokhan, is this still needed?
>

Only weakly needed in my experience.


>
> MAHOUT-1204 Rewrite Benchmarks using Caliper
>
>
> I feel we should close this for now as 'Won't fix'.  Create a new JIRA if
> there's renewed interest.
>

Last I checked caliper wasn't available via Maven, but that seems to have
changed.


>
> MAHOUT-1206 Add density-based clustering algorithms to mahout
>
>
> 'Won't fix'
>

Agree.


>
> MAHOUT-1257 performance improvement to LogLikehood
>
>
> 'Won't fix'. This patch only provides a very small marginal change in
> performance.
>

Agree.


>
> MAHOUT-716 Implement Boosting
>
>
> Would be great to have, the patch is from pre-0.6  and needs cleaning up.
>

Agree.


>
> MAHOUT-732 Implement ranking autoencoder on top of gradient machine
>
>
> My $0.02, we should first stabilize what's already present in Mahout
> before we attempt this 'Deep Learning' stuff.
> Not to mention this patch is from pre-0.6 and the code needs to be cleaned
> up.
>

Yes.  Let's close WONTFIX and open a new one if a coder shows up.


> MAHOUT-880 Add some matrix method(like addition, subtraction,
>>  norm ...
>> etc) to DistributedRowMatrix
>
>
>
Close this and open a new one as needed.


>
> MAHOUT-932 RandomForest quits with ArrayIndexOutOfBoundsException while
>
> running sample
>
> May have been addressed by Sean in the recent fix for RDF. Need to cross
> check.
>

Good idea.


>
> MAHOUT-1004 Distributed User-based Collaborative Filtering
>
>
> 'Won't fix'
>

Agree.


>
> MAHOUT-1022 Process Mining Algorithm Example in Mahout
>
>
> 'Won't fix'
>

Agre.


>
> MAHOUT-953 ArffVectorIterable does not gracefully handle duplicate
>> attribute name
>
>
> Present patch is hacky, if someone could contribute a cleaner patch that
> would be great.
>

If a better patch doesn't show up soon, WONTFIX is the right answer.

Reply via email to