Deneb (Giorgio), The code involved is really quite heinous and we haven't been able to find volunteers to maintain this code in the past.
It might be possible to maintain a few selected algorithms, but we really have to move forward. On Sun, Apr 13, 2014 at 10:09 AM, Giorgio Zoppi <[email protected]>wrote: > The best thing, should be do a plan, and see how much effort do you need to > this. Then find out voluntaries to accomplish the task. Quite sure that > there a lot of people around there that they are willing to help out. > > BR, > deneb. > > > 2014-04-13 18:45 GMT+02:00 Sebastian Schelter <[email protected]>: > > > Hi, > > > > I took some days to let the latest discussion about the state and future > > of Mahout go through my head. I think the most important thing to address > > right now is the MapReduce "legacy" codebase. A lot of the MR algorithms > > are currently unmaintained, documentation is outdated and the original > > authors have abandoned Mahout. For some algorithms it is hard to get even > > questions answered on the mailinglist (e.g. RandomForest). I agree with > > Sean's comments that letting the code linger around is no option and will > > continue to harm Mahout. > > > > In the previous discussion, I suggested to make a radical move and aim to > > delete this codebase, but there were serious objections from committers > and > > users that convinced me that there is still usage of and interested in > that > > codebase. > > > > That puts us into a "legacy dilemma". We cannot delete the code without > > harming our userbase. On the other hand, I don't see anyone willing to > > rework the codebase. Further, the code cannot linger around anymore as it > > is doing now, especially when we fail to answer questions or don't > provide > > documentation. > > > > *We have to make a move*! > > > > I suggest the following actions with regard to the MR codebase. I hope > > that they find consent. If there are objections, please give > alternatives, > > *keeping everything as-is is not an option*: > > > > * reject any future MR algorithm contributions, prominently state this > on > > the website and in talks > > * make all existing algorithm code compatible with Hadoop 2, if there is > > no one willing to make an existing algorithm compatible, remove the > > algorithm > > * deprecate the existing MR algorithms, yet still take bug fix > > contributions > > * remove Random Forest as we cannot even answer questions to the > > implementation on the mailinglist > > > > There are two more actions that I would like to see, but'd be willing to > > give up if there are objections: > > > > * move the MR algorithms into a separate maven module > > * remove Frequent Pattern Mining again (we already aimed for that in 0.9 > > but had one user who shouted but never returned to us) > > > > Let me know what you think. > > > > --sebastian > > > > > > -- > Quiero ser el rayo de sol que cada día te despierta > para hacerte respirar y vivir en me. > "Favola -Moda". >
