On 03/06/2015 12:44 PM, Pat Ferrel wrote:
This is great.

So we’ve talked about a name change and shortly we’ll be forced to come up with 
something the describes what Mahout has become. Most past users think of it as 
a scalable ML library on Hadoop. That may describe Mahout-Legacy but it seems 
like we need a name for the Scala DSL/Spark/other? part of the project. Lots of 
projects have sub-projects so we know there is no issue with naming 
sub-projects. So my question to everyone is:

Should (or can) the Top Level Project be renamed? If so to what?
I don't like the idea of a top level name change. I think that it would be a much better idea to direct our resources at polishing and developing what we have now. As well, especially for this release, I think that it would do a disservice to the "legacy" components (which as you point out have not been deprecated) with ~45 completed bugfixes and several more in the pipe.


If we don’t rename the TLP then what should we call legacy (not very appealing) 
and scala/DSL (not a name really)
agreed. Legacy is not the most appealing name. Maybe something like Mahout-MapReduce? Though that could cause some confusion regarding the "no new MapReduce code"

My opinion:
Since we are deemphasizing legacy I’m not sure there is a need to call 
attention to it by giving it a subproject name. However it is not deprecated so 
we need to include it in releases and even fix the minimum of critical bugs for 
some time to come.
agreed regarding fixing critical legacy bugs. Looking through the issues last night there didn't seem to me a lot of critical bugs, and probably a good amount of issues can be closed out as wont fix/not an issue.

Mahout is getting beat up in the circles of those who talk about such things 
and much of this is because people don’t understand what it has become. 
Therefore I’d like to see a project rename to reset expectations. Leave the 
name Mahout for legacy stuff and give a new name to the Scala environment. 
Split the builds and create new docs for the Scala stuff. This would seem to 
make it easier to document since legacy is most of what the CMS documents, we 
could create whole new template for the new project name.
What is the upside to splitting the builds? I'm not against it- I'm just not sure I understand.

Failing this, many of the same benefits could be gained by creating legacy and 
scala sub-projects with better names. This I know we can do and recall that 
things like MLlib are generally not tied to Spark when speaking about them. So 
a subproject could have very much its own identity.

Looking at the long history of Mahout it seems like the current generality was 
hard gained through implementing many special purpose algorithms, some of which 
were grad student projects. This is where MLlib is today in some ways. So a 
general framework and environment makes a lot of sense as the evolution of 
Mahout. Let’s give it a name, something better than DSL.
I think that a pretty clear description of what the other side of the project is has been emerging recently. IMO We need to start getting it out there. Probably a good start would be to update the front page of the mahout site.

I don't have any good ideas regarding names for this side of the project.


On Mar 5, 2015, at 7:43 PM, Andrew Musselman <[email protected]> wrote:

Thanks AP

On Thursday, March 5, 2015, Andrew Palumbo <[email protected]> wrote:

I went through all of the unresolved JIRA issues and marked all with at
least a "legacy" or "scala". (for lack of a better name for all that is not
legacy) label. Hopefully I got them all.

Some are labelled with both (math, build, documentation related to both or
neither, etc.)

legacy issues:

https://issues.apache.org/jira/browse/MAHOUT-1522?jql=
project%20%3D%20MAHOUT%20AND%20resolution%20%3D%
20Unresolved%20AND%20labels%20%3D%20scala%20ORDER%20BY%20priority%20DESC

"scala" issues:

https://issues.apache.org/jira/browse/MAHOUT-1522?jql=
project%20%3D%20MAHOUT%20AND%20resolution%20%3D%
20Unresolved%20AND%20labels%20%3D%20legacy%20ORDER%20BY%20priority%20DESC

Hopefully this will help us get started closing up some old issues. I'll
try to make another pass over them and close tomorrow and try to find some
that need to be closed out.


Reply via email to