Grant, I don't mean to belabor this, but I hate to have this public record of us misunderstanding each other quite so relentlessly. So I'm going to try one more time to see if I can phrase my point of view in such a way that we will be better aligned, and if I fail (or if we are indeed really poorly aligned) ,then so be it.
This starts with a question of the mission of Mahout, TLP or not. If the mission of Mahout is to focus on algorithms that are expressed as map-reduce on Hadoop, then, honestly, I don't think this code belongs. I've studied this in depth (and done a weak implementation), Jethran's done two implementations, my friend and colleague Dr. Scott Miller has done a few, and none of us think that this algorithm is going to fit. If in addition, the project wants to stick with Java programs, even more so. This particular algorithm is one in which none of us see a way to make map-reduce parallelism compensate for the fundamental limitations of Java floating point speed. There may be another way to cluster based on MI that can exploit map-reduce, but this isn't it. Once I get the code posted somewhere, I'll let you all know where, and you are welcome to argue at that point. My net impression is that the Mahout team might want to incorporate code that is outside the map-reduce corral, but is complementary to the broad mission of NLP algorithms, but that the team isn't excited about doing so right now. Then comes the process issue. I will write at the outset that I was making an incoherent and pretty unreasonable proposal about committer status. Because Java Map-Reduce technology is not applicable, at the moment, to things doing at our place of business, Jethran and I are not well-positioned to pass through the standard procedure for earning committer status on the project just now. It is true that other Apache projects have adopted committers in nonstandard ways, but, upon reflection, I don't see that as a valid analogy to the situation at hand. If you are curious, I can fill you in off-line as to the amusing tale of how I became a committer on WS-COMMONS. I confess that I'm puzzled about your comment about proxy commits. Comitters commit other people's work from JIRAs constantly, so that can't be what you are talking about. If the problem is someone misrepresenting work as their own, then that wouldn't arise in this case. If I gave the impression that I planned to mislead someone I apologize, I didn't mean to. In any case, I think the issue is moot, since I will explore what seems reasonable to the labs with the labs PMC. Regards, benson On Fri, Nov 20, 2009 at 8:44 AM, Grant Ingersoll <[email protected]>wrote: > > On Nov 19, 2009, at 7:49 PM, Benson Margulies wrote: > > > Grant, > > > > As someone who was given committer status on a project in a nonstandard > way, > > I have, perhaps, more flexible morals about the principles than you do. > > I said we could evaluate it, but so far I'm not really for it, as much as I > think the code would be useful to have as a part of Mahout, since it seems > the author of the code cannot speak on their own behalf, which, as Ted > points out, usually bodes ill for the sustainability of the project. > > > But > > I certainly have no intent to try to convince you that Mahout should > deviate > > from normal community procedures if you are not so inclined. > > > > This is a high-speed implementation of > > http://acl.ldc.upenn.edu/J/J92/J92-4003.pdf. Both Brown and Mercer are > on > > the author list. > > > > Yes, the author is not I. He works with me. > > > > It may be that the best solution here is for me to park it at the labs > using > > my status, and act as Jethran's agent. > > The ASF really frowns on proxy commits. I've seen people have creds > revoked over such a thing, so I would encourage you not to go that route. > > I don't understand why Jethran can't just put up the patch himself so that > we can have a look at it and evaluate whether it fits here or not. It's > hard for me to imagine a project being successful where the main author > doesn't have commit rights. If you're saying you have time for the commits > and that Jethren has time to do the work, then you and Jethren would > certainly have time to put up three or so quality patches to Mahout (you've > already done a few and we know you get how the ASF works, since you are a > Member and a committer, so it would not be a hard sell to the PMC). Given > Mahout is pretty young, it really does not take long to get committer > status. I frankly don't see why it is a big deal to go the normal route. > > > That would be a lot less work than > > teeing it up here, and I think that lazyness will win the day here. > > Understood. I'd probably just start a Google Code project in this case, if > it were me. I really do think it will fit under Mahout as TLP at some > point, so maybe the we can come back to this discussion at that point. > >
