Re 1 and 2, i suppose there are the same old criteria for any contributions as they were before. in order of importance: * absolutely doesn't break anything existing * contains pledge of future support * coherent in architecture and compatible in input/output * has potential for user demand * attracting developers and not just users is big extra plus at this point for the project.
I suppose in this sense there could be no other plan than just do what we have always done. i.e. filter any new jira thru this list as it takes any shape or form of an issue or a patch. I think we have a fair understanding what the h2o product is, but we have no idea what contribution patches might be. There's got to be a difference between these two, since just putting two things side by side under same name doesn't really make any functional difference compared to where it is all now. Some of spark issues on the other hand are already in and were discussed (perhaps not in hangouts though; but in jira and in person) for about half a year now. regarding 1248, i am not optimistic it is going to happen with MR versions. We discussed hyperparameter search at length when ALS stuff was first introduced to Mahout and it did not yield ideas about sensible iterative implementation. Yeah, i guess you could do a grid search in parallel, but it is usually a disaster under big data settings (you don't want to multiply your "big" by the "grid"), so you need something more iterative, and iterative is a disaster on MR. So i am dubious anyone really will take on it. there's very little pragmatical sense to do it. Non-grid iterative search for hyperparameters is actually much more promising for spark methods, in comparison. On Tue, Mar 18, 2014 at 8:45 PM, Saikat Kanjilal <[email protected]>wrote: > Hi Guys, > I read through the email threads with the weigh ins for the inclusion of > H2O as well as spark and wanted to circle back on the plan for folks to > meet around 1.0, so a few questions: > > 1) How does the inclusion of H2O and spark weigh in importance versus the > current JIRA items that are existing for potentially new feature work to be > done in mahout (in my case JIRA 1248/1249) > 2) From reading all the responses it doesn't seem like there's full > consensus on what the next steps are for h2o and how that relates to the > roadmap around 1.0, please correct me if I'm misunderstanding, can someone > outline whether any concrete decisions have been made on whether or not > mahout 1.0 will include h2o bindings > 3) Are we moving forward with the google hangout , I didnt receive > anything about this yet > > > Thanks in advance. >
