On Sun, Apr 6, 2014 at 8:54 AM, Sean Owen <sro...@gmail.com> wrote: > On Sun, Apr 6, 2014 at 4:16 PM, Andrew Musselman > <andrew.mussel...@gmail.com> wrote: > > Seems to me there has been a renewed effort to eat our broccoli, along > with > > the other ideas people have been bringing on board. > > > > What are you proposing to put in the board report? > > I have not seen significant activity to unify or update the existing > code. It's still the same different chunks with different styles, > input/output, distributed/not, etc. The doc updates look very > positive. To be fair the task of really addressing the technical debt > is very large, so even making said dent would be a lot of work. A > clean-slate reboot therefore actually seems like a good plan, but > that's another question... > > Concretely, in a board report, I personally would not agree with > representing the Spark or H2O work as an agreed future plan or > roadmap, right now. Being in the board report makes that impression, > as have recent articles/tweets I've seen, so it deserves care. That's > why I chimed in, maybe tilting at windmills. > > From where I sit with customers, the overall impression is negative > among those that have tried to use the code, and usage has gone from > few to almost none. I doubt my sample is so different from the whole > user population. Much of it is consistency/quality, but some of it's > just an interest in non-M/R frameworks. > > So, I think that current state and set of problems is far more > important to acknowledge in a board report than just mentioning some > future possibilities, and the latter was the impression I got of the > likely content. In fact, it makes the talk about large upcoming > possible changes make so much more sense. >
I agree with you that bringing the existing code up to a place that's approachable and consistent for new people and their clients to approach should be a prioritiy. My team have plenty(dozens) of large clients who are just getting up to speed with "old" Hadoop tech, and I see a long life ahead for a system that still uses map-reduce, and has other methods in an "internal incubation" or contrib phase.