Thanks for the update Joey. May someone close to NSA disclose what may have changed recently that allows contributing to Open Source eaiser ?
On Fri, Sep 2, 2011 at 12:30 PM, Joey Echeverria <[email protected]> wrote: > To add to what Todd said, I actually worked with those guys for the > last 3 years and have used Accumulo in production. It's true that it > would have been better if they had been able to contribute to HBase > rather than go on their own, but it's not easy to contribute to open > source, either officially or unofficially when you work at NSA. I > think there is precedence for competing and/or "duplicate" Apache > projects, Avro/Thrift and HBase/Cassandra come to mind. I'm mostly > interested in this project setting a precedent for other work at NSA > to be developed as open source. > > -Joey > > On Fri, Sep 2, 2011 at 3:09 PM, Todd Lipcon <[email protected]> wrote: > > Hey folks, > > > > <wearing my Todd hat and not my Cloudera hat!> > > > > I've been in touch with this team for the last 18 months or so. > > They're good people, smart, and have a healthy respect for HBase and > > our team. Though they haven't contributed code or participated on the > > lists, I can vouch that they do follow our development and generally > > do understand HBase as well as what makes their system different. In > > the context of the incubator proposal, they're trying to explain why > > their system is different than HBase, and not trying to knock our > > project. They do borrow our ideas, and in the future we'll be able to > > borrow some of theirs. Iterator trees, for example, are distinct from > > coprocessors and have some really nice capabilities which I'm looking > > forward to adapting into HBase. > > > > There are a couple things to keep in mind about the story here: > > - they first evaluated HBase 3 years ago. HBase at that point was not > > usable for their application - I think several of us here remember the > > state of HBase at the time and might have made the same decision. So, > > they started their own project with an internal team of 5-6 people. > > - contributing to open source from within the NSA is not easy, for > > obvious reasons. They've jumped through many hoops to open source > > this, and we should be thankful for that. Now that they're out in open > > source land, I think we'll see them collaborating with us much more > > openly. > > > > I for one look forward to working with these folks, and maybe merging > > the projects some time down the road as the feature lists converge. > > > > -Todd > > > > On Fri, Sep 2, 2011 at 11:40 AM, Gary Helmling <[email protected]> > wrote: > >> Some comments on the proposal and differentiation vs HBase: > >> > >> Access Labels: > >> > >> The proposal claims that this is "unlikely to be adopted [in HBase]". > This > >> is completely untrue. This has been discussed many times in the past in > >> relation to our security implementation. It's just been deferred at the > >> moment due to a need to focus on the initial implementation. But it's > >> certainly viewed as a potentially important feature for a future > iteration. > >> Contributions always welcome! > >> > >> see HBASE-3435: Provide per-column-qualifier and per-key-value security > for > >> HBASE-3025 > >> > >> > >> Iterators: > >> > >> What do these provide that RegionObservers don't? I'm speculating since > the > >> proposal provides little in the way of details, but if these are > "unlikely > >> to be adopted" it's only because coprocessors already offer more > extensive > >> functionality. > >> > >> > >> "Flexibility" aka online schema changes and locality groups > >> > >> Locality groups seem to be the only meaningful differentiation in this > >> entire comparison. > >> > >> > >> Testing > >> > >> Performance under "some configurations and conditions" and > unsubstantiated > >> "greater data integrity" is not meaningful differentiation. > >> > >> > >> Apache Brand > >> > >> Claims a relationship with HBase. Is there overlapping code or is this > just > >> the duplication of functionality? There's no community relationship > that > >> I'm aware of. I haven't seen any of the proposed committers on the > HBase > >> user and dev lists to this point, so that doesn't set much of a > precedent > >> for community interaction. > >> > >> > >> Overall I see no meaningful differentiation vs HBase as an existing > project, > >> no past attempts to interact with the most relevant Apache community, > and > >> only an, until now, private "community" of government users. I think > it's > >> great that they want to open source this. I don't want to discourage > that > >> -- go for it! But I don't see what the benefit is of ASF incubating > this. > >> I only see the potential for community fragmentation and market > confusion > >> over such closely similar projects. > >> > >> > >> Gary > >> > >> > >> On Fri, Sep 2, 2011 at 11:06 AM, Stack <[email protected]> wrote: > >> > >>> See here for the incubator proposal: > >>> http://wiki.apache.org/incubator/AccumuloProposal > >>> > >>> Reactions probably better belong over on the incubator mailing list > >>> but I thought a discussion here first might be useful developing a > >>> stance. > >>> > >>> Initial reaction, not having seen the code, is that it seems to be > close to > >>> HBase; so close, they call HBase out explicitly in their proposal. > >>> > >>> The cell based 'access labels' seem like a matter of adding > >>> an extra field to KV and their Iterators seem like a specialization on > >>> Coprocessors. The ability to add column families on the fly seems too > >>> minor a difference to call out especially if online schema edits are > >>> now (soon) supported. They talk of locality group like functionality > >>> too -- that > >>> could be a significant difference. We would have to see the code but > at > >>> first blush, differences look small. > >>> > >>> Yet another BT implementation further divides this contended space. > >>> If there were to be an effort integrating HBase into Accumulo or vice > >>> versa, its likely to distract significantly from project forward motion > (If > >>> the Accumulo fellows were interested in integrating the two projects, > >>> I'd have thought they'd have tried to talk to us before this so thats > >>> probably not their intent). > >>> > >>> On other hand, if their once-secret project is out in the open, we can > >>> steal the Apache-licensed good bits and.... > >>> > >>> What do folks think? > >>> > >>> St.Ack > >>> > >> > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 >
