Yeah, in many large installations, turning off constraints makes a lot of sense (do checking before you put the data over the wire, rather than server side). However, on multi-tenant systems or where you are required to enforce certain parameters (constraints) on the data no matter what, due to company policy or w/e.
There is an example of how to do Constraints as a jar with CPs already attached to the ticket, and its pretty simple. However, the ticket goes into the plusses and minuses for a top-level or just basic CP based implementation. For me, the best reason for top level is top make HBase easy to use and have certain built-in features. Yeah, we can do security, but you have to include the jars make sure it works, etc. As opposed to _certain_ systems where security is built in. Similar arguments can be made for things like constraints - its just _easier_ to have it built in, and let people use them (or not) as they choose. The ticket also talks about abstracting out some of the CP things to make it easier to add other top level features, which would be a win too. Yeah, they would be backed by CPs, but that doesn't mean it doesn't make sense for people to use the stuff really (as in dead simple) easily. -Jesse On Mon, Oct 17, 2011 at 10:00 PM, lars hofhansl <[email protected]> wrote: > My $0.02... > > > I'd rather include an example of how to do this with a coprocessors > (similar to what we do with the > aggregation client), rather than a new HBase feature. If the example is > easy to extend and to compile to a > jar we have achieved almost the same. > > > Also - as an anecdote - every semi large relational database I worked with > professionally had constraints turned because > of performance reasons and rather implemented constraints at the > application layer. > > > -- Lars > > ________________________________ > From: Jesse Yates <[email protected]> > To: [email protected] > Sent: Monday, October 17, 2011 11:27 AM > Subject: Re: adding constraints > > Added HBASE-4605 <https://issues.apache.org/jira/browse/HBASE-4605> (and > approach comemnts) for this issue. > > -Jesse > > On Mon, Oct 17, 2011 at 11:10 AM, Ted Yu <[email protected]> wrote: > > > Jesse: > > I agree with your observations. > > > > Constraint, defined for single table, would be useful. > > > > Please file a JIRA and describe your strategy there. > > > > Thanks > > > > On Mon, Oct 17, 2011 at 11:04 AM, Jesse Yates <[email protected] > > >wrote: > > > > > On Mon, Oct 17, 2011 at 11:00 AM, Ted Yu <[email protected]> wrote: > > > > > > > Jesse: > > > > This is a nice initiative. > > > > Looks like the Constraint you define below is per table. Meaning it > is > > > not > > > > cross-table referential integrity. > > > > > > > > > > Theoretically we could support doing this. And if people were really > > cheeky > > > with the current implementation, they could access other tables to > > enforce > > > it (though it would kill you on access time). Even so, doing the > > > cross-table > > > checks, is going to be rough on run time (cross-server locking is > always > > > bad > > > news bears ;), so thinking this should definitely be a later > > consideration. > > > > > > > > > > Cheers > > > > > > > > On Mon, Oct 17, 2011 at 10:45 AM, Jesse Yates < > [email protected] > > > > >wrote: > > > > > > > > > Hey everyone, > > > > > > > > > > TL;DR Adding classic DB constraints as a system level coprocessor > to > > > help > > > > > simplify using HBase and ease adopting. > > > > > > > > > > Coprocessors are a really powerful mechanism and are incredibly > > useful > > > > for > > > > > a > > > > > variety of things. However, I feel like the mechanism for using > them > > > can > > > > be > > > > > very daunting and, for certain features, could do with some > > > > simplification. > > > > > > > > > > What I would like to propose is a simple interface that people can > > use > > > to > > > > > implement a 'constraint' (matching the classic database > definition). > > > This > > > > > would help ease of adoption by helping HBase more easily check that > > > box, > > > > > help minimize code duplication across organizations, and lead to > > easier > > > > > adoption. > > > > > > > > > > Essentially, people would implement a 'Constraint' interface for > > > checking > > > > > keys before they are put into a table. Puts that are valid get > > written > > > to > > > > > the table, but if not people can will throw an exception that gets > > > > > propagated back to the client explaining why the put was invalid. > > > > > > > > > > Constraints would be set on a per-table basis and the user would be > > > > > expected > > > > > to ensure the jars containing the constraint are present on the > > > machines > > > > > serving that table. > > > > > > > > > > Yes, people could roll their own mechanism for doing this via > > > > coprocessors > > > > > each time, but this would make it easier to do so, so you only have > > to > > > > > implement a very minimal interface and not worry about the > specifics. > > > > > > > > > > If people are interested, I would like to open a Jira on the > feature. > > > > I've > > > > > got a basic implementation, but would like to expand it to be a > more > > > > > integrated, top-level element of the code. I just don't want to > waste > > > my > > > > > time doing a full blown impl and then not have at least general > > > concensus > > > > > on > > > > > it being a good feature. > > > > > > > > > > One of the complaints I commonly hear about HBase is that, to > > > outsiders, > > > > it > > > > > is difficult to figure out and use (though once you do, its solid). > > > This > > > > > would be a step to make it easier to use and adopt. > > > > > > > > > > Thanks, > > > > > Jesse Yates > > > > > > > > > > > > > > >
