On Fri, Oct 1, 2010 at 1:54 PM, Ryan Rawson <ryano...@gmail.com> wrote:
> I think it's tricky, because we dont really encourage people to think > of regions, but think of rows instead. The fact that regions exist is > a bit of an implementation detail, although like indexes in databases > a critical and crucial one that we cant really ignore in the end. > Requiring people to specify regions by start/stop row but without > giving them a lot of support to discover and identify regions might > not be such a good idea. Especially if we allow people to specify > arbitrary row ranges without regards to regions, and the rest is 'just > implementation'... it might be better to stick to something like a > pair of row keys or something. > > Thanks for the reply. The comparison to indexes is interesting, since DBs often allow you to specify hints to the query planner to force use of specific indexes but it's definitely not general usage. We could provide a region-based interface, with helpers to retrieve the appropriate region lists, but it's more complexity and I think maybe I'm over-anticipating future use cases. And having clients handle region names directly seems bound to lead to future headaches when region names change due to splits, etc. I like the idea of cutting back to just start and end row keys, with clarified documentation of how they're used. So the interface would be something like: exec(Class protocol, byte[] start, byte[] end, Call method) I think dropping the List<Row> and RowRange args simplifies things and eliminates some of the row-oriented expectations they raise. You lose the ability to invoke on disjoint regions in a single call, but that could be re-evaluated and added as a future change if it turns out to be important. Anyone have objections or see something I'm missing? If not, I'll update the current patch with this approach.