I'd expect performance to be slightly better with separate tables than locality groups, because managing locality groups can be relatively cheap, but it's not entirely free.
Namespaces work like a table prefix, but also provide a means to easily configure all of its tables at once. So, they're either slightly or significantly better than a table prefix, depending on your needs. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Aug 17, 2015 at 3:01 PM, z11373 <[email protected]> wrote: > Thanks Christopher for valuable insight. > Right now we don't have scenario which it needs to query data from multiple > customers at once. Perhaps some time in the future, and that 'future' seems > could be years from now (or perhaps never), so I think I am inclined to > implement them as separate tables for now. > > Though they are in separate tables, I will still apply visibility column for > each row in the table. The visibility string could be something like > customer id. The caller will be another app of ours, so we can trust it > (still need to pass that customer id as authz string). > > In term of scan performance, is it true that if we shard by column family or > different table, it won't matter much since I'd think we also can create > separate locality group for different column family)? > > Thanks for the tips on using namespace, originally I'd think of using prefix > the table names with customer id. I guess they are no difference, right? > > Thanks, > Z > > > > -- > View this message in context: > http://apache-accumulo.1065345.n5.nabble.com/sharding-via-different-tables-tp14884p14893.html > Sent from the Developers mailing list archive at Nabble.com.
