The problem is, AFIK, that replication count is not necessarily the distribution count, so you can't guarantee all tablet servers will have a copy.
On Mar 16, 2018 1:41 PM, Boris Tyukin <bo...@boristyukin.com> wrote: I'm new to Kudu but we are also going to use Impala mostly with Kudu. We have a few tables that are small but used a lot. My plan is replicate them more than 3 times. When you create a kudu table, you can specify number of replicated copies (3 by default) and I guess you can put there a number, corresponding to your node count in cluster. The downside, you cannot change that number unless you recreate a table. On Fri, Mar 16, 2018 at 10:42 AM, Cliff Resnick <cre...@gmail.com<mailto:cre...@gmail.com>> wrote: We will soon be moving our analytics from AWS Redshift to Impala/Kudu. One Redshift feature that we will miss is its ALL Distribution, where a copy of a table is maintained on each server. We define a number of metadata tables this way since they are used in nearly every query. We are considering using parquet in HDFS cache for these, and Kudu would be a much better fit for the update semantics but we are worried about the additional contention. I'm wondering if having a Broadcast, or ALL, tablet replication might be an easy feature to add to Kudu? -Cliff