I'm new to Kudu but we are also going to use Impala mostly with Kudu. We have a few tables that are small but used a lot. My plan is replicate them more than 3 times. When you create a kudu table, you can specify number of replicated copies (3 by default) and I guess you can put there a number, corresponding to your node count in cluster. The downside, you cannot change that number unless you recreate a table.
On Fri, Mar 16, 2018 at 10:42 AM, Cliff Resnick <[email protected]> wrote: > We will soon be moving our analytics from AWS Redshift to Impala/Kudu. One > Redshift feature that we will miss is its ALL Distribution, where a copy of > a table is maintained on each server. We define a number of metadata tables > this way since they are used in nearly every query. We are considering > using parquet in HDFS cache for these, and Kudu would be a much better fit > for the update semantics but we are worried about the additional > contention. I'm wondering if having a Broadcast, or ALL, tablet > replication might be an easy feature to add to Kudu? > > -Cliff >
