We will soon be moving our analytics from AWS Redshift to Impala/Kudu. One Redshift feature that we will miss is its ALL Distribution, where a copy of a table is maintained on each server. We define a number of metadata tables this way since they are used in nearly every query. We are considering using parquet in HDFS cache for these, and Kudu would be a much better fit for the update semantics but we are worried about the additional contention. I'm wondering if having a Broadcast, or ALL, tablet replication might be an easy feature to add to Kudu?
-Cliff
