they did a pretty good job explaining why Kudu was born in the docs

and there are tons of posts on the subject including my post

Mainly, Kudu allows to have mutable data and faster seeks. So if your table
does not need to be updated in real-time and you are fine with doing batch
reprocessing and managing partitions, you will be happy with Impala/hdfs.
Actually, we use Hive in such cases to do processing, and then our users
would use Impala to query.

On Sat, Dec 14, 2019 at 8:53 AM l vic <> wrote:

> Naive question: when using of Impala/hdfs would be preferable over
> Impala/kudu? In particular: what would makes more sense for large tables (
> > 1TB)?
> Thanks...

Reply via email to