Often for these cases having multiple partitions doesn't provide any
advantage.  There are fixed-cost overheads to having many tablets, so if
the tablets are small these costs can outweigh the benefit.  Additionally,
if you aren't actively writing to the table then the benefit of
parallelizing those writes isn't there.  As far as join performance, that's
usually dominated by the join strategy used by the SQL engine, and whether
the engine can take advantage of the size disparity in the tables.

- Dan

On Mon, Oct 15, 2018 at 10:44 AM Boris Tyukin <bo...@boristyukin.com> wrote:

> Out of 300 tables I need to ingest into Kudu, 250 are really small - less
> than 500k rows and will fit in a single 1Gb partition. Does it still make
> sense to create 3 partitions or have no partitions at all?
>
> Some of these tables are frequently joined to very large 1-10B row
> tables...
>
> Thanks,
> Boris
>

Reply via email to