thanks for replying, Adar. Did some math and in our case we are hitting
another Kudu limit - 60 tablets per node. We use high density nodes with 2
24-core CPUs so we have 88 hyperthreaded cores total per node or 88*24=2112
cores total. But I cannot create more than 60*24=1440 tablets per table.
Looks like my tablets for the largest table will be around 8-10Gb in size.
Should I be worried since recommendation is to keep tablets about 1Gb in
size?

On Wed, Oct 17, 2018 at 8:06 PM Adar Lieber-Dembo <a...@cloudera.com> wrote:

> Hi Boris,
>
> > Also, when they say tablets - I assume this is before replication? so in
> reality, it is number of nodes x cpu cores / replication factor? If this is
> the case, it is not looking good...
>
> No, I think this is post-replication. The underlying assumption is
> that you want to maximize parallelism for large tables, and since
> Impala only uses one read thread per tablet, that means ensuring the
> number of tablets is close or equal to the overall number of cores.
> However, during a scan Impala will choose one of the tablet's replicas
> to read from, so you don't need to "reserve" a core for the other
> replicas.
>
> >> can someone clarify if this recommendation below - does it mean
> physical or hyper-threaded CPU cores? quite a big difference...
>
> I think this refers to hyper-threaded CPU cores (i.e. a CPU unit
> capable of executing an OS thread). But I'd be curious to hear if your
> workload is substantially more or less performant either way.
>

Reply via email to