Hi Team,

This is regarding a performance issue me and my team have on a huge data
load in Kudu. We are hoping you can guide us on a solution to the below
mentioned concerns.

We have 212 million data loads in Kudu. Currently for such a data load,
when loading through impala, 47 seconds are spent for query processing and
loading overall. We have used default configurations in Kudu and Impala
with 6 node clusters to get these numbers. We have achieved the performance
as we expected in Kudu level, however, in Impala we haven’t reached to the
performance we expected.

What can we do to reduce the time spent for loading 212 million data loads
from 47 seconds to 10 seconds through impala?

We would be much obliged if you can provide us with some solutions.

Thank You!

Best Regards
Sumudu Madushanka

Reply via email to