[
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725276#comment-16725276
]
Adar Dembo commented on KUDU-2638:
----------------------------------
Thank you for the log.
Although each server has 12 disks, it would seem that Kudu is configured to use
just one for its data directories:
bq. --fs_data_dirs=/data1/data/kudu/tserver-new
This will have a dramatic impact on overall Kudu performance. Firstly, Kudu
will only bootstrap one tablet at a time (see the documentation for
{{--num_tablets_to_open_simultaneously}}, which helps explain why your tablets
take so long to bootstrap. Secondly, your overall disk bandwidth is very low,
so maintenance manager flush/compact operations are much slower than they
otherwise would be.
If you upgrade to Kudu 1.7 or 1.8 and rebuild your tservers (one at a time),
Kudu's metadata will be stored on the same disk as the WALs rather than the
first data directory. In your case, with only one data directory, having the
metadata colocated with all of Kudu's data is going to make all flush/compact
operations slower (as they need to rewrite the tablet superblocks).
Another thing that stands out to me is the relative size of each Kudu data
block:
{quote}
1 data directories: /data1/data/kudu/tserver-new/data
Total live blocks: 19299871
Total live bytes: 102086799764
Total live bytes (after alignment): 176281313280
Total number of LBM containers: 226 (17 full)
{quote}
This works out to a couple KB per data block. Ideally data blocks would be
larger, closer to 1 MB each. Having so many small data blocks means more
overhead elsewhere in the system.
Finally, as you pointed out, the number of delta compaction operations is quite
high, as is the number of DMS flushes. What kind of workload is this? It seems
to be dominated by UPDATEs, which isn't optimal for Kudu.
> kudu cluster restart very long time to reused
> ---------------------------------------------
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
> Issue Type: Improvement
> Reporter: jiaqiyang
> Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary
>
>
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> --------------------------------------------------------------------------------+------------
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)