[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719839#comment-16719839
 ] 

jiaqiyang commented on KUDU-2638:
-

from the source code i see that :[MaintenanceManager::FindBestOp()]
 * If there's an Op that we can run quickly that frees log retention, we run it.
// - If we've hit the overall process memory limit (note: this includes memory 
that the Ops cannot
// free), we run the Op with the highest RAM usage.
// - If there are Ops that are retaining logs past our target replay size, we 
run the one that has
// the highest retention (and if many qualify, then we run the one that also 
frees up the
// most RAM).
// - Finally, if there's nothing else that we really need to do, we run the Op 
that will improve
// performance the most.

 

i think the op find use the last rule:Finally, if there's nothing else that we 
really need to do, we run the Op that will improve performance the most.

 

if this is true ,the restart cluster when there is many detel data fille will 
cost very long time to avalible the table

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 7:00 AM:
---

{code:java}
// code placeholder
I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:50 AM:
---

{code:java}
// code placeholder
I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY) I1121 17:15:29.870625 168167 
ts_tablet_manager.cc:932] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrapping tablet I1121 17:15:29.870635 
168167 tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap starting. I1121 17:16:57.754650 
168167 tablet_bootstrap.cc:616] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Time spent opening tablet: real 87.881s user 
0.908s sys 0.340s I1121 17:16:59.455792 168167 log.cc:644] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Max 
segment size reached. Starting new segment allocation I1121 17:16:59.455893 
168167 tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap replayed 1/14 log segments. Stats: 
ops{read=1614 overwritten=0 applied=1613 ignored=1476} inserts{seen=65 
ignored=0} mutations{seen=423 ignored=0} orphaned_commits=1. Pending: 1 
replicates I1121 17:16:59.456018 168167 log.cc:571] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Rolled 
over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2 
I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:02.018836 168167 log.cc:571] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Rolled 
over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3 
I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:03.023664 168167 
tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap replaying log segment 3/14 
(2.46M/7.99M this segment, stats: ops{read=4487 overwritten=0 applied=4487 
ignored=3705} inserts{seen=881 ignored=0} mutations{seen=2898 ignored=0} 
orphaned_commits=1) I1121 17:17:04.08 168167 log.cc:644] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Max 
segment size reached. Starting new segment allocation I1121 17:17:04.889019 
168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4 
I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:07.373458 168167 
log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:07.373723 168167 
log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5 
I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1) I1121 17:17:09.348376 168167 
log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:45 AM:
---

{code:java}
// code placeholder


{code}


was (Author: jiaqiyang):
{code:java}
// code placeholder

I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 

[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719818#comment-16719818
 ] 

jiaqiyang commented on KUDU-2638:
-

like this ;

the log i choose one tablet 5aae5dc9e6f4468aaf00c060152d4fed on one tserver;

from all the log i find all the tablet on tserver avilable use 7 hours;

if i stop the compact can the all tablet avalible time will short?

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719812#comment-16719812
 ] 

Adar Dembo commented on KUDU-2638:
--

Do you mean to say that, on bootstrap, your tablets undergo major delta 
compaction?

Can you attach a tserver log exhibiting the slow startup?


> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang commented on KUDU-2638:
-

{code:java}
// code placeholder

I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719810#comment-16719810
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:30 AM:
---

so that tablet lifcycle from init to runing very Slow:

when boostrap,there is many major compat in tserver;

how can we improve the time the tablet avilable!


was (Author: jiaqiyang):
so that tablet lifcycle from init to runing very Slow:

when boostrap,there is many major compat in tserver;how can we improve the time 
the tablet avilable!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaqiyang updated KUDU-2638:

Description: 
when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable

+

t1 | HEALTHY | 1 | 1 | 0 | 0

t2 | UNAVAILABLE | 5 | 0 | 1 | 4

t3 | UNAVAILABLE | 6 | 2 | 0 | 4

t3 | UNAVAILABLE | 3 | 0 | 0 | 3

  was:
when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

                       Name                        |      Status      | Total 
Tablets | Healthy | Under-replicated | Unavailable

---+--+---+-+--+-

                    | HEALTHY          | 1             | 1       | 0            
    | 0

                    | UNAVAILABLE      | 5             | 0       | 1            
    | 4

           | UNAVAILABLE      | 6             | 2       | 0                | 4

               | UNAVAILABLE      | 3             | 0       | 0                
| 3


> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)
jiaqiyang created KUDU-2638:
---

 Summary: kudu cluster restart very long time to reused
 Key: KUDU-2638
 URL: https://issues.apache.org/jira/browse/KUDU-2638
 Project: Kudu
  Issue Type: Improvement
Reporter: jiaqiyang


when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

                       Name                        |      Status      | Total 
Tablets | Healthy | Under-replicated | Unavailable

---+--+---+-+--+-

                    | HEALTHY          | 1             | 1       | 0            
    | 0

                    | UNAVAILABLE      | 5             | 0       | 1            
    | 4

           | UNAVAILABLE      | 6             | 2       | 0                | 4

               | UNAVAILABLE      | 3             | 0       | 0                
| 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2637) Add a note about leadership imbalance in the faq

2018-12-12 Thread Alexey Serbin (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719564#comment-16719564
 ] 

Alexey Serbin commented on KUDU-2637:
-

Probably, it makes sense to add some data-backed explanation to illustrate that 
actual leader-related overhead is not that big.  Running {{kudu perf loadgen}} 
workload against a small Kudu cluster (even with all tablet servers at the same 
node) and capturing metrics like {{cpu_utime}}, {{cpu_stime}}, 
{{threads_running}} and a snapshot of the {{MALLOC}} section of the {{/memz}} 
page generated by embedded Web server might be good enough.

> Add a note about leadership imbalance in the faq
> 
>
> Key: KUDU-2637
> URL: https://issues.apache.org/jira/browse/KUDU-2637
> Project: Kudu
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Grant Henke
>Priority: Minor
>
> There have been a few questions on leadership imbalance and whether or not it 
> is important to monitor and fix. We should update the FAQ section to address 
> this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2637) Add a note about leadership imbalance in the faq

2018-12-12 Thread Grant Henke (JIRA)
Grant Henke created KUDU-2637:
-

 Summary: Add a note about leadership imbalance in the faq
 Key: KUDU-2637
 URL: https://issues.apache.org/jira/browse/KUDU-2637
 Project: Kudu
  Issue Type: Improvement
  Components: documentation
Reporter: Grant Henke


There have been a few questions on leadership imbalance and whether or not it 
is important to monitor and fix. We should update the FAQ section to address 
this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1575) Backup and restore procedures

2018-12-12 Thread tim geary (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719129#comment-16719129
 ] 

tim geary commented on KUDU-1575:
-

Have we made any progress since May of implementing this or do we have a target 
version?

THX

> Backup and restore procedures
> -
>
> Key: KUDU-1575
> URL: https://issues.apache.org/jira/browse/KUDU-1575
> Project: Kudu
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Mike Percy
>Assignee: Mike Percy
>Priority: Major
>
> Kudu needs backup and restore procedures, both for data and for metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2636) LBM supports deleting the full container which is dead after hole punch

2018-12-12 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2636:
-
External issue URL:   (was: https://issues.apache.org/jira/browse/KUDU-2014)

> LBM supports deleting the full container which is dead after hole punch
> ---
>
> Key: KUDU-2636
> URL: https://issues.apache.org/jira/browse/KUDU-2636
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs, util
>Affects Versions: 1.8.0
>Reporter: HeLifu
>Priority: Major
>
> Right now, the LBM does not support deleting the full container which is dead 
> after hole punching, and after running for some time, there will be lots of 
> dead containers that will affect the startup time. So, it is necessary to 
> delete these files while hole punching.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2636) LBM supports deleting the full container which is dead after hole punch

2018-12-12 Thread HeLifu (JIRA)
HeLifu created KUDU-2636:


 Summary: LBM supports deleting the full container which is dead 
after hole punch
 Key: KUDU-2636
 URL: https://issues.apache.org/jira/browse/KUDU-2636
 Project: Kudu
  Issue Type: Improvement
  Components: fs, util
Affects Versions: 1.8.0
Reporter: HeLifu


Right now, the LBM does not support deleting the full container which is dead 
after hole punching, and after running for some time, there will be lots of 
dead containers that will affect the startup time. So, it is necessary to 
delete these files while hole punching.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)