[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-22 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727818#comment-16727818
 ] 

jiaqiyang commented on KUDU-2646:
-

 

https://issues.apache.org/jira/browse/KUDU-2638

 

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-22 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727819#comment-16727819
 ] 

jiaqiyang commented on KUDU-2646:
-

cat kudu-tserver-jira.info |grep ' Time spent bootstrapping tablet:' >a

awk '\{sum+=$14}END\{print sum}' a

26915.1 s

 

from your log :there is 26915.1 S to bootstrap  all your tablet

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-22 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727816#comment-16727816
 ] 

jiaqiyang commented on KUDU-2646:
-

1 data directories: /cdh/kudu/tserver/fdd/data

Total live blocks: 33688986

Total live bytes: 134814618486

Total live bytes (after alignment): 259383992320

Total number of LBM containers: 21179 (8196 full)

Did not check for missing blocks

Did not check for orphaned blocks

 

 

from your tserver log there is one disk ;

 What kind of hardware is being used for the tserver's metadata directory?

see the documentation for {{--num_tablets_to_open_simultaneously}}, which helps 
explain why your tablets take so long to bootstrap.

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-20 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720962#comment-16720962
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/21/18 3:05 AM:
---

yes,first thank you very much;

i know that i provide the log is not enough;

thank you for your attention!

i will give out full log for the tserver!


was (Author: jiaqiyang):
yes,first thank you very much;

i know that i provide the log is not enough;

thank you for your attention!

i will give out full log for the tserver!

i am very intresting in kudu!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725603#comment-16725603
 ] 

jiaqiyang commented on KUDU-2638:
-

Yeah!thank you very much!

Now i will built a new kudu cluster with 12 SSD disk Tserver; 

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725550#comment-16725550
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/20/18 4:06 AM:
---

Thank you for your attention!

I'm sorry there is one disk on my tserver! 

>From your  Suggestion i know there are three questions in my now kudu cluster:

Firstly, one disk and --num_tablets_to_open_simultaneously default will only 
bootstrap one tablet at a time ; Secondly, the disk io-state is very busy up to 
100%; Thirdly,there is many small block;

But i have some question:

1.one tablet lifecycle from INITIALIZ to RUNNING spent time affected by major 
compact ?

2.by manual trigger major compact can reduce   small block ,but compact/flush 
Op manage by 

MaintenanceManager

 

Use Case:

there is many UPDATEs in my case:

mysql binlog realtime Synchronize to kudu so there is many UPDATEs events

we want to use kudu as a retime OLAP engion
  
  
  
  
  


was (Author: jiaqiyang):
Thank you for your attention!

yes i think so:there is many UPDATEs in my case:

mysql binlog realtime Synchronize to kudu so there is many UPDATEs events

we want to use kudu as a retime OLAP engion
  
  
  
  
  

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725550#comment-16725550
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/20/18 3:16 AM:
---

Thank you for your attention!

yes i think so:there is many UPDATEs in my case:

mysql binlog realtime Synchronize to kudu so there is many UPDATEs events

we want to use kudu as a retime OLAP engion
  
  
  
  
  


was (Author: jiaqiyang):
yes i think so:there is many UPDATEs in my case:

mysql binlog realtime Synchronize to kudu so there is many UPDATEs events

we want to use kudu as a retime OLAP engion
 
 
 
 
 

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725550#comment-16725550
 ] 

jiaqiyang commented on KUDU-2638:
-

yes i think so:there is many UPDATEs in my case:

mysql binlog realtime Synchronize to kudu so there is many UPDATEs events

we want to use kudu as a retime OLAP engion
 
 
 
 
 

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724859#comment-16724859
 ] 

jiaqiyang commented on KUDU-2638:
-

in this code :
{code:java}
// code placeholder

MaintenanceOp* MaintenanceManager::FindBestOp() {
  TRACE_EVENT0("maintenance", "MaintenanceManager::FindBestOp");

  size_t free_threads = num_threads_ - running_ops_;
  if (free_threads == 0) {
VLOG_AND_TRACE("maintenance", 1) << LogPrefix()
 << "There are no free threads, so we can't 
run anything.";
return nullptr;
  }

  int64_t low_io_most_logs_retained_bytes = 0;
  MaintenanceOp* low_io_most_logs_retained_bytes_op = nullptr;

  uint64_t most_mem_anchored = 0;
  MaintenanceOp* most_mem_anchored_op = nullptr;

  int64_t most_logs_retained_bytes = 0;
  int64_t most_logs_retained_bytes_ram_anchored = 0;
  MaintenanceOp* most_logs_retained_bytes_op = nullptr;

  int64_t most_data_retained_bytes = 0;
  MaintenanceOp* most_data_retained_bytes_op = nullptr;

  double best_perf_improvement = 0;
  MaintenanceOp* best_perf_improvement_op = nullptr;
  for (OpMapTy::value_type  : ops_) {
MaintenanceOp* op(val.first);
MaintenanceOpStats& stats(val.second);
VLOG_WITH_PREFIX(3) << "Considering MM op " << op->name();
// Update op stats.
stats.Clear();
op->UpdateStats();
if (op->cancelled() || !stats.valid() || !stats.runnable()) {
  continue;
}
if (stats.logs_retained_bytes() > low_io_most_logs_retained_bytes &&
op->io_usage() == MaintenanceOp::LOW_IO_USAGE) {
  low_io_most_logs_retained_bytes_op = op;
  low_io_most_logs_retained_bytes = stats.logs_retained_bytes();
  VLOG_AND_TRACE("maintenance", 2) << LogPrefix() << "Op " << op->name() << 
" can free "
   << stats.logs_retained_bytes() << " 
bytes of logs";
}

if (stats.ram_anchored() > most_mem_anchored) {
  most_mem_anchored_op = op;
  most_mem_anchored = stats.ram_anchored();
}
// We prioritize ops that can free more logs, but when it's the same we 
pick the one that
// also frees up the most memory.
if (stats.logs_retained_bytes() > 0 &&
(stats.logs_retained_bytes() > most_logs_retained_bytes ||
(stats.logs_retained_bytes() == most_logs_retained_bytes &&
stats.ram_anchored() > most_logs_retained_bytes_ram_anchored))) 
{
  most_logs_retained_bytes_op = op;
  most_logs_retained_bytes = stats.logs_retained_bytes();
  most_logs_retained_bytes_ram_anchored = stats.ram_anchored();
}

if (stats.data_retained_bytes() > most_data_retained_bytes) {
  most_data_retained_bytes_op = op;
  most_data_retained_bytes = stats.data_retained_bytes();
  VLOG_AND_TRACE("maintenance", 2) << LogPrefix() << "Op " << op->name() << 
" can free "
   << stats.data_retained_bytes() << " 
bytes of data";
}

if ((!best_perf_improvement_op) ||
(stats.perf_improvement() > best_perf_improvement)) {
  best_perf_improvement_op = op;
  best_perf_improvement = stats.perf_improvement();
}
  }

  // Look at ops that we can run quickly that free up log retention.
  if (low_io_most_logs_retained_bytes_op) {
if (low_io_most_logs_retained_bytes > 0) {
  VLOG_AND_TRACE("maintenance", 1) << LogPrefix()
<< "Performing " << 
low_io_most_logs_retained_bytes_op->name() << ", "
<< "because it can free up more logs "
<< "at " << low_io_most_logs_retained_bytes
<< " bytes with a low IO cost";
  return low_io_most_logs_retained_bytes_op;
}
  }

  // Look at free memory. If it is dangerously low, we must select something
  // that frees memory-- the op with the most anchored memory.
  double capacity_pct;
  if (memory_pressure_func_(_pct)) {
if (!most_mem_anchored_op) {
  std::string msg = StringPrintf("we have exceeded our soft memory limit "
  "(current capacity is %.2f%%).  However, there are no ops currently "
  "runnable which would free memory.", capacity_pct);
  LOG_WITH_PREFIX(INFO) << msg;
  return nullptr;
}
VLOG_AND_TRACE("maintenance", 1) << LogPrefix() << "We have exceeded our 
soft memory limit "
<< "(current capacity is " << capacity_pct << "%).  Running the op "
<< "which anchors the most memory: " << 
most_mem_anchored_op->name();
return most_mem_anchored_op;
  }

  if (most_logs_retained_bytes_op &&
  most_logs_retained_bytes / 1024 / 1024 >= 
FLAGS_log_target_replay_size_mb) {
VLOG_AND_TRACE("maintenance", 1) << LogPrefix()
<< "Performing " << most_logs_retained_bytes_op->name() << ", "
<< "because it can free up more logs (" << most_logs_retained_bytes
<< " bytes)";
return most_logs_retained_bytes_op;
  }

  // Look at 

[jira] [Updated] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-19 Thread jiaqiyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaqiyang updated KUDU-2638:

Attachment: kudu16.tc.tablet.png

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-18 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724769#comment-16724769
 ] 

jiaqiyang edited comment on KUDU-2646 at 12/19/18 7:44 AM:
---

i have the same question i have give out tserver log KUDU-2638

kudu version 1.6.0


was (Author: jiaqiyang):
i have the same question i have give out tserver log KUDU-2638

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-18 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724769#comment-16724769
 ] 

jiaqiyang commented on KUDU-2646:
-

i have the same question i have give out tserver log KUDU-2638

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-18 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724766#comment-16724766
 ] 

jiaqiyang commented on KUDU-2638:
-

now i'm  analysising  what block the tablet lifecycle;

i think the question in  MaintenanceManager  modle

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-18 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724757#comment-16724757
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/19/18 7:40 AM:
---

this is the kudu cluster one tserver log

 

[^tserverLog.tar.gz]

 

the cluster total 19 tservers and 3 masters;

12 sata disk every server

200+ tablet on one tserver;

 3 replica every tablet;

 

there is one question :

why the cluster restart use long time to table avalible , in the log i see that 
boostrap very quickly ;but there is very long time spend on major compact; how 
can we stop the compact then admin compaction after idle time like HBase 
compaction!


was (Author: jiaqiyang):
this is the kudu cluster one tserver log

 

[^tserverLog.tar.gz]

 

the cluster total 19 tservers and 3 masters;

12 sata disk every server

200+ tablet on one tserver;

 

there is one question :

why the cluster restart use long time to table avalible , in the log i see that 
boostrap very quickly ;but there is very long time spend on major compact; how 
can we stop the compact then admin compaction after idle time like HBase 
compaction!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-18 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724757#comment-16724757
 ] 

jiaqiyang commented on KUDU-2638:
-

this is the kudu cluster one tserver log

 

[^tserverLog.tar.gz]

 

the cluster total 19 tservers and 3 masters;

12 sata disk every server

200+ tablet on one tserver;

 

there is one question :

why the cluster restart use long time to table avalible , in the log i see that 
boostrap very quickly ;but there is very long time spend on major compact; how 
can we stop the compact then admin compaction after idle time like HBase 
compaction!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-18 Thread jiaqiyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaqiyang updated KUDU-2638:

Attachment: tserverLog.tar.gz

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
> Attachments: tserverLog.tar.gz
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-13 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720962#comment-16720962
 ] 

jiaqiyang commented on KUDU-2638:
-

yes,first thank you very much;

i know that i provide the log is not enough;

thank you for your attention!

i will give out full log for the tserver!

i am very intresting in kudu!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719839#comment-16719839
 ] 

jiaqiyang commented on KUDU-2638:
-

from the source code i see that :[MaintenanceManager::FindBestOp()]
 * If there's an Op that we can run quickly that frees log retention, we run it.
// - If we've hit the overall process memory limit (note: this includes memory 
that the Ops cannot
// free), we run the Op with the highest RAM usage.
// - If there are Ops that are retaining logs past our target replay size, we 
run the one that has
// the highest retention (and if many qualify, then we run the one that also 
frees up the
// most RAM).
// - Finally, if there's nothing else that we really need to do, we run the Op 
that will improve
// performance the most.

 

i think the op find use the last rule:Finally, if there's nothing else that we 
really need to do, we run the Op that will improve performance the most.

 

if this is true ,the restart cluster when there is many detel data fille will 
cost very long time to avalible the table

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 7:00 AM:
---

{code:java}
// code placeholder
I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:50 AM:
---

{code:java}
// code placeholder
I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY) I1121 17:15:29.870625 168167 
ts_tablet_manager.cc:932] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrapping tablet I1121 17:15:29.870635 
168167 tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap starting. I1121 17:16:57.754650 
168167 tablet_bootstrap.cc:616] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Time spent opening tablet: real 87.881s user 
0.908s sys 0.340s I1121 17:16:59.455792 168167 log.cc:644] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Max 
segment size reached. Starting new segment allocation I1121 17:16:59.455893 
168167 tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap replayed 1/14 log segments. Stats: 
ops{read=1614 overwritten=0 applied=1613 ignored=1476} inserts{seen=65 
ignored=0} mutations{seen=423 ignored=0} orphaned_commits=1. Pending: 1 
replicates I1121 17:16:59.456018 168167 log.cc:571] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Rolled 
over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2 
I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:02.018836 168167 log.cc:571] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Rolled 
over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3 
I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:03.023664 168167 
tablet_bootstrap.cc:437] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Bootstrap replaying log segment 3/14 
(2.46M/7.99M this segment, stats: ops{read=4487 overwritten=0 applied=4487 
ignored=3705} inserts{seen=881 ignored=0} mutations{seen=2898 ignored=0} 
orphaned_commits=1) I1121 17:17:04.08 168167 log.cc:644] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Max 
segment size reached. Starting new segment allocation I1121 17:17:04.889019 
168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4 
I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:07.373458 168167 
log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates I1121 17:17:07.373723 168167 
log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5 
I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1) I1121 17:17:09.348376 168167 
log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:45 AM:
---

{code:java}
// code placeholder


{code}


was (Author: jiaqiyang):
{code:java}
// code placeholder

I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 

[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719818#comment-16719818
 ] 

jiaqiyang commented on KUDU-2638:
-

like this ;

the log i choose one tablet 5aae5dc9e6f4468aaf00c060152d4fed on one tserver;

from all the log i find all the tablet on tserver avilable use 7 hours;

if i stop the compact can the all tablet avalible time will short?

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719815#comment-16719815
 ] 

jiaqiyang commented on KUDU-2638:
-

{code:java}
// code placeholder

I1121 17:04:53.100796 165214 ts_tablet_manager.cc:909] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Loading 
tablet metadata

I1121 17:07:06.116400 165214 ts_tablet_manager.cc:1082] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Registered 
tablet (data state: TABLET_DATA_READY)

I1121 17:15:29.870625 168167 ts_tablet_manager.cc:932] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 
Bootstrapping tablet

I1121 17:15:29.870635 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
starting.

I1121 17:16:57.754650 168167 tablet_bootstrap.cc:616] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Time spent 
opening tablet: real 87.881suser 0.908s sys 0.340s

I1121 17:16:59.455792 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:16:59.455893 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 1/14 log segments. Stats: ops{read=1614 overwritten=0 applied=1613 
ignored=1476} inserts{seen=65 ignored=0} mutations{seen=423 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:16:59.456018 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-2

I1121 17:17:02.018604 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:02.018836 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-3

I1121 17:17:02.018995 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 2/14 log segments. Stats: ops{read=3892 overwritten=0 applied=3891 
ignored=3256} inserts{seen=718 ignored=0} mutations{seen=2327 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:03.023664 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 3/14 (2.46M/7.99M this segment, stats: ops{read=4487 
overwritten=0 applied=4487 ignored=3705} inserts{seen=881 ignored=0} 
mutations{seen=2898 ignored=0} orphaned_commits=1)

I1121 17:17:04.08 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:04.889019 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-4

I1121 17:17:04.889173 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 3/14 log segments. Stats: ops{read=5397 overwritten=0 applied=5396 
ignored=4259} inserts{seen=1392 ignored=0} mutations{seen=4399 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373458 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:07.373601 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replayed 4/14 log segments. Stats: ops{read=7365 overwritten=0 applied=7364 
ignored=5769} inserts{seen=1779 ignored=0} mutations{seen=6078 ignored=0} 
orphaned_commits=1. Pending: 1 replicates

I1121 17:17:07.373723 168167 log.cc:571] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Rolled over to a new log segment at 
/data/data/kudu/tserver-new/wals/5aae5dc9e6f4468aaf00c060152d4fed/wal-5

I1121 17:17:08.071877 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: Bootstrap 
replaying log segment 5/14 (2.36M/8.00M this segment, stats: ops{read=7680 
overwritten=0 applied=7680 ignored=5940} inserts{seen=1972 ignored=0} 
mutations{seen=6778 ignored=0} orphaned_commits=1)

I1121 17:17:09.348376 168167 log.cc:644] T 5aae5dc9e6f4468aaf00c060152d4fed P 
510015b8e3d2462e9d52965cfa306af7: Max segment size reached. Starting new 
segment allocation

I1121 17:17:09.348531 168167 tablet_bootstrap.cc:437] T 
5aae5dc9e6f4468aaf00c060152d4fed P 510015b8e3d2462e9d52965cfa306af7: 

[jira] [Comment Edited] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719810#comment-16719810
 ] 

jiaqiyang edited comment on KUDU-2638 at 12/13/18 6:30 AM:
---

so that tablet lifcycle from init to runing very Slow:

when boostrap,there is many major compat in tserver;

how can we improve the time the tablet avilable!


was (Author: jiaqiyang):
so that tablet lifcycle from init to runing very Slow:

when boostrap,there is many major compat in tserver;how can we improve the time 
the tablet avilable!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaqiyang updated KUDU-2638:

Description: 
when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable

+

t1 | HEALTHY | 1 | 1 | 0 | 0

t2 | UNAVAILABLE | 5 | 0 | 1 | 4

t3 | UNAVAILABLE | 6 | 2 | 0 | 4

t3 | UNAVAILABLE | 3 | 0 | 0 | 3

  was:
when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

                       Name                        |      Status      | Total 
Tablets | Healthy | Under-replicated | Unavailable

---+--+---+-+--+-

                    | HEALTHY          | 1             | 1       | 0            
    | 0

                    | UNAVAILABLE      | 5             | 0       | 1            
    | 4

           | UNAVAILABLE      | 6             | 2       | 0                | 4

               | UNAVAILABLE      | 3             | 0       | 0                
| 3


> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-12 Thread jiaqiyang (JIRA)
jiaqiyang created KUDU-2638:
---

 Summary: kudu cluster restart very long time to reused
 Key: KUDU-2638
 URL: https://issues.apache.org/jira/browse/KUDU-2638
 Project: Kudu
  Issue Type: Improvement
Reporter: jiaqiyang


when restart my kudu cluster ;all tablet not avalible:

run kudu cluster ksck show that:

Table Summary                                                                   
                                                                               

                       Name                        |      Status      | Total 
Tablets | Healthy | Under-replicated | Unavailable

---+--+---+-+--+-

                    | HEALTHY          | 1             | 1       | 0            
    | 0

                    | UNAVAILABLE      | 5             | 0       | 1            
    | 4

           | UNAVAILABLE      | 6             | 2       | 0                | 4

               | UNAVAILABLE      | 3             | 0       | 0                
| 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)