[ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-------------------------
    Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

kudu-master's log as below:
{code:java}
I1031 16:21:21.644222 180146 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet d1fd56be8eef44e782d509a0eeae9c15 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
ff4fd0a538944d69b8a6beea81e5bb01 at 2018-10-24 12:39:17 CST)
W1031 16:21:21.644421 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet d1fd56be8eef44e782d509a0eeae9c15 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet d1fd56be8eef44e782d509a0eeae9c15 
already in progress: creating tablet
I1031 16:21:21.644436 180146 catalog_manager.cc:2700] Scheduling retry of 
d1fd56be8eef44e782d509a0eeae9c15 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 553 ms (attempt = 6)
{code}
kudu-tserver's log as below:

 
{code:java}
I1031 16:21:22.197888 137341 tablet_service.cc:799] Processing DeleteTablet for 
tablet d1fd56be8eef44e782d509a0eeae9c15 with delete_type TABLET_DATA_DELETED 
(Replaced by ff4fd0a538944d69b8a6beea81e5bb01 at 2018-10-24 12:39:17 CST) from 
{username='kudu'} at 10.120.219.118:50247
I1031 16:21:22.230309 137131 maintenance_manager.cc:492] P 
39f15fcf42ef45bba0c95a3223dc25ee: 
FlushDeltaMemStoresOp(70499bc0f9ac4d8196ae5a0be6ef0b8b) complete. Timing: real 
0.416s      user 0.404s     sys 0.008s Metrics: 
{"fdatasync":3,"fdatasync_us":2583,"lbm_write_time_us":29,"lbm_writes_lt_1ms":4}
I1031 16:21:22.321700 137341 tablet_service.cc:799] Processing DeleteTablet for 
tablet 74a30181dea9400a9bcfaeb56f83f379 with delete_type TABLET_DATA_DELETED 
(Replaced by 31e350fddea443048946f5a20d3171bd at 2018-10-31 16:21:13 CST) from 
{username='kudu'} at 10.120.219.118:50247
I1031 16:21:22.350440 137341 tablet_service.cc:799] Processing DeleteTablet for 
tablet 7c864af01309432c9a2a4d1c88bbe52b with delete_type TABLET_DATA_DELETED 
(Replaced by ec4b733818d940e0af32c51bda3c7^C
{code}
 

-----------------------------------------------------------------------

We modified the flag '{color:#FF0000}max_create_tablets_per_ts{color}' (2000) 
of master.conf, and there is some load on the kudu cluster. Then someone else 
created a big table which had tens of thousands of tablets from impala-shell 
(it was a mistake).

It was a long time for him to wait, so he did "ctrl+c". But we found that the 
tablets in 'INITIALIZED' status was growing rapidly, half an hour later it was 
350,000 :(

We deleted this table by kudu client tool, and found that the number of 
'INITIALIZED' tablets was going down slowly. By simple estimating it will take 
10+ days to be back to normal.  But luckily, the application system are not 
affected.

 

 

  was:
I have met this problem again on 2018/10/26.The kudu version is 1.7.2

Once there are more than 2000(one threshold value) tablets on one tserver and  
at the same time the queue for the consensus service will be full. Then, if we 
create a new table, we can see that the number of the new tablets will raise 
infinitely on tserver.

I think it is related to the logic of creating tablet on master, especially the 
replacement operation.

 


> kudu will create tablet infinitely while there are more than 2000 tables on 
> the tserver
> ---------------------------------------------------------------------------------------
>
>                 Key: KUDU-2453
>                 URL: https://issues.apache.org/jira/browse/KUDU-2453
>             Project: Kudu
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.4.0, 1.7.2
>            Reporter: HeLifu
>            Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> kudu-master's log as below:
> {code:java}
> I1031 16:21:21.644222 180146 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet d1fd56be8eef44e782d509a0eeae9c15 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> ff4fd0a538944d69b8a6beea81e5bb01 at 2018-10-24 12:39:17 CST)
> W1031 16:21:21.644421 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet d1fd56be8eef44e782d509a0eeae9c15 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet d1fd56be8eef44e782d509a0eeae9c15 
> already in progress: creating tablet
> I1031 16:21:21.644436 180146 catalog_manager.cc:2700] Scheduling retry of 
> d1fd56be8eef44e782d509a0eeae9c15 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 553 ms (attempt = 6)
> {code}
> kudu-tserver's log as below:
>  
> {code:java}
> I1031 16:21:22.197888 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet d1fd56be8eef44e782d509a0eeae9c15 with delete_type 
> TABLET_DATA_DELETED (Replaced by ff4fd0a538944d69b8a6beea81e5bb01 at 
> 2018-10-24 12:39:17 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.230309 137131 maintenance_manager.cc:492] P 
> 39f15fcf42ef45bba0c95a3223dc25ee: 
> FlushDeltaMemStoresOp(70499bc0f9ac4d8196ae5a0be6ef0b8b) complete. Timing: 
> real 0.416s    user 0.404s     sys 0.008s Metrics: 
> {"fdatasync":3,"fdatasync_us":2583,"lbm_write_time_us":29,"lbm_writes_lt_1ms":4}
> I1031 16:21:22.321700 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 74a30181dea9400a9bcfaeb56f83f379 with delete_type 
> TABLET_DATA_DELETED (Replaced by 31e350fddea443048946f5a20d3171bd at 
> 2018-10-31 16:21:13 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.350440 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 7c864af01309432c9a2a4d1c88bbe52b with delete_type 
> TABLET_DATA_DELETED (Replaced by ec4b733818d940e0af32c51bda3c7^C
> {code}
>  
> -----------------------------------------------------------------------
> We modified the flag '{color:#FF0000}max_create_tablets_per_ts{color}' (2000) 
> of master.conf, and there is some load on the kudu cluster. Then someone else 
> created a big table which had tens of thousands of tablets from impala-shell 
> (it was a mistake).
> It was a long time for him to wait, so he did "ctrl+c". But we found that the 
> tablets in 'INITIALIZED' status was growing rapidly, half an hour later it 
> was 350,000 :(
> We deleted this table by kudu client tool, and found that the number of 
> 'INITIALIZED' tablets was going down slowly. By simple estimating it will 
> take 10+ days to be back to normal.  But luckily, the application system are 
> not affected.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to