[jira] [Created] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)
daicheng created KUDU-3536:
--

 Summary: Could not remove renamed recovery dir(nfs) when kudu 
restarts
 Key: KUDU-3536
 URL: https://issues.apache.org/jira/browse/KUDU-3536
 Project: Kudu
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: daicheng


Configured kudu directories to NFS on k8s , and insert some data to kudu,after 
restart kudu, the kudu tserver  fails to bootstrap with error like :
{code:java}
IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred {code}
while the issue didn't comes when the directory on local disk.

here some error details:
{code:java}
 Config source |        Replicas        | Current term | Config index | 
Committed?
---++--+--+
 master        | A*  B                  |              |              | Yes
 A             | [config not available] |              |              | 
 B             | [config not available] |              |              | Tablet 
1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
unavailable: 2 replica(s) not RUNNING
  1bf087d776394884b2031385cd7e8b82 
(kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 One or more errors occurred
  ea0e0a381c284877aa234228ed81a24f 
(kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
[LEADER]
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred{code}
{code:java}
W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.19 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
 during walk: IO error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.261075 
74 ts_tablet_manager.cc:1378] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Tablet failed to bootstrap: IO error: Could 
not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 74 
ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 0.213s 
user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 74 
tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 pmI1227 
07:43:15.261160 74 raft_consensus.cc:2227] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus shutting 
down.Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261169 74 
raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: Bootstrap 
replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 applied=4406 
ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
07:43:15.469259 74 env_posix.cc:2337] Could not delete directory: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.469303 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637
 during walk: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb97

[jira] [Created] (KUDU-3537) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)
daicheng created KUDU-3537:
--

 Summary: Could not remove renamed recovery dir(nfs) when kudu 
restarts
 Key: KUDU-3537
 URL: https://issues.apache.org/jira/browse/KUDU-3537
 Project: Kudu
  Issue Type: Bug
Affects Versions: 1.16.0
 Environment: kudu on k8s
Reporter: daicheng


Configured kudu directories to NFS on k8s , and insert some data to kudu,after 
restart kudu, the kudu tserver  fails to bootstrap with error like :
{code:java}
IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred {code}
while the issue didn't comes when the directory on local disk.

here some error details:
{code:java}
 Config source |        Replicas        | Current term | Config index | 
Committed?
---++--+--+
 master        | A*  B                  |              |              | Yes
 A             | [config not available] |              |              | 
 B             | [config not available] |              |              | Tablet 
1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
unavailable: 2 replica(s) not RUNNING
  1bf087d776394884b2031385cd7e8b82 
(kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 One or more errors occurred
  ea0e0a381c284877aa234228ed81a24f 
(kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
[LEADER]
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred{code}
{code:java}
W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.19 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
 during walk: IO error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.261075 
74 ts_tablet_manager.cc:1378] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Tablet failed to bootstrap: IO error: Could 
not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 74 
ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 0.213s 
user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 74 
tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 pmI1227 
07:43:15.261160 74 raft_consensus.cc:2227] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus shutting 
down.Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261169 74 
raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: Bootstrap 
replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 applied=4406 
ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
07:43:15.469259 74 env_posix.cc:2337] Could not delete directory: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.469303 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637
 during walk: IO error: 
/var/lib/kudu/ts

[jira] [Closed] (KUDU-3537) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng closed KUDU-3537.
--
Resolution: Duplicate

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3537
> URL: https://issues.apache.org/jira/browse/KUDU-3537
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
> Environment: kudu on k8s
>Reporter: daicheng
>Priority: Major
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406 ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
> orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.469259 74 env_posix.cc:2337] Could not de

[jira] [Commented] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800715#comment-17800715
 ] 

daicheng commented on KUDU-3536:


Configured kudu directories to NFS on k8s , and insert some data to kudu,after 
restart kudu, the kudu tserver  fails to bootstrap with error like :
{code:java}
IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred {code}
while the issue didn't comes when the directory on local disk.

here some error details:
{code:java}
 Config source |        Replicas        | Current term | Config index | 
Committed?
---++--+--+
 master        | A*  B                  |              |              | Yes
 A             | [config not available] |              |              | 
 B             | [config not available] |              |              | Tablet 
1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
unavailable: 2 replica(s) not RUNNING
  1bf087d776394884b2031385cd7e8b82 
(kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 One or more errors occurred
  ea0e0a381c284877aa234228ed81a24f 
(kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
[LEADER]
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred{code}
{code:java}
W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.19 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
 during walk: IO error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.261075 
74 ts_tablet_manager.cc:1378] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Tablet failed to bootstrap: IO error: Could 
not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 74 
ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 0.213s 
user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 74 
tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 pmI1227 
07:43:15.261160 74 raft_consensus.cc:2227] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus shutting 
down.Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261169 74 
raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: Bootstrap 
replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 applied=4406 
ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
07:43:15.469259 74 env_posix.cc:2337] Could not delete directory: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.469303 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637
 during walk: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.504146 
74

[jira] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


[ https://issues.apache.org/jira/browse/KUDU-3536 ]


daicheng deleted comment on KUDU-3536:


was (Author: dachn):
Configured kudu directories to NFS on k8s , and insert some data to kudu,after 
restart kudu, the kudu tserver  fails to bootstrap with error like :
{code:java}
IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred {code}
while the issue didn't comes when the directory on local disk.

here some error details:
{code:java}
 Config source |        Replicas        | Current term | Config index | 
Committed?
---++--+--+
 master        | A*  B                  |              |              | Yes
 A             | [config not available] |              |              | 
 B             | [config not available] |              |              | Tablet 
1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
unavailable: 2 replica(s) not RUNNING
  1bf087d776394884b2031385cd7e8b82 
(kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
 One or more errors occurred
  ea0e0a381c284877aa234228ed81a24f 
(kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
[LEADER]
    State:       FAILED
    Data state:  TABLET_DATA_READY
    Last status: IO error: Could not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 One or more errors occurred{code}
{code:java}
W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.19 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
 during walk: IO error: 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.261075 
74 ts_tablet_manager.cc:1378] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Tablet failed to bootstrap: IO error: Could 
not remove renamed recovery dir 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 
/var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
 One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 74 
ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 0.213s 
user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 74 
tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 pmI1227 
07:43:15.261160 74 raft_consensus.cc:2227] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus shutting 
down.Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261169 74 
raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 P 
ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: Bootstrap 
replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 applied=4406 
ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
07:43:15.469259 74 env_posix.cc:2337] Could not delete directory: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 07:43:15.469303 
74 env_posix.cc:2063] Error running callback with file 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637
 during walk: IO error: 
/var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
 Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 07:43:15.504146 
74 ts_tablet_manager.cc:1378] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
ea0e0a381c2848

[jira] [Updated] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated KUDU-3536:
---
Attachment: image-2023-12-27-17-53-49-704.png

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406 ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
> orphaned_commits=0. Pending: 0 replicatesWed, Dec 27 2023 3:43:15 pmW1227 
> 

[jira] [Updated] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated KUDU-3536:
---
Attachment: image-2023-12-27-17-53-58-982.png

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406 ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
> orphaned_commits=0. Pending: 0 replicat

[jira] [Updated] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated KUDU-3536:
---
Attachment: image-2023-12-27-17-56-03-991.png

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png, image-2023-12-27-17-56-03-991.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406 ignored=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
> orph

[jira] [Updated] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated KUDU-3536:
---
Attachment: image-2023-12-27-17-57-17-795.png

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png, image-2023-12-27-17-56-03-991.png, 
> image-2023-12-27-17-57-17-795.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406 ignored=2} inserts{seen=0 ignored=

[jira] [Updated] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated KUDU-3536:
---
Attachment: image-2023-12-27-17-58-56-351.png

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png, image-2023-12-27-17-56-03-991.png, 
> image-2023-12-27-17-57-17-795.png, image-2023-12-27-17-58-56-351.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.035sWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261147 
> 74 tablet_replica.cc:323] stopping tablet replicaWed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.261160 74 raft_consensus.cc:2227] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f [term 1 
> FOLLOWER]: Raft consensus shutting down.Wed, Dec 27 2023 3:43:15 pmI1227 
> 07:43:15.261169 74 raft_consensus.cc:2256] T 3b734a27abc74768ad6cff599b66f0f1 
> P ea0e0a381c284877aa234228ed81a24f [term 1 FOLLOWER]: Raft consensus is shut 
> down!Wed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261204 74 
> tablet_bootstrap.cc:492] T 1bb9b2f91c3f48d7a97fb974112dedd6 P 
> ea0e0a381c284877aa234228ed81a24f: Bootstrap starting.Wed, Dec 27 2023 3:43:15 
> pmI1227 07:43:15.452575 74 tablet_bootstrap.cc:492] T 
> 1bb9b2f91c3f48d7a97fb974112dedd6 P ea0e0a381c284877aa234228ed81a24f: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=4406 overwritten=0 
> applied=4406

[jira] [Commented] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800730#comment-17800730
 ] 

daicheng commented on KUDU-3536:


(1) here some directories before restart:

data dir:
{code:java}
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls
35  48  block_manager_instance
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls 35/05/11/00582958901988
0058295890198802  0058295890198805  0058295890198808  0058295890198812  
0058295890198815  0058295890198819  0058295890198823  0058295890198826  
0058295890198803  0058295890198806  0058295890198810  0058295890198813  
0058295890198816  0058295890198820  0058295890198824  
0058295890198804  0058295890198807  0058295890198811  0058295890198814  
0058295890198818  0058295890198821  0058295890198825  {code}
wals dir:

!image-2023-12-27-17-53-58-982.png!

 

after restart :

data still exists in data dir like this:

!image-2023-12-27-17-56-03-991.png!

and the wals dirs  have many dirs  like :

!image-2023-12-27-17-57-17-795.png!

!image-2023-12-27-17-58-56-351.png!

 

 

 

 

 

 

 

 

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png, image-2023-12-27-17-56-03-991.png, 
> image-2023-12-27-17-57-17-795.png, image-2023-12-27-17-58-56-351.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmW1227 
> 07:43:15.19 74 env_posix.cc:2063] Error running callback with file 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917
>  during walk: IO error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:43:15 pmE1227 
> 07:43:15.261075 74 ts_tablet_manager.cc:1378] T 
> 3b734a27abc74768ad6cff599b66f0f1 P ea0e0a381c284877aa234228ed81a24f: Tablet 
> failed to bootstrap: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  One or more errors occurredWed, Dec 27 2023 3:43:15 pmI1227 07:43:15.261124 
> 74 ts_tablet_manager.cc:1356] T 3b734a27abc74768ad6cff599b66f0f1 P 
> ea0e0a381c284877aa234228ed81a24f: Time spent bootstrapping tablet: real 
> 0.213s user 0.070s sys 0.03

[jira] [Comment Edited] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800730#comment-17800730
 ] 

daicheng edited comment on KUDU-3536 at 12/27/23 10:02 AM:
---

(1) here some directories before restart:

data dir:
{code:java}
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls
35  48  block_manager_instance
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls 35/05/11/00582958901988
0058295890198802  0058295890198805  0058295890198808  0058295890198812  
0058295890198815  0058295890198819  0058295890198823  0058295890198826  
0058295890198803  0058295890198806  0058295890198810  0058295890198813  
0058295890198816  0058295890198820  0058295890198824  
0058295890198804  0058295890198807  0058295890198811  0058295890198814  
0058295890198818  0058295890198821  0058295890198825  {code}
wals dir:

!image-2023-12-27-17-53-58-982.png!

 

(2) after restart :

data still exists in data dir like this:

!image-2023-12-27-17-56-03-991.png!

and the wals dirs  have many dirs  like :

!image-2023-12-27-17-57-17-795.png!

!image-2023-12-27-17-58-56-351.png!

 

 

 

 

 

 

 

 


was (Author: dachn):
(1) here some directories before restart:

data dir:
{code:java}
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls
35  48  block_manager_instance
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls 35/05/11/00582958901988
0058295890198802  0058295890198805  0058295890198808  0058295890198812  
0058295890198815  0058295890198819  0058295890198823  0058295890198826  
0058295890198803  0058295890198806  0058295890198810  0058295890198813  
0058295890198816  0058295890198820  0058295890198824  
0058295890198804  0058295890198807  0058295890198811  0058295890198814  
0058295890198818  0058295890198821  0058295890198825  {code}
wals dir:

!image-2023-12-27-17-53-58-982.png!

 

after restart :

data still exists in data dir like this:

!image-2023-12-27-17-56-03-991.png!

and the wals dirs  have many dirs  like :

!image-2023-12-27-17-57-17-795.png!

!image-2023-12-27-17-58-56-351.png!

 

 

 

 

 

 

 

 

> Could not remove renamed recovery dir(nfs) when kudu restarts
> -
>
> Key: KUDU-3536
> URL: https://issues.apache.org/jira/browse/KUDU-3536
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: daicheng
>Priority: Major
> Attachments: image-2023-12-27-17-53-49-704.png, 
> image-2023-12-27-17-53-58-982.png, image-2023-12-27-17-56-03-991.png, 
> image-2023-12-27-17-57-17-795.png, image-2023-12-27-17-58-56-351.png
>
>
> Configured kudu directories to NFS on k8s , and insert some data to 
> kudu,after restart kudu, the kudu tserver  fails to bootstrap with error like 
> :
> {code:java}
> IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred {code}
> while the issue didn't comes when the directory on local disk.
> here some error details:
> {code:java}
>  Config source |        Replicas        | Current term | Config index | 
> Committed?
> ---++--+--+
>  master        | A*  B                  |              |              | Yes
>  A             | [config not available] |              |              | 
>  B             | [config not available] |              |              | 
> Tablet 1bb9b2f91c3f48d7a97fb974112dedd6 of table 'impala::test.test_kudu' is 
> unavailable: 2 replica(s) not RUNNING
>   1bf087d776394884b2031385cd7e8b82 
> (kudu-tserver-0.kudu-tservers.qilu-local.svc.cluster.local:7050): not running
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703663028897150:
>  One or more errors occurred
>   ea0e0a381c284877aa234228ed81a24f 
> (kudu-tserver-1.kudu-tservers.qilu-local.svc.cluster.local:7050): not running 
> [LEADER]
>     State:       FAILED
>     Data state:  TABLET_DATA_READY
>     Last status: IO error: Could not remove renamed recovery dir 
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  
> /var/lib/kudu/tserver/wals/1bb9b2f91c3f48d7a97fb974112dedd6.recovery-1703662995452637:
>  One or more errors occurred{code}
> {code:java}
> W1227 07:43:15.222187 74 env_posix.cc:2337] Could not delete directory: IO 
> error: 
> /var/lib/kudu/tserver/wals/3b734a27abc74768ad6cff599b66f0f1.recovery-1703662995205917:
>  Directory not empty (error 39)Wed, Dec 27 2023 3:4

[jira] [Comment Edited] (KUDU-3536) Could not remove renamed recovery dir(nfs) when kudu restarts

2023-12-27 Thread daicheng (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800730#comment-17800730
 ] 

daicheng edited comment on KUDU-3536 at 12/27/23 10:05 AM:
---

(1) here some directories before restart:

data dir:
{code:java}
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls
35  48  block_manager_instance
kudu@kudu-tserver-1:/var/lib/kudu/tserver/data$ ls 35/05/11/00582958901988
0058295890198802  0058295890198805  0058295890198808  0058295890198812  
0058295890198815  0058295890198819  0058295890198823  0058295890198826  
0058295890198803  0058295890198806  0058295890198810  0058295890198813  
0058295890198816  0058295890198820  0058295890198824  
0058295890198804  0058295890198807  0058295890198811  0058295890198814  
0058295890198818  0058295890198821  0058295890198825  {code}
wals dir:

!image-2023-12-27-17-53-58-982.png!

 

(2) after restart :

data still exists in data dir like this:

!image-2023-12-27-17-56-03-991.png!

and the wals dirs  have many dirs  like :

!image-2023-12-27-17-57-17-795.png!

!image-2023-12-27-17-58-56-351.png!

 

*Errors are reported in the log and printed in a loop*
|I1227 10:01:02.639056 3479 tablet_replica.cc:323] stopping tablet replica|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.639060 3479 raft_consensus.cc:2227] 
T db50b1f175124cd5a3dd66362164fe9c P ea0e0a381c284877aa234228ed81a24f [term 1 
LEARNER]: Raft consensus shutting down.|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.639070 3479 raft_consensus.cc:2256] 
T db50b1f175124cd5a3dd66362164fe9c P ea0e0a381c284877aa234228ed81a24f [term 1 
LEARNER]: Raft consensus is shut down!|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.842792 131 tablet_service.cc:1508] 
Processing DeleteTablet for tablet 6890f8aee0934b5e8a9c84460516d6e4 with 
delete_type TABLET_DATA_TOMBSTONED (TS ea0e0a381c284877aa234228ed81a24f not 
found in new config with opid_index 581) from \{username='kudu'} at 
10.244.26.209:43330|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.842931 3486 
ts_tablet_manager.cc:1822] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: Deleting tablet data with delete state 
TABLET_DATA_TOMBSTONED|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.970611 3486 
ts_tablet_manager.cc:1835] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: tablet deleted with delete type 
TABLET_DATA_TOMBSTONED: last-logged OpId unknown|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.970773 3486 log.cc:1192] T 
6890f8aee0934b5e8a9c84460516d6e4 P ea0e0a381c284877aa234228ed81a24f: Deleting 
WAL directory at /var/lib/kudu/tserver/wals/6890f8aee0934b5e8a9c84460516d6e4|
|Wed, Dec 27 2023 6:01:02 pm|I1227 10:01:02.987134 131 tablet_service.cc:1508] 
Processing DeleteTablet for tablet db50b1f175124cd5a3dd66362164fe9c with 
delete_type TABLET_DATA_TOMBSTONED (TS ea0e0a381c284877aa234228ed81a24f not 
found in new config with opid_index 618) from \{username='kudu'} at 
10.244.26.209:43330|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.001976 3486 
ts_tablet_manager.cc:1822] T db50b1f175124cd5a3dd66362164fe9c P 
ea0e0a381c284877aa234228ed81a24f: Deleting tablet data with delete state 
TABLET_DATA_TOMBSTONED|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.087684 3486 
ts_tablet_manager.cc:1835] T db50b1f175124cd5a3dd66362164fe9c P 
ea0e0a381c284877aa234228ed81a24f: tablet deleted with delete type 
TABLET_DATA_TOMBSTONED: last-logged OpId unknown|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.087839 3486 log.cc:1192] T 
db50b1f175124cd5a3dd66362164fe9c P ea0e0a381c284877aa234228ed81a24f: Deleting 
WAL directory at /var/lib/kudu/tserver/wals/db50b1f175124cd5a3dd66362164fe9c|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.266434 3487 
ts_tablet_manager.cc:891] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: Initiating tablet copy from peer 
a2af138dcf6c4b3fb51c02262be08333 
(kudu-tserver-2.kudu-tservers.qilu-local.svc.cluster.local:7050)|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.266579 3487 
tablet_copy_client.cc:250] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: tablet copy: overwriting existing tombstoned 
replica with an unknown last-logged opid|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.267592 3487 
tablet_copy_client.cc:287] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: tablet copy: Beginning tablet copy session 
from remote peer at address 
kudu-tserver-2.kudu-tservers.qilu-local.svc.cluster.local:7050|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.269829 3487 
ts_tablet_manager.cc:1822] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: Deleting tablet data with delete state 
TABLET_DATA_COPYING|
|Wed, Dec 27 2023 6:01:03 pm|I1227 10:01:03.326943 3487 
ts_tablet_manager.cc:1835] T 6890f8aee0934b5e8a9c84460516d6e4 P 
ea0e0a381c284877aa234228ed81a24f: tablet deleted with delete type 
TABLET_DATA_COPYING: