liutang123 opened a new issue #3545:
URL: https://github.com/apache/incubator-doris/issues/3545
Background
---
In our cluster, we may need to replace the FEs from some machines to others.
Suppose there are 3 FEs, the replacement progress is as follows:
A, B, C -> D, E, F
Stop A, drop A(`ALTER SYSTEM DROP FOLLOWER "{A's ip:A's port}";`);
add D`(ALTER SYSTEM ADD FOLLOWER "D's ip:D's port")`;
stop B, drop B;
add ;
stop C,drop C;
add F.
Error occurs
---
For some reason, A restarts and becomes the master(It thinks the current FEs
are A, B and C)
```
2020-05-07 15:53:50,290 WARN 74 [BDBStateChangeListener.stateChange():42]
transfer from UNKNOWN to MASTER
2020-05-07 15:53:50,297 INFO 84 [BDBHA.fencing():68] start fencing, epoch
number is 28
```
A tries to connect to B and C but fails.
```
2020-05-07 15:54:25,813 INFO 33 [HeartbeatMgr.runOneCycle():131] get
heartbeat response: type: FRONTEND, status: BAD, msg: got exception, name:
10.23.105.20_9010_1560853397130, queryPort: 0, rpcPort: 0, replayedJournalId: 0
2020-05-07 15:54:25,813 INFO 33 [HeartbeatMgr.runOneCycle():131] get
heartbeat response: type: FRONTEND, status: BAD, msg: got exception, name:
10.178.72.6_9010_1572596783035, queryPort: 0, rpcPort: 0, replayedJournalId: 0
```
A tries to connects to the BEs and successes.
```
2020-05-07 15:54:25,803 INFO 33 [HeartbeatMgr.runOneCycle():131] get
heartbeat response: type: BACKEND, status: OK, msg: null
```
BEs treat A as new master and report the tablets to A:
```
I0507 15:54:25.785990 71263 heartbeat_server.cpp:117] master change. new
master host: 10.22.86.18. port: 9020. epoch: 28
I0507 15:54:25.786010 71263 heartbeat_server.cpp:151] Master FE is changed
or restarted. report tablet and disk info immediately
W0507 15:54:28.283879 2016 heartbeat_server.cpp:120] epoch is not greater
than local. ignore heartbeat. host: 10.22.86.18 port: 9020 local epoch: 28
received epoch: 26
```
**So, we can see epoch will decreases.**
Because A's meta is too old, it will delete the tablet replicas witch not in
its meta.
```
2020-05-07 16:22:42,879 INFO 159 [ReportHandler.deleteFromBackend():639]
failed add to meta. tablet[35652129], backend[10002]. db[-1] does not exist
```
Try to let Doris recover by itself
---
We kill A and restart the BEs. FE found many tablet replicas miss but can
not repair them because many tablets's all replicas are deleted.
```
2020-05-07 17:53:16,222 INFO 1747901 [TabletSchedCtx.finishCloneTask():916]
clone finished: tablet id: 5591104, fe.log.20200507-2:2020-05-07 17:53:16,077
INFO 1614067 [TabletSchedCtx.finishCloneTask():916] clone finished: tablet id:
4831455, status: REPLICA_MISSING, state: FINISHED, type: REPAIR. from backend:
10002, src path hash: -4547742653113016317. to backend: 25550562, dest path
hash: -7051031248689971305. err: unable to find source replica
```
**So, FE will not attempt to recycle data from BE's trash**
Firstly recover data manually
---
We consult
[tablet-restore-tool](http://doris.apache.org/master/zh-CN/administrator-guide/operation/tablet-restore-tool.html)
to repair the tablets in BE but fails. The fail reason is tablet will be set
as TABLET_SHUTDOWN when being deleted and TabletManager will not load
TABLET_SHUTDOWN tabets.
```
I0507 16:27:10.622463 3849 tablet_manager.cpp:1364] set tablet to shutdown
state and remove it from memory tablet_id=28312052 schema_hash=2083600167
tablet path=/data5/olap/data/739/28312052/2083600167
I0508 01:59:58.980805 45923 restore_tablet_action.cpp:157] tablet path in
trash:/data5/olap/trash/20200507162852.693857/28312052/2083600167
I0508 01:59:59.047287 45923 tablet_manager.cpp:833] tablet is to be deleted,
skip load it tablet id = 28312052 schema hash = 2083600167
```
We modify the source code and skip this check, partial data are restored.
**So, when repair the tablet, BE should skip tablet_state check and set
tablet's state to TABLET_RUNNING**
Secondly recover data manually
---
For some empty partition's tablets created by old version Doris, its version
is 2, and the partition's version is 1.
When FE receives a tablet report and not found the tablet replica in its
meta, it will try to add the tablet replica info to its meta but will not when
tablet's version is bigger than partition's version.
```
2020-05-08 04:25:55,914 WARN 98 [ReportHandler.deleteFromBackend():652]
failed add to meta. tablet[35501614], backend[10007]. version is invalid.
tablet[2-0], partition's max version [1]
```
We drop these empty partitions and all data are finally restored.
**So, I think when add replica's info to meta, Doris should identify this
case**
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]