SWJTU-ZhangLei opened a new issue, #18766:
URL: https://github.com/apache/doris/issues/18766

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   root@VM-0-46-ubuntu:/mnt/hdd01/STRESS_ENV/be# ./lib/doris_be --version
   doris-0.0.0-branch-1.2(AVX2) RELEASE (build 
git://VM-0-22-ubuntu@62b20b126ff94284b3b9b84a6c2e0e931c157565)
   Built on Fri, 07 Apr 2023 21:45:48 CST by VM-0-22-ubuntu
   
   
   ### What's Wrong?
   
   1、fe can't not start
   2、fe.out
   `[2023-04-10 10:52:59] notify new FE type transfer: UNKNOWN
   [2023-04-10 10:52:59] notify new FE type transfer: FOLLOWER
   [2023-04-10 10:52:59] notify new FE type transfer: UNKNOWN
   [2023-04-10 10:52:59] notify new FE type transfer: FOLLOWER
   [2023-04-10 10:52:59] this node is DETACHED
   java.lang.NullPointerException
           at 
com.sleepycat.je.rep.InsufficientLogException.initRepImpl(InsufficientLogException.java:268)
           at 
com.sleepycat.je.rep.InsufficientLogException.getRepImpl(InsufficientLogException.java:361)
           at com.sleepycat.je.rep.NetworkRestore.init(NetworkRestore.java:171)
           at 
com.sleepycat.je.rep.NetworkRestore.execute(NetworkRestore.java:281)
           at 
org.apache.doris.journal.bdbje.BDBJEJournal.reSetupBdbEnvironment(BDBJEJournal.java:358)
           at 
org.apache.doris.journal.bdbje.BDBJEJournal.open(BDBJEJournal.java:343)
           at org.apache.doris.persist.EditLog.open(EditLog.java:1038)
           at org.apache.doris.catalog.Env.initialize(Env.java:863)
           at org.apache.doris.PaloFe.start(PaloFe.java:138)
           at org.apache.doris.PaloFe.main(PaloFe.java:73)`
   
   3、fe.log
   `2023-04-10 10:52:59,166 INFO (UNKNOWN 172.21.0.68_9310_1680101228114(-1)|1) 
[BDBEnvironment.setup():162] add helper[172.21.0.68:9310] as 
ReplicationGroupAdmin
   2023-04-10 10:52:59,170 WARN (UNKNOWN 172.21.0.68_9310_1680101228114(-1)|1) 
[Env.notifyNewFETypeTransfer():2373] notify new FE type transfer: UNKNOWN
   2023-04-10 10:52:59,189 WARN (RepNode 172.21.0.68_9310_1680101228114(-1)|62) 
[Env.notifyNewFETypeTransfer():2373] notify new FE type transfer: FOLLOWER
   2023-04-10 10:52:59,198 WARN (REPLICA 172.21.0.68_9310_1680101228114(1)|62) 
[Env.notifyNewFETypeTransfer():2373] notify new FE type transfer: UNKNOWN
   2023-04-10 10:52:59,214 WARN (UNKNOWN 172.21.0.68_9310_1680101228114(1)|62) 
[Env.notifyNewFETypeTransfer():2373] notify new FE type transfer: FOLLOWER
   2023-04-10 10:52:59,228 WARN (REPLICA 172.21.0.68_9310_1680101228114(1)|62) 
[BDBStateChangeListener.stateChange():57] this node is DETACHED
   2023-04-10 10:52:59,219 WARN (UNKNOWN 172.21.0.68_9310_1680101228114(-1)|1) 
[BDBJEJournal.reSetupBdbEnvironment():349] catch insufficient log exception. 
will recover and try again.
   com.sleepycat.je.rep.InsufficientLogException: (JE 18.3.12) Environment must 
be closed, caused by: com.sleepycat.je.rep.InsufficientLogException: 
Environment invalid because of previous exception: (JE 18.3.12) 
172.21.0.68_9310_1680101228114(1):/mnt/hdd01/STRESS_ENV/fe/doris-meta/bdb 
INSUFFICIENT_LOG: Log files at this node are obsolete. Environment is invalid 
and must be closed.refreshVLSN=null logProviders=null repImpl=null props=null
           at 
com.sleepycat.je.rep.InsufficientLogException.wrapSelf(InsufficientLogException.java:340)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1835) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:848) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:802) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.log.LogManager.getLogEntryHandleNotFound(LogManager.java:956) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.dbi.DiskOrderedScanner.fetchEntry(DiskOrderedScanner.java:2068)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.dbi.DiskOrderedScanner.fetchAndProcessBINs(DiskOrderedScanner.java:1640)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.dbi.DiskOrderedScanner.scanSerial(DiskOrderedScanner.java:789) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.dbi.DiskOrderedScanner.scan(DiskOrderedScanner.java:708) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at com.sleepycat.je.dbi.DatabaseImpl.count(DatabaseImpl.java:1510) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at com.sleepycat.je.Database.count(Database.java:2042) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
org.apache.doris.journal.bdbje.BDBJEJournal.getMaxJournalId(BDBJEJournal.java:257)
 ~[doris-fe.jar:1.2-SNAPSHOT]
           at 
org.apache.doris.journal.bdbje.BDBJEJournal.open(BDBJEJournal.java:339) 
~[doris-fe.jar:1.2-SNAPSHOT]
           at org.apache.doris.persist.EditLog.open(EditLog.java:1038) 
~[doris-fe.jar:1.2-SNAPSHOT]
           at org.apache.doris.catalog.Env.initialize(Env.java:863) 
~[doris-fe.jar:1.2-SNAPSHOT]
           at org.apache.doris.PaloFe.start(PaloFe.java:138) 
~[doris-fe.jar:1.2-SNAPSHOT]
           at org.apache.doris.PaloFe.main(PaloFe.java:73) 
~[doris-fe.jar:1.2-SNAPSHOT]
   Caused by: com.sleepycat.je.rep.InsufficientLogException: Environment 
invalid because of previous exception: (JE 18.3.12) 
172.21.0.68_9310_1680101228114(1):/mnt/hdd01/STRESS_ENV/fe/doris-meta/bdb 
INSUFFICIENT_LOG: Log files at this node are obsolete. Environment is invalid 
and must be closed. Originally thrown by HA thread: REPLICA 
172.21.0.68_9310_1680101228114(1) Originally thrown by HA thread: REPLICA 
172.21.0.68_9310_1680101228114(1)
           at 
com.sleepycat.je.rep.stream.ReplicaFeederSyncup.setupLogRefresh(ReplicaFeederSyncup.java:706)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.rep.stream.ReplicaFeederSyncup.verifyRollback(ReplicaFeederSyncup.java:355)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.rep.stream.ReplicaFeederSyncup.execute(ReplicaFeederSyncup.java:164)
 ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:732) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:485) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at 
com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:412) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
           at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1869) 
~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]`
   
   ### What You Expected?
   
   fe can start well
   
   ### How to Reproduce?
   
   It is hard to reproduced,  the followed steps is that we found this problem 
in our environment.
   
   1、 build a 3 fe and 3 be cluster
   2、import and select data continuously throught a follower fe ip
   3、sometime, we found the master fe oom
   4、after about ten hours, we try to start the master fe, we found it can't 
start
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to