morningman opened a new issue #3889:
URL: https://github.com/apache/incubator-doris/issues/3889


   **Describe the bug**
   fe.log has error:
   
   ```
   2020-06-16 04:55:17,211 ERROR 27 
[PublishVersionDaemon.runAfterCatalogReady():57] errors while publish version 
to all backends
   java.util.NoSuchElementException: No value present
           at java.util.Optional.get(Optional.java:135) ~[?:1.8.0_161]
           at 
org.apache.doris.load.routineload.RoutineLoadJob.afterVisible(RoutineLoadJob.java:806)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.TransactionState.afterStateTransform(TransactionState.java:409)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.TransactionState.afterStateTransform(TransactionState.java:392)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.DatabaseTransactionMgr.finishTransaction(DatabaseTransactionMgr.java:762)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.GlobalTransactionMgr.finishTransaction(GlobalTransactionMgr.java:224)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.PublishVersionDaemon.publishVersion(PublishVersionDaemon.java:208)
 ~[palo-fe.jar:?]
           at 
org.apache.doris.transaction.PublishVersionDaemon.runAfterCatalogReady(PublishVersionDaemon.java:55)
 [palo-fe.jar:?]
           at 
org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) 
[palo-fe.jar:?]
           at org.apache.doris.common.util.Daemon.run(Daemon.java:116) 
[palo-fe.jar:?]
   ```
   
   **Debug**
   
   
https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L927-L936
   
   the `routineLoadTaskInfo.setTxnStatus(txnStatus);` in line 928 should be 
before the check `if (state == JobState.RUNNING) {` in line 927, or the error 
may happen by following step:
   
   1. A routine load task's transaction is COMMITTED, and begin to call 
`RoutineLoadJob.executeTaskOnTxnStatusChanged()`.
   
   2. The routine load job PAUSED because of some other error, such as BE down
   
   3. Code runs to line 927 and find that job's state is not RUNNING, so it 
will not set the transaction status saved in the `routineLoadTaskInfo`
   
   4. the transaction is COMMITTED, but due to some reason, it can not change 
to VISIBLE for a long time, and because transaction status in 
`routineLoadTaskInfo`(not the status in transaction) is still PENDING, so the 
routine load task timeout checker will consider is as a timeout task. 
   
   
https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L483-L489
   
   5. Timeout checker remove the `routineLoadTaskInfo` from the job.
   
   6. the PUBLISH of the transaction is finally succeeded, but when calling 
`afterVisible()`, it can not find the  `routineLoadTaskInfo` in the job, so 
exception will thrown in line 806.
   
   
https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L804-L807


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to