[ 
https://issues.apache.org/jira/browse/HDDS-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646462#comment-17646462
 ] 

Mohammad Arafat Khan commented on HDDS-3894:
--------------------------------------------

*+Summary of the discussion+ :-*  

The Ozone Manager logs were quite Noisy with a lot of unimportant error 
messages being thrown, this is because of recon. Recon makes these calls to OM 
to update the delta info. For Delta updates, \{*}recon uses sequence 
numbers{*}. If Recon's sequence number is out-of-date and the iterator is 
invalid, it will throw {*}SequenceNumberNotFoundException{*}. This could happen 
after the OM restart or it could happen during the startup of a {*}fresh 
cluster{*}. 

So initially the plan was to tell OM it can ignore this exception by 
suppressing the log messages, which might not be the right thing to do 
\{*}because it is important to know what is happening to the state of updates 
to Recon{*}. Hence, we should not suppress these messages and let them be. 
Currently, for the assumption that these log messages flood the user console, 
the frequency of these Delta calls is around 10 mins, and the chances of the 
exception being thrown will only be during *OM restart* or the *startup of a 
fresh cluster, hence they shouldn't show up on the User console much.*

And even if this exception occurs, it will be handled as well because if the 
*SequenceNumberNotFoundException* is thrown, Recon will then fall back to 
fetching the entire snapshot rather than the delta updates.

 

[~weichiu] [~smeng] [~swagle] can we do something else on this, or can we close 
it?

> Noisy log at OM when the requested sequence is not written into DB
> ------------------------------------------------------------------
>
>                 Key: HDDS-3894
>                 URL: https://issues.apache.org/jira/browse/HDDS-3894
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager, Ozone Recon
>            Reporter: Rakesh Radhakrishnan
>            Assignee: Mohammad Arafat Khan
>            Priority: Minor
>
> Too many logs at OM, which is noisy. It looks like Recon is making this call 
> to OM to update the delta info. OM can ignore this exception and reduce the 
> log priority to DEBUG or so.
> *Ozone-om.log*
> {code:java}
> 2020-06-26 10:02:52,963 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
>  Rolled log segment from 
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_inprogress_106536
>  to 
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_106536-106603
> 2020-06-26 10:02:52,982 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
>  created new log segment 
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_inprogress_106604
> 2020-06-26 10:09:54,828 ERROR org.apache.hadoop.hdds.utils.db.RDBStore: 
> Unable to get delta updates since sequenceNumber 11268864
> org.rocksdb.RocksDBException: Requested sequence not yet written in the db
>         at org.rocksdb.RocksDB.getUpdatesSince(Native Method)
>         at org.rocksdb.RocksDB.getUpdatesSince(RocksDB.java:3588)
>         at 
> org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(RDBStore.java:339)
>         at 
> org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(OzoneManager.java:3422)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(OzoneManagerRequestHandler.java:257)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:194)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:171)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:109)
>         at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:74)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
>         at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
> 2020-06-26 10:11:08,442 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
>  Rolling segment log-106604_106637 to index:106637
> 2020-06-26 10:11:08,443 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
>  Rolling segment log-106638_106654 to index:106654
> 2020-06-26 10:11:08,444 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
>  Rolling segment log-106655_106671 to index:106671
> {code}
> *Recon.log*
> {code:java}
> 2020-06-26 10:08:55,883 INFO 
> org.apache.hadoop.ozone.recon.fsck.MissingContainerTask: Missing Container 
> task Thread took 88 milliseconds for processing 0 containers.
> 2020-06-26 10:09:54,824 INFO 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: 
> Syncing data from Ozone Manager.
> 2020-06-26 10:09:54,824 INFO 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: 
> Obtaining delta updates from Ozone Manager
> 2020-06-26 10:09:54,828 INFO 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: 
> Number of updates received from OM : 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to