[
https://issues.apache.org/jira/browse/HDDS-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646462#comment-17646462
]
Mohammad Arafat Khan commented on HDDS-3894:
--------------------------------------------
*+Summary of the discussion+ :-*
The Ozone Manager logs were quite Noisy with a lot of unimportant error
messages being thrown, this is because of recon. Recon makes these calls to OM
to update the delta info. For Delta updates, \{*}recon uses sequence
numbers{*}. If Recon's sequence number is out-of-date and the iterator is
invalid, it will throw {*}SequenceNumberNotFoundException{*}. This could happen
after the OM restart or it could happen during the startup of a {*}fresh
cluster{*}.
So initially the plan was to tell OM it can ignore this exception by
suppressing the log messages, which might not be the right thing to do
\{*}because it is important to know what is happening to the state of updates
to Recon{*}. Hence, we should not suppress these messages and let them be.
Currently, for the assumption that these log messages flood the user console,
the frequency of these Delta calls is around 10 mins, and the chances of the
exception being thrown will only be during *OM restart* or the *startup of a
fresh cluster, hence they shouldn't show up on the User console much.*
And even if this exception occurs, it will be handled as well because if the
*SequenceNumberNotFoundException* is thrown, Recon will then fall back to
fetching the entire snapshot rather than the delta updates.
[~weichiu] [~smeng] [~swagle] can we do something else on this, or can we close
it?
> Noisy log at OM when the requested sequence is not written into DB
> ------------------------------------------------------------------
>
> Key: HDDS-3894
> URL: https://issues.apache.org/jira/browse/HDDS-3894
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Manager, Ozone Recon
> Reporter: Rakesh Radhakrishnan
> Assignee: Mohammad Arafat Khan
> Priority: Minor
>
> Too many logs at OM, which is noisy. It looks like Recon is making this call
> to OM to update the delta info. OM can ignore this exception and reduce the
> log priority to DEBUG or so.
> *Ozone-om.log*
> {code:java}
> 2020-06-26 10:02:52,963 INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
> Rolled log segment from
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_inprogress_106536
> to
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_106536-106603
> 2020-06-26 10:02:52,982 INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
> created new log segment
> /data/3/jun27_new1/hadoop-ozone/om/ratis2/bf265839-605b-3f16-9796-c5ba1605619e/current/log_inprogress_106604
> 2020-06-26 10:09:54,828 ERROR org.apache.hadoop.hdds.utils.db.RDBStore:
> Unable to get delta updates since sequenceNumber 11268864
> org.rocksdb.RocksDBException: Requested sequence not yet written in the db
> at org.rocksdb.RocksDB.getUpdatesSince(Native Method)
> at org.rocksdb.RocksDB.getUpdatesSince(RocksDB.java:3588)
> at
> org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(RDBStore.java:339)
> at
> org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(OzoneManager.java:3422)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(OzoneManagerRequestHandler.java:257)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:194)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:171)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:109)
> at
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:74)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
> at
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
> 2020-06-26 10:11:08,442 INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
> Rolling segment log-106604_106637 to index:106637
> 2020-06-26 10:11:08,443 INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
> Rolling segment log-106638_106654 to index:106654
> 2020-06-26 10:11:08,444 INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 5dca58f5-6231-4858-9589-4e64b1435aa4@group-C5BA1605619E-SegmentedRaftLogWorker:
> Rolling segment log-106655_106671 to index:106671
> {code}
> *Recon.log*
> {code:java}
> 2020-06-26 10:08:55,883 INFO
> org.apache.hadoop.ozone.recon.fsck.MissingContainerTask: Missing Container
> task Thread took 88 milliseconds for processing 0 containers.
> 2020-06-26 10:09:54,824 INFO
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl:
> Syncing data from Ozone Manager.
> 2020-06-26 10:09:54,824 INFO
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl:
> Obtaining delta updates from Ozone Manager
> 2020-06-26 10:09:54,828 INFO
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl:
> Number of updates received from OM : 0
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]