When NFS is unavailable, client try to write to log. Lgs server will put it into the queue with the time. This info will check point with standby. Switch-over happens, and NFS is available again. New active will get the diffent time b/w current time and the time put into the queue. The timer on node can be different and it causes the coredump due to the current time less then time put into the queue.
Once node is active, it must update the time put into the queue to make it consistent on local node. --- src/log/logd/lgs_cache.cc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/log/logd/lgs_cache.cc b/src/log/logd/lgs_cache.cc index d6a282e48..e3583e97c 100644 --- a/src/log/logd/lgs_cache.cc +++ b/src/log/logd/lgs_cache.cc @@ -124,7 +124,11 @@ Cache::Data::Data(std::shared_ptr<WriteAsyncInfo> info, Cache::Data::Data(const CkptPushAsync* data) { TRACE_ENTER(); param_ = std::make_shared<WriteAsyncInfo>(data); - queue_at_ = data->queue_at; + // Don't inherit the queue_at_ from the active node, + // since the timer on both nodes may be different. + // Queue_at now is about when the element is actually + // put into the queue of each logsv instance. + queue_at_ = base::TimespecToNanos(base::ReadMonotonicClock()); seq_id_ = data->seq_id; log_record_ = strdup(data->log_record); size_ = strlen(log_record_); -- 2.17.1 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel