Possible workaround is to tell RH not to clear "older" records.

to avoid flooding robinhood log you may want to replace LVL_VERB with
LVL_DEBUG or LVL_MAJOR

index 6f4dc853..360120f8 100644
--- a/src/chglog_reader/chglog_reader.c
+++ b/src/chglog_reader/chglog_reader.c
@@ -210,6 +210,16 @@ static int
clear_changelog_records(reader_thr_info_t *p_info)
         return 0;
     }
 
+    // Make sure record id is higher than currently cleared record ids
+    if (p_info->last_clear.rec_id > p_info->last_commit.rec_id) {
+        DisplayLog(LVL_VERB, CHGLOG_TAG,
+                   "%s: ChangeLog backward clear records up to
#%"PRIu64
+                   " already cleared up to #%"PRIu64,
+                   p_info->mdtdevice, p_info->last_commit.rec_id,
+                   p_info->last_clear.rec_id);
+        return 0;
+    }
+
     reader_id = cl_reader_config.mdt_def[p_info->thr_index].reader_id;


On Thu, 2019-07-18 at 09:39 +0200, Carsten Beyer wrote:
> Hi Thomas,
> 
> yes, changelog reader is registered on both MDT. The filesystem has
> 5 
> MDT in total and the other three are connected to second RBH server 
> (same problem/error).
> 
> I got also a notice from our storage vendor, it's already listed in
> JIRA:
> 
> Failure to clear the changelog for user 1 on MDT => 
> https://jira.whamcloud.com/browse/LU-11205
> 
> Looks like that we hit this.
> 
> Cheers,
> Carsten
> 
> 
> On 17.07.19 16:16, thomas.leibov...@cea.fr wrote:
> > > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] ChangeLog |
> > > ERROR:  llapi_changelog_clear("lustre01-MDT0001", "cl1",
> > > 42031384423) returned -22
> > 
> > Hello,
> > 
> > Did you register a changelog reader on each MDT?
> > Here it seams your filesystem has at least 2 MDTs (MDT0001 must be
> > the second one).
> > 
> > Regards,
> > Thomas
> > 
> > -----Message d'origine-----
> > De : Carsten Beyer [mailto:be...@dkrz.de]
> > Envoyé : mercredi 17 juillet 2019 15:56
> > À : robinhood
> > Objet : [robinhood-support] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515556002)
> > returned -22
> > 
> > Hi @all,
> > 
> > I have a question if someone is using Lustre 2.11 (serverside /
> > clientside) with Robinhood v3.1.5 ?
> > 
> > We have 3 Lustre systems (one testsystem, 2 production systems) and
> > I
> > get error messages for llapi_changelog_clear when I start
> > Robinhood.
> > It's after updating the Lustre filesystems from v2.5 to v2.11.
> > Errors
> > occur only on the production systems but not on the testsystem.
> > It's
> > maybe the load on the filesystems. Maybe somebody has the same
> > issue on
> > other Lustre version(s) ?
> > 
> > [root@mrh0 robinhood]# tail -f robinhood.log
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] CheckFS |
> > '/mnt/lustre01'
> > matches mount point '/mnt/lustre01', type=lustre,
> > fs=10.50.32.53@o2ib:10.50.32.54@o2ib:/lustre01
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | Table does
> > not
> > exist: 'SELECT value FROM VARS WHERE varname='VersionFunctionSet''
> > (Table 'robinhood_lustre01.VARS' doesn't exist)
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | No function
> > versioning (expected: 1.6). Existing functions will be dropped and
> > re-created.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | Table does
> > not
> > exist: 'SELECT value FROM VARS WHERE varname='VersionTriggerSet''
> > (Table
> > 'robinhood_lustre01.VARS' doesn't exist)
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | No trigger
> > versioning (expected: 1.4). Existing triggers will be dropped and
> > re-created.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table VARS
> > does
> > not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table ENTRIES
> > does
> > not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table NAMES
> > does
> > not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table
> > ANNEX_INFO
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | function
> > sz_range
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table
> > ACCT_STAT
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | Populating
> > accounting table from existing DB contents. This can take a
> > while...
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table
> > STRIPE_INFO
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table
> > STRIPE_ITEMS
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | table SOFT_RM
> > does
> > not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | trigger
> > ACCT_ENTRY_INSERT does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | trigger
> > ACCT_ENTRY_DELETE does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | trigger
> > ACCT_ENTRY_UPDATE does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | function
> > one_path
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] ListMgr | function
> > this_path
> > does not exist (or wrong version): creating it.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] llapi | warning:
> > llapi_changelog_start() called without CHANGELOG_FLAG_EXTRA_FLAGS
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/1] Main | Daemon started
> > (running modules: log_reader)
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/2] ChangeLog | LU-1331 is
> > fixed
> > in this version of Lustre.
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] llapi | cannot purge
> > records
> > for 'cl1'
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0001", "cl1", 42031384423)
> > returned -22
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 
> > 
> > [root@mrh0 robinhood]# egrep '(ChangeLog \|
> > ERROR|STAGE_CHGLOG_CLR)'
> > robinhood.log
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0001", "cl1", 42031384423)
> > returned -22
> > 2019/07/17 14:20:09 robinhood@mrh0[40836/3] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/31] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515555990)
> > returned -22
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/31] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/6] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515555991)
> > returned -22
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/6] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/33] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515555992)
> > returned -22
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/33] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/8] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515555993)
> > returned -22
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/8] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/5] ChangeLog | ERROR:
> > llapi_changelog_clear("lustre01-MDT0000", "cl1", 11515555994)
> > returned -22
> > 2019/07/17 14:20:11 robinhood@mrh0[40836/5] EntryProc | Error -22
> > performing callback at stage STAGE_CHGLOG_CLR.
> > 
> > Robinhood is on RHEL6 with Lustre 2.11 client / MariaDB / RBH
> > v3.1.5
> > 
> > [root@mrh0 robinhood]# rpm -qa | egrep -i
> > '(lustre-client|robinhood|mariadb)' | sort
> > kmod-lustre-client-2.11.0-1_2.6.32_754.14.2.el6.x86_64
> > lustre-client-2.11.0-1_2.6.32_754.14.2.el6.x86_64
> > MariaDB-client-10.2.11-1.el6.x86_64
> > MariaDB-common-10.2.11-1.el6.x86_64
> > MariaDB-compat-10.2.11-1.el6.x86_64
> > MariaDB-devel-10.2.11-1.el6.x86_64
> > MariaDB-server-10.2.11-1.el6.x86_64
> > MariaDB-shared-10.2.11-1.el6.x86_64
> > robinhood-adm-3.1.5-1.x86_64
> > robinhood-lustre-3.1.5-1.lustre2.11.el6.x86_64
> > [root@mrh0 robinhood]#
> > 
> > 
> > Cheers,
> > Carsten
> > 
> > 
> 
> 
> _______________________________________________
> robinhood-support mailing list
> robinhood-support@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/robinhood-support

_______________________________________________
robinhood-support mailing list
robinhood-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to