Hi,

We’re running Robinhood 2.5.5-2 against several of our Lustre file systems 
using changelogs.  We’ve been having a problem that seems to be mostly around 
actively written files on the file systems.  We had a user today that was 
writing out over a petabyte of data, but this was not shown in the Robinhood 
database.  The files all existed in the database, but their files sizes were 
significantly smaller than on disk.  A rbh-diff on the user’s directory brought 
the database back in sync.

I assume this is because no changelog events are produced while a file is open 
and being actively written to.  Under the db_update_policy stanza in our 
Robinhood configuration, we left the default of md_update=always.  Should I 
change that to on_event_periodic to keep up with these changing files?  If so, 
what would be a reasonable min_interval and max_interval for file systems with 
10-20 million files?  Do the intervals mean that Robinhood won’t check for 
updates any sooner than min_interval (even if an event occurs) and will check 
an entry when it hasn’t been checked in the last max_interval?  If I have a 
max_interval of 1h, does that mean every file in the file system will be 
checked for updates once every hour?

What is the difference between “always” and “on_event”?  Does “on_event” update 
for certain events whereas “always” updates for any event related to that entry?

Thanks in advance,
Shawn
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to