Jessica, We see the same issue. I posted my question back in October but haven’t heard anything back: https://sourceforge.net/p/robinhood/mailman/message/35435513/
We are in a better place than before, but I still see some amount of discrepancy over time. We haven’t done a scan since late October and are staying reasonably in sync (not perfect though). I’ve done a couple things that are helping us to stay more in sync: 1) In the Robinhood configuration file, I set md_update = always. From looking through the source code, it appears this setting will force a metadata update every time every chance it gets, which is more frequently than the default. I don’t believe this has made a big impact, but it helps (https://github.com/cea-hpc/robinhood/blob/5274d237105e04f20fc81258a927d9c4311ebc77/src/common/update_params.c#L428). 2) I also enabled ATIME changelogs. This is not recommended for production by Lustre or Robinhood developers, but our metadata load is low enough that this doesn’t cause problems. There seems to be a hierarchy to when/if changelogs are reported (http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2016-March/013376.html), and I believe this change is the one that has helped us keep in sync the most. Both of these changes have performance ramifications and should be done at your own risk, but in our situation they’ve seemed to help. I’d be very happy to hear expert advice on keeping Robinhood in sync however. Thanks, Shawn On 1/24/17, 9:24 AM, "Jessica Otey" <jo...@nrao.edu> wrote: All, I have been observing for a while now a difference between how full robinhood believes our filesystem is and how full lfs df -h reports it is. I am wondering if anyone has any insight into this. A bit of history... I run frequent reports against the robinhood database, which also include the output of the lfs df -h command. Historically, the total based on rbh-du and lfs df were essentially identical. Lately, they seem to keep growing apart--what seems to bring them back together is doing a full scan of the file system. As time passes after a scan (changelogs are on), the 'lfs df' command reports that usage is more and more full than rbh-du says it is. Indeed, I recently reinstalled (upgraded) my robinhood and noticed that when the report ran precisely after the filescan and before the service was activated (so no changelogs were being consumed) the difference was precisely zero. I believe that is a clue... but I don't know why this is happening (when there was a lengthy period where it wasn't happening) or what to do to fix it. Also, it might help to know that there is a non-negligible amount of moving files from one OST to another taking place. Thanks, Jessica -- Jessica Otey System Administrator II North American ALMA Science Center (NAASC) National Radio Astronomy Observatory (NRAO) Charlottesville, Virginia (USA) ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ robinhood-support mailing list robinhood-support@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/robinhood-support ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ robinhood-support mailing list robinhood-support@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/robinhood-support