I will add my data point to the discussion in that I have been able to leave the CHANGELOG running on the storage metadata server while a full robinhood scan runs.
Cheers, megan On Mon, Aug 23, 2021 at 10:08 AM Nathan Gregg - NOAA Affiliate via robinhood-support <[email protected]> wrote: > Thanks Thomas for the excellent feedback. I am going to give this a try. > > This is probably a silly question but is it ok to leave the changelog > scans running while I do another full scan in parallel? > > Thanks for the help. > > Nate > > > On Mon, Aug 16, 2021 at 3:57 AM [email protected] > <[email protected]> wrote: > > > > Hello Nathan, > > > > Request on subtrees of the filesystem is what make the query very slow > because this request builds and matches the path of every entry in the DB. > > A possible solution we can imagine to optimize your query is to define > fileclasses for the parts of the filesystem you want to query. > > e.g. > > fileclass projectA { > > definition { tree == /fs/subdirA } > > } > > fileclass projectB { > > definition { tree == /fs/subdirB } > > } > > ... > > Note you will need to rescan the FS to update the fileclass of all the > entries. > > > > Then > > rbh-report --top-users=1000 --filter-class=projectA > > should be faster that using -P. > > > > Of course this supposes you know in advance the set of directories on > which you want to get stats. > > > > I hope this helps, > > Regards, > > Thomas > > > > > -----Message d'origine----- > > > De : Nathan Gregg - NOAA Affiliate via robinhood-support [mailto: > robinhood- > > > [email protected]] > > > Envoyé : lundi 9 août 2021 19:46 > > > À : [email protected] > > > Objet : [robinhood-support] Robinhood Report Performance > > > > > > Hello All, > > > > > > We successfully have Robinhood up and running and ingesting data from > > > changelogs from two Lustre file systems. Everything seems to perform > > > well other than when we want to run reports that are not part of the > > > accounting table. For example, if we want to run a report such as, ` > > > rbh-report --top-users=1000 -P /fs/subdir`, it takes 1.5 days to > > > complete. > > > > > > Our system has SSD drives and 384 GB of RAM. The IO load looks to be > > > very low on the box and I am sure more memory would help some but not > > > sure how much? Is there anything else we can do to try to > > > dramatically increase our reporting times for such queries? > > > > > > We are running `mysqltuner` and keeping up with its suggestions but so > > > far reports such as the one above are painfully slow. > > > > > > Thanks in advance for your support. > > > > > > Nate > > > > > > > > > _______________________________________________ > > > robinhood-support mailing list > > > [email protected] > > > https://lists.sourceforge.net/lists/listinfo/robinhood-support > > > _______________________________________________ > robinhood-support mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/robinhood-support >
_______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
