Robinhood in the scanning mode seems to randomly reset a machine after running for some time (from ~1 to ~10 hours). This has been observed at least on 3 different nodes.
The only message in the log before resetting the node is the following kernel: fuse init (API version 7.14) kernel: iTCO_wdt: Unexpected close, not stopping watchdog! kernel: lp: driver loaded but no devices found kernel: ppdev: user-space parallel port driver kernel: PPP generic driver version 2.4.2 kernel: tun: Universal TUN/TAP device driver, 1.6 kernel: tun: (C) 1999-2004 Max Krasnyansky <m...@qualcomm.com> If I remove the *_wdt watchdog models then the scan is successfully completed. Nodes run SL 6.7, kernel 2.6.32-642.6.2.el6.x86_64, lustre 2.4.3, and we are using pre-built robinhood rpms version 3.0. Lustre to be scanned has ~50M files. Any ideas why is this happening? Gizo -- Dr. Gizo Nanava Leibniz Universitaet IT Services Leibniz Universitaet Hannover Schlosswender Str. 5 D-30159 Hannover Tel +49 511 762 7919085 http://www.luis.uni-hannover.de ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ robinhood-support mailing list robinhood-support@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/robinhood-support