[robinhood-support] Robinhood 3.0.1 rebooting unexpectedly on initial scan with Lustre 2.8.0.8

2017-07-11 Thread Mervini, Joseph A
Hello, I have a problem similar to https://sourceforge.net/p/robinhood/mailman/message/35883907/ in which the robinhood server running mariadb-5.5.52-1.el7.x86_64 and lustre 2.8.0.8 client will reboot when the initial scan is run. I am running this in a testbed environment prior to deployment

[robinhood-support] Scan appears to have completed but is still running

2017-10-16 Thread Mervini, Joseph A
Hi, I have a fairly large lustre file system (3.5PB) that I have run an initial scan on that has been running without and real activity for more than a day and I am wondering if it is a normal condition. I have not turned on changelogs on the MDS nor do I have any thing set up to automatically

Re: [robinhood-support] [EXTERNAL] Robinhood 3.0.1 rebooting unexpectedly on initial scan with Lustre 2.8.0.8

2017-08-23 Thread Mervini, Joseph A
chdog timer starts during the scan when path2fid does an open. The > watchdog timer isn’t terminated until a special flag is specified, and Lustre > doesn’t know about the flag, so the watchdog times out and reboots the system. > > - Justin Miller > >> On Aug 23, 2017, at 9:55 AM,

Re: [robinhood-support] [EXTERNAL] Robinhood 3.0.1 rebooting unexpectedly on initial scan with Lustre 2.8.0.8

2017-08-23 Thread Mervini, Joseph A
0 jame...@sandia.gov > On Aug 23, 2017, at 11:50 AM, Mervini, Joseph A <jame...@sandia.gov> wrote: > > I am able to verify that the cause of the failures was in fact due to > /dev/watchdog and /dev/watchdog0 existing on the lustre file system. The > problem was easily duplicated b

Re: [robinhood-support] [EXTERNAL] Robinhood 3.0.1 rebooting unexpectedly on initial scan with Lustre 2.8.0.8

2017-08-23 Thread Mervini, Joseph A
Trying again - apparently the message is too long. Joe Mervini Sandia National Laboratories High Performance Computing 505.844.6770 jame...@sandia.gov<mailto:jame...@sandia.gov> On Aug 23, 2017, at 9:22 AM, Mervini, Joseph A <jame...@sandia.gov<mailto:jame...@sandia

Re: [robinhood-support] [EXTERNAL] Robinhood 3.0.1 rebooting unexpectedly on initial scan with Lustre 2.8.0.8

2017-08-23 Thread Mervini, Joseph A
ex resumed> ) = 0 Write failed: Broken pipe NOTE: I captured the entire output from strace in a screen log. Joe Mervini Sandia National Laboratories High Performance Computing 505.844.6770 jame...@sandia.gov<mailto:jame...@sandia.gov> On Aug 22, 2017, at 9:17 AM, Mervini,