Hello,
I have a problem similar to
https://sourceforge.net/p/robinhood/mailman/message/35883907/ in which the
robinhood server running mariadb-5.5.52-1.el7.x86_64 and lustre 2.8.0.8 client
will reboot when the initial scan is run. I am running this in a testbed
environment prior to deployment
Hi,
I have a fairly large lustre file system (3.5PB) that I have run an initial
scan on that has been running without and real activity for more than a day and
I am wondering if it is a normal condition. I have not turned on changelogs on
the MDS nor do I have any thing set up to automatically
chdog timer starts during the scan when path2fid does an open. The
> watchdog timer isn’t terminated until a special flag is specified, and Lustre
> doesn’t know about the flag, so the watchdog times out and reboots the system.
>
> - Justin Miller
>
>> On Aug 23, 2017, at 9:55 AM,
0
jame...@sandia.gov
> On Aug 23, 2017, at 11:50 AM, Mervini, Joseph A <jame...@sandia.gov> wrote:
>
> I am able to verify that the cause of the failures was in fact due to
> /dev/watchdog and /dev/watchdog0 existing on the lustre file system. The
> problem was easily duplicated b
Trying again - apparently the message is too long.
Joe Mervini
Sandia National Laboratories
High Performance Computing
505.844.6770
jame...@sandia.gov<mailto:jame...@sandia.gov>
On Aug 23, 2017, at 9:22 AM, Mervini, Joseph A
<jame...@sandia.gov<mailto:jame...@sandia
ex resumed> ) = 0
Write failed: Broken pipe
NOTE: I captured the entire output from strace in a screen log.
Joe Mervini
Sandia National Laboratories
High Performance Computing
505.844.6770
jame...@sandia.gov<mailto:jame...@sandia.gov>
On Aug 22, 2017, at 9:17 AM, Mervini,