Andreas Dilger wrote: > On 2010-11-23, at 05:01, Frederik Ferner wrote: >> during a planned MDT fail over today, we got a number of these >> messages below, can anyone explain what this could be? >> >>> Nov 23 08:33:26 cs04r-sc-mds01-01 kernel: Lustre: >>> 21054:0:(mds_open.c:367:mds_create_objects()) Bad lmm_size during >>> open replay for inode 111003141 > > This means that the client (trying to recreate a file that was not > saved to disk during the MDS failover) sent the layout information, > but the size it reported for the layout information did not match the > size that the MDS thought it should be for that kind of layout. > > Unfortunately, the error message doesn't report what those sizes are, > so it is hard to know what the impact might be. The message is only > a warning, and it is not necessarily a problem if the > client-specified size is larger than the size expected, but it might > be a problem if the client-specified size is smaller than expected > (which I think is the less likely case).
Thanks for this, I don't think, I'll worry to much about it now as the clients were all fairly quiet at the time of failover, so I don't think many important files have been written then. We tried to suspend all cluster jobs about 10 minutes before the fail over and some of the files/inodes at least now seem to belong to some cluster jobs. So I'm not sure if the inodes still are the same files or what was going on then. Does this relate to the stripe layout? Most files should have a stripe count of 1, would this make a difference? >> This is using Lustre 1.8.3-ddn3.3 on all servers and most clients, >> some clients use 1.8.4. > > I can't comment on what changes are in the DDN release, so I don't > know if this is specific to that release or not. In any case, I've > never seen these messages before. I'll test this later on our test file system but no promises that I'll be able to reproduce similar conditions. >> So far we've not noticed any ill effect but would like to know what >> that message is and if we can safely ignore it. > > It would only affect the listed inodes, if at all. Unfortunately I don't have the full list of inodes as syslog has skipped some 'similar messages', but as mentioned above, I'm not that worried at the moment. Thanks, Frederik -- Frederik Ferner Computer Systems Administrator phone: +44 1235 77 8624 Diamond Light Source Ltd. mob: +44 7917 08 5110 (Apologies in advance for the lines below. Some bits are a legal requirement and I have no control over them.) _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
