I posted this on Slack, but it’s serious enough that I want to make sure 
everyone sees it. Does anyone, from IBM or otherwise, have any more information 
about this/whether it was even announced anyplace? Thanks!

A little late, but we ran into a relatively serious problem at our site with 
5.0.2.3 at our site. The symptom is a mmfsd crash/segfault related to 
fs/dirop.C:4548. We ran into this sporadically, but it was repeatable on the 
problem workload. From IBM Support:

2. This is a known defect.
The problem has been fixed through
D.1073563: CTM_A_XW_FOR_DATA_IN_INODE related assert in DirLTE::lock
A companion fix is
D.1073753: Assert that the lock mode in DirLTE::lock is strong enough


The rep further said "It's not an APAR since it's found in internal testing. 
It's an internal function at a place it should not assert but a part of the 
condition as the code path is specific to the DIR_UPDATE_LOCKMODE optimization 
code... The assert was meant for certain file creation code path, but the 
condition wasn't set strictly for that code path that some other code path 
could also run into the assert. So we cannot predict on which node it would 
happen.” 

The fix was setting disableAssert="dirop.C:4548, which can be done live. Anyone 
seen anything else about this anyplace? The bug is fixed in 5.0.3.x and was 
introduced in 5.0.2.0/1 (not sure what this version number means; I’ve seen 
them listed X.X.X.X.X.X, X.X.X-X.X, and others).

--
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - [email protected]
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB C630, Newark
     `'

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to