Re: [Lustre-discuss] md and mdadm in 1.8.7-wc1

2012-03-19 Thread Robin Humble
On Mon, Mar 19, 2012 at 10:05:42PM -0700, Samuel Aparicio wrote:
>I am wondering if anyone has experienced issues with md / mdadm in the 
>1.8.7-wc1 patched server kernels.?

I've seen an issue:
  http://jira.whamcloud.com/browse/LU-1115
althought it looks quite different to your problem... still, you might
hit that problem next.

>we have historically used software raid on our OSS machines because it 
>provided a 20-30% throughput in our hands, over
>raid provided from our storage arrays (coraid ATA over ethernet shelves). In 
>1.8.5 this has worked more or less flawlessly,
>but we now have new storage, with 3Tb rather than 2Tb disks and new servers 
>with 1.8.7-wc1 patched kernels.

I don't have any 3tb disks to test with, but I think you need to use a
newer superblock format for 3tb devices.
eg. use
   mdadm -e 1.2 ...
see 'man mdadm' which says something about max 2tb devices for 0.90 format.

also I'm not quite sure how to read the below, but it kinda looks like
you have 17 3tb disks in a single raid? that's a lot... I thought
ldiskfs was only ok up to 24tb these days?

>md is unable to reliably shut down and restart arrays after the machines have 
>been rebooted (cleanly) - the disks are no
>longer recognized as part of the arrays they were created within. In the 
>kernel log we have seen the following messages below,
>which include the following:
>
> md: bug in file drivers/md/md.c, line 1677

if (!mddev->events) {
/*
 * oops, this 64-bit counter should never wrap.
 * Either we are in around ~1 trillion A.C., assuming
 * 1 reboot per second, or we have a bug:
 */
MD_BUG();
mddev->events --;
}


so it looks like your md superblock is corrupted. that's consistent with
needing a newer superblock version.

other less likely possibilities:
 - could it also be that your coraid devices have problems with >2TB?
 - if you are running with 32bit kernels something could be wrong there.

cheers,
robin
--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility

>looking through the mdadm changelogs, it seems like there are some possible 
>patches for md in 2.6.18 kernels but I cannot tell
>if they are applied here, or whether this is even relevant.
>
>I am not clear whether this is an issue with 3Tb disks, or something else 
>related to mdadm and the patched server kernel. My suspicion
>is that something has broken with  > 2.2Tb disks.
>
>Does anyone have any ideas about this?
>
>thanks
>sam aparicio
>
>---
>Mar 19 21:34:48 OST3 kernel: md:**
>Mar 19 21:34:48 OST3 kernel: 
>Mar 19 21:35:20 OST3 kernel: md: bug in file drivers/md/md.c, line 1677
>Mar 19 21:35:20 OST3 kernel: 
>Mar 19 21:35:20 OST3 kernel: md:**
>Mar 19 21:35:20 OST3 kernel: md:*  *
>Mar 19 21:35:20 OST3 kernel: md:**
>Mar 19 21:35:20 OST3 kernel: md142: 
>Mar 19 21:35:20 OST3 kernel: md141: 
>Mar 19 21:35:20 OST3 kernel: md140: 
>/e14.5>
>Mar 19 21:35:20 OST3 kernel: md: rdev etherd/e14.16, SZ:2930265344 F:0 S:0 
>DN:16
>Mar 19 21:35:20 OST3 kernel: md: rdev superblock:
>Mar 19 21:35:20 OST3 kernel: md:  SB: (V:1.0.0) 
>ID:<9859f274.34313a61.0030.> CT:5d3314af
>Mar 19 21:35:20 OST3 kernel: md: L234772919 S861164367 ND:1970037550 
>RD:1919251571 md1667457582 LO:65536 CS:196610
>Mar 19 21:35:20 OST3 kernel: md: UT:0800 ST:0 AD:1565563648 WD:1 FD:8 
>SD:0 CSUM: E:
>Mar 19 21:35:20 OST3 kernel:  D  0:  DISK
>Mar 19 21:35:20 OST3 kernel:  D  1:  DISK
>Mar 19 21:35:20 OST3 kernel:  D  2:  DISK
>Mar 19 21:35:20 OST3 kernel:  D  3:  DISK
>Mar 19 21:35:20 OST3 kernel: md: THIS:  DISK
>< output truncated >
>
>
>
>
>
>
>
>
>Professor Samuel Aparicio BM BCh PhD FRCPath
>Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
>675 West 10th, Vancouver V5Z 1L3, Canada.
>office: +1 604 675 8200 lab website http://molonc.bccrc.ca
>
>
>
>
>
>
>
>___
>Lustre-discuss mailing list
>Lustre-discuss@lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] md and mdadm in 1.8.7-wc1

2012-03-19 Thread Samuel Aparicio
I am wondering if anyone has experienced issues with md / mdadm in the 
1.8.7-wc1 patched server kernels.?
we have historically used software raid on our OSS machines because it provided 
a 20-30% throughput in our hands, over
raid provided from our storage arrays (coraid ATA over ethernet shelves). In 
1.8.5 this has worked more or less flawlessly,
but we now have new storage, with 3Tb rather than 2Tb disks and new servers 
with 1.8.7-wc1 patched kernels.

md is unable to reliably shut down and restart arrays after the machines have 
been rebooted (cleanly) - the disks are no
longer recognized as part of the arrays they were created within. In the kernel 
log we have seen the following messages below,
which include the following:

 md: bug in file drivers/md/md.c, line 1677

looking through the mdadm changelogs, it seems like there are some possible 
patches for md in 2.6.18 kernels but I cannot tell
if they are applied here, or whether this is even relevant.

I am not clear whether this is an issue with 3Tb disks, or something else 
related to mdadm and the patched server kernel. My suspicion
is that something has broken with  > 2.2Tb disks.

Does anyone have any ideas about this?

thanks
sam aparicio

---
Mar 19 21:34:48 OST3 kernel: md:**
Mar 19 21:34:48 OST3 kernel: 
Mar 19 21:35:20 OST3 kernel: md: bug in file drivers/md/md.c, line 1677
Mar 19 21:35:20 OST3 kernel: 
Mar 19 21:35:20 OST3 kernel: md:**
Mar 19 21:35:20 OST3 kernel: md:*  *
Mar 19 21:35:20 OST3 kernel: md:**
Mar 19 21:35:20 OST3 kernel: md142: 
Mar 19 21:35:20 OST3 kernel: md141: 
Mar 19 21:35:20 OST3 kernel: md140: 

Mar 19 21:35:20 OST3 kernel: md: rdev etherd/e14.16, SZ:2930265344 F:0 S:0 DN:16
Mar 19 21:35:20 OST3 kernel: md: rdev superblock:
Mar 19 21:35:20 OST3 kernel: md:  SB: (V:1.0.0) 
ID:<9859f274.34313a61.0030.> CT:5d3314af
Mar 19 21:35:20 OST3 kernel: md: L234772919 S861164367 ND:1970037550 
RD:1919251571 md1667457582 LO:65536 CS:196610
Mar 19 21:35:20 OST3 kernel: md: UT:0800 ST:0 AD:1565563648 WD:1 FD:8 
SD:0 CSUM: E:
Mar 19 21:35:20 OST3 kernel:  D  0:  DISK
Mar 19 21:35:20 OST3 kernel:  D  1:  DISK
Mar 19 21:35:20 OST3 kernel:  D  2:  DISK
Mar 19 21:35:20 OST3 kernel:  D  3:  DISK
Mar 19 21:35:20 OST3 kernel: md: THIS:  DISK
< output truncated >








Professor Samuel Aparicio BM BCh PhD FRCPath
Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
675 West 10th, Vancouver V5Z 1L3, Canada.
office: +1 604 675 8200 lab website http://molonc.bccrc.ca







___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ll_ost thread soft lockup

2012-03-19 Thread Robin Humble
On Mon, Mar 19, 2012 at 07:28:22AM -0600, Kevin Van Maren wrote:
>You are running 1.8.5, which does not have the fix for the known MD raid5/6 
>rebuild corruption bug.  That fix was released in the Oracle Lustre 1.8.7 
>kernel patches.  Unless you already applied that patch, you might want to run 
>a check of your raid arrays and consider an upgrade (at least patch your 
>kernel with that fix).
>
>md-avoid-corrupted-ldiskfs-after-rebuild.patch in the 2.6-rhel5.series (note 
>that this bug is NOT specific to rhel5).  This fix does NOT appear to have 
>been picked up by whamcloud.

as you say, the md rebuild bug is in all kernels < 2.6.32
  http://marc.info/?l=linux-raid&m=130192650924540&w=2

the Whamcloud fix is LU-824 which landed in git a tad after 1.8.7-wc1.

I also asked RedHat nicely, and they added the same patch to RHEL5.8
kernels, which IMHO is the correct place for a fundamental md fix.

so once Lustre supports RHEL5.8 servers, then the patch in Lustre
isn't needed any more.

cheers,
robin
--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss