Hi all I got this on an a crash on a OSS (Lustre 1.6.3) :
[EMAIL PROTECTED] ~]# cat /proc/fs/lustre/health_check device lustre-OST0012 reported unhealthy device lustre-OST0014 reported unhealthy device lustre-OST0016 reported unhealthy NOT HEALTHY In /var/log/messages we have : Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-15): read_block_bitmap: Invalid block bitmap - block_group = 10648, block = 348913664 Dec 17 14:40:56 oss01 kernel: Remounting filesystem read-only Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695936(bit 19200 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: Remounting filesystem read-only Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695937(bit 19201 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695938(bit 19202 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695939(bit 19203 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695940(bit 19204 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695941(bit 19205 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695942(bit 19206 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695943(bit 19207 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695944(bit 19208 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695945(bit 19209 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695946(bit 19210 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695947(bit 19211 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695948(bit 19212 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695949(bit 19213 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695950(bit 19214 in group 28402) Dec 17 14:40:56 oss01 kernel: Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17): mb_free_blocks: double-free of inode 232644664's block 930695951(bit 19215 in group 28402) (....) Dec 17 14:41:17 oss01 kernel: Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16): mb_free_blocks: double-free of inode 214925368's block 859725308(bit 24060 in group 26236) Dec 17 14:41:17 oss01 kernel: Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16): mb_free_blocks: double-free of inode 214925368's block 859725309(bit 24061 in group 26236) Dec 17 14:41:17 oss01 kernel: Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16): mb_free_blocks: double-free of inode 214925368's block 859725310(bit 24062 in group 26236) Dec 17 14:41:17 oss01 kernel: Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16): mb_free_blocks: double-free of inode 214925368's block 859725311(bit 24063 in group 26236) Dec 17 14:41:17 oss01 kernel: Dec 17 14:41:17 oss01 kernel: LustreError: 759:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 1 (120 credits): rc -30 Dec 17 15:22:13 oss01 heartbeat: [26083]: info: Checking status of STONITH device [external/ipmi ] Dec 17 15:22:13 oss01 heartbeat: [32011]: info: Exiting STONITH-stat process 26083 returned rc 0. Dec 17 15:35:05 oss01 kernel: LustreError: 675:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 94: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 726:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 95: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 726:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 1 previous similar message Dec 17 15:35:05 oss01 kernel: LustreError: 698:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 94: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 698:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 1 previous similar message Dec 17 15:35:05 oss01 kernel: LustreError: 739:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 97: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 739:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 4 previous similar messages Dec 17 15:35:05 oss01 kernel: LustreError: 712:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 95: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 712:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 4 previous similar messages Dec 17 15:35:05 oss01 kernel: LustreError: 670:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 96: rc -2 Dec 17 15:35:05 oss01 kernel: LustreError: 670:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 14 previous similar messages Dec 17 15:35:16 oss01 kernel: LustreError: 639:0:(ldlm_resource.c: 651:ldlm_resource_add()) lvbo_init failed for resource 98: rc -2 Dec 17 15:35:16 oss01 kernel: LustreError: 639:0:(ldlm_resource.c: 651:ldlm_resource_add()) Skipped 6 previous similar messages Dec 17 15:54:53 oss01 kernel: LustreError: 777:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:53 oss01 kernel: LustreError: 799:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:53 oss01 kernel: LustreError: 799:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages Dec 17 15:54:53 oss01 kernel: LustreError: 830:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:53 oss01 kernel: LustreError: 830:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages Dec 17 15:54:53 oss01 kernel: LustreError: 860:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:53 oss01 kernel: LustreError: 860:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages Dec 17 15:54:54 oss01 kernel: LustreError: 809:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:54 oss01 kernel: LustreError: 809:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages Dec 17 15:54:54 oss01 kernel: LustreError: 859:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49 credits): rc -30 Dec 17 15:54:54 oss01 kernel: LustreError: 859:0:(fsfilt-ldiskfs.c: 281:fsfilt_ldiskfs_start()) Skipped 5 previous similar messages I solve the problem by umounting and remounting the 3 OSTs. Is it a bug relative to 1.6.3 ? ext4 ? What is the status for 1.6.4.1 ? Best Regards, Franck _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
