Re: [Cooker] Re: Kernel and Samba
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Norman Zhang wrote: Hi Buchan, My Samba box stalls after some usage, mapped drives disappear and users can't write or read from drives. The stalls happen randomly. I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is this a kernel bug or Samba bug? Does anyone know a fix for it? I checked the memory from BIOS, they didn't report any errors. BIOS memory check is (mostly) useless. Use memtest86 or similar. Sure I will test it with memtest86 and report back. I have been running LM9.0 with Samba on this box for 3/4 year now. The problem only arose in the last 2 months by random. I swapped brand new Crucial Micron ECC DDR266 SDRAM, but the problem still presists. BTW, the BIOS memory check is quite extensive (Intel claims to scan it block by block). It takes about 1 to 2 minutes for it to scan the memory. Not sure how this compares to memtest86. I guess I will wait after hours before I can run a memtest86. I ran memtest and found no error. Do you have other suggestions that I can further troubleshoot this? There are no cards plugged to the system. The system just runs software RAID. Thus it seems to be either XFS, md or samba bug. I would guess XFS. You may want to try a more recent kernel? (Thomas hinted that earlier kernels may have had some issues with XFS). But I think I'm running on XFS on the only production Winbind box I have at present (running 8.2 still!), with no problems. But the only box I have running XFS with an smp kernel runs 9.1. Maybe I could try upgrading samba to 2.2.8a-2mdk from your web server. I would prefer if you used on of the samba FTP mirrors, you can get setup easily at http://plf.zarb.org/~nanardon/?minor=1 (choose a Samba medium). Are there potential gotchas that I should watch out for? Not that I know of. In fact, I had reports that Squid authentication via winbind works with these but not the version you have. Regards, Buchan - -- |--Another happy Mandrake Club member--| Buchan MilneMechanical Engineer, Network Manager Cellphone * Work+27 82 472 2231 * +27 21 8828820x202 Stellenbosch Automotive Engineering http://www.cae.co.za GPG Key http://ranger.dnsalias.com/bgmilne.asc 1024D/60D204A7 2919 E232 5610 A038 87B1 72D6 AC92 BA50 60D2 04A7 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQE/n5x4rJK6UGDSBKcRAm7jAJ0eC55sftiAdids8ednjYV/RjrXVwCfRrzf hp3Z6hE87MS2P7yhqclPxIs= =NizD -END PGP SIGNATURE-
[Cooker] Re: Kernel and Samba
Hi Buchan, My Samba box stalls after some usage, mapped drives disappear and users can't write or read from drives. The stalls happen randomly. I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is this a kernel bug or Samba bug? Does anyone know a fix for it? I checked the memory from BIOS, they didn't report any errors. BIOS memory check is (mostly) useless. Use memtest86 or similar. I ran memtest and found no error. Do you have other suggestions that I can further troubleshoot this? There are no cards plugged to the system. The system just runs software RAID. Thus it seems to be either XFS, md or samba bug. I would guess XFS. You may want to try a more recent kernel? (Thomas hinted that earlier kernels may have had some issues with XFS). But I think I'm running on XFS on the only production Winbind box I have at present (running 8.2 still!), with no problems. But the only box I have running XFS with an smp kernel runs 9.1. I did upgrade samba to 2.2.8a-2mdk, but problem still persists. I now upgraded kernel-smp-2.4.19-35mdk-1-1mdk as per MDKSA-2003:074. I will report back on my findings. BTW, how do I find out what XFS revision is included in kernel-smp-2.4.19? Regards, Norman
Re: [Cooker] Re: Kernel and Samba
Norman Zhang kirjoitti viestissään (lähetysaika Torstai 30 Lokakuu 2003 06:08): Hi Buchan, My Samba box stalls after some usage, mapped drives disappear and users can't write or read from drives. The stalls happen randomly. I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is this a kernel bug or Samba bug? Does anyone know a fix for it? I checked the memory from BIOS, they didn't report any errors. BIOS memory check is (mostly) useless. Use memtest86 or similar. I ran memtest and found no error. Do you have other suggestions that I can further troubleshoot this? There are no cards plugged to the system. The system just runs software RAID. Thus it seems to be either XFS, md or samba bug. I would guess XFS. You may want to try a more recent kernel? (Thomas hinted that earlier kernels may have had some issues with XFS). But I think I'm running on XFS on the only production Winbind box I have at present (running 8.2 still!), with no problems. But the only box I have running XFS with an smp kernel runs 9.1. I did upgrade samba to 2.2.8a-2mdk, but problem still persists. I now upgraded kernel-smp-2.4.19-35mdk-1-1mdk as per MDKSA-2003:074. I will report back on my findings. BTW, how do I find out what XFS revision is included in kernel-smp-2.4.19? #dmesg |grep xfs -- Regards Thomas
[Cooker] Re: Kernel and Samba
Hi Buchan, My Samba box stalls after some usage, mapped drives disappear and users can't write or read from drives. The stalls happen randomly. I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is this a kernel bug or Samba bug? Does anyone know a fix for it? I checked the memory from BIOS, they didn't report any errors. BIOS memory check is (mostly) useless. Use memtest86 or similar. Sure I will test it with memtest86 and report back. I have been running LM9.0 with Samba on this box for 3/4 year now. The problem only arose in the last 2 months by random. I swapped brand new Crucial Micron ECC DDR266 SDRAM, but the problem still presists. BTW, the BIOS memory check is quite extensive (Intel claims to scan it block by block). It takes about 1 to 2 minutes for it to scan the memory. Not sure how this compares to memtest86. I guess I will wait after hours before I can run a memtest86. I ran memtest and found no error. Do you have other suggestions that I can further troubleshoot this? There are no cards plugged to the system. The system just runs software RAID. Thus it seems to be either XFS, md or samba bug. Maybe I could try upgrading samba to 2.2.8a-2mdk from your web server. Are there potential gotchas that I should watch out for? /var/log/kernel/warnings Oct 27 09:19:12 smbserver kernel: xfs_force_shutdown(md(9,5),0x8) called from line 1039 of file xfs_trans.c. Return address = 0xe08ae312 Oct 27 09:19:12 smbserver kernel: Corruption of in-memory data detected. Shutting down filesystem: md(9,5) Oct 27 09:19:12 smbserver kernel: Please umount the filesystem, and rectify the problem(s) This seems to point quite strongly to either hardware (most likely memory) or kernel (xfs driver or md driver, it seems you are running software raid?) If the kernel has problems with a filesystem, there's nothing much samba can do about it ... I'm using software RAID. Do you know if there are recent updates to the Mandrake kernel that may fix bugs in XFS and md drivers? Funny thing is that only Samba dies. SSH and others still work. /var/log/kernel/errors -- Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming transparent Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit address space for Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit address space for Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming transparent Oct 27 10:36:44 smbserver kernel: PCI: Device 00:1f.1 not available because of resource collisions You need to give some more information on the hardware on this machine, but something does not look right ... what's in /proc/interrupts ? I'm using Intel SE7500WV2S Server Board. BIOS Version: 2.01 Build 0483. My /proc/interrupts are as follows. I have seen the boot screen complaint about resources collision, but couldn't find out the cause. I've disabled all unecessary ports in the BIOS (e.g., USB). CPU0CPU1CPU2CPU3 0:1385866 0 0 0 IO-APIC-edge timer 1: 7 0 0 0 IO-APIC-edge keyboard 2: 0 0 0 0XT-PIC cascade 8: 1 0 0 0 IO-APIC-edge rtc 12:197 0 0 0 IO-APIC-edge PS/2 Mouse 15: 5 0 0 0 IO-APIC-edge ide1 30: 677193 0 0 0 IO-APIC-level eth1 31: 923339 0 0 0 IO-APIC-level eth0 49: 38985 0 0 0 IO-APIC-level aic7xxx 50: 16 0 0 0 IO-APIC-level aic7xxx NMI: 0 0 0 0 LOC:1385678 1385677 1385676 1385676 ERR: 0 MIS: 0 Regards, Norman
[Cooker] Re: Kernel and Samba
Hi, My Samba box stalls after some usage, mapped drives disappear and users can't write or read from drives. The stalls happen randomly. I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is this a kernel bug or Samba bug? Does anyone know a fix for it? I checked the memory from BIOS, they didn't report any errors. BIOS memory check is (mostly) useless. Use memtest86 or similar. Sure I will test it with memtest86 and report back. I have been running LM9.0 with Samba on this box for 3/4 year now. The problem only arose in the last 2 months by random. I swapped brand new Crucial Micron ECC DDR266 SDRAM, but the problem still presists. BTW, the BIOS memory check is quite extensive (Intel claims to scan it block by block). It takes about 1 to 2 minutes for it to scan the memory. Not sure how this compares to memtest86. I guess I will wait after hours before I can run a memtest86. /var/log/kernel/warnings Oct 27 09:19:12 smbserver kernel: xfs_force_shutdown(md(9,5),0x8) called from line 1039 of file xfs_trans.c. Return address = 0xe08ae312 Oct 27 09:19:12 smbserver kernel: Corruption of in-memory data detected. Shutting down filesystem: md(9,5) Oct 27 09:19:12 smbserver kernel: Please umount the filesystem, and rectify the problem(s) This seems to point quite strongly to either hardware (most likely memory) or kernel (xfs driver or md driver, it seems you are running software raid?) If the kernel has problems with a filesystem, there's nothing much samba can do about it ... I'm using software RAID. Do you know if there are recent updates to the Mandrake kernel that may fix bugs in XFS and md drivers? Funny thing is that only Samba dies. SSH and others still work. /var/log/kernel/errors -- Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming transparent Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit address space for Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit address space for Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming transparent Oct 27 10:36:44 smbserver kernel: PCI: Device 00:1f.1 not available because of resource collisions You need to give some more information on the hardware on this machine, but something does not look right ... what's in /proc/interrupts ? I'm using Intel SE7500WV2S Server Board. BIOS Version: 2.01 Build 0483. My /proc/interrupts are as follows. I have seen the boot screen complaint about resources collision, but couldn't find out the cause. I've disabled all unecessary ports in the BIOS (e.g., USB). CPU0CPU1CPU2CPU3 0:1385866 0 0 0 IO-APIC-edge timer 1: 7 0 0 0 IO-APIC-edge keyboard 2: 0 0 0 0XT-PIC cascade 8: 1 0 0 0 IO-APIC-edge rtc 12:197 0 0 0 IO-APIC-edge PS/2 Mouse 15: 5 0 0 0 IO-APIC-edge ide1 30: 677193 0 0 0 IO-APIC-level eth1 31: 923339 0 0 0 IO-APIC-level eth0 49: 38985 0 0 0 IO-APIC-level aic7xxx 50: 16 0 0 0 IO-APIC-level aic7xxx NMI: 0 0 0 0 LOC:1385678 1385677 1385676 1385676 ERR: 0 MIS: 0 /var/log/samba/log.winbindd --- [2003/10/27 10:37:23, 0] nsswitch/winbindd.c:process_loop(626) process_loop: Invalid request size (1701996389) sent, should be (1304) Some winbind users have reported winbind in 2.2.8a works a bit better, you can find packages on the samba FTP mirrors for all supported releases (hmm, except ldap-enabled packages for 9.2 ... I must do this still ...). Thanks. I will wait for your updates. BTW, http://ranger.dnsalias.com/ is one of my mostly watched site. Too bad Samba 3.0 had some nasty bugs. Otherwise, I would love to upgrade and test it out. Regards, Norman