Re: [Cooker] Re: Kernel and Samba

2003-10-29 Thread Buchan Milne
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Norman Zhang wrote:
 Hi Buchan,


My Samba box stalls after some usage, mapped drives disappear and
users can't write or read from drives. The stalls happen randomly.
I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is
this a kernel bug or Samba bug? Does anyone know a fix for it? I
checked the memory from BIOS, they didn't report any errors.

BIOS memory check is (mostly) useless. Use memtest86 or similar.

Sure I will test it with memtest86 and report back. I have been
running LM9.0 with Samba on this box for 3/4 year now. The problem
only arose in the last 2 months by random. I swapped brand new
Crucial Micron ECC DDR266 SDRAM, but the problem still presists. BTW,
the BIOS memory check is quite extensive (Intel claims to scan it
block by block). It takes about 1 to 2 minutes for it to scan the
memory. Not sure how this compares to memtest86. I guess I will wait
after hours before I can run a memtest86.


 I ran memtest and found no error. Do you have other suggestions that I can
 further troubleshoot this? There are no cards plugged to the system. The
 system just runs software RAID. Thus it seems to be either XFS, md or
samba
 bug.

I would guess XFS. You may want to try a more recent kernel? (Thomas
hinted that earlier kernels may have had some issues with XFS). But I
think I'm running on XFS on the only production Winbind box I have at
present (running 8.2 still!), with no problems. But the only box I have
running XFS with an smp kernel runs 9.1.

 Maybe I could try upgrading samba to 2.2.8a-2mdk from your web server.

I would prefer if you used on of the samba FTP mirrors, you can get
setup easily at http://plf.zarb.org/~nanardon/?minor=1 (choose a Samba
medium).

 Are there potential gotchas that I should watch out for?

Not that I know of. In fact, I had reports that Squid authentication via
winbind works with these but not the version you have.

Regards,
Buchan

- --
|--Another happy Mandrake Club member--|
Buchan MilneMechanical Engineer, Network Manager
Cellphone * Work+27 82 472 2231 * +27 21 8828820x202
Stellenbosch Automotive Engineering http://www.cae.co.za
GPG Key   http://ranger.dnsalias.com/bgmilne.asc
1024D/60D204A7 2919 E232 5610 A038 87B1 72D6 AC92 BA50 60D2 04A7
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE/n5x4rJK6UGDSBKcRAm7jAJ0eC55sftiAdids8ednjYV/RjrXVwCfRrzf
hp3Z6hE87MS2P7yhqclPxIs=
=NizD
-END PGP SIGNATURE-




[Cooker] Re: Kernel and Samba

2003-10-29 Thread Norman Zhang
Hi Buchan,

 My Samba box stalls after some usage, mapped drives disappear and
 users can't write or read from drives. The stalls happen randomly.
 I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is
 this a kernel bug or Samba bug? Does anyone know a fix for it? I
 checked the memory from BIOS, they didn't report any errors.

 BIOS memory check is (mostly) useless. Use memtest86 or similar.

 I ran memtest and found no error. Do you have other suggestions that
 I can further troubleshoot this? There are no cards plugged to the
 system. The system just runs software RAID. Thus it seems to be
 either XFS, md or samba bug.

 I would guess XFS. You may want to try a more recent kernel? (Thomas
 hinted that earlier kernels may have had some issues with XFS). But I
 think I'm running on XFS on the only production Winbind box I have at
 present (running 8.2 still!), with no problems. But the only box I
 have running XFS with an smp kernel runs 9.1.

I did upgrade samba to 2.2.8a-2mdk, but problem still persists. I now
upgraded kernel-smp-2.4.19-35mdk-1-1mdk as per MDKSA-2003:074. I will report
back on my findings. BTW, how do I find out what XFS revision is included in
kernel-smp-2.4.19?

Regards,
Norman







Re: [Cooker] Re: Kernel and Samba

2003-10-29 Thread Thomas Backlund
Norman Zhang kirjoitti viestissään (lähetysaika Torstai 30 Lokakuu 2003 
06:08):
 Hi Buchan,

  My Samba box stalls after some usage, mapped drives disappear and
  users can't write or read from drives. The stalls happen randomly.
  I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is
  this a kernel bug or Samba bug? Does anyone know a fix for it? I
  checked the memory from BIOS, they didn't report any errors.
 
  BIOS memory check is (mostly) useless. Use memtest86 or similar.
 
  I ran memtest and found no error. Do you have other suggestions that
  I can further troubleshoot this? There are no cards plugged to the
  system. The system just runs software RAID. Thus it seems to be
  either XFS, md or samba bug.
 
  I would guess XFS. You may want to try a more recent kernel? (Thomas
  hinted that earlier kernels may have had some issues with XFS). But I
  think I'm running on XFS on the only production Winbind box I have at
  present (running 8.2 still!), with no problems. But the only box I
  have running XFS with an smp kernel runs 9.1.

 I did upgrade samba to 2.2.8a-2mdk, but problem still persists. I now
 upgraded kernel-smp-2.4.19-35mdk-1-1mdk as per MDKSA-2003:074. I will
 report back on my findings. BTW, how do I find out what XFS revision is
 included in kernel-smp-2.4.19?


#dmesg |grep xfs

-- 
Regards

Thomas




[Cooker] Re: Kernel and Samba

2003-10-28 Thread Norman Zhang
Hi Buchan,

 My Samba box stalls after some usage, mapped drives disappear and
 users can't write or read from drives. The stalls happen randomly.
 I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is
 this a kernel bug or Samba bug? Does anyone know a fix for it? I
 checked the memory from BIOS, they didn't report any errors.

 BIOS memory check is (mostly) useless. Use memtest86 or similar.

 Sure I will test it with memtest86 and report back. I have been
 running LM9.0 with Samba on this box for 3/4 year now. The problem
 only arose in the last 2 months by random. I swapped brand new
 Crucial Micron ECC DDR266 SDRAM, but the problem still presists. BTW,
 the BIOS memory check is quite extensive (Intel claims to scan it
 block by block). It takes about 1 to 2 minutes for it to scan the
 memory. Not sure how this compares to memtest86. I guess I will wait
 after hours before I can run a memtest86.

I ran memtest and found no error. Do you have other suggestions that I can
further troubleshoot this? There are no cards plugged to the system. The
system just runs software RAID. Thus it seems to be either XFS, md or samba
bug. Maybe I could try upgrading samba to 2.2.8a-2mdk from your web server.
Are there potential gotchas that I should watch out for?

 /var/log/kernel/warnings
 
 Oct 27 09:19:12 smbserver kernel: xfs_force_shutdown(md(9,5),0x8)
 called from line 1039 of file xfs_trans.c.  Return address =
 0xe08ae312
 Oct 27 09:19:12 smbserver kernel: Corruption of in-memory data
 detected. Shutting down filesystem: md(9,5)
 Oct 27 09:19:12 smbserver kernel: Please umount the filesystem, and
 rectify the problem(s)

 This seems to point quite strongly to either hardware (most likely
 memory) or kernel (xfs driver or md driver, it seems you are running
 software raid?) If the kernel has problems with a filesystem, there's
 nothing much samba can do about it ...

 I'm using software RAID. Do you know if there are recent updates to
 the Mandrake kernel that may fix bugs in XFS and md drivers? Funny
 thing is that only Samba dies. SSH and others still work.

 /var/log/kernel/errors
 --
 Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2:
 assuming transparent
 Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit
 address space for
 Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit
 address space for
 Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2:
 assuming transparent
 Oct 27 10:36:44 smbserver kernel: PCI: Device 00:1f.1 not available
 because of resource collisions

 You need to give some more information on the hardware on this
 machine, but something does not look right ... what's in
 /proc/interrupts ?

 I'm using Intel SE7500WV2S Server Board. BIOS Version: 2.01 Build
 0483. My /proc/interrupts are as follows. I have seen the boot screen
 complaint about resources collision, but couldn't find out the cause.
 I've disabled all unecessary ports in the BIOS (e.g., USB).

CPU0CPU1CPU2CPU3
   0:1385866   0   0   0  IO-APIC-edge  timer
   1:  7   0   0   0  IO-APIC-edge  keyboard
   2:  0   0   0   0XT-PIC  cascade
   8:  1   0   0   0  IO-APIC-edge  rtc
  12:197   0   0   0  IO-APIC-edge  PS/2 Mouse
  15:  5   0   0   0  IO-APIC-edge  ide1
  30: 677193   0   0   0 IO-APIC-level  eth1
  31: 923339   0   0   0 IO-APIC-level  eth0
  49:  38985   0   0   0 IO-APIC-level  aic7xxx
  50: 16   0   0   0 IO-APIC-level  aic7xxx
 NMI:  0   0   0   0
 LOC:1385678 1385677 1385676 1385676
 ERR:  0
 MIS:  0

Regards,
Norman






[Cooker] Re: Kernel and Samba

2003-10-27 Thread Norman Zhang
Hi,

 My Samba box stalls after some usage, mapped drives disappear and
 users can't write or read from drives. The stalls happen randomly.
 I'm running 2.4.19-16mdksmp and Samba 2.2.7a-9.2mdk. May I ask is
 this a kernel bug or Samba bug? Does anyone know a fix for it? I
 checked the memory from BIOS, they didn't report any errors.

 BIOS memory check is (mostly) useless. Use memtest86 or similar.

Sure I will test it with memtest86 and report back. I have been running
LM9.0 with Samba on this box for 3/4 year now. The problem only arose in the
last 2 months by random. I swapped brand new Crucial Micron ECC DDR266
SDRAM, but the problem still presists. BTW, the BIOS memory check is quite
extensive (Intel claims to scan it block by block). It takes about 1 to 2
minutes for it to scan the memory. Not sure how this compares to memtest86.
I guess I will wait after hours before I can run a memtest86.

 /var/log/kernel/warnings
 
 Oct 27 09:19:12 smbserver kernel: xfs_force_shutdown(md(9,5),0x8)
 called from line 1039 of file xfs_trans.c.  Return address =
 0xe08ae312
 Oct 27 09:19:12 smbserver kernel: Corruption of in-memory data
 detected. Shutting down filesystem: md(9,5)
 Oct 27 09:19:12 smbserver kernel: Please umount the filesystem, and
 rectify the problem(s)

 This seems to point quite strongly to either hardware (most likely
 memory) or kernel (xfs driver or md driver, it seems you are running
 software raid?) If the kernel has problems with a filesystem, there's
 nothing much samba can do about it ...

I'm using software RAID. Do you know if there are recent updates to the
Mandrake kernel that may fix bugs in XFS and md drivers? Funny thing is that
only Samba dies. SSH and others still work.

 /var/log/kernel/errors
 --
 Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming
 transparent
 Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit
 address space for
 Oct 27 10:36:44 smbserver kernel: PCI: Unable to handle 64-bit
 address space for
 Oct 27 10:36:44 smbserver kernel: Unknown bridge resource 2: assuming
 transparent
 Oct 27 10:36:44 smbserver kernel: PCI: Device 00:1f.1 not available
 because of resource collisions

 You need to give some more information on the hardware on this
 machine, but something does not look right ... what's in
 /proc/interrupts ?

I'm using Intel SE7500WV2S Server Board. BIOS Version: 2.01 Build 0483. My
/proc/interrupts are as follows. I have seen the boot screen complaint about
resources collision, but couldn't find out the cause. I've disabled all
unecessary ports in the BIOS (e.g., USB).

   CPU0CPU1CPU2CPU3
  0:1385866   0   0   0  IO-APIC-edge  timer
  1:  7   0   0   0  IO-APIC-edge  keyboard
  2:  0   0   0   0XT-PIC  cascade
  8:  1   0   0   0  IO-APIC-edge  rtc
 12:197   0   0   0  IO-APIC-edge  PS/2 Mouse
 15:  5   0   0   0  IO-APIC-edge  ide1
 30: 677193   0   0   0 IO-APIC-level  eth1
 31: 923339   0   0   0 IO-APIC-level  eth0
 49:  38985   0   0   0 IO-APIC-level  aic7xxx
 50: 16   0   0   0 IO-APIC-level  aic7xxx
NMI:  0   0   0   0
LOC:1385678 1385677 1385676 1385676
ERR:  0
MIS:  0

 /var/log/samba/log.winbindd
 ---
 [2003/10/27 10:37:23, 0] nsswitch/winbindd.c:process_loop(626)
   process_loop: Invalid request size (1701996389) sent, should be
 (1304)

 Some winbind users have reported winbind in 2.2.8a works a bit
 better, you can find packages on the samba FTP mirrors for all
 supported releases (hmm, except ldap-enabled packages for 9.2 ... I
 must do this still ...).

Thanks. I will wait for your updates. BTW, http://ranger.dnsalias.com/ is
one of my mostly watched site. Too bad Samba 3.0 had some nasty bugs.
Otherwise, I would love to upgrade and test it out.

Regards,
Norman