date:20050809

Re: bge watchdog errors in -STABLE

2005-08-09 Thread Uzi Klein



Brad Miele wrote:

Hi,

Last week, i cvsupped two HP Proliant DL380 G4 machines to the most 
recent stable, which replaced the bge driver from mid may with the one 
committed on or around June 24. Since updating, my cards, Broadcom 
BCM5704C have been throwing occasional watchdog timeout errors on their 
shared 1000basetx net. These errors do not currently seem to be 
effecting much, and there have only been 6 since last friday, but I 
would like to clear them up.


I'm experiencing the same issue with the exact same hardware ever since 
5.4-RELEASE-p3, with no real affection.





I noted that the current version of the bge driver had code backed out 
from the previous one, and an option BGE_FAKE_AUTONEG was added. I 
have tried enabling this option, as it seems to pertain to this chipset, 
but the errors are still occuring. I am thinking of switching the cards 
from autoselect to explicit 1000basetx, with one of them set to link0.


Please confirm that this is a reasonable next step and let me know any 
other information that will be useful.


Regards,

Brad

  Brad Miele
  Technology Director
  IPNStock
  (866) 476-7862 x902
  [EMAIL PROTECTED]



--
Uzi Klein
Software Development Executive
B.M.B.Y Software Systems Ltd.


smime.p7s
Description: S/MIME Cryptographic Signature

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

2005-08-09 Thread O. Hartmann


Mike Tancsa wrote:

At 08:25 PM 08/08/2005, O. Hartmann wrote:


Hello.

My box is a FreeBSD 6.0-BETA2 driven ASUS A8N-SLI Deluxe based AMD64 
boxed (see dmesg).
One of  my SATA disks, the SAMSUNG SP2004C seems to show errors during 
operation (and also showd under 5.4-RELEASE-p3).

Sometimes I get this error:
ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
while the machine still keeps working.
Other days the box crashes completely.

Is this a operating system bug or is this message an evidence of 
defective hardware?



You can probably confirm a hardware issue with the smartmon tools.  
(/usr/ports/sysutils/smartmontools).


It was quite handy the other day for us to narrow down a problem between 
a drive tray and the actual drive.  We started to see


Aug  3 02:02:49 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=391423
Aug  3 02:03:00 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=2304319
Aug  3 02:03:10 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=2312927
Aug  3 02:03:17 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=2308639
Aug  3 02:03:26 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=2309855
Aug  3 02:03:37 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=2348359
Aug  4 12:12:37 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=1528639
Aug  4 12:13:04 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
retries left) LBA=1530031
Aug  4 12:13:04 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (1 
retry left) LBA=1528639

Aug  4 12:13:04 verify1 kernel: ad0: FAILURE - READ_DMA timed out
Aug  4 12:13:04 verify1 kernel: spec_getpages:(ad0s1a) I/O read failure: 
(error=5) bp 0xd630b4fc vp 0xc2640d68


Yet when we read the actual error info off the drive via smartctl -a 
ad0, it was clean.  So it pointed to the drive tray which we swapped and 
all was well.  In other situations however, the smart info will often 
tell you if the drive is starting to fail.  Its not 100% reliable, but 
since we started using it, it generally gave us some sort of heads up as 
to whether or not a drive is in trouble.



---Mike


Dear Mike.
Thanks a lot for this info.
I will use this tool and try to report what I found out.

I also use trays for my drives (like I did with SCSI and SCA2 on our 
servers at the lab). Maybe this could be an issue.


Oliver
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Xorg, Radeon X800PRO problem.

2005-08-09 Thread Yann Golanski

I am stuck on getting this graphics card working with stable.  

Attached are xorg.conf and Xorg.log.0.  The versions are:

# uname -a
FreeBSD rubicon.york.ac.uk 5.4-STABLE FreeBSD 5.4-STABLE #0: Mon Aug  8
16:18:46 BST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386

# pkg_info -Ia | grep xorg
xorg-6.8.2  X.Org distribution metaport

Any help would be most welcome.

-- 
[EMAIL PROTECTED]  -=*=-  www.kierun.org
PGP:   009D 7287 C4A7 FD4F 1680  06E4 F751 7006 9DE2 6318
Section ServerLayout
Identifier X.org Configured
Screen  0  Screen0 0 0
InputDeviceMouse0 CorePointer
InputDeviceKeyboard0 CoreKeyboard
EndSection

Section Files
RgbPath  /usr/X11R6/lib/X11/rgb
ModulePath   /usr/X11R6/lib/modules
FontPath /usr/X11R6/lib/X11/fonts/misc/
FontPath /usr/X11R6/lib/X11/fonts/TTF/
FontPath /usr/X11R6/lib/X11/fonts/Speedo/
FontPath /usr/X11R6/lib/X11/fonts/Type1/
FontPath /usr/X11R6/lib/X11/fonts/CID/
FontPath /usr/X11R6/lib/X11/fonts/75dpi/
FontPath /usr/X11R6/lib/X11/fonts/100dpi/
FontPath /usr/X11R6/lib/X11/fonts/webfonts/
FontPath /usr/X11R6/lib/X11/fonts/freefont
FontPath /usr/X11R6/lib/X11/fonts/local
FontPath /usr/home/yann/.fonts/proggy
EndSection

Section Module
Load  dbe
Load  dri
load  extmod
Load  glx
Load  record
Load  xtrap
Load  freetype
Load  speedo
Load  type1
EndSection

Section InputDevice
Identifier  Keyboard0
Driver  keyboard
Option  XkbLayout gb
EndSection

Section InputDevice
Identifier  Mouse0
Driver  mouse
Option  Protocol auto
Option  Device /dev/sysmouse
Option  ZAxisMapping 4 5
option  Buttons 6
EndSection

Section Monitor
Identifier   Monitor0
VendorName   Sony
ModelNameLCD
HorizSync28.0 - 80.0
VertRefresh  48.0 - 75.0
EndSection

Section Device
Identifier  Card0
Driver  ati
#VendorName  ATI Technologies Inc
#BoardName   Radeon X800PRO
#BusID   PCI:5:0:0
EndSection

Section Screen
Identifier Screen0
Device Card0
MonitorMonitor0
DefaultDepth8
#SubSection Display
#   Depth 1
#EndSubSection
#SubSection Display
#   Depth 4
#EndSubSection
SubSection Display
Depth 8
Modes   800x600
EndSubSection
SubSection Display
Depth 15
Modes   1152x764
EndSubSection
SubSection Display
Depth 16
Modes 1152x764
EndSubSection
SubSection Display
Depth 24
Modes 1280x1024
EndSubSection
EndSection

Release Date: 18 December 2003
X Protocol Version 11, Revision 0, Release 6.7
Build Operating System: FreeBSD 5.3 i386 [ELF] 
Current Operating System: FreeBSD rubicon.york.ac.uk 5.4-STABLE FreeBSD 
5.4-STABLE #0: Mon Aug  8 16:18:46 BST 2005 [EMAIL 
PROTECTED]:/usr/obj/usr/src/sys/SMP i386
Build Date: 16 October 2004
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Module Loader present
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: /var/log/Xorg.0.log, Time: Thu Sep  8 09:49:51 2005
(==) Using config file: /etc/X11/xorg.conf
(==) ServerLayout X.org Configured
(**) |--Screen Screen0 (0)
(**) |   |--Monitor Monitor0
(**) |   |--Device Card0
(**) |--Input Device Mouse0
(**) |--Input Device Keyboard0
(**) Option XkbLayout gb
(**) XKB: layout: gb
(==) Keyboard: CustomKeycode disabled
(WW) `fonts.dir' not found (or not valid) in /usr/X11R6/lib/X11/fonts/Speedo/.
Entry deleted from font path.
(Run 'mkfontdir' on /usr/X11R6/lib/X11/fonts/Speedo/).
(WW) `fonts.dir' not found (or not valid) in /usr/X11R6/lib/X11/fonts/CID/.
Entry deleted from font path.
(Run 'mkfontdir' on /usr/X11R6/lib/X11/fonts/CID/).
(WW) The directory /usr/X11R6/lib/X11/fonts/webfonts/ does not exist.
Entry deleted from font path.
(WW) The directory /usr/X11R6/lib/X11/fonts/freefont does not exist.
Entry deleted from font path.
(WW) `fonts.dir' not found (or not valid) in /usr/X11R6/lib/X11/fonts/local.
Entry deleted from font path.
(Run 'mkfontdir' on /usr/X11R6/lib/X11/fonts/local).
(WW) The directory /usr/home/yann/.fonts/proggy does not exist.
Entry deleted from font path.
(**) FontPath set to

Re: 4-5 libmilter.a install failure

2005-08-09 Thread Peter Jeremy

On Mon, 2005-Aug-08 14:25:35 -1000, Randy Bush wrote:
my error.  i hacked /etc/make.conf between build and install.
so i can use ed to unhack it, but when i try to install again

--
 Making hierarchy
--
cd /usr/src; /usr/obj/usr/src/make.i386/make -f Makefile.inc1 hierarchy
cd /usr/src/etc;/usr/obj/usr/src/make.i386/make distrib-dirs
mtree -eU  -f /usr/src/etc/mtree/BSD.root.dist -p /
/usr/libexec/ld-elf.so.1: Shared object libmd.so.2 not found, required by 
mtree

Apart from the other suggested solutions:  At the beginning of the
installworld, make stashes a selection of utilities (including
mtree) into a temporary directory (${INSTALLTMP} in Makefile.inc1).
The exact name varies but is something like /tmp/install.
If you haven't lost the directory from the first installworld, you can
always copy mtree (any any other over-written utilities) back where
they started.

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 5-STABLE cpufreq hotter than est from ports

2005-08-09 Thread Bruno Ducrot

On Tue, Aug 02, 2005 at 12:22:02AM +0200, Tijl Coosemans wrote:
 A couple days ago I updated my system and was excited to see cpufreq
 and powerd in 5-stable. Since then however I noticed that my laptop
 temperature is about 5°C higher than with est and estctrl. I found that
 cpufreq when setting 200MHz for example set the absolute frequency to
 1600MHz (max for this laptop) and the relative frequency (p4tcc) to
 12.5% instead of using a more power conserving setting like 800MHz/25%.
 
 The problem is that cpufreq_expand_set() (sys/kern/kern_cpu.c)
 traverses freq levels from high to low when adding relative levels and
 skips duplicates. When it wants to add 800MHz/25% it sees this setting
 as a duplicate of 1600MHz/12.5% it has found before. This can be fixed
 by letting cpufreq_expand_set() traverse freq levels in reverse order
 (and still skipping duplicates). Then each frequency level has the
 lowest possible absolute setting. This is a one line change in
 sys/kern/kern_cpu.c (line 653).

It's a well known bug.  Someday I think I will have enough time to fix
that one if Nate don't bite me.

 With this patch temperature is almost as low as with est again (only
 1°C hotter). However, there are still such levels like 1400/12.5
 (175MHz) which are lower than let's say 600/37.5 (225MHz), but consume
 a lot more power. On my laptop this problem doesn't really occur
 because of the way powerd works, only the absolute levels 1600, 800 and
 600 are ever used. I can imagine somebody with a 1700MHz cpu not being
 so lucky though. So, I've worked out a patch (attached) that makes sure
 that a lower frequency level has at most the same absolute setting
 (preferably less of course). This eliminates quite a few levels so
 somebody with a better knowledge of cpufreq should check if this patch
 really does something good. This is the first time I've taken a look at
 FreeBSD source code by the way.

It's in my todo list in a so long time that I must admit I must be
blamed to have not fixed that already.

 Also, somewhat related, the p4tcc driver doesn't recognise
 acpi_throttle, which means that when you load the cpufreq module after
 booting, the freq levels are messed up. I'm not sure what the best
 solution for this is. Let p4tcc detect acpi_throttle and don't attach
 if it's present (like acpi_throttle does now if it finds p4tcc) or
 detach it before attaching? Or maybe p4tcc and acpi_throttle should be
 merged into one driver?
 
 Finally, is the kernel config option CPU_ENABLE_TCC still relevant?
 Because it's still listed in NOTES.

Right.  I forgot to kill that option.

-- 
Bruno Ducrot

--  Which is worse:  ignorance or apathy?
--  Don't know.  Don't care.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Returned mail: see transcript for details

2005-08-09 Thread Flipmail

Dear Member,

Thank you for your interest in our offer.  Unfortunately your inquiry
was not received.  We are very interested in hearing from you.  If you
would like to contact our Member Services Team, please go to www.netflip.com 
and click the Contact Us link.  We will respond
to your inquiry within 48 hours.  

We look forward to working with you towards our goal for 100% customer
satisfaction.

Regards,


Member Services Team

www.netflip.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Firewire oddity..

2005-08-09 Thread Daniel O'Connor

I got the following while trying to rsync a large number of files over 
the network to a firewire HD (FAT32 FS) on a 6.0-BETA1 system.

Aug  9 12:48:17 inchoate kernel: firewire0: split transaction timeout 
dst=0xffc0 tl=0x36 state=3
Aug  9 12:48:17 inchoate kernel: sbp_orb_pointer_callback: xfer-resp = 60
Aug  9 12:48:17 inchoate kernel: fw_rcv: unknown response WRES(2) src=0xffc0 
tl=0x36 rt=1 data=0x84513864
Aug  9 12:48:17 inchoate kernel: try ad-hoc work around!!
Aug  9 12:48:17 inchoate kernel: no use...
Aug  9 12:48:19 inchoate kernel: firewire0: split transaction timeout 
dst=0xffc0 tl=0x8 state=3
Aug  9 12:48:19 inchoate kernel: sbp_orb_pointer_callback: xfer-resp = 60
Aug  9 12:48:19 inchoate kernel: fw_rcv: unknown response WRES(2) src=0xffc0 
tl=0x8 rt=1 data=0x84518aad
Aug  9 12:48:19 inchoate kernel: try ad-hoc work around!!
Aug  9 12:48:19 inchoate kernel: no use...

da0 at sbp0 bus 0 target 0 lun 0
da0: Oxford 911D 0037 Fixed Simplified Direct Access SCSI-4 device
da0: 50.000MB/s transfers
da0: 114473MB (234441648 512 byte sectors: 255H 63S/T 14593C)

My interactive performance has gone down the tube too :(

here is a systat -vmstat snapshot
2 usersLoad 13.14  7.58  4.75  Aug  9 16:06

Mem:KBREALVIRTUAL VN PAGER  SWAP PAGER
Tot   Share  TotShareFree in  out in  out
Act  273168   43380   749572   105288   20760 count1
All  497356   51660 17265484   156920 pages2
  zfod   Interrupts
Proc:r  p  d  s  wCsw  Trp  Sys  Int  Sof  Fltcow1514 total
 1 1134  1505   44 2374 2005  2401  69176 wire   1007 0: clk
   273368 act 1: atkb
92.4%Sys   1.5%Intr  2.3%User  0.0%Nice  3.8%Idl   137800 inact   3: sio1
||||||||||  19780 cache   4: sio0
==+  980 free7: ppc0
  daefr   129 8: rtc
Namei Name-cacheDir-cache prcfr   9: pcm0
Calls hits% hits% react   350 11: nvi
 2276 2189   96   pdwak   12: psm
  pdpgs   13: npx
Disks   ad0   da0   cd0 pass0 pass1 1 intrn28 14: ata
KB/t   4.07  4.00  0.00  0.00  0.00 61024 buf 15: ata
tps  2870 0 0 0   268 dirtybuf
MB/s   0.11  0.27  0.00  0.00  0.00 35360 desiredvnodes
% busy   6932 0 0 0 10098 numvnodes
 6637 freevnodes


IRQ 11 is my super fun IRQ with 90% of my hardware attached..

I am copying about 17000 45k files. The controller is the one built into my
Dell Inspiron 8600 -
Aug  9 11:29:24 inchoate kernel: fwohci0: 1394 Open Host Controller Interface 
mem 0xfaffd800-0xfaffdfff,0xfaff8000-0xfaffbfff irq 11 at device 1.1 on pci2
Aug  9 11:29:24 inchoate kernel: fwohci0: [GIANT-LOCKED]
Aug  9 11:29:24 inchoate kernel: fwohci0: OHCI version 1.10 (ROM=0)
Aug  9 11:29:24 inchoate kernel: fwohci0: No. of Isochronous channels is 4.
Aug  9 11:29:24 inchoate kernel: fwohci0: EUI64 43:4f:c0:00:08:17:a4:38
Aug  9 11:29:24 inchoate kernel: fwohci0: Phy 1394a available S400, 2 ports.
Aug  9 11:29:24 inchoate kernel: fwohci0: Link S400, max_rec 2048 bytes.

When  I tried this earlier with 65000 much larger files I managed to get 
rsync to use 125% of the CPU (no HT/SMP here)..

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgpnAN2DafI4c.pgp
Description: PGP signature

Summary about nve network interface driver

2005-08-09 Thread Kövesdán Gábor


Hi,

I've experineced serious errors with the nve driver and I've seen quite 
many people would like to use it, but they also experiences these error. 
There are open PR's and unfortunately nobody has volunteered to fix this 
issue, the PR's have the default Responsible field, nobody has changed 
it. Maxime Henrion has tried to fix the crashes, however, and asked me 
to test his commits in HEAD, but unfortunately things haven't become 
better. I've also seen, that Quinton Dolan, the original developer of 
this driver, has also committed something to HEAD, he was the last 
committer to that file when I checked HEAD, but unfortunately at moment 
RELENG_6 has this buggy version and so has HEAD. I've checked both. The 
problem is complicated, the experineced errors are quite different:


-Sometimes it crashes with an attempted use of a free mbuf. See: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=83943

I've experineced and reported this issue in RELENG_6.
There is an another report by Dmitry Selin: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=amd64/82555


-   I've also exprienced a general protection fault and I suspect it 
must be related to the nve driver.

See: http://www.freebsd.org/cgi/query-pr.cgi?pr=84133
I've experineced this issue in RELENG_6 and in HEAD.

- Some people have device timeouts with nve. See: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=amd64/84027
I've also experienced it, but haven't reported, I wanted to get the 
crashes fixed first.



I'm very interested in fixing this driver, because I would like to use 
FreeBSD 6.0 as my desktop os, but unfortunately I don't have the 
knowledge to fix this driver. If somebody would volunteer to take care, 
I'd do the testing in RELENG_6 or HEAD or I'd do what I'd been told to 
do. It would be really nice if it got fixed by the time 6.0-RELEASE gets 
released.


Cheers,

Gábor Kövesdán
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 5-STABLE cpufreq hotter than est from ports

2005-08-09 Thread Nate Lawson


Tijl Coosemans wrote:

A couple days ago I updated my system and was excited to see cpufreq
and powerd in 5-stable. Since then however I noticed that my laptop
temperature is about 5°C higher than with est and estctrl. I found that
cpufreq when setting 200MHz for example set the absolute frequency to
1600MHz (max for this laptop) and the relative frequency (p4tcc) to
12.5% instead of using a more power conserving setting like 800MHz/25%.

The problem is that cpufreq_expand_set() (sys/kern/kern_cpu.c)
traverses freq levels from high to low when adding relative levels and
skips duplicates. When it wants to add 800MHz/25% it sees this setting
as a duplicate of 1600MHz/12.5% it has found before. This can be fixed
by letting cpufreq_expand_set() traverse freq levels in reverse order
(and still skipping duplicates). Then each frequency level has the
lowest possible absolute setting. This is a one line change in
sys/kern/kern_cpu.c (line 653).


You have some valid issues but I need some time to analyze your patch to 
be sure it doesn't introduce new problems.  There may be some issues 
with traversing the list backwards.



Also, somewhat related, the p4tcc driver doesn't recognise
acpi_throttle, which means that when you load the cpufreq module after
booting, the freq levels are messed up. I'm not sure what the best
solution for this is. Let p4tcc detect acpi_throttle and don't attach
if it's present (like acpi_throttle does now if it finds p4tcc) or
detach it before attaching? Or maybe p4tcc and acpi_throttle should be
merged into one driver?


cpufreq is not set up for loading after boot currently.  It must be 
loaded at boot.  There are architectural issues that need to be solved 
to make this happen, namely real arbitration between drivers loaded that 
support the same feature through different mechanisms.  p4tcc and 
acpi_throttle on some architectures is such a combo that needs special 
attention.



Finally, is the kernel config option CPU_ENABLE_TCC still relevant?
Because it's still listed in NOTES.


The old option should be removed.

I'll try to review this patch and commit it sometime soon.

--
Nate
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Xorg, Radeon X800PRO problem.

2005-08-09 Thread Matthias Buelow

Section Device
Identifier  Card0
Driver  ati
#VendorName  ATI Technologies Inc
#BoardName   Radeon X800PRO
#BusID   PCI:5:0:0
EndSection
...
(--) Assigning device section with no busID to primary device
(WW) RADEON: No matching Device section for instance (BusID PCI:5:0:1) found
(EE) No devices detected.

Have you tried specifying that BusID as in the commented out line
above (only PCI:5:0:1 instead of PCI:5:0:0)?

mkb.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Xorg, Radeon X800PRO problem.

2005-08-09 Thread Matthias Buelow

Yann Golanski wrote:

Yes, it gives me the same error.  The warning changes the PCI slot to
PCI:5:0:0...

Anyway, I used an older card.  It works.  *shrugs*

Odd.. I've got an X800SE and it works without problems; don't think
there's so much of a difference concerning the Xorg driver with the
two cards..

...

Errm! Upon looking at your original mail again.. you're running
Xorg 6.7? I don't think it'll work with that version. I had to
apply patches back then to get 6.7 to work with my card. Try upgrading
to the current version that is in ports (6.8.2 iirc).

mkb.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: kernel panic on 5.4-RELEASE-p6

2005-08-09 Thread Kris Kennaway

On Sun, Aug 07, 2005 at 06:51:09PM +0200, Petr Holub wrote:
 Hi all,
 
 I've encoutnered the follwing kernel panic on 5.4-RELEASE-p6. However,
 as the machine is production and not development one, I don't have
 debugger analysis. The panic might *theoretically* be due to problems
 accessing faulty CD-R media in the ATAPI DVD/CD-RW drive (that was the
 only unusual thing which happened before the panic). Kernel config
 and dmesg are attached below, the machine is IBM T41p laptop.
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address   = 0x300f0
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0x300f0
 stack pointer   = 0x10:0xe67cdb1c
 frame pointer   = 0x10:0xe67cdb34
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 1337 (vim)
 trap number = 12
 panic: page fault
 Uptime: 14m43s.

You need to follow the instructions in the chapter on kernel debugging
in the developers handbook in order for anyone to be able to begin
investigating this.

Kris


pgpT915th460e.pgp
Description: PGP signature

Re: Panic on FreeBSD 6.0BETA1

2005-08-09 Thread Kris Kennaway

On Wed, Aug 03, 2005 at 08:47:43AM -0400, Derrick Edwards wrote:
 ???Hi all,
 I decided to try and help with testing 6-BETA1, after updating sources and 
 recompiling i get the following during boot up. I am tried booting without 
 hyperthreading enabled in the bios and I still get the same panic.
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id =00
 fault virtual address = 0x480008
 fault code = supervisor read, page not present
 instruction pointer = 0x20:0xc06923cc
 stack pointer = 0x28:0xc10208ec
 
 Code segment = base 0x0, limit 0xf, type 0x1b
 ? ? ? ? ? ? ?= DPL 0, pres 1, def32 1, gran 1
 
 processor eflags = interrupt enabled, resyne, IOPL = 0
 current process = 0 (swapper)
 [thread pid 0tid0]
 Stopped at strlen+0x8: cmpb $0,0(%edx)
 
 I left all the debugging features in the current however, I am not sure 
 exactly how to trace this problem. If someone could point me to any doc that 
 would allow to provide more information that would ?be great. I updated my 
 pen 400MHZ using the same procedures and all went well. Please help

See the chapter on kernel debugging in the developers handbook.

Kris

P.S. I need to make a macro to type the answer to this FAQ. 



pgp8RZZoRHMnB.pgp
Description: PGP signature

RE: kernel panic on 5.4-RELEASE-p6

2005-08-09 Thread Petr Holub

 You need to follow the instructions in the chapter on kernel debugging
 in the developers handbook in order for anyone to be able to begin
 investigating this.

I though that - alas at the time of the panic, that machine didn't have
dump partition configured. :(( Anyway, I've configured that partition,
built kernel.debug and I'm just waiting if that happens again.

Thanks,
Petr


Petr Holub
CESNET z.s.p.o.   Supercomputing Center Brno
Zikova 4 Institute of Compt. Science
162 00 Praha 6, CZMasaryk University
Czech Republic Botanicka 68a, 60200 Brno, CZ 
e-mail: [EMAIL PROTECTED]   phone: +420-549493944
 fax: +420-541212747
   e-mail: [EMAIL PROTECTED]

 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: kernel panic on 5.4-RELEASE-p6

2005-08-09 Thread Kris Kennaway

On Tue, Aug 09, 2005 at 07:22:39PM +0200, Petr Holub wrote:
  You need to follow the instructions in the chapter on kernel debugging
  in the developers handbook in order for anyone to be able to begin
  investigating this.
 
 I though that - alas at the time of the panic, that machine didn't have
 dump partition configured. :(( Anyway, I've configured that partition,
 built kernel.debug and I'm just waiting if that happens again.

That chapter also describes a fallback you can use if you don't have a
kernel.debug or dump to at least get the name of the function in which
the panic occurred.

Kris


pgpJlWFp57wMm.pgp
Description: PGP signature

RE: twa0 errors and system lockup on amd64

2005-08-09 Thread Vinod Kashyap

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jung-uk Kim
 Sent: Monday, August 08, 2005 2:29 PM
 To: freebsd-stable@FreeBSD.org
 Cc: Josh Endries
 Subject: Re: twa0 errors and system lockup on amd64

 On Monday 08 August 2005 05:03 pm, Josh Endries wrote:
  Hello,

  Just a little while ago I got this on a test 5.4-stable 
 dual Opteron 
  box I'm setting up (9500S-LP RAID 5 with a hot spare):

  Aug  8 15:58:21 kernel: twa0: ERROR: (0x05: 0x210b): Request timed
  out!: request = 0x80a67700
  Aug  8 15:58:21 kernel: twa0: INFO: (0x16: 0x1108): Resetting
  controller...:
  Aug  8 15:58:21 kernel: twa0: ERROR: (0x15: 0x110b): Can't 
 drain AEN 
  queue after reset: error = 60 Aug  8 15:58:21 kernel: twa0: ERROR: 
  (0x16: 0x1105): Controller reset failed: error = 60; attempt 1

  It attempted twice and then just sat there after that. I 
 couldn't log 
  in at all so I did a hard reset after probably 30+ minutes. 
 I didn't 
  find much online other than driver source code or twe(4) man pages, 
  which suggests that it's a problem between the driver and card. Has 
  anyone else seen this problem? Is it a sign of a flaky card 
 or could 
  it be something else? Maybe it's something to do with AMD64? This 
  system was supposed to go into production tomorrow. I guess 
 it's good 
  that it died today...

This should have nothing to do with amd64.  The firmware seems to have
gotten into a bad state.  Can you reproduce this consistently?  If you
can, please run 'tw_cli /cX show diag', and send the output.

 I have seen it with 9500S-8, which is the same controller 
 with 8 ports.  In fact, I am seeing other problems.

 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
 twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5
 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
 twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0
 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
 twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0
 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
 twa0: INFO: (0x04: 0x0005): Rebuild completed: unit=0
 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
 twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5

 I have rebuilt this array many times but it's happening again 
 and again.  It seems this controller/driver has issues with 
 amd64.  FYI, UP kernel or replacing cables didn't fix the problem.

You seem to have a bad drive/cable at port 5.

 Good luck,

 Jung-uk Kim

  Josh
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to 
 [EMAIL PROTECTED]

CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Applied Micro Circuits Corporation or its 
subsidiaries. It is to be used solely for the purpose of furthering the 
parties' business relationship. All unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

2005-08-09 Thread Chuck Swiger


O. Hartmann wrote:
[ ... ]
One of  my SATA disks, the SAMSUNG SP2004C seems to show errors during 
operation (and also showd under 5.4-RELEASE-p3).

Sometimes I get this error:
ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
while the machine still keeps working.
Other days the box crashes completely.

Is this a operating system bug or is this message an evidence of 
defective hardware?


Back up any data you care about now.  Use the smartmontools port or hunt down a 
utility from Samsung which'll do a surface test (read only, nondestructive).


You can also run a dd if=/dev/ad10 of=/dev/null bs=8192 to do a full read 
test under FreeBSD, and see how many CRC errors show up.


--
-Chuck

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

2005-08-09 Thread J. T. Farmer


Chuck Swiger wrote:


O. Hartmann wrote:
[ ... ]

One of  my SATA disks, the SAMSUNG SP2004C seems to show errors 
during operation (and also showd under 5.4-RELEASE-p3).

Sometimes I get this error:
ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
while the machine still keeps working.
Other days the box crashes completely.

Is this a operating system bug or is this message an evidence of 
defective hardware?


Back up any data you care about now.  Use the smartmontools port or 
hunt down a utility from Samsung which'll do a surface test (read 
only, nondestructive).


You can also run a dd if=/dev/ad10 of=/dev/null bs=8192 to do a full 
read test under FreeBSD, and see how many CRC errors show up.



Actually, I would go with the it's an operating system error.  That's 
exactly
the same error quite a few people (myself included) have been reporting 
under

5.4 and 5-STABLE.  In my case, I just installed the smartmontools port and
it's reporting that my drive is behaving perfectly.

This is 6.0-Beta?  I have a couple spare drives here (destined to be part of
a RAID array on another machine).  Perhaps after I get rid of some of the
current RL overloads, I will try it on this machine.

John

--
John T. FarmerOwner  CTOGoldSword Systems
[EMAIL PROTECTED] 865-691-6498   Knoxville TN
   Consulting, Design,  Development of Networks  Software

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 4-5 libmilter.a install failure

2005-08-09 Thread Randy Bush

fwiw, we gave up and are cold installing 6

randy

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

2005-08-09 Thread Karl Denninger

Post your dmesg output from boot.

If the SATA controller has a SII chipset, you're in trouble.

Get that board out of there - or if its on the motherboard, get something
else in there and use it instead.

SII chipsets were ok in 4.x, but the newer ATA code broke badly with them.
I've had a PR open on this since February, and many others have reported
similar issues.  The problems still exist in the 6.x-BETA releases I've
checked out, and are in some cases MORE severe (for me anyway) than they are
in 5.4.

You CAN AND WILL lose data if you're not careful.  

Be careful!

A BIG disappointment to me is that FreeBSD has not CONSPICUOUSLY stated in
the hardware notes for 5.4 (and beyond) that these controllers DO NOT work 
reliably with 5.x and later, or undertaken to do whatever is needed to 
make them work as they did in 4.x (I had no problems with the same 
hardware under the 4.x releases)  As these controllers are widely 
available and very inexpensive, not to mention showing up in all manner
of boards from various vendors (including Adaptec, Bustek and on some
motherboards!) this is quite a disappointment.

I finally gave up kvetching on the list and bought a 3ware 8502 card.  
Same disks, same system, same load, no problems.

Smartmontools declared all my disks as perfectly healthy in my case, and has
for several others as well

YMMV.

--
-- 
Karl Denninger ([EMAIL PROTECTED]) Internet Consultant  Kids Rights Activist
http://www.denninger.netMy home on the net - links to everything I do!
http://scubaforum.org   Your UNCENSORED place to talk about DIVING!
http://genesis3.blogspot.comMusings Of A Sentient Mind

On Tue, Aug 09, 2005 at 10:04:14PM -0400, J. T. Farmer wrote:
 Chuck Swiger wrote:
 
 O. Hartmann wrote:
 [ ... ]
 
 One of  my SATA disks, the SAMSUNG SP2004C seems to show errors 
 during operation (and also showd under 5.4-RELEASE-p3).
 Sometimes I get this error:
 ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
 while the machine still keeps working.
 Other days the box crashes completely.
 
 Is this a operating system bug or is this message an evidence of 
 defective hardware?
 
 Back up any data you care about now.  Use the smartmontools port or 
 hunt down a utility from Samsung which'll do a surface test (read 
 only, nondestructive).
 
 You can also run a dd if=/dev/ad10 of=/dev/null bs=8192 to do a full 
 read test under FreeBSD, and see how many CRC errors show up.
 
 
 Actually, I would go with the it's an operating system error.  That's 
 exactly
 the same error quite a few people (myself included) have been reporting 
 under
 5.4 and 5-STABLE.  In my case, I just installed the smartmontools port and
 it's reporting that my drive is behaving perfectly.
 
 This is 6.0-Beta?  I have a couple spare drives here (destined to be part of
 a RAID array on another machine).  Perhaps after I get rid of some of the
 current RL overloads, I will try it on this machine.
 
 John
 
 --
 John T. FarmerOwner  CTOGoldSword Systems
 [EMAIL PROTECTED] 865-691-6498   Knoxville TN
Consulting, Design,  Development of Networks  Software
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 
 %SPAMBLOCK-SYS: Matched [EMAIL PROTECTED], message ok


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Panic on FreeBSD 6.0BETA1

2005-08-09 Thread Panagiotis Christias

On 8/9/05, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Wed, Aug 03, 2005 at 08:47:43AM -0400, Derrick Edwards wrote:
  ???Hi all,
  I decided to try and help with testing 6-BETA1, after updating sources and
  recompiling i get the following during boot up. I am tried booting without
  hyperthreading enabled in the bios and I still get the same panic.
 
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id =00
  fault virtual address = 0x480008
  fault code = supervisor read, page not present
  instruction pointer = 0x20:0xc06923cc
  stack pointer = 0x28:0xc10208ec
 
  Code segment = base 0x0, limit 0xf, type 0x1b
  ? ? ? ? ? ? ?= DPL 0, pres 1, def32 1, gran 1
 
  processor eflags = interrupt enabled, resyne, IOPL = 0
  current process = 0 (swapper)
  [thread pid 0tid0]
  Stopped at strlen+0x8: cmpb $0,0(%edx)
 
  I left all the debugging features in the current however, I am not sure
  exactly how to trace this problem. If someone could point me to any doc that
  would allow to provide more information that would ?be great. I updated my
  pen 400MHZ using the same procedures and all went well. Please help
 
 See the chapter on kernel debugging in the developers handbook.
 
 Kris
 
 P.S. I need to make a macro to type the answer to this FAQ.

IMHO I think that the average FreeBSD user/fan could use some help
when he/she deals with kernel panics/crashes/dumps/kgdb/kernel.debug
etc.

First step, the debug kernel and modules could be included in the base
system. What is it, 50 to 60MB disk space? That should be no(?)
problem. Being there and correctly built by The Source would be the
first step for the average user (not the I wrote the kernel
developer) to use them and provide correct, complete and thus useful
feedback. The one that you developers ask for.

Second step, the procedure of extracting the useful data from the
crash dumps using the wat doez diz program do? kgdb program is not
efficient to be left in hands of the average user. He/she is probably
a great guy, afterall he/she is using FreeBSD and certainly willing to
contribute but dealing with kgdb is too much to ask for. Except if
he/she has the guidance of kernel developer through this mailing list
or others. It is not an efficient way to investigate and report
serious problems.

I am wondering if possible to write some kgdb-wrapper scripts which
would do the dirty job of analyzing the crash dump using kgdb and the
proper command (backtrace, list xx, print yy, up nn, down mm, voodoo
etc.). This procedure can be as detailed as needed or wanted.  The
results could be sent automatically (send-pr) to the FreeBSD Team.

That way, every user would be able to easily send proper reports
without mistakes or missing data and benefit the FreeBSD Project.

Any thoughts?
Panagiotis
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

2005-08-09 Thread Matthias Buelow

Karl Denninger wrote:

SII chipsets were ok in 4.x, but the newer ATA code broke badly with them.
I've had a PR open on this since February, and many others have reported
similar issues.  The problems still exist in the 6.x-BETA releases I've
checked out, and are in some cases MORE severe (for me anyway) than they are
in 5.4.

Well, it doesn't affect just the SII chips.. I see the same on an
Intel ICH6 chipset but never after the kernel has mounted the root
fs. Sometimes it takes several attempts until it manages to do so,
though. The machine works w/o any such problems on other OSes. I've
deferred update of another machine (which is a hosted box and cannot
afford random hangs at boot) because of general flakeyness of the
ATA/SATA code in 5.4 (significantly worse than with 5.3, imho). If
these issues don't go away completely soon (in 6.x) I'll have to
look for some alternative system which doesn't make such a fuss
with mainstream hardware.

mkb.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Xorg, Radeon X800PRO problem.

2005-08-09 Thread Sascha Holzleiter

On Tue, 2005-08-09 at 09:57 +0100, Yann Golanski wrote:
 I am stuck on getting this graphics card working with stable.  
 

I had the same problem with the standard server port. Replacing it with
the xorg-server-snap port solved it. Seems these cards are just to new
for the stable release ;)

Thanks to our xorg maintainers for providing up-to-date servers!


-- 
  Sascha



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: bge watchdog errors in -STABLE

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Xorg, Radeon X800PRO problem.

Re: 4-5 libmilter.a install failure

Re: 5-STABLE cpufreq hotter than est from ports

Re: Returned mail: see transcript for details

Firewire oddity..

Summary about nve network interface driver

Re: 5-STABLE cpufreq hotter than est from ports

Re: Xorg, Radeon X800PRO problem.

Re: Xorg, Radeon X800PRO problem.

Re: kernel panic on 5.4-RELEASE-p6

Re: Panic on FreeBSD 6.0BETA1

RE: kernel panic on 5.4-RELEASE-p6

Re: kernel panic on 5.4-RELEASE-p6

RE: twa0 errors and system lockup on amd64

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Re: 4-5 libmilter.a install failure

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Re: Panic on FreeBSD 6.0BETA1

Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Re: Xorg, Radeon X800PRO problem.

23 matches

Site Navigation

Mail list logo

Footer information