Otto,

I know the cables are allright, i'm using them with other hard drive .
And the hard drive is new , but i will format it and check if it
shows up some errors.
I hope it is hardware related , i would get kind of scared otherwise.
Do you need me to try anything else with this filesystem?

Regards,
Marcos

----- Original Message ----- 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: <misc@openbsd.org>
Sent: Friday, July 13, 2007 4:46 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Otto ,
>
> This is the error i get:
> It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
>
> (i could see once that when it finished it says:)
> fsck_ffs in free():  error: free_page: pointer to wrong page
> fsck: /dev/rwd0h: Abort trap
>
> I reboot it again many times and that did not show again
>
>
> i try to fsck manually like this as you say and i get:
>
> # ulimit -d unlimited
> # fsck -y /dev/rwd0e
>
> INCONSISTENT CGSIZE=16384
>
> FIX? yes
>
> * * Last mounted on /usr
> * * Phase 1- Check Blocks and Sizes
> * * Phase 2 - Check pathnames
> * * Phase 3 - Check Conectivity
> * * Phase 4 - Check Reference Counts
> * * Phase 5 - Check Cyl Groups
>
> CANNOT READ: BLK 64
>
> CONTINUE? yes
>
> fsck: /dev/rwd0e: Segmentation Fault

This is not an out of memory situation.

It looks like fsck_ffs has problems getting data from your disk,
probably because of hardware failure or bad cabling.  Sometimes it
detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
is possible it gets corrupted data in other cases.

Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
memory and corrupt it's internal data. During the last year I've fixed
some stuff in this area, but there still remains cases that can go
wrong.

-Otto


> # _
>
>
> The dmesg is:
>
> OpenBSD 4.1-stable (GENERIC) #0: Mon May 14 14:02:47 ART 2007
>     [EMAIL PROTECTED]:/u/system/src/sys/arch/i386/compile/GENERIC
> cpu0: Intel(R) Pentium(R) 4 CPU 2.80GHz ("GenuineIntel" 686-class) 2.81 GHz
> cpu0:
>
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
> ,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
> real mem  = 1064857600 (1039900K)
> avail mem = 964222976 (941624K)
> using 4278 buffers containing 53366784 bytes (52116K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+ BIOS, date 09/15/03, BIOS32 rev. 0 @ 0xfbbd0, 
> SMBIOS rev. 2.2
@
> 0xf0800 (39 entries)
> bios0: MICRO-STAR INTL, CO.,LTD. MS-6743
> apm0 at bios0: Power Management spec V1.2
> apm0: AC on, battery charge unknown
> apm0: flags 70102 dobusy 1 doidle 1
> pcibios0 at bios0: rev 2.1 @ 0xf0000/0xdf84
> pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdeb0/176 (9 entries)
> pcibios0: PCI Exclusive IRQs: 3 4 5 7 10 11
> pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371SB ISA" rev 0x00)
> pcibios0: PCI bus #1 is the last bus
> bios0: ROM list: 0xc0000/0xa600 0xcc000/0x1800
> acpi at mainbus0 not configured
> cpu0 at mainbus0
> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> pchb0 at pci0 dev 0 function 0 "Intel 82865G/PE/P CPU-I/0-1" rev 0x02
> vga1 at pci0 dev 2 function 0 "Intel 82865G Video" rev 0x02: aperture at 
> 0xf0000000,
size
> 0x8000000
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> pci1 at ppb0 bus 1
> fxp0 at pci1 dev 8 function 0 "Intel PRO/100 VE" rev 0x02, i82562: irq 10, 
> address
> 00:0c:76:b5:8a:85
> inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
> ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, channel > 0
configured
> to compatibility, channel 1 configured to compatibility
> wd0 at pciide0 channel 0 drive 1: <WDC WD800JD-00MSA1>
> wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
> wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
> atapiscsi0 at pciide0 channel 1 drive 1
> scsibus0 at atapiscsi0: 2 targets
> cd0 at scsibus0 targ 0 lun 0: <LITEON, CD-ROM LTN526D, 9S01> SCSI0 5/cdrom 
> removable
> cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2
> ichiic0 at pci0 dev 31 function 3 "Intel 82801EB/ER SMBus" rev 0x02: irq 4
> iic0 at ichiic0
> iic0: addr 0x2f 04=00 06=0a 07=00 0c=00 0d=07 0e=85 0f=00 10=c0 11=11 12=00 
> 13=60 14=14
> 15=62 16=01 17=06
> isa0 at ichpcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pmsi0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pmsi0 mux 0
> pcppi0 at isa0 port 0x61
> midi0 at pcppi0: <PC speaker>
> spkr0 at pcppi0
> lm0 at isa0 port 0x290/8: W83627THF
> npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
> biomask ebfd netmask effd ttymask ffff
> pctr: user-level cycle counter enabled
> dkcsum: wd0 matches BIOS drive 0x80
> root on wd0a
> rootdev=0x0 rrootdev=0x300 rawdev=0x302
>
>
> ----- Original Message ----- 
> From: "Otto Moerbeek" <[EMAIL PROTECTED]>
> To: "Marcos Laufer" <[EMAIL PROTECTED]>
> Cc: <misc@openbsd.org>
> Sent: Friday, July 13, 2007 3:38 PM
> Subject: Re: fsck Segmentation fault on 4.1
>
>
> On Fri, 13 Jul 2007, Marcos Laufer wrote:
>
> > Hello,
> >
> > I want to report a problem i experienced while testing OpenBSD 4.1 .
> > I've installed it, increased VM_PHYSSEG_MAX to 16
> > in /usr/src/sys/arch/i386/include/vmparam.h to make
> > it work with this particular motherboard and made a
> > stable release.
> > Installed a server with it and it's working fine as an MX for
> > a few months until now.
> > The machine was crashed, no error on the screen and the keyboard
> > did not respond. I rebooted , it started to fsck , and
> > the fsck failed on /usr. So i run fsck manually : fsck -y, but
> > it crashes with segmentation fault, so i can't mount or
> > start the server.
> > I read on the archives that it was a problem because of running out
> > of swap, but i had made a 2gb swap partition, despite of that
> > i added a 64mb file as swap and tried fsck again, but no luck.
> > This time it was easy for me to reinstall everything in a new hard disk, but
> > i still keep the old one because i would like to learn how to fix
> > this , if anyone wants me to make some tests or has
> > any ideas on what is going on , let me know.
>
> Start by showing the error messgae. A segmentation fault is something
> different than running out of memory.
>
> If fsck segfaults, I need a proper error report.
> See http://www.openbsd.org/report.html
>
> If fsck runs out of memory, increasing ulimit -d might help, like:
>
> # ulimit -d unlimited
> # fsck ...
>
> That reminds me to cook a diff to do this automatically. With
> filesystem getting larger an larger, more people will run into
> out-of-mem situations.
>
> -Otto

Reply via email to