Hi everyone,

I got this half an hour ago, with some processes left in D state, namely
ooffice.bin and two instances of procmail, as this happened on my /home
LV:

kernel BUG at 
/usr/src/sources/linux-2.6.16-rc5/fs/reiser4/plugin/file/tail_conversion.c:29!
invalid opcode: 0000 [#1]
PREEMPT 
Modules linked in: mga drm w83781d hwmon_vid hwmon i2c_isa snd_seq_midi 
snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_cmipci 
snd_opl3_lib snd_hwdep snd_mpu401_uart ohci_hcd floppy sr_mod cdrom pata_via 
i2c_viapro aic7xxx scsi_transport_spi ehci_hcd uhci_hcd 3c59x mii snd_ens1370 
gameport snd_rawmidi snd_seq_device snd_pcm snd_timer snd_ak4531_codec snd 
soundcore snd_page_alloc via_agp agpgart usbcore xfs exportfs reiser4 ext2 loop 
lp parport_pc parport rtc psmouse reiserfs dm_mod raid5 raid1 xor md_mod 
pata_pdc2027x libata sd_mod scsi_mod unix
CPU:    0
EIP:    0060:[<f2daa02d>]    Not tainted VLI
EFLAGS: 00010286   (2.6.16-rc5 #10) 
EIP is at get_exclusive_access+0x31/0x44 [reiser4]
eax: b26d6c04   ebx: 00000000   ecx: ec54bbf4   edx: b736fdc0
esi: 3dbf3000   edi: 00006c85   ebp: 00006c85   esp: ded36f0c
ds: 007b   es: 007b   ss: 0068
Process soffice.bin (pid: 12533, threadinfo=ded36000 task=b3b7b070)
Stack: <0>f2da83da 00000000 c52ca544 e52935a8 00007000 b014c75f c52ca544 
e7b3cc80 
b0151cd1 b6b54354 b6b5434c 3dbf3000 ed113360 e6a02160 b26d6bc0 ec54bc4c 
ec54bbf4 00000000 00006c85 00000001 00000000 ec54bc00 00000000 00006c85 
Call Trace:
[<f2da83da>] write_unix_file+0x1ba/0x60c [reiser4]
[<f2da8220>] write_unix_file+0x0/0x60c [reiser4]
[<b0101135>] syscall_call+0x7/0xb
Code: ff 21 e0 8b 00 8b 80 b0 04 00 00 8b 40 40 8b 50 08 85 d2 75 16 ba 01 00 
ff ff 89 c8 0f c1 10 85 d2 75 12 c7 41 24 01 00 00 00 c3 <0f> 0b 1d 00 04 8c dc 
f2 eb e0 51 e8 13 ac 35 bd 59 eb e5 55 89 


I had another occurrence of something looking similar at first glance,
repeatedly grinding my laptop to halt when I was on a trip. The only way
to make it go away was to wipe the device by dd'ing /dev/zero to it. Not
even tar-backup and mkfs did the job - otherwise I could have left out
the word "repeatedly"...

The only thing I could imagine other than a serious problem wrt. reiser4
code is a "soft" bad block relocated by the drive upon write, but there
was nothing like a read error in the logs. Furthermore I wanted the
gurus to know since it occured to me more than once.


Thanks for your time!
Chris



FYI, here comes something about the disk, including SMART error log:


/dev/sda:

ATA device, with non-removable media
        Model Number:       SAMSUNG SV1203N                         
        Serial Number:      S01CJ10Y410901      
        Firmware Revision:  TQ100-30
Standards:
        Supported: 7 6 5 4 
        Likely used: 7
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  234493056
        LBA48  user addressable sectors:  234493056
        device size with M = 1024*1024:      114498 MBytes
        device size with M = 1000*1000:      120060 MBytes (120 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 1
        Standby timer values: spec'd by Standard, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Recommended acoustic management value: 254, current value: 254
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=240ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    READ BUFFER cmd
           *    WRITE BUFFER cmd
           *    Host Protected Area feature set
           *    Look-ahead
           *    Write cache
           *    Power Management feature set
                Security Mode feature set
           *    SMART feature set
           *    FLUSH CACHE EXT command
           *    Mandatory FLUSH CACHE command 
           *    Device Configuration Overlay feature set 
           *    48-bit Address feature set 
           *    Automatic Acoustic Management feature set 
                SET MAX security extension
           *    DOWNLOAD MICROCODE cmd
           *    SMART self-test 
           *    SMART error logging 
Security: 
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
        not     frozen
        not     expired: security count
                supported: enhanced erase
        56min for SECURITY ERASE UNIT. 56min for ENHANCED SECURITY ERASE UNIT.
HW reset results:
        CBLID- above Vih
        Device num = 0 determined by the jumper
Checksum: correct



smartctl version 5.33 [i386-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 8 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8 occurred at disk power-on lifetime: 1595 hours (66 days + 11 hours)
  When the command that caused the error occurred, the device was active or 
idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 51 00 00 00 00 a0  Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  b1 c0 00 00 00 00 a0 00      16:23:07.813  DEVICE CONFIGURATION RESTORE
  b1 c2 00 00 00 00 a0 00      16:23:07.813  DEVICE CONFIGURATION IDENTIFY
  9a 23 04 00 02 00 a0 00      16:23:07.813  [VENDOR SPECIFIC]
  9a 23 04 00 02 00 a0 00      16:23:07.750  [VENDOR SPECIFIC]
  9a 23 01 00 02 00 a0 00      16:23:07.750  [VENDOR SPECIFIC]

Error 7 occurred at disk power-on lifetime: 1595 hours (66 days + 11 hours)
  When the command that caused the error occurred, the device was active or 
idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  02 51 3f 00 00 00 e0  Error: TK0NF

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  10 00 3f 00 00 00 e0 00      16:10:46.563  RECALIBRATE [OBS-4]
  91 00 3f 3f ff 3f e0 00      16:10:46.563  INITIALIZE DEVICE PARAMETERS 
[OBS-6]
  ef 03 45 01 00 00 a0 00      16:10:46.563  SET FEATURES [Set transfer mode]
  ef 03 0c 01 00 00 a0 00      16:10:46.563  SET FEATURES [Set transfer mode]
  ec 00 00 01 00 00 a0 00      16:10:45.813  IDENTIFY DEVICE

Error 6 occurred at disk power-on lifetime: 1595 hours (66 days + 11 hours)
  When the command that caused the error occurred, the device was active or 
idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  02 51 00 00 00 00 e0  Error: TK0NF

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  10 00 00 00 00 00 e0 00      16:10:32.625  RECALIBRATE [OBS-4]
  00 00 01 01 00 00 a0 00      16:10:32.625  NOP [Abort queued commands]

Error 5 occurred at disk power-on lifetime: 1595 hours (66 days + 11 hours)
  When the command that caused the error occurred, the device was active or 
idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  02 51 00 00 00 00 e0  Error: TK0NF

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  10 00 00 00 00 00 e0 00      16:10:32.063  RECALIBRATE [OBS-4]
  00 da 01 01 00 00 a0 00      16:10:32.063  NOP [Reserved subcommand]
  b0 da 10 01 4f c2 a0 00      16:10:25.688  SMART RETURN STATUS
  b0 d8 10 01 4f c2 a0 00      16:10:25.625  SMART ENABLE OPERATIONS
  c6 03 10 01 00 00 a0 00      16:10:25.625  SET MULTIPLE MODE

Error 4 occurred at disk power-on lifetime: 483 hours (20 days + 3 hours)
  When the command that caused the error occurred, the device was active or 
idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 51 00 00 4f c2 e0  Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  b0 da 00 00 4f c2 e0 00  21d+14:59:16.250  SMART RETURN STATUS
  ec 00 00 ac 87 e4 e0 00  21d+14:59:16.188  IDENTIFY DEVICE
  ef 82 00 00 00 00 e0 00  21d+14:47:39.438  SET FEATURES [Disable write cache]
  ef 02 00 00 00 00 e0 00  21d+14:44:35.625  SET FEATURES [Enable write cache]
  ef 42 fe 00 00 00 e0 00  21d+09:37:08.125  SET FEATURES [Enable AAM]

Attachment: pgp2cOQLNJZQQ.pgp
Description: PGP signature

Reply via email to