Re: bad bread

2006-05-09 Thread Valdis . Kletnieks
On Tue, 09 May 2006 00:18:32 +0200, PFC said:

>   Linux RAID has a special option for that : you can trigger a check, 
> which  
> will re-read the entire disks and, if a read error occurs, re-write the  
> failing sector with good data from the other drives in the RAID. The drive  
> with the bad sector will then remap it to another sector.

If you have 2 mirrored disks, and are replacing one, you don't have a good
block to read it from.  The failure mode was a RAID controller that didn't
properly handle re-writing the bad block on the first disk, so when the
second disk got a bad block, you were screwed



pgpQzzLSB85Ov.pgp
Description: PGP signature


bug when accessing files .

2006-05-09 Thread Henti Smith
Hi all. 

I'm getting kernel BUG error when accessing a set of files on my FS. 

Here is the kernel output. 

[ cut here ]
kernel BUG at fs/reiserfs/journal.c:2809!
invalid operand:  [#1]
PREEMPT SMP
Modules linked in:
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.12-gentoo-r6)
EIP is at journal_begin+0xe3/0xf0
eax:    ebx: cba3dea8   ecx: cba3def8   edx: dfd22c00
esi: cba3def8   edi: cba3c000   ebp: dfd22c00   esp: cba3de68
ds: 007b   es: 007b   ss: 0068
Process mc (pid: 6112, threadinfo=cba3c000 task=dba120a0)
Stack:     d520c80c d520c80c  
cba3dea8
   c01aa029 cba3dea8 dfd22c00 0012  cdff3000  

      e09c2000 cba3def8  cba3def8 
cba3df68
Call Trace:
 [] remove_save_link+0x39/0x110
 [] journal_end+0xa7/0x100
 [] reiserfs_delete_inode+0xcc/0xe0
 [] reiserfs_delete_inode+0x0/0xe0
 [] generic_delete_inode+0x73/0x110
 [] iput+0x62/0x90
 [] sys_unlink+0x10d/0x140
 [] syscall_call+0x7/0xb
Code: 74 29 40 b9 09 00 00 00 89 df 89 46 04 f3 a5 83 7b 04 01 7e 04 31 
c0 eb c9 c7 44 24 04 60 e4 39 c0 89 2c 24 e8 1f 0f ff ff eb ea <0f> 0b 
f9 0a 48 3c 39 c0 eb cd 8d 76 00 55 31 ed 57 56 53 83 ec

Machine information : 

Gentoo info since it usually has some handy bits of info :P 

Portage 2.0.53 (default-linux/x86/2005.0, gcc-3.2.3,
glibc-2.3.4.20041102-r1, 2.6.12-gentoo-r6 i686)
=
System uname: 2.6.12-gentoo-r6 i686 Pentium III (Coppermine) Gentoo
Base System version 1.4.16 dev-lang/python: 2.2.3-r1, 2.3.5
sys-apps/sandbox:1.2.10
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r7
sys-devel/libtool:   1.4.3-r1, 1.5.16
virtual/os-headers:  2.4.19-r1, 2.6.8.1-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O3 -march=pentium3 -fprefetch-loop-arrays -funroll-loops -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.3/env 
/usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3/share/config 
/usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=pentium3 -fprefetch-loop-arrays -funroll-loops
-pipe" DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://ftp.is.co.za/linux/distributions/gentoo
http://gentoo.oregonstate.edu/
http://www.ibiblio.org/pub/Linux/distributions/gentoo"; MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 acl apm arts audiofile avi berkdb bitmap-fonts bzip2 crypt
curl eds emboss encode expat foomaticdb fortran gdbm gif gmp gpm
gstreamer gtk2 idn imlib ipv6 java jpeg kde lcms libg++ libwww mad
mikmod mng motif mp3 mpeg ncurses nls ogg oggvorbis opengl oss pam pcre
pdflib perl png quicktime readline samba sdl slang spell ssl tcpd tiff
truetype truetype-fonts type1-fonts udev vorbis winbind xmms xv zlib
userland_GNU kernel_linux elibc_glibc" Unset:  ASFLAGS, CTARGET, LANG,
LC_ALL, LDFLAGS, LINGUAS, PORTDIR_OVERLAY

after the error all disk acces is dead .. and machine has to be reset. 

I'm busy upgrading to latest kernel to see if this solves the problem .. 
will also get latest version of reiserfsck and scan drive just in case. 

-- 
Henti Smith
[EMAIL PROTECTED]
+27 82 958 2525
http://www.geekware.co.za

DISCLAIMER : 

Unauthorised use of characters, images, sounds, odors, severed limbs,
noodles, wierd dreams, strange looking fruit, oxygen, and certain parts
of Jupiter are strictly forbidden.  If I find you violating, or
molesting my property in any way, I will employ a pair of burly
convicts to find you, kidnap you, and perform god-awful sexual
experiments on you until you lose the ability to sound out vowels.  I
don't know why you are still reading this, but by doing so you have
proven that you have far too much time on your hands, and you should go
plant a tree, or read a book or something.
- http://www.ctrlaltdel-online.com/


Flat files, databases and creating structure

2006-05-09 Thread Kristian Koehntopp

In the context of this years MySQL Users conference, Tim O'Reilly has started 
a nice article series on O'Reilly Radar under the topic of "Database War 
Stories". There is a common theme to all these postings, and it is "For some 
things, flat files rule, for others database do it better."

In 
http://radar.oreilly.com/archives/2006/05/brian_aker_of_mysql_responds.html, 
Brian Aker of MySQL summarizes this, and his closing remark is along the 
lines of

Everyone arrives at certain truths, flat files with multiple dimensions 
don't scale, you will need to partition your data in some manner, and 
in the end caching is a requirement.

The "flat files with multiple dimensions" remark reminded me of some texts by 
Hans, noteably his observations in http://www.namesys.com/whitepaper.html.

Some of the articles in the series, e.g. the flickr article re the tag clouds 
problem (Why is this a problem -> 
http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html, 
http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html) 
and the remarks elsewhere regarding full text indexing also strike me as 
relevant to the goals of the Reiserfs project.

I know that this is not strictly productive in the sense of code and patches, 
but maybe you have ideas or remarks that you want to share and that fit into 
the framework of this series. If so, you should probably be talking to Tim 
O'Reilly and add to this series.

Kristian

-- 
Kristian =?iso-8859-15?q?K=F6hntopp?= <[EMAIL PROTECTED]>


Re: reiserfs crash

2006-05-09 Thread Devel
Il Sat, 06 May 2006 11:43:58 +0400
"Vladimir V. Saveliev" <[EMAIL PROTECTED]> scrisse:

> Hello
> 
> On Fri, 2006-05-05 at 16:34 +0200, Devel wrote:
> > Il Fri, 05 May 2006 10:43:26 +0400
> > "Vladimir V. Saveliev" <[EMAIL PROTECTED]> scrisse:
> > 
> > > Hello
> > > 
> > > On Thu, 2006-05-04 at 19:06 +0200, Devel wrote:
> > > > Hi All,
> > > > i'm testing reiser4 on a linux box kernel 2.6.16. This linux box write 
> > > > a lot of images on the partiion with reiser4 and after delete them.
> > > > After a while all goes wrong and dmesg give me this oops:
> > > > 
> > > > 
> > > > <4>reiser4[image_eraser.pl(2374)]: cbk_level_lookup 
> > > > (fs/reiser4/search.c:971)[vs-3533]:
> > > > WARNING: Keys are inconsistent. Fsck?
> > > > <4>reiser4[image_eraser.pl(2374)]: key_warning
> > > > (fs/reiser4/plugin/file_plugin_common.c:514)[nikita-717]: WARNING:
> > > > Error for inode 47326534 (-5) Unable to handle kernel NULL pointer
> 
> 5 is error code indicating i/o error: disk block could not be read or
> written from/to a device.
> That is why I guessed that the harddrive is not reliable.
> 

I make an fsck on the partition and now the disk seem work fine. 
May be the disk was inconsistent? 


> > > > dereference at virtual address  printing eip:
> > > > 
> > > > *pde = 
> > > > Oops:  [#1]
> > > > Modules linked in: bttv video_buf firmware_class compat_ioctl32
> > > > i2c_algo_bit v4l2_common btcx_risc ir_common tveeprom i2c_core videodev
> > > > video CPU:0 EIP:0060:[<>]Not tainted VLI
> > > > EFLAGS: 00010282   (2.6.16.5 #1)
> > > > EIP is at rest_init+0x3feffde0/0x1e
> > > > eax:    ebx: d80c3d84   ecx: da670afc   edx: c03ee8e0
> > > > esi:    edi:    ebp: c01b74f1   esp: d80c3b58
> > > > ds: 007b   es: 007b   ss: 0068
> > > > Process image_eraser.pl (pid: 2374, threadinfo=d80c2000 task=df863a30)
> > > > Stack: <0>c01b74b5 d80c3d84   da670afc d80c3e38
> > > > d80c3bbc d80c3bbc c01b746c d80c3c18 c01b750e d80c3d84  
> > > > da670afc d80c3e38 d80c3bbc c01b77a4 d80c3d84 da670afc d80c3e38 d80c3bbc
> > > > 0002  Call Trace:
> > > >  [] kill_units+0x49/0x53
> > > >  [] kill_units+0x0/0x53
> > > >  [] kill_head+0x1d/0x24
> > > >  [] prepare_for_compact+0x1e2/0x406
> > > >  [] reiser4_get_neighbor+0x75/0x261
> > > >  [] jload_gfp+0x112/0x124
> > > >  [] kill_node40+0x23/0x9a
> > > >  [] lock_carry_node_tail+0x16/0x18
> > > >  [] carry_cut+0x3f/0x53
> > > >  [] carry_on_level+0x30/0xaa
> > > >  [] carry+0x79/0x169
> > > >  [] kill_node_content+0x125/0x13e
> > > >  [] cut_tree_worker_common+0x196/0x2e8
> > > >  [] cut_tree_worker_common+0x0/0x2e8
> > > >  [] cut_tree_object+0xae/0x149
> > > >  [] create_item_node40+0x1fc/0x258
> > > >  [] znode_make_dirty+0x40/0x50
> > > >  [] cut_file_items+0xdb/0x174
> > > >  [] shorten_file+0x29/0x1d7
> > > >  [] update_file_size+0x0/0x61
> > > >  [] truncate_file_body+0x63/0x6f
> > > >  [] delete_object_unix_file+0x33/0xa6
> > > >  [] reiser4_delete_inode+0x83/0x9d
> > > >  [] reiser4_delete_inode+0x0/0x9d
> > > >  [] generic_delete_inode+0x56/0xb4
> > > >  [] iput+0x63/0x66
> > > >  [] do_unlinkat+0xb4/0xf9
> > > >  [] sys_unlink+0xb/0xe
> > > >  [] sysenter_past_esp+0x54/0x75
> > > > Code:  Bad EIP value.
> > > > 
> > > > If i reboot the machine this error repeat again!
> > > > i have to do fsck to resolve this error?
> > > > thanks
> > > > 
> > > 
> > > It looks like your harddrive is not very reliable. Would you please try
> > > to experiment with another box/harddrive?
> > > 
> > > 
> > 
> > The harddrive is a maxtor so i tested it with the powermax maxtor harddrive 
> > tools with success. More i run the smart control long test with success 
> > (smartctl -t long /dev/hda) . Why you say "It looks like your harddrive is 
> > not very reliable" ?
> > 
> 
> ok. But nevertheless I would like you to try to reproduce the problem on
> another hardware to be sure that it is a software bug.
> 
> 

I will make onother one linux box to reproduce errors!!
Thanks a lot and good work!