Re: mbuf cache
Petri Helenius wrote: This also has the desirable side effect that stack processing will occur on the same CPU as the interrupt processing occurred. This avoids inter-CPU memory bus arbitration cycles, and ensures that you won't engage in a lot of unnecessary L1 cache busting. Hence I prefer this method to polling. Anywhere I could read up on the associated overhead and how the whole stuff works out in the worst case where data is DMAd into memory, read up to CPU1 and then to CPU2 and then discarded and if there would be any roads that can be taken to optimize this. Not really. If there were a good resource on this, people would have read it already, and some of the code that has been rewritten or replaced would never have been written the way it was in the first place. 8-). You can read technical papers on a lot of topics. Some contain information that has been known to the academic community since their publication, but has yet to make it into a commercial OS, let alone a free one like Linux or FreeBSD. Basically, it's an experience thing. John Lemon, who did the direct dispatch work, is a Cisco employee. He know what he knows because he has a lot of experience. Luigi Rizzo, who did the polling code, is a tenured University professor in Italy. He knows what he knows because he has a lot of experience. I did the soft interrupt coelescing code, and did a couple of the patches to add polling support to some of the ethernet drivers, etc.. I'm a voracious reader of research papers, and I've been a Novell, Artisoft, Whistle Communications, IBM, ClickArray, etc. employee. Bill Paul, who did most of the network drivers, did nothing but eat, breathe, and sleep network drivers for years. Etc. etc.. If you are asking for paper references, then I can at least tell you where to start; go to: http://citeseer.nj.nec.com/cs and look for Jeff Mogul, DEC Western Research Laboratories, Mohit Aron, Peter Druschel, Sally Floyd, Van Jacobson, SCALA, TCP Rate halving, Receiver Livelock, RICE University, Duke University, University of Utah. That will at least get you most of the papers. Then follow the references to the other papers. You will get much better load capacity scaling out of two cheaper boxes, if you implement correctly, IMO. Synchronization of the unformatted data can probably never get as good as it gets if you optimize the system for your case. But I agree it should be better than it is now, however it does not really seem to get any better. (unless you consider the EV7 and Opteron approaches better than the current Intel approach) It's a lot of work to do it right. SVR4.2 and up doesn't do it right, despite their indirect claims to scale to 32 CPUs in the SCO vs. IBM suit recently filed. The secret recipe, if there is one, is probably lock avoidance through algorithm choice, rather than better locking or finer grained locking, etc.. Even then, you are usually talking a scaling factor of almost 10 times on any stall external to the CPU chip itself, because of bus speeds, and that's on the best hardware. For a 3GHz CPU with a 133MHz front side bus, that's more like 23 times. If it's I/O bus, then you are talking 46 times. It's really, really ugly once you get out of the L1 cache... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
On Fri, Mar 14, 2003 at 07:27:42PM -0800, Peter Wemm wrote: Crist J. Clark wrote: --C7zPtVaVf+AK4Oqc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Perhaps it would be a good idea to build a linker.hints file with kldxref(8) at boot time. At least, I can't think of any really good reasons why _not_ to do it. Yes, we need to do this, but your patch needs a little more work. Specifically.. There is a linker.hints file in each directory in the module path, not just /boot/kernel. You need to look at the kern.module_path sysctl to find the search path. [EMAIL PROTECTED]:26pm]~-101 sysctl -n kern.module_path /boot/kernel;/boot/kernel;/boot/modules;/modules Ah. There that is. This also needs to be robust in the case where /boot might be another file system or readonly or NFSROOT or not even mounted, or something. I think the easiest thing to do here is if the kldxref(8) command fails, it shouldn't kill the boot. Specifically, force the script to always exit on success. Anything failing here is simply not bad enough that we should interupt the boot. In the case of a read-only filesystem, we get an error message like, Building /foo/modules/linker.hints kldxref: /foo/lhint.caF5Wl: Read-only file system For non-existent directories, kldxref(8) actually doesn't return a failure anyway, # kldxref /nonexistent # echo $? 0 Someone who's mount root read-only or booting from a partition that doesn't get mounted is someone who better know what they are doing. If they don't like the error messages they have the knob to completely disable the script or I have added an option for a rc.conf(5) specified directory listing rather than kern.module_path. There are always going to be that small fraction of real or imagined users doing some wild things that won't fit into the startup scripts no matter how many knobs we give them. Making things overly complicated for that small percentile just confuses the bulk of the users and makes more room for more bugs. KISS. Just turn this off if you don't want to or can't use it. -- Crist J. Clark | [EMAIL PROTECTED] | [EMAIL PROTECTED] http://people.freebsd.org/~cjc/| [EMAIL PROTECTED] Index: src/etc/defaults/rc.conf === RCS file: /export/freebsd/ncvs/src/etc/defaults/rc.conf,v retrieving revision 1.170 diff -u -r1.170 rc.conf --- src/etc/defaults/rc.conf15 Mar 2003 08:14:42 - 1.170 +++ src/etc/defaults/rc.conf17 Mar 2003 08:25:09 - @@ -28,6 +28,9 @@ apmd_enable=NO # Run apmd to handle APM event from userland. apmd_flags= # Flags to apmd (if enabled). devd_enable=NO # Run devd, to trigger programs on device tree changes. +kldxref_enable=NO# Build linker.hints files with kldxref(8). +kldxref_clobber=NO # Overwrite old linker.hints at boot. +kldxref_module_path= # Override kern.module_path. A ';'-delimited list. pccard_enable=NO # Set to YES if you want to configure PCCARD devices. pccard_mem=DEFAULT # If pccard_enable=YES, this is card memory address. pccard_beep=2# pccard beep type. Index: src/etc/rc.d/network1 === RCS file: /export/freebsd/ncvs/src/etc/rc.d/network1,v retrieving revision 1.145 diff -u -r1.145 network1 --- src/etc/rc.d/network1 12 Feb 2003 04:26:10 - 1.145 +++ src/etc/rc.d/network1 15 Mar 2003 00:36:05 - @@ -4,7 +4,7 @@ # # PROVIDE: network1 -# REQUIRE: atm1 ipfilter mountcritlocal pccard serial sppp sysctl tty +# REQUIRE: atm1 ipfilter kldxref mountcritlocal pccard serial sppp sysctl tty # KEYWORD: FreeBSD . /etc/rc.subr Index: src/etc/rc.d/kldxref === RCS file: src/etc/rc.d/kldxref diff -N src/etc/rc.d/kldxref --- /dev/null 1 Jan 1970 00:00:00 - +++ src/etc/rc.d/kldxref17 Mar 2003 08:23:09 - @@ -0,0 +1,35 @@ +#!/bin/sh +# +# $FreeBSD:$ +# + +# PROVIDE: kldxref +# REQUIRE: root +# BEFORE: network1 +# KEYWORD: FreeBSD + +. /etc/rc.subr + +rcvar=kldxref_enable +name=kldxref +stop_cmd=: +start_cmd=kldxref_start + +kldxref_start () { + if [ -z $kldxref_module_path ]; then + MODULE_PATHS=`sysctl -n kern.module_path` + else + MODULE_PATHS=$kldxref_module_path + fi + IFS=';' + for KERNDIR in $MODULE_PATHS; do + if [ ! -f $KERNDIR/linker.hints ] || + checkyesno kldxref_clobber; then + echo Building $KERNDIR/linker.hints + kldxref $KERNDIR + fi + done +} + +load_rc_config $name +run_rc_command $1
Re: Create linker.hints at boot
On Mon, Mar 17, 2003 at 12:28:34AM -0800, Crist J. Clark wrote: Index: src/etc/rc.d/kldxref === RCS file: src/etc/rc.d/kldxref diff -N src/etc/rc.d/kldxref --- /dev/null 1 Jan 1970 00:00:00 - +++ src/etc/rc.d/kldxref 17 Mar 2003 08:23:09 - @@ -0,0 +1,35 @@ +#!/bin/sh +# +# $FreeBSD:$ +# + +# PROVIDE: kldxref +# REQUIRE: root +# BEFORE: network1 +# KEYWORD: FreeBSD This has to require mountcritlocal will it ever be useful on ia64 because /boot is the mount point for the EFI filesystem where we have the kernel and modules. This is standard behaviour and one of the cases Peter mentioned. -- Marcel Moolenaar USPA: A-39004 [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
source upgrade broken?
Gentleman, Please correct me if I am wrong but it appears, that the source upgrade path from 4.* to 5.0 is broken. I havent played with it much but it appears thatbuilding the kernel, depends on somethings new to the -current compiler, and the compiler is dependant on stuff in the 5.-current kernel. I realize that with all the stuff thats been ripped out of 5.0 and added, that a clean install is probably the best way to go. I am just curious if it truly is borked or if it is just me. Robert Garrett To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
[no subject]
subscribe Buurtnet Oldambt 69 3524 BD Utrecht 030-2898094 06-17058904 [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: source upgrade broken?
I second that problem. Tried doing an upgrade yesterday, and it didn't work--missing libc.so.4 error given during make installworld. Scott - Original Message - From: Robert Garrett [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, March 16, 2003 3:53 AM Subject: source upgrade broken? Gentleman, Please correct me if I am wrong but it appears, that the source upgrade path from 4.* to 5.0 is broken. I havent played with it much but it appears thatbuilding the kernel, depends on somethings new to the -current compiler, and the compiler is dependant on stuff in the 5.-current kernel. I realize that with all the stuff thats been ripped out of 5.0 and added, that a clean install is probably the best way to go. I am just curious if it truly is borked or if it is just me. Robert Garrett To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf cache
If you are asking for paper references, then I can at least tell you where to start; go to: http://citeseer.nj.nec.com/cs and look for Jeff Mogul, DEC Western Research Laboratories, Mohit Aron, Peter Druschel, Sally Floyd, Van Jacobson, SCALA, TCP Rate halving, Receiver Livelock, RICE University, Duke University, University of Utah. That will at least get you most of the papers. Then follow the references to the other papers. These seem quite network-heavy, I was more interested in references of SMP stuff and how the coherency is maintained and what is the overhead of maintaining the coherency in read/write operations and how alignment helps/screws you with different word-sizes in IA32 architechture. Writing a coarse SMP memory benchmark should be easy, I wonder if it has been done? Judging from the profiling I´ve done on both kernel and userland things, copying memory around is among the most expensive things to do in modern multi-GHz machines. Doing arithmetic to decrease memory bandwidth requirements pays off very well. The thing I´m still wondering about is how expensive is writing compared to reading. Pete To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Hang on Boot (still)
On Sun, 16 Mar 2003, Lucas Reddinger wrote: The one alternative would be to compile a stripped kernel on another machine, and install off of it. I did this, but I do not have enough knowledge of the 5.x kernel/modules to be able to do this myself. If someone could give me some help with this instead, it would be greatly appriciated. Try setting this from the loader hw.eisa_slots=0 (or 1). -- | Matthew N. Dodd | '78 Datsun 280Z | '75 Volvo 164E | FreeBSD/NetBSD | | [EMAIL PROTECTED] | 2 x '84 Volvo 245DL| ix86,sparc,pmax | | http://www.jurai.net/~winter | For Great Justice! | ISO8802.5 4ever | To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: source upgrade broken?
Are you guys precisely following the instructions in src/UPDATING? On Mon, 17 Mar 2003, Scott Sipe wrote: I second that problem. Tried doing an upgrade yesterday, and it didn't work--missing libc.so.4 error given during make installworld. Scott - Original Message - From: Robert Garrett [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, March 16, 2003 3:53 AM Subject: source upgrade broken? Gentleman, Please correct me if I am wrong but it appears, that the source upgrade path from 4.* to 5.0 is broken. I havent played with it much but it appears thatbuilding the kernel, depends on somethings new to the -current compiler, and the compiler is dependant on stuff in the 5.-current kernel. I realize that with all the stuff thats been ripped out of 5.0 and added, that a clean install is probably the best way to go. I am just curious if it truly is borked or if it is just me. Robert Garrett To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message -- This .signature sanitized for your protection To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: crash: bwrite: need chained iodone
Le 2003-03-12, Jeff Roberson écrivait : Can you please print bp? I'd like to know what all of the members are. A cluster buf should NEVER have BX_BKGRDWRITE set. This is totally bogus. Got that crash again, with sync-on-panic disabled. The interesting thing is that the stack trace might be corrupted or inaccurate (maybe some tail recursion optimisation or inlining is going on around): although it seems to indicate that the panic is the one from bwrite: need chained iodone (which is absurd, as we saw, since bp-bp_xflags == 0), the panic message is buffer is not busy??? from bwrite, and indeed we can see that this is the case (see print of bp-b_lock). Thomas. Script started on Mon Mar 17 12:06:57 2003 This is ZSH 4.0.6 on a xterm-color ([EMAIL PROTECTED]) /var/crash # gdb -k /usr/obj/usr/src/sys/MALEVIL/kernel.debug vmcore.7 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: bwrite: buffer is not busy??? panic messages: --- panic: bwrite: buffer is not busy??? Uptime: 3d17h25m5s Dumping 511 MB ata0: resetting devices .. done 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 --- #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:239 239 dumping++; (kgdb) bt #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:239 #1 0xc01f4698 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:371 #2 0xc01f4903 in panic () at /usr/src/sys/kern/kern_shutdown.c:542 #3 0xc0232072 in bwrite (bp=0xce531228) at /usr/src/sys/kern/vfs_bio.c:795 #4 0xc0232a7c in bawrite (bp=0x0) at /usr/src/sys/kern/vfs_bio.c:1138 #5 0xc023a02b in cluster_wbuild (vp=0xc5ff8db0, size=16384, start_lbn=21, len=4) at /usr/src/sys/kern/vfs_cluster.c:996 #6 0xc02396ff in cluster_write (bp=0xce6237a8, filesize=378368, seqcount=4) at /usr/src/sys/kern/vfs_cluster.c:596 #7 0xc02e3fec in ffs_write (ap=0xe5d49be0) at /usr/src/sys/ufs/ffs/ffs_vnops.c:728 #8 0xc024e1b2 in vn_write (fp=0xc5f56618, uio=0xe5d49c7c, active_cred=0xc5f91480, flags=0, td=0xc4170a50) at vnode_if.h:417 #9 0xc0214008 in dofilewrite (td=0xc4170a50, fp=0xc5f56618, fd=0, buf=0x8c1a400, nbyte=0, offset=0, flags=0) at file.h:239 #10 0xc0213e49 in write (td=0xc4170a50, uap=0xe5d49d10) at /usr/src/sys/kern/sys_generic.c:329 #11 0xc033a68e in syscall (frame= {tf_fs = 47, tf_es = 148963375, tf_ds = -1078001617, tf_edi = 677204256, tf_esi = 0, tf_ebp = -1077939928, tf_isp = -439050892, tf_ebx = 677216484, tf_edx = 20, tf_ecx = 0, tf_eax = 4, tf_trapno = 0, tf_err = 2, tf_eip = 677548851, tf_cs = 31, tf_eflags = 518, tf_esp = -1077939988, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1030 #12 0xc032a89d in Xint0x80_syscall () at {standard input}:138 ---Can't read userspace from dump, or kernel process--- (kgdb) fr 3 #3 0xc0232072 in bwrite (bp=0xce531228) at /usr/src/sys/kern/vfs_bio.c:795 795 panic(bwrite: need chained iodone); (kgdb) list 790 (bp-b_flags B_ASYNC) 791 !vm_page_count_severe() 792 !buf_dirty_count_severe()) { 793 if (bp-b_iodone != NULL) { 794 printf(bp-b_iodone = %p\n, bp-b_iodone); 795 panic(bwrite: need chained iodone); 796 } 797 798 /* get a new block */ 799 newbp = geteblk(bp-b_bufsize); (kgdb) print *bp $1 = {b_io = {bio_cmd = 1, bio_dev = 0x, bio_disk = 0x0, bio_blkno = 18520608, bio_offset = 926699520, bio_bcount = 32768, bio_data = 0xd42ba000 , bio_flags = 0, bio_error = 0, bio_resid = 0, bio_done = 0xc0235db0 bufdonebio, bio_driver1 = 0x0, bio_driver2 = 0x0, bio_caller1 = 0x0, bio_caller2 = 0xce531228, bio_queue = {tqe_next = 0x0, tqe_prev = 0x0}, bio_attribute = 0x0, bio_from = 0x0, bio_to = 0x0, bio_length = 0, bio_completed = 0, bio_children = 259, bio_inbed = 0, bio_parent = 0x0, bio_t0 = {sec = 0, frac = 0}, bio_task = 0, bio_task_arg = 0x0, bio_pblkno = 0}, b_op = 0xc03a89f8, b_magic = 280038160, b_iodone = 0xc0239320 cluster_callback, b_offset = 311296, b_vnbufs = {tqe_next = 0x0, tqe_prev = 0x0}, b_left = 0x0, b_right = 0x0, b_vflags = 0, b_freelist = { tqe_next = 0xce531fe8, tqe_prev = 0xc03dcb3c}, b_qindex = 0, b_flags = 1677721604, b_xflags = 0 '\0', b_lock = { lk_interlock = 0xc03d74d8, lk_flags = 0, lk_sharecount = 0, lk_waitcount = 0, lk_exclusivecount = 0, lk_prio = 80, lk_wmesg = 0xc0379b53 bufwait, lk_timo = 0, lk_lockholder = 0x, lk_newlock = 0x0},
Re: Create linker.hints at boot
Crist J. Clark wrote: Also, what's the best way/is there a way to figure out the boot directory rather than hardwire /boot/kernel? dirname `sysctl -n kern.bootfile` -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Outros: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] To the systems programmer, users and applications serve only to provide a test load. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
Crist J. Clark wrote: Perhaps it would be a good idea to build a linker.hints file with kldxref(8) at boot time. At least, I can't think of any really good reasons why _not_ to do it. [...] This is my first stab at rc-ng for a long while, so please be gentle if I've not handled that the best way. Patches attached. Now that I've read it, I wonder what will happen in the cases where you have libraries nfs-mounted. This runs before any remote fs is mounted. -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Outros: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] To the systems programmer, users and applications serve only to provide a test load. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
Daniel C. Sobral wrote: Crist J. Clark wrote: Perhaps it would be a good idea to build a linker.hints file with kldxref(8) at boot time. At least, I can't think of any really good reasons why _not_ to do it. [...] This is my first stab at rc-ng for a long while, so please be gentle if I've not handled that the best way. Patches attached. Now that I've read it, I wonder what will happen in the cases where you have libraries nfs-mounted. This runs before any remote fs is mounted. Err... M... Too early in the morning, not enough caffeine. Just ignore me... :-) -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Outros: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] You are dishonest, but never to the point of hurting a friend. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: wi driver
In message: [EMAIL PROTECTED] Sam Leffler [EMAIL PROTECTED] writes: : * Alfred Perlstein [EMAIL PROTECTED] [030316 21:19] wrote: : um.. : : ... : 840 _FLAGS_OUTRANGE) { : 841 WI_UNLOCK(sc); : 842 return; : 843 } : 844 KASSERT((ifp-if_flags IFF_OACTIVE) == 0, : 845 (wi_start: if_flags %x\n, ifp-if_flags)); : 846 : 847 memset(frmhdr, 0, sizeof(frmhdr)); : : : What's up here? : : It's a race, we shouldn't be inspecting the ifp without a lock. : : I think this kassert should be removed. : : Do you guys concurr? : : Warner has this pending with some other fixes; perhaps he can accelerate : doing the commit? The assert is actually just bogus (if_start can be called : under certain conditions with IFF_OACTIVE set. This is part of my commit. The KASSERT is totally bogus. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: wi driver
I've just committed the right fix for this (which is to nuke the bogus KASSERT). I have one or two other fixes in the pipe for lucent cards, but had hoped to get them 'perfect' rather than 'a lot better' before committing them. Since my time has been short, I'll go ahead and try to commit the 'better' ones today and work towards making those more perfect in the fullness of time. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: source upgrade broken?
On Mon, Mar 17, 2003 at 03:10:08AM -0800, Doug Barton wrote: Are you guys precisely following the instructions in src/UPDATING? most definately, the new compiler depends on new syscalls in the kernel, and the kernel depends on new options in the compiler. I could not find anything about this situation in UPDATING. I'm doing a fresh checkout of the -current tree while I'm writing this. I believe the procedure I followed was cvs co src -t . gperf rebuild bombed due to missing includes.. went ahead and tried make buildworld bombed on tar is a directory during the clean phase.. wasted my obj directory.. rm -rf /usr/obj cd /usr/src make buildworld and bombed on rm -f tar addext.o argmatch.o backupfile.o basename.o dirname.o error.o exclude.o full-write.o getdate.o getline.o getopt.o getopt1.o getstr.o hash.o human.o mktime.o modechange.o prepargs.o print-copyr.o quotearg.o safe-read.o save-cwd.o savedir.o unicodeio.o xgetcwd.o xmalloc.o xstrdup.o xstrtoul.o xstrtoumax.o buffer.o compare.o create.o delete.o extract.o incremen.o list.o mangle.o misc.o names.o rtapelib.o tar.o update.o tar.1.cat rm: tar: is a directory *** Error code 1 this is of course todays current as of 5:30 a.m CST Rob On Mon, 17 Mar 2003, Scott Sipe wrote: I second that problem. Tried doing an upgrade yesterday, and it didn't work--missing libc.so.4 error given during make installworld. Scott - Original Message - From: Robert Garrett [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, March 16, 2003 3:53 AM Subject: source upgrade broken? Gentleman, Please correct me if I am wrong but it appears, that the source upgrade path from 4.* to 5.0 is broken. I havent played with it much but it appears thatbuilding the kernel, depends on somethings new to the -current compiler, and the compiler is dependant on stuff in the 5.-current kernel. I realize that with all the stuff thats been ripped out of 5.0 and added, that a clean install is probably the best way to go. I am just curious if it truly is borked or if it is just me. Robert Garrett To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message -- This .signature sanitized for your protection To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf cache
Petri Helenius wrote: [ ... Citeseer earch terms for professional strength networking ... ] These seem quite network-heavy, I was more interested in references of SMP stuff and how the coherency is maintained and what is the overhead of maintaining the coherency in read/write operations and how alignment helps/screws you with different word-sizes in IA32 architechture. Ah. I misunderstood. I thought you meant networking specifically, because of the receiver livelock discussion context. Let me change my answer... 8-). Generally, there are not reference works available online unless you are an IEEE member. One of my favorite printed references is a special order title, which is a collection of IEEE proceedings directly on the topic: Scheduling and Load Balancing in Parallel and Distributed Systems Behrooz A. Shirazi (Editor), Ali R. Hurson (Editor), Krishna M. Kavi (Editor) Wiley-IEEE Press; 1st edition (April 30, 1995) ISBN: 0818665874 It usually costs about US$30. A couple of other good books (more general, but some coverage) are: UNIX(R) Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers Curt Schimmel Addison-Wesley Pub Co; 1st edition (June 30, 1994) ISBN: 0201633388 UNIX Internals: The New Frontiers Uresh Vahalia Prentice Hall; 1st edition (October 23, 1995) ISBN: 0131019082 Solaris Internals: Core Kernel Architecture Jim Mauro, Richard McDougall Prentice Hall PTR; 1st edition (October 5, 2000) ISBN: 0130224960 The first is usually about US$70; the second is usually about US$70; the third is usually about US$60. Note: I am biased about the second; I did technical review on it for Prentice Hall before it was published, and am mentioned in it, so fair warning ;^). You might also check out: The Magic Garden Explained: The Internals of Unix System V Release 4: An Open Systems Design Berny Goodheart, James Cox, John R. Mashey Prentice Hall; (August 1994) ASIN: 0130981389 I know Mashey in passing, and I know the other two by reputation; the book is a good SVR4 book, but the index royally sucks: every time I sent to look something up that I was interested in seeing, I couldn't find it. Also, IMO, SVR4 is not that hot. I'm sure this book will end up introduced into evidence, though ;^). Writing a coarse SMP memory benchmark should be easy, I wonder if it has been done? Sure. Intel has written a lot of them in assmebly language in their server product division, and then shared only the results. 8-). Actually, Intel has a couple of good publications on compiler design for the P4 (basically the say don't do what GCC does); they would also apply to higher level design, I think. You can find them as PDF's on their web site under P4 programming for Hyperthreading. Judging from the profiling I´ve done on both kernel and userland things, copying memory around is among the most expensive things to do in modern multi-GHz machines. Doing arithmetic to decrease memory bandwidth requirements pays off very well. The thing I´m still wondering about is how expensive is writing compared to reading. Depends on your cache configuration; write-through sucks, if the other CPU's L1 is aware of the memory you are touching. The L2 cache can hide some of it otherwise, up to the point that it has to flush pages over the 133MHz DRAM bus. 8-(. You should expect data copying to be *the* most expensive thing, short of device I/O, BTW. Arbitration is a bugger, and moving data anywhere from L1 to L1 on the same CPU is going to cost you an arm and a leg. 8-(. This is why people are so hot-to-trot about zero copy TCP, and zero copy NFS and zero copy floor wax... ;^). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙèÇѹ¹Õé ¾ÃØ觹Õé¡ç¨ÐàËÁ×͹Çѹ¹Õé
ËÒ¡¤Ø³ÅéÁàËÅÇ·Õè¨ÐÇҧἹ ÂèÍÁá»ÅÇèҤسÇҧἹ·Õè¨ÐÅéÁàËÅÇ ¨ÔÁ âÃËì¹ ¹Ñ¡»ÃѪÒÍѹ´Ñº 1 ¢Í§âÅ¡ àªè¹ ¤Ø³¤Ô´ÇèÒ㹪ÕÇÔµ¹ÕéàÃÒ¤§äÁèÁÕ·Ò§ÃÇ ¤Ø³¡çä¨ÐäÁèÁÕ·Ò§ÃÇÂàÅ ËÃ×Í ¤Ø³¤Ô´ÇèÒÊÑ¡Çѹ¶Ö§©Ñ¹µéͧÃÇÂá¹èæ ¨ÔÁ âÃËì¹ ºÍ¡ÇèÒ ¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙè·Ø¡Çѹ¹Õé ÍÕ¡ 3 »Õ¢éҧ˹éÒÅͧ¤Ô´´ÙÇèÒ ¤Ø³¨ÐÁÕâÍ¡ÒÊÃÇÂä´éËÃ×ÍäÁè ¶éҤӵͺ¤×Í ãªè ¤Ø³¡ÓÅѧ¨ÐÃÇ ¡çÂÔ¹´Õ¡Ñº¤Ø³´éǤÃѺ¤Ø³¡ÓÅѧ¨ÐÃÇÂáÅéÇ áµè¶éҤӵͺ¤×Í äÁè ¤Ø³äÁèÊÒÁÒöÃÇÂä´é ¤Ø³µéͧà»ÅÕè¹ÍÐäÃÊÑ¡ÍÂèҧ㹪ÕÇÔµ¤Ø³áÅéÇ ¨ÔÁ âÃËì¹ ºÍ¡ÍÕ¡ÇèÒ ¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙèÇѹ¹Õé ¾ÃØ觹Õé¡ç¨ÐàËÁ×͹Çѹ¹Õé ä»àÃ×èÍÂæäÁèÁÕ·ÕèÊÔé¹ÊØ´ ËÁÒ¤ÇÒÁÇèÒ -¶éÒÇѹ¹Õé¤Ø³ÂѧµéͧÇÔè§ËÒà§Ô¹ ¨èÒÂ˹ÕéµèÒ§æ -¶éÒÇѹ¹Õé¤Ø³Âѧ¶Ù¡à¨éÒ¹Ò¡´¢Õè ãªé§Ò¹ÍÂèҧ˹ѡ -¶éÒÇѹ¹Õé¤Ø³ÂѧËÒ·Ò§ÍÍ¡äÁèä´é Åͧà»Ô´âÍ¡ÒÊãËéµÑÇàͧ´Ù à»Ô´ã¨¢Í§¤Ø³ãËé¡ÇéÒ§áÅéÇà´Ô¹µÒÁàÃÒÁÒËÃ×Í»ÅèÍÂãËéâÍ¡ÒʹÕéËÅØ´ÅÍÂä» ¤Ø³ÊÒÁÒöà¢éÒä»´ÙÃÒÂÅÐàÍÕ´à¾ÔèÁàµÔÁáÅСÃÍ¡¢éÍÁÙÅà¾×èÍ¢ÍÃѺ¢éÍÁÙÅàº×éͧµé¹¿ÃÕ ! ä´é·Õè http://www.geocities.com/thaigetrich/easywork ¢ÍÍÀÑÂËÒ¡¢éͤÇÒÁ¹Õé¶Ù¡Êè§ä»Âѧ¤Ø³â´ÂºÑ§àÍÔ ËÒ¡¤Ø³äÁèµéͧ¡ÒÃÃѺ¢éͤÇÒÁ¹ÕéÍÕ¡¡ÃØ³Ò mail ÁÒ·Õè www.ecommerce.web1000.com/unsub
¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙèÇѹ¹Õé ¾ÃØ觹Õé¡ç¨ÐàËÁ×͹Çѹ¹Õé
ËÒ¡¤Ø³ÅéÁàËÅÇ·Õè¨ÐÇҧἹ ÂèÍÁá»ÅÇèҤسÇҧἹ·Õè¨ÐÅéÁàËÅÇ ¨ÔÁ âÃËì¹ ¹Ñ¡»ÃѪÒÍѹ´Ñº 1 ¢Í§âÅ¡ àªè¹ ¤Ø³¤Ô´ÇèÒ㹪ÕÇÔµ¹ÕéàÃÒ¤§äÁèÁÕ·Ò§ÃÇ ¤Ø³¡çä¨ÐäÁèÁÕ·Ò§ÃÇÂàÅ ËÃ×Í ¤Ø³¤Ô´ÇèÒÊÑ¡Çѹ¶Ö§©Ñ¹µéͧÃÇÂá¹èæ ¨ÔÁ âÃËì¹ ºÍ¡ÇèÒ ¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙè·Ø¡Çѹ¹Õé ÍÕ¡ 3 »Õ¢éҧ˹éÒÅͧ¤Ô´´ÙÇèÒ ¤Ø³¨ÐÁÕâÍ¡ÒÊÃÇÂä´éËÃ×ÍäÁè ¶éҤӵͺ¤×Í ãªè ¤Ø³¡ÓÅѧ¨ÐÃÇ ¡çÂÔ¹´Õ¡Ñº¤Ø³´éǤÃѺ¤Ø³¡ÓÅѧ¨ÐÃÇÂáÅéÇ áµè¶éҤӵͺ¤×Í äÁè ¤Ø³äÁèÊÒÁÒöÃÇÂä´é ¤Ø³µéͧà»ÅÕè¹ÍÐäÃÊÑ¡ÍÂèҧ㹪ÕÇÔµ¤Ø³áÅéÇ ¨ÔÁ âÃËì¹ ºÍ¡ÍÕ¡ÇèÒ ¶éҤسÂѧ·ÓÊÔ觷Õè¤Ø³·ÓÍÂÙèÇѹ¹Õé ¾ÃØ觹Õé¡ç¨ÐàËÁ×͹Çѹ¹Õé ä»àÃ×èÍÂæäÁèÁÕ·ÕèÊÔé¹ÊØ´ ËÁÒ¤ÇÒÁÇèÒ -¶éÒÇѹ¹Õé¤Ø³ÂѧµéͧÇÔè§ËÒà§Ô¹ ¨èÒÂ˹ÕéµèÒ§æ -¶éÒÇѹ¹Õé¤Ø³Âѧ¶Ù¡à¨éÒ¹Ò¡´¢Õè ãªé§Ò¹ÍÂèҧ˹ѡ -¶éÒÇѹ¹Õé¤Ø³ÂѧËÒ·Ò§ÍÍ¡äÁèä´é Åͧà»Ô´âÍ¡ÒÊãËéµÑÇàͧ´Ù à»Ô´ã¨¢Í§¤Ø³ãËé¡ÇéÒ§áÅéÇà´Ô¹µÒÁàÃÒÁÒËÃ×Í»ÅèÍÂãËéâÍ¡ÒʹÕéËÅØ´ÅÍÂä» ¤Ø³ÊÒÁÒöà¢éÒä»´ÙÃÒÂÅÐàÍÕ´à¾ÔèÁàµÔÁáÅСÃÍ¡¢éÍÁÙÅà¾×èÍ¢ÍÃѺ¢éÍÁÙÅàº×éͧµé¹¿ÃÕ ! ä´é·Õè http://www.geocities.com/thaigetrich/easywork ¢ÍÍÀÑÂËÒ¡¢éͤÇÒÁ¹Õé¶Ù¡Êè§ä»Âѧ¤Ø³â´ÂºÑ§àÍÔ ËÒ¡¤Ø³äÁèµéͧ¡ÒÃÃѺ¢éͤÇÒÁ¹ÕéÍÕ¡¡ÃØ³Ò mail ÁÒ·Õè www.ecommerce.web1000.com/unsub
Re: source upgrade broken?
On Mon, Mar 17, 2003 at 07:44:02AM -0600, Robert Garrett wrote: On Mon, Mar 17, 2003 at 03:10:08AM -0800, Doug Barton wrote: Are you guys precisely following the instructions in src/UPDATING? most definately, the new compiler depends on new syscalls in the kernel, and the kernel depends on new options in the compiler. I could not find anything about this situation in UPDATING. I'm doing a fresh checkout of the -current tree while I'm writing this. I believe the procedure I followed was cvs co src -t . gperf rebuild bombed due to missing includes.. went ahead and tried make buildworld bombed on tar is a directory during the clean phase.. wasted my obj directory.. rm -rf /usr/obj cd /usr/src make buildworld and bombed on rm -f tar addext.o argmatch.o backupfile.o basename.o dirname.o error.o exclude.o full-write.o getdate.o getline.o getopt.o getopt1.o getstr.o hash.o human.o mktime.o modechange.o prepargs.o print-copyr.o quotearg.o safe-read.o save-cwd.o savedir.o unicodeio.o xgetcwd.o xmalloc.o xstrdup.o xstrtoul.o xstrtoumax.o buffer.o compare.o create.o delete.o extract.o incremen.o list.o mangle.o misc.o names.o rtapelib.o tar.o update.o tar.1.cat rm: tar: is a directory *** Error code 1 this is of course todays current as of 5:30 a.m CST You need to either do your checkouts with the -P (prune) option or do an update afterwards with it (i.e. cvs update -d -P) so tar isn't bogusly a directory. -- Brooks -- Any statement of the form X is the one, true Y is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 pgp0.pgp Description: PGP signature
Re: source upgrade broken?
On Mon, Mar 17, 2003 at 07:51:34AM -0800, Brooks Davis wrote: On Mon, Mar 17, 2003 at 07:44:02AM -0600, Robert Garrett wrote: On Mon, Mar 17, 2003 at 03:10:08AM -0800, Doug Barton wrote: Are you guys precisely following the instructions in src/UPDATING? most definately, the new compiler depends on new syscalls in the kernel, and the kernel depends on new options in the compiler. I could not find anything about this situation in UPDATING. I'm doing a fresh checkout of the -current tree while I'm writing this. I believe the procedure I followed was cvs co src -t . gperf rebuild bombed due to missing includes.. went ahead and tried make buildworld bombed on tar is a directory during the clean phase.. wasted my obj directory.. rm -rf /usr/obj cd /usr/src make buildworld and bombed on rm -f tar addext.o argmatch.o backupfile.o basename.o dirname.o error.o exclude.o full-write.o getdate.o getline.o getopt.o getopt1.o getstr.o hash.o human.o mktime.o modechange.o prepargs.o print-copyr.o quotearg.o safe-read.o save-cwd.o savedir.o unicodeio.o xgetcwd.o xmalloc.o xstrdup.o xstrtoul.o xstrtoumax.o buffer.o compare.o create.o delete.o extract.o incremen.o list.o mangle.o misc.o names.o rtapelib.o tar.o update.o tar.1.cat rm: tar: is a directory *** Error code 1 this is of course todays current as of 5:30 a.m CST You need to either do your checkouts with the -P (prune) option or do an update afterwards with it (i.e. cvs update -d -P) so tar isn't bogusly a directory. -- Brooks -- Any statement of the form X is the one, true Y is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 Yes, I forgot the -P option for cvs, and I also used the old method of building a kernel, using the make buildkernel option apperantly has corrected the issue that I was having, ... old habits die hard :).. Rob To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
HEADS UP: Don't upgrade your Alphas!
Hi! Hold off upgrading your Alphas for a moment. Something broke libc recently that results in (at least) floating point exceptions from awk(1) (this is not related to today's awk upgrade). I've been able to reproduce this on beast.freebsd.org by building the fresh libc.a and linking awk with it, and running a test case. I haven't been able to reproduce this with 8th March libc, so the time window for the breakage is low. I suspect the recent gtdoa commit to libc; we will know that is less than an hour. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age pgp0.pgp Description: PGP signature
Re: HEADS UP: Don't upgrade your Alphas!
Yes, as I have suspected, the gdtoa change is responsible for a breakage. libc corresponding to this lib/libc works: cvs -q up -P -d -D'2003/03/12 20:20:00' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... This version, together with contrib/gdtoa, doesn't: cvs -q up -P -d -D'2003/03/12 20:29:59' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... : nawk: floating point exception 8 : input record number 325, file : source line number 84 To see the breakage, one needs to install new libc, and run (assuming that /usr/bin/nawk is dynamically linked) make in usr.bin/truss; this will run the awk(1) script that exhibits one of these FPEs. P.S. Hmm, I didn't test this on i386, as I found this bug when attempting to produce a cross-release of i386 on Alpha, so i386's may be affected too. Will see. On Sat, Mar 15, 2003 at 01:47:05AM -0800, David Schultz wrote: das 2003/03/15 01:47:05 PST FreeBSD src repository Removed files: lib/libc/stdlib strtod.c Log: The gdtoa import apparently hasn't caused anything or anyone to explode, so nix the old strtod() / dtoa(). This change is part of the gdtoa patches reviewed on [EMAIL PROTECTED] Revision ChangesPath 1.26 +0 -2429 src/lib/libc/stdlib/strtod.c (dead) On Mon, Mar 17, 2003 at 06:12:19PM +0200, Ruslan Ermilov wrote: Hi! Hold off upgrading your Alphas for a moment. Something broke libc recently that results in (at least) floating point exceptions from awk(1) (this is not related to today's awk upgrade). I've been able to reproduce this on beast.freebsd.org by building the fresh libc.a and linking awk with it, and running a test case. I haven't been able to reproduce this with 8th March libc, so the time window for the breakage is low. I suspect the recent gtdoa commit to libc; we will know that is less than an hour. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age pgp0.pgp Description: PGP signature
Re: HEADS UP: Don't upgrade your Alphas!
On Mon, Mar 17, 2003 at 06:40:37PM +0200, Ruslan Ermilov wrote: Yes, as I have suspected, the gdtoa change is responsible for a breakage. libc corresponding to this lib/libc works: cvs -q up -P -d -D'2003/03/12 20:20:00' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... This version, together with contrib/gdtoa, doesn't: cvs -q up -P -d -D'2003/03/12 20:29:59' Aye, should have added the PST to the D specifications. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age pgp0.pgp Description: PGP signature
Re: HEADS UP: Don't upgrade your Alphas!
On Mon, Mar 17, 2003 at 06:43:11PM +0200, Ruslan Ermilov wrote: On Mon, Mar 17, 2003 at 06:40:37PM +0200, Ruslan Ermilov wrote: Yes, as I have suspected, the gdtoa change is responsible for a breakage. libc corresponding to this lib/libc works: cvs -q up -P -d -D'2003/03/12 20:20:00' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... This version, together with contrib/gdtoa, doesn't: cvs -q up -P -d -D'2003/03/12 20:29:59' Aye, should have added the PST to the D specifications. *Blush*. I've mangled the dates. The correct date specifiers that I tested with are: 86 8:22am cvs -q up -P -d -D'2003/03/12 12:23:00 PST' 98 8:26am cvs -q up -P -d -D'2003/03/12 12:32:00 PST' Doesn't change anything except this though. ;) Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age pgp0.pgp Description: PGP signature
Re: Why did INVARIANTS hide the geom bug?
Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], walt writes: If inclusion of INVARIANTS serves to disguise bugs in the kernel, I wonder if kernel committers should be using this option routinely? Please check into our current reality :-) Hm. How do I parse that sentence? If you are implying (as it says in NOTES) that INVARIANTS are not enabled by default then my question is certainly a stupid one. However, when I look at the GENERIC kernel config file I see options INVARIANTS options INVARIANT_SUPPORT so what am I to think? Do most kernel committers run a GENERIC kernel as the FBSD website says? Does anyone take a poll occasionally? Did I miss your point entirely? Suggest you check what INVARIANTS actually do. Looking at the code thru my amateur eyes it appears that defining INVARIANTS allows the programmer to add whatever code he wishes with an ifdef statement. That covers a lot of territory. Looking thru sys/geom I don't see any such ifdefs in your code, so I still don't know why the recent geom bug was hidden by INVARIANTS. Hope you're feeling better :-) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
On Mon, Mar 17, 2003 at 12:28:34AM -0800, Crist J. Clark wrote: +kldxref_start () { + if [ -z $kldxref_module_path ]; then + MODULE_PATHS=`sysctl -n kern.module_path` + else + MODULE_PATHS=$kldxref_module_path + fi Please change the logic to positive logic: if [ -n $kldxref_module_path ]; then MODULE_PATHS=$kldxref_module_path else MODULE_PATHS=`sysctl -n kern.module_path` fi To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
.2 isn't a valid double and other problems
hi! cvsup and build (kernel + userland, empty /usr/obj): FreeBSD ds9.webonaut.com 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sun Mar 16 17:53:22 CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DS9 i386 since my update from yesterday (above), fontconfig can't read his configuration file. it claims the .2 isn't a valid double. (line 228 of fonts.conf) metacity seems can't also not correct read his config if numbers like 0.7 used. there are also problems with script-fu from gimp-devel wich cause a segfault: ... many similar line like the blow deletet ... gimp-1.3: Corrupt segment 1 in gradient file '/usr/X11R6/share/gimp/gradients/Yellow_Orange.ggr'. (gimp-1.3:58219): Gimp-Core-WARNING **: (): no matching segment for position 0,067 gimp-1.3: fatal error: Segmentation fault i'm currently rebuilding everything with an older date. franz. -- WEBONAUT.com http://webonaut.com mailto:[EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
bwrite panics solved.
Got that crash again, with sync-on-panic disabled. The interesting thing is that the stack trace might be corrupted or inaccurate (maybe some tail recursion optimisation or inlining is going on around): although it seems to indicate that the panic is the one from bwrite: need chained iodone (which is absurd, as we saw, since bp-bp_xflags == 0), the panic message is buffer is not busy??? from bwrite, and indeed we can see that this is the case (see print of bp-b_lock). Thomas, excelent! I found the bug with this information. I improperly unlocked the cluster buf instead of the target buf in a case where the target buf could not be clustered. It's fixed now though. Thanks for the debuging help! Cheers, Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
mdconfig/mdmfs problems - kernel panic
OS: FreeBSD 5.0-CURRENT, JPSNAP20030314 I'm running a Dual Xeon system with 1GB DDRRAM, and trying to create a ram disk to compile under, specifically to compile the kernel. I've tried several methods, involving either creating one 512MB disk with mdconfig or mdmfs. No matter what options I specify, the mounted mfs works fine until I start filling it up more. For instance, I can usually copy the entire /usr/src/sys to /mnt and make depend, but a while after I make the kernel panics as a result of the ram disk. (specifically citing malloc errors, one time it speicifically spat out a number in the order of 251XX and indicated a malloc bucket limit exceeded or something like that) I've also tried making several (3x192MB) ram disks, mapping them as a single device with ccdconfig, then using /dev/ccd0 mounted with the appropriate newfs/mount commands. I get a different error (in fact rather than a straight kernel panic, I get several errors directly attributed to units on ccd0) I apologize for not having the kernel panic info, but I havn't been able to record it as of yet, and I don't have time to cause another panic right now. If that will help I'll be glad to try it again later this week and post the exact info. Thanks. -John To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mdconfig/mdmfs problems - kernel panic
In message [EMAIL PROTECTED], John Stockdale writes: OS: FreeBSD 5.0-CURRENT, JPSNAP20030314 I'm running a Dual Xeon system with 1GB DDRRAM, and trying to create a ram disk to compile under, specifically to compile the kernel. I've tried several methods, involving either creating one 512MB disk with mdconfig or mdmfs. No matter what options I specify, the mounted mfs works fine until I start filling it up more. For instance, I can usually copy the entire /usr/src/sys to /mnt and make depend, but a while after I make the kernel panics as a result of the ram disk. (specifically citing malloc errors, one time it speicifically spat out a number in the order of 251XX and indicated a malloc bucket limit exceeded or something like that) quote from md(4): malloc Backing store is allocated using malloc(9). Only one malloc- bucket is used, which means that all md devices with malloc backing must share the malloc-per-bucket-quota. The exact size of this quota varies, in particular with the amount of RAM in the system. The exact value can be determined with vmstat(8). -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
RE: mdconfig/mdmfs problems - kernel panic
Ahh, that explains why the multiple /dev/md* didn't help the problem. I'm looking into the vmstat options, but can't figure out how to extract the malloc-per-bucket-quota limit for the system (I've read man vmstat, and tried vmstat -z and vmstat -m, but the only Limit listed is under vmstat -z, and nothing indicates if any displayed limits are relavent to this discussion). Additionally, if I am hitting this limit, how can I increase the limit/what kind of impact would increasing the impact have on the system except in allowing me to user larger /dev/md*? Thanks. -John -Original Message- From: Poul-Henning Kamp [mailto:[EMAIL PROTECTED] Sent: Monday, March 17, 2003 11:27 AM To: John Stockdale Cc: [EMAIL PROTECTED] Subject: Re: mdconfig/mdmfs problems - kernel panic In message [EMAIL PROTECTED], John Stockdale writes: OS: FreeBSD 5.0-CURRENT, JPSNAP20030314 I'm running a Dual Xeon system with 1GB DDRRAM, and trying to create a ram disk to compile under, specifically to compile the kernel. I've tried several methods, involving either creating one 512MB disk with mdconfig or mdmfs. No matter what options I specify, the mounted mfs works fine until I start filling it up more. For instance, I can usually copy the entire /usr/src/sys to /mnt and make depend, but a while after I make the kernel panics as a result of the ram disk. (specifically citing malloc errors, one time it speicifically spat out a number in the order of 251XX and indicated a malloc bucket limit exceeded or something like that) quote from md(4): malloc Backing store is allocated using malloc(9). Only one malloc- bucket is used, which means that all md devices with malloc backing must share the malloc-per-bucket-quota. The exact size of this quota varies, in particular with the amount of RAM in the system. The exact value can be determined with vmstat(8). -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Anyone working on fsck?
Is there anyone working on fsck? Recent timings with a fast machine with 1TB filesystems show that it takes abuot 6 hours to fsck such a filesystem (on a fast array with a lot of RAM) This is with a version of fsck that already has some locally developed speedups and changes. I have not dared time the standard one yet. Pass 1 takes 2 hours on its own. but the cpu is only about 5% busy. Looking at the access patterns on the drive suggests that there MAY be some chance to speed this up using such things reading in teh active cylinder group in advance, and performing the indirect block checks in another thread/process. Does anyone have any plans to work on this sort of thing? On another issue: The memory requirements for fsck are about 700MB+ per TB of filesystem on FreeBSD4.x With the advent of UFS2 this increases, and the filesyste,s can also get bigger. This means that a 2+TB UFS2 filesystem will be IMPOSSIBLE to check on an i386 machine. One solution is to implement an offline, non-in-place filesystem checker using all those techniques we all learned in CS101 relating to mag-tape merge sorts etc. basically you'd have to have a small disk partition set asside to hold working files, to which teh checker would write files of records detailing block numbers etc. then you would do merge or block sorts on those files to get them in various orders (depending on fields in the records) And recombine them to find such things as: multiple referenced blocks, and other file inconsitancies. It wouldn;t be super fast but at least it COULD be used to check a 30TB array, where the in-memory version would beed a process VM space of 24MB which is clearly impossible on a x86. Julian To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
* Julian Elischer [EMAIL PROTECTED] [030317 12:22] wrote: Is there anyone working on fsck? Recent timings with a fast machine with 1TB filesystems show that it takes abuot 6 hours to fsck such a filesystem (on a fast array with a lot of RAM) This is with a version of fsck that already has some locally developed speedups and changes. I have not dared time the standard one yet. Is this with or without the intentional delay introduced in order to avoid monopolizing the disk in background mode? -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
this is a full 100% forground fsck -y On Mon, 17 Mar 2003, Alfred Perlstein wrote: * Julian Elischer [EMAIL PROTECTED] [030317 12:22] wrote: Is there anyone working on fsck? Recent timings with a fast machine with 1TB filesystems show that it takes abuot 6 hours to fsck such a filesystem (on a fast array with a lot of RAM) This is with a version of fsck that already has some locally developed speedups and changes. I have not dared time the standard one yet. Is this with or without the intentional delay introduced in order to avoid monopolizing the disk in background mode? -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
On Mon, Mar 17, 2003 at 09:11:12AM -0800, David O'Brien wrote: On Mon, Mar 17, 2003 at 12:28:34AM -0800, Crist J. Clark wrote: +kldxref_start () { + if [ -z $kldxref_module_path ]; then + MODULE_PATHS=`sysctl -n kern.module_path` + else + MODULE_PATHS=$kldxref_module_path + fi Please change the logic to positive logic: if [ -n $kldxref_module_path ]; then MODULE_PATHS=$kldxref_module_path else MODULE_PATHS=`sysctl -n kern.module_path` fi Is there a technical reason for that or is it just a style issue? -- Crist J. Clark | [EMAIL PROTECTED] | [EMAIL PROTECTED] http://people.freebsd.org/~cjc/| [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
I might add that the test filesystem was 95% full with about 8,000,000 directories on it. It was populated with multiple copies of /bin and /etc as a test set :-) On Mon, 17 Mar 2003, Julian Elischer wrote: this is a full 100% forground fsck -y On Mon, 17 Mar 2003, Alfred Perlstein wrote: * Julian Elischer [EMAIL PROTECTED] [030317 12:22] wrote: Is there anyone working on fsck? Recent timings with a fast machine with 1TB filesystems show that it takes abuot 6 hours to fsck such a filesystem (on a fast array with a lot of RAM) This is with a version of fsck that already has some locally developed speedups and changes. I have not dared time the standard one yet. Is this with or without the intentional delay introduced in order to avoid monopolizing the disk in background mode? -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-fs in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003 12:22:33 -0800 (PST) Julian Elischer [EMAIL PROTECTED] wrote: Howdy, It wouldn;t be super fast but at least it COULD be used to check a 30TB array, where the in-memory version would beed a process VM space of 24MB which is clearly impossible on a x86. I'm sure most of us have computers with more than 24MB :-P, sorry couldn't resist. It's an interesting problem anyway, does anyone have insight on how other OSs and FS handle these kind of problems? e.g. Solaris with vxfs or Linux with XFS/JFS/this week's FS. Cheers, -- Miguel Mendez - [EMAIL PROTECTED] GPG Public Key :: http://energyhq.homeip.net/files/pubkey.txt EnergyHQ :: http://www.energyhq.tk Of course it runs NetBSD! Tired of Spam? - http://www.trustic.com pgp0.pgp Description: PGP signature
Re: Create linker.hints at boot
On Mon, Mar 17, 2003 at 12:43:19PM -0800, Crist J. Clark wrote: On Mon, Mar 17, 2003 at 09:11:12AM -0800, David O'Brien wrote: On Mon, Mar 17, 2003 at 12:28:34AM -0800, Crist J. Clark wrote: +kldxref_start () { + if [ -z $kldxref_module_path ]; then + MODULE_PATHS=`sysctl -n kern.module_path` + else + MODULE_PATHS=$kldxref_module_path + fi Please change the logic to positive logic: if [ -n $kldxref_module_path ]; then MODULE_PATHS=$kldxref_module_path else MODULE_PATHS=`sysctl -n kern.module_path` fi Is there a technical reason for that or is it just a style issue? Style, easier to read out loud, easier to understand w/o having to think. Just like this is hard to read. It certainly doesn't do what one reads out loud: if not string compaire equal. if (!strcmp(a,b) { printf(same\n); } To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: NFS file unlocking problem
On Sun, 16 Mar 2003, Steve Sizemore wrote: Sorry - I was trying to be too helpful. I actually did capture the raw dump but appended the decoded output. This time, I've attached a real raw dump. The dump doesn't seem to be attached. However, I note that the request being sent is SETLKW which is a blocking wait until lock is granted. If the server thinks the file is already locked, it will hang *and* that is the proper behavior. What is the result of running this locally on the NFS server and attempting to lock the underlying file? If rpc.lockd is hanging onto a lock, running that perl script locally on the actual file (not an NFS mounted image of it) should also hang. As a side note, you probably want to create a C executable to do this kind of fcntl fiddling when attempting to test NFS. That way you can use a locally mounted binary and you won't wind up with all of the Perl access calls on the NFS wire. Or, at least, use a local copy of Perl. -a To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
`make buildworld' failed
Hello, current! How are you? I have: FreeBSD 5.0-CURRENT 24 Feb 2003 21:55:43 MSK. I have very rcent sources of -CURRENT (updated 17 Mar 2003 about 20:00 MSK (GMT+3)). `make buildwolrd' was failed (only very tail of output is here): cc -pg -O -pipe -march=pentiumpro -DTERMIOS -DANSI_SOURCE -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto -I/usr/obj/usr/src/secure/lib/libcrypto -DL_ENDIAN -c /usr/src/crypto/openssl/crypto/x509v3/v3err.c -o v3err.po cc -fpic -DPIC -O -pipe -march=pentiumpro -DTERMIOS -DANSI_SOURCE -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto -I/usr/obj/usr/src/secure/lib/libcrypto -DL_ENDIAN -c /usr/src/crypto/openssl/crypto/x509v3/v3err.c -o v3err.So building profiled crypto library building static crypto library building shared library libcrypto.so.3 ranlib libcrypto.a ranlib libcrypto_p.a sh /usr/src/tools/install.sh -C -o root -g wheel -m 444 libcrypto.a /usr/obj/usr/src/i386/usr/lib sh /usr/src/tools/install.sh -C -o root -g wheel -m 444 /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/crypto.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/ebcdic.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/opensslv.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/ossl_typ.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/symhacks.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/tmdiff.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/../e_os.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/../e_os2.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/aes/aes.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/aes/aes_locl.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/asn1/asn1.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/asn1/asn1_mac.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/asn1/! asn1t.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bf/blowfish.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bio/bio.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bn/bn.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/buffer/buffer.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/cast/cast.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/comp/comp.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/conf/conf.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/conf/conf_api.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/des/des.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/des/des_old.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/dh/dh.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/dsa/dsa.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/dso/dso.h /usr/src/secure/lib/libcrypto/../..! /../crypto/openssl/crypto/ec/ec.h /usr/src/secure/lib/libcrypto/../../ ../crypto/openssl/crypto/engine/eng_int.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/engine.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_4758_cca_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_aep_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_atalla_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_cswift_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_ncipher_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_nuron_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_sureware_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/engine/hw_ubsec_err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/err/err.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/hmac/hmac.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/! crypto/idea/idea.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/krb5/krb5_asn.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/lhash/lhash.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/md2/md2.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/md4/md4.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/md5/md5.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/mdc2/mdc2.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/objects/objects.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/objects/obj_mac.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/ocsp/ocsp.h /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/pem/pem.h
Re: Anyone working on fsck?
UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Keeping track of 67M to 132M blocks and (assuming avg file size of 8k to 16k) something like 60M to 80M files takes quite a bit of time when you are also seeking all over the disk. A few ideas: When you have about 67M (2^26) files, ideally you want to *avoid* checking as many as you can. Given access times, you are only going to be able to do a few hundred disk accesses at most in a minute. So you are going to have only a few files/dirs that may be inconsistent in case of a crash. Why not keep track of that somehow? If you need about 1GB of space to store the state of a TB file system that needs to be checked, may be it _should_ be *stored* in a contiguous area on the FS itself. 1GB is about 0.1% of space. Typically only a few cyl grps may be inconsistent in case of a crash. May be some info about which cyl groups need to be checked can be stored so that brute force checking of all grps can be avoided. Typically a file will be stored in one or a small number of cyl groups. If that info. is stored somewhere it can speed things up. Extant based allocation will reduce the number of indirect blocks. But may be this is not such a big issue if most of your files fit in a few blocks. Anyway, support for all of these have to be done in the filesystem first before fsck can benefit. If instead you spend time optimizing just fsck, you will likely make it far more complex (and potentially harder to get right). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
In message [EMAIL PROTECTED], Bakul Shah writes: UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Now, before we go off and design YABFS, can we just get real for a second ? I have been tending UNIX computers of all sorts for many years and there is one bit of wisdom that has yet to fail me: Every now and then, boot in single-user and run full fsck on all filesystems. If this had failed to be productive, I would have given up the habit years ago, but it is still a good idea it seems. Personally, I think background-fsck is close to the ideal situation since I can skip the boot in single-user part of the above profylactic. If you start to implement any sort of journaling (that is what you talked about in your email), you might as well just stop right at the clean bit, and avoid the complexity. Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Thanks for your thoughts. . Some good points.. On Mon, 17 Mar 2003, Bakul Shah wrote: UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Keeping track of 67M to 132M blocks and (assuming avg file size of 8k to 16k) something like 60M to 80M files takes quite a bit of time when you are also seeking all over the disk. Puting indirect blocks nearer the cylinder group metadata for the group that contains the inode would be nice :-) A few ideas: When you have about 67M (2^26) files, ideally you want to *avoid* checking as many as you can. Given access times, you are only going to be able to do a few hundred disk accesses at most in a minute. So you are going to have only a few files/dirs that may be inconsistent in case of a crash. Why not keep track of that somehow? In the case of HW failure on a raid you REALLY need to checke everything. You don't trust anything.. If you need about 1GB of space to store the state of a TB file system that needs to be checked, may be it _should_ be *stored* in a contiguous area on the FS itself. 1GB is about 0.1% of space. It is no trouble for us to set asside a separate filesystem, or partition for this.. Typically only a few cyl grps may be inconsistent in case of a crash. May be some info about which cyl groups need to be checked can be stored so that brute force checking of all grps can be avoided. Typically a file will be stored in one or a small number of cyl groups. If that info. is stored somewhere it can speed things up. Extant based allocation will reduce the number of indirect blocks. But may be this is not such a big issue if most of your files fit in a few blocks. Anyway, support for all of these have to be done in the filesystem first before fsck can benefit. If instead you spend time optimizing just fsck, you will likely make it far more complex (and potentially harder to get right). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003, Bakul Shah wrote: Anyway, support for all of these have to be done in the filesystem first before fsck can benefit. yep If instead you spend time optimizing just fsck, you will likely make it far more complex (and potentially harder to get right). You talk like I have a choice :-) I cannot change ufs/ffs and even if I could the clients wouldn't go for it. The problem space is Fsck of UFS/FFS partitions is too slow for 200GB+ filesystems. The solution space can not contain any answer that includes redefining UFS/FFS. Welcome to the real world. :-) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Now, before we go off and design YABFS, can we just get real for a second ? I leave it to others to design YAFS, I just wanted to complain about this one :-) Every few years I seriously look at speeding up fsck but give up. I remember even asking about it a few years ago on one of these groups. I have been tending UNIX computers of all sorts for many years and there is one bit of wisdom that has yet to fail me: Every now and then, boot in single-user and run full fsck on all filesystems. If this had failed to be productive, I would have given up the habit years ago, but it is still a good idea it seems. Even now I use fsck in forground since the background fsck was not stable enough the last time I used it. But I remember thinking fsck was taking too long for as long as I have used it (since 1981). Personally, I think background-fsck is close to the ideal situation since I can skip the boot in single-user part of the above profylactic. Anything that runs for half hour or more in fg is likely to take longer in bg. What happens if the system crashes again before it finishes? Will bg fsck handle that? Am I right in thinking that it can not save files in /lost+found? In general I am very uneasy with bg fsck -- when I am validating something I don't want to be creating new stuff. If you start to implement any sort of journaling (that is what you talked about in your email), you might as well just stop right at the clean bit, and avoid the complexity. No, I didn't suggest journaling, I suggested storing all state in a contiguous area (or a small number of such areas). indirect blocks, keeping track of free blocks, etc. You can still do a completely exhaustive fsck but it won't be exhausting to you. Journaling is also a good idea:-) Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. I am skeptical you will get more than a factor of 2 improvement without changing the FS (but hey, that is 3 hours for Julian so I am sure he will be happy with that!). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: NFS file unlocking problem
On Mon, Mar 17, 2003 at 01:21:19PM -0800, Andrew P. Lentvorski, Jr. wrote: On Sun, 16 Mar 2003, Steve Sizemore wrote: The dump doesn't seem to be attached. However, I note that the request It appears that there are problems sending the raw dump. I've tried twice - once 2 minutes after I sent the original message, and once again when I got this from you. Neither has shown up on the list. I can find another way to make it available if you need to see it. being sent is SETLKW which is a blocking wait until lock is granted. If the server thinks the file is already locked, it will hang *and* that is the proper behavior. What is the result of running this locally on the NFS server and attempting to lock the underlying file? If rpc.lockd is hanging onto a lock, running that perl script locally on the actual file (not an NFS mounted image of it) should also hang. It seems to work as expected (at least as I expect) on the server. If no other process has a lock, then the program locks the file, unlocks it, and exits immediately. If the remote client is trying to lock/unlock the file, then running the same program on the server also hangs. One other twist - recently, the behavior is less predictable. A couple of times in the last 24 hours, the lock/unlock on the client has actually worked as it should. The first time it happened, I was so surprised, that I thought I must have locked a local file rather than an NFS mounted file. On other occasions, the program has succeeded after very long hangs, .e.g % time plock xxx Locking xxx Unlocking xxx Done 0.21u 0.05s 55:35.33 0.0% This makes me wonder whether waiting indefinitely would succeed in all cases. (Note, however, that I've frequently waited more than an hour before killing the process or giving up.) As a side note, you probably want to create a C executable to do this kind of fcntl fiddling when attempting to test NFS. That way you can use a locally mounted binary and you won't wind up with all of the Perl access calls on the NFS wire. Or, at least, use a local copy of Perl. If I trusted my C skills as much as I trust my perl skills, I would do that. The perl stuff is all mounted locally, so there shouldn't be any perl nfs traffic on the wire. Let me know if you still need to see the dump. Steve -- Steve Sizemore [EMAIL PROTECTED], (510) 642-8570 Unix System Manager Dept. of Mathematics and College of Letters and Science University of California, Berkeley To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: .2 isn't a valid double and other problems
Thus spake Franz Klammer [EMAIL PROTECTED]: cvsup and build (kernel + userland, empty /usr/obj): FreeBSD ds9.webonaut.com 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sun Mar 16 17:53:22 CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DS9 i386 since my update from yesterday (above), fontconfig can't read his configuration file. it claims the .2 isn't a valid double. (line 228 of fonts.conf) metacity seems can't also not correct read his config if numbers like 0.7 used. there are also problems with script-fu from gimp-devel wich cause a segfault: ... many similar line like the blow deletet ... gimp-1.3: Corrupt segment 1 in gradient file '/usr/X11R6/share/gimp/gradients/Yellow_Orange.ggr'. (gimp-1.3:58219): Gimp-Core-WARNING **: (): no matching segment for position 0,067 gimp-1.3: fatal error: Segmentation fault You're using a locale in which the decimal point character is a comma. The breakage is my fault. I'm waiting to hear back from the vendor about a (trivial) patch, but I may just check the fix in sooner anyway. For the moment, please use the following: Index: gdtoaimp.h === RCS file: /cvs/src/contrib/gdtoa/gdtoaimp.h,v retrieving revision 1.2 diff -u -r1.2 contrib/gdtoa/gdtoaimp.h --- gdtoaimp.h 12 Mar 2003 20:20:22 - 1.2 +++ gdtoaimp.h 15 Mar 2003 23:14:12 - @@ -203,6 +203,7 @@ #endif #define INFNAN_CHECK +#define USE_LOCALE #undef IEEE_Arith #undef Avoid_Underflow Index: contrib/gdtoa/strtodg.c === RCS file: /cvs/src/contrib/gdtoa/strtodg.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 strtodg.c --- contrib/gdtoa/strtodg.c 12 Mar 2003 20:18:18 - 1.1.1.1 +++ contrib/gdtoa/strtodg.c 16 Mar 2003 00:27:41 - @@ -337,6 +337,9 @@ int j, k, nbits, nd, nd0, nf, nz, nz0, rd, rvbits, rve, rve1, sign; int sudden_underflow; CONST char *s, *s0, *s1; +#ifdef USE_LOCALE + CONST char *s2; +#endif double adj, adj0, rv, tol; Long L; ULong y, z; To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
You talk like I have a choice :-) I cannot change ufs/ffs and even if I could the clients wouldn't go for it. What about changing the size of block size or cyl grp size? Do they change things much? The problem space is Fsck of UFS/FFS partitions is too slow for 200GB+ filesystems. The solution space can not contain any answer that includes redefining UFS/FFS. Welcome to the real world. :-) I am so glad I have a separate machine for every few GB of disk space :-) So may be you can have on a multi-processor solution. I'll try to come up with more useful suggestions given your constraints To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: HEADS UP: Don't upgrade your Alphas!
Thus spake Ruslan Ermilov [EMAIL PROTECTED]: Hold off upgrading your Alphas for a moment. Something broke libc recently that results in (at least) floating point exceptions from awk(1) (this is not related to today's awk upgrade). I've been able to reproduce this on beast.freebsd.org by building the fresh libc.a and linking awk with it, and running a test case. I haven't been able to reproduce this with 8th March libc, so the time window for the breakage is low. I suspect the recent gtdoa commit to libc; we will know that is less than an hour. Whups. You're probably using a locale in which the decimal point is not a period. In that case, please use the patches I just posted to the following thread, which I just CC'd you. If it really is an Alpha issue and not a locale issue, awk is probably dying on one of the scripts used by the kernel build. If you could send me the command line that causes awk to die, that would be helpful. I'm running a kernel build on beast right now to see if I can reproduce a problem. I have a meeting in a few minutes, but I'll be back in five hours or so to follow up on this. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, Mar 17, 2003 at 12:45:15PM -0800, Julian Elischer wrote: I might add that the test filesystem was 95% full with about 8,000,000 directories on it. It was populated with multiple copies of /bin and /etc as a test set :-) How much like you're real file mix does this look? If your real mix doesn't require this many files it may not be so bad. I've got an 800GB SCSI-IDE RAID box[0] with a single UFS file system with about 520GB of mirrors on it and it fscks in ~40min. I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. -- Brooks Promise UltraTrak RM8000 with 8 120GB disks in a RAID5. -- Any statement of the form X is the one, true Y is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 pgp0.pgp Description: PGP signature
Re: HEADS UP: Don't upgrade your Alphas!
On Mon, Mar 17, 2003 at 02:12:31PM -0800, David Schultz wrote: Thus spake Ruslan Ermilov [EMAIL PROTECTED]: Hold off upgrading your Alphas for a moment. Something broke libc recently that results in (at least) floating point exceptions from awk(1) (this is not related to today's awk upgrade). I've been able to reproduce this on beast.freebsd.org by building the fresh libc.a and linking awk with it, and running a test case. I haven't been able to reproduce this with 8th March libc, so the time window for the breakage is low. I suspect the recent gtdoa commit to libc; we will know that is less than an hour. Whups. You're probably using a locale in which the decimal point is not a period. In that case, please use the patches I just posted to the following thread, which I just CC'd you. Ruslan has been using my Alphas to test this and there is no locale set on those. If it really is an Alpha issue and not a locale issue, awk is probably dying on one of the scripts used by the kernel build. If you could send me the command line that causes awk to die, that would be helpful. I'm running a kernel build on beast right now to see if I can reproduce a problem. I have a meeting in a few minutes, but I'll be back in five hours or so to follow up on this. -- | / o / /_ _ [EMAIL PROTECTED] |/|/ / / /( (_) Bulte To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003, Bakul Shah wrote: Now, before we go off and design YABFS, can we just get real for a second ? I am skeptical you will get more than a factor of 2 improvement without changing the FS (but hey, that is 3 hours for Julian so I am sure he will be happy with that!). I doubt I can get even 30% but I did just cut off 12% off pass1 by adding a pre-read of the cylinder group. (2 hours - 1hour 45 minutes) (it's a start) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Create linker.hints at boot
On Mon, Mar 17, 2003 at 01:07:53PM -0800, David O'Brien wrote: On Mon, Mar 17, 2003 at 12:43:19PM -0800, Crist J. Clark wrote: On Mon, Mar 17, 2003 at 09:11:12AM -0800, David O'Brien wrote: On Mon, Mar 17, 2003 at 12:28:34AM -0800, Crist J. Clark wrote: +kldxref_start () { + if [ -z $kldxref_module_path ]; then + MODULE_PATHS=`sysctl -n kern.module_path` + else + MODULE_PATHS=$kldxref_module_path + fi Please change the logic to positive logic: if [ -n $kldxref_module_path ]; then MODULE_PATHS=$kldxref_module_path else MODULE_PATHS=`sysctl -n kern.module_path` fi Is there a technical reason for that or is it just a style issue? Style, easier to read out loud, easier to understand w/o having to think. Just like this is hard to read. It certainly doesn't do what one reads out loud: if not string compaire equal. if (!strcmp(a,b) { printf(same\n); } I don't see what that code snipit has to do with the script (but I am in the camp that would go ahead and waste the four bytes of source code to write that as (strcmp(a, b) == 0)). I _did_ write the original script the way I was thinking/would say it, If $kldxref_module_path is empty, use the sysctl(8), otherwise, use its contents. I guess I was thinking of '-z' as a positive, is empty, rather than a negative, is not filled. But whatever. I've changed it to the positive in my repository. I'll commit the latest version later and everyone can make there own fixes/additions/changes/PRs. -- Crist J. Clark | [EMAIL PROTECTED] | [EMAIL PROTECTED] http://people.freebsd.org/~cjc/| [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003, Brooks Davis wrote: On Mon, Mar 17, 2003 at 12:45:15PM -0800, Julian Elischer wrote: I might add that the test filesystem was 95% full with about 8,000,000 directories on it. It was populated with multiple copies of /bin and /etc as a test set :-) How much like you're real file mix does this look? If your real mix doesn't require this many files it may not be so bad. I've got an 800GB SCSI-IDE RAID box[0] with a single UFS file system with about 520GB of mirrors on it and it fscks in ~40min. I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
crashdump when compile java/jdk14
Hello, $ gdb -k /sys/i386/compile/HP6100/kernel.debug /usr/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: bremfree: removing a buffer not on a queue panic messages: --- panic: softdep_disk_io_initiation: read syncing disks, buffers remaining... panic: bremfree: removing a buffer not on a queue Uptime: 3h24m39s Dumping 255 MB ata0: resetting devices .. ata0: pre reset mask=03 ostat0=50 ostat2=00 ata0-slave: ATAPI 00 00 ad0: ATAPI 00 00 ata0: after reset mask=03 stat0=50 stat1=00 ad0: ATA 01 a5 ata0: devices=01 ad0: success setting BIOSPIO on Intel ICH3 chip done 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 --- #0 doadump () at ../../../kern/kern_shutdown.c:239 239 dumping++; (kgdb) bt #0 doadump () at ../../../kern/kern_shutdown.c:239 #1 0xc01d897d in boot (howto=260) at ../../../kern/kern_shutdown.c:371 #2 0xc01d8c83 in panic () at ../../../kern/kern_shutdown.c:542 #3 0xc021cec0 in bremfreel (bp=0xc77afc60) at ../../../kern/vfs_bio.c:637 #4 0xc021cdd2 in bremfree (bp=0x0) at ../../../kern/vfs_bio.c:619 #5 0xc021ee4c in vfs_bio_awrite (bp=0x0) at ../../../kern/vfs_bio.c:1688 #6 0xc02933c2 in ffs_fsync (ap=0xddd019d0) at ../../../ufs/ffs/ffs_vnops.c:255 #7 0xc02924d7 in ffs_sync (mp=0xc26b, waitfor=2, cred=0xc0eb2f00, td=0xc0354800) at vnode_if.h:612 #8 0xc02340fb in sync (td=0xc0354800, uap=0x0) at ../../../kern/vfs_syscalls.c:138 #9 0xc01d849f in boot (howto=256) at ../../../kern/kern_shutdown.c:280 #10 0xc01d8c83 in panic () at ../../../kern/kern_shutdown.c:542 #11 0xc028b71c in softdep_disk_io_initiation (bp=0xc77afc60) at ../../../ufs/ffs/ffs_softdep.c:3466 #12 0xc0225b7f in cluster_wbuild (vp=0xc3129124, size=16384, start_lbn=5455, len=2) at buf.h:422 #13 0xc021ee42 in vfs_bio_awrite (bp=0xc77e9ac8) at ../../../kern/vfs_bio.c:1682 #14 0xc02933c2 in ffs_fsync (ap=0xddd01c18) at ../../../ufs/ffs/ffs_vnops.c:255 #15 0xc02924d7 in ffs_sync (mp=0xc26b, waitfor=3, cred=0xc0eb2f00, td=0xc2695000) at vnode_if.h:612 #16 0xc023389a in sync_fsync (ap=0xddd01cc4) at ../../../kern/vfs_subr.c:3493 ---Type return to continue, or q return to quit--- #17 0xc022f9ba in sched_sync () at vnode_if.h:612 #18 0xc01c3523 in fork_exit (callout=0xc022f7e0 sched_sync, arg=0x0, frame=0x0) at ../../../kern/kern_fork.c:875 (kgdb) Any idea? -- Rgdz,/\ ASCII RIBBON CAMPAIGN Sergey Osokin aka oZZ, \ /AGAINST HTML MAIL http://ozz.pp.ru/ X AND NEWS / \ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
At 10:39 PM +0100 2003/03/17, Poul-Henning Kamp wrote: Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Just what are you saying? Is Julian Elischer not the right person to be working on this, because he has a history of not finishing the last 30% of something? -- Brad Knowles, [EMAIL PROTECTED] They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++): a C++(+++)$ UMBSHI$ P+++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+() DI+() D+(++) G+() e++ h--- r---(+++)* z(+++) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Monday, 17 March 2003 at 22:39:02 +0100, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Bakul Shah writes: UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Now, before we go off and design YABFS, can we just get real for a second ? I have been tending UNIX computers of all sorts for many years and there is one bit of wisdom that has yet to fail me: Every now and then, boot in single-user and run full fsck on all filesystems. If this had failed to be productive, I would have given up the habit years ago, but it is still a good idea it seems. Personally, I think background-fsck is close to the ideal situation since I can skip the boot in single-user part of the above profylactic. If you start to implement any sort of journaling (that is what you talked about in your email), you might as well just stop right at the clean bit, and avoid the complexity. Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Poul-Henning, how can you justify the second half of that sentence? I take exception to the implications. In case anybody is in any doubt, I've heard you say this sort of thing about julian before. Please don't do it again. This is without my core hat. As most people here know, core has warned you about this kind of behaviour multiple times before. What I say here in no way prejudices what core may decide to do about the incident. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Hang on Boot (still)
:D I booted the two floppies again. I tried what you said. hw.eisa_slots=0. It got farther this time. I got a full dmesg. However, it was followed by this: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x7 fault code= supervisor read, page not present instructor pointer= 0x8:0xc01f0cc9 stack pointer = 0x10:0xcf2a3788 frame pointer = 0x10:0xcf2a3794 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL=0 current process = 1 (sysinstall) trap number = 12 panic: page fault Doing a hw.eisa_slots=1 resulted in the same problem. So I decided to try a 5.0-RELEASE CD. I tried hw.eisa_slots=0, which resulted in the same crappy error I always got with 5.0-R. Then I remembered that ACPI wasn't disabled in 5.0-R by default. so I used the hint and disabled it. I got a lot farther, but it eventually hung with a message like cbb0: Card not supported, then it would reboot itself after a few seconds. So I decided to try a few other hints. It ended up that this combination of variables on my 5.0-RELEASE would get me booted and into sysinstall: hw.eisa_slots=0 hint.acpi.0.disabled=1 hint.pcic.0.disabled=1 So, disabling pcic0 did the trick. Then, I decided to go back to my snapshot and try this. However, I got the same page fault as described above. So what happened between 5.0-R and -C? I'll get my 5.0-R installed, and then I'll try to make world. Lucas Reddinger On Sun, 16 Mar 2003, Lucas Reddinger wrote: The one alternative would be to compile a stripped kernel on another machine, and install off of it. I did this, but I do not have enough knowledge of the 5.x kernel/modules to be able to do this myself. If someone could give me some help with this instead, it would be greatly appriciated. Try setting this from the loader hw.eisa_slots=0 (or 1). -- | Matthew N. Dodd | '78 Datsun 280Z | '75 Volvo 164E | FreeBSD/NetBSD | | [EMAIL PROTECTED] | 2 x '84 Volvo 245DL| ix86,sparc,pmax | | http://www.jurai.net/~winter | For Great Justice! | ISO8802.5 4ever | -- Lucas Reddinger Customer Service Winged Leopard Web Designs To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003, Brooks Davis wrote: On Mon, Mar 17, 2003 at 12:45:15PM -0800, Julian Elischer wrote: I might add that the test filesystem was 95% full with about 8,000,000 directories on it. It was populated with multiple copies of /bin and /etc as a test set :-) How much like you're real file mix does this look? If your real mix doesn't require this many files it may not be so bad. I've got an 800GB SCSI-IDE RAID box[0] with a single UFS file system with about 520GB of mirrors on it and it fscks in ~40min. This is typical of the real file mix, actually it's probably a bit pesimistic but only by a small margin.. (I needed 'worst case' figures) I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. -- Brooks Promise UltraTrak RM8000 with 8 120GB disks in a RAID5. -- Any statement of the form X is the one, true Y is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: .2 isn't a valid double and other problems
Am Mo, 2003-03-17 um 23.08 schrieb David Schultz: Thus spake Franz Klammer [EMAIL PROTECTED]: cvsup and build (kernel + userland, empty /usr/obj): FreeBSD ds9.webonaut.com 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sun Mar 16 17:53:22 CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DS9 i386 since my update from yesterday (above), fontconfig can't read his configuration file. it claims the .2 isn't a valid double. (line 228 of fonts.conf) metacity seems can't also not correct read his config if numbers like 0.7 used. there are also problems with script-fu from gimp-devel wich cause a segfault: ... many similar line like the blow deletet ... gimp-1.3: Corrupt segment 1 in gradient file '/usr/X11R6/share/gimp/gradients/Yellow_Orange.ggr'. (gimp-1.3:58219): Gimp-Core-WARNING **: (): no matching segment for position 0,067 gimp-1.3: fatal error: Segmentation fault You're using a locale in which the decimal point character is a comma. The breakage is my fault. I'm waiting to hear back from the vendor about a (trivial) patch, but I may just check the fix in sooner anyway. For the moment, please use the following: yes! everything is ok now. thanks! franz. -- WEBONAUT.com http://webonaut.com mailto:[EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Possible SPAM (accuracy high): Buy Generic and save
Generic Viagra is now available to consumers As low as $2.25 per dose (50 mg) No Doctor's Consutation required "Silagra is as good as Viagra - just cheaper!" Costs over 65% less than Brand Name (Generic Sildenafil Citrate (Silagra) and Viagra. both consist of 100 mg of sildenafil citrate) Private delivery to your home within 14 working days of payment verification - FREE SHIPPING 100% Money Back Guarantee - The First Pharmaceutical to ever be guaranteed. Limited Time Offer: 19 pills for $119.00. Why pay twice as much when Silagra is the same thing and is only a click away? Hurry Offer Ends Soon! Viagra is a trademark of the Pfizer, Inc. and is not affiliated with Generic Viagra. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
pptp mpd under 5.0
Im trying to run pptp under 5.0 -current. (first time with mpd so probably some config issue) I get these errors: mpd: pid 1102, version 3.13 ([EMAIL PROTECTED] 09:35 17-Mar-2003) [pptp0] can't create socket node: No such file or directory mpd: local IP address for PPTP is 10.23.0.3 [pptp0] using interface [pptp1] can't create socket node: No such file or directory [pptp1] using interface I tried running mpd under truss to see what its trying to access... im guessing its missing a tun0 or ng0 or some such from /dev, whats the replacement for MAKDEV to mkae them? ... and I cant seem to get truss to run at all on any program... bash-2.05b# truss mpd truss: cannot open /proc/curproc/mem: No such file or directory truss: cannot open /proc/1107/mem: No such file or directory bash-2.05b# To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
nfs panic: lockmgr: locking against myself
I'm seeing the following panic under heavy NFS client usage on an SMP w/kernel sources from Weds. evening. Has this been fixed? Thanks, Drew panic: lockmgr: locking against myself cpuid = 0; lapic.id = Debugger(panic) Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0 db t Debugger(c0355e8c,0,c0354b5c,d8e2793c,1) at Debugger+0x55 panic(c0354b5c,0,c0354b22,ef,c41d6400) at panic+0x11f lockmgr(ce6abce4,2090022,c470136c,c41d6400,26b) at lockmgr+0x491 BUF_TIMELOCK(ce6abc18,10022,c470136c,c035d2bf,0) at BUF_TIMELOCK+0x80 nfs_flush(c470136c,c443e480,1,c41d6400,1) at nfs_flush+0x607 nfs_fsync(d8e27a9c,0,c035a68f,45a,c1df8858) at nfs_fsync+0x31 vinvalbuf(c470136c,1,c443e480,c41d6400,0) at vinvalbuf+0xe4 nfs_vinvalbuf(c470136c,1,c443e480,c41d6400,1) at nfs_vinvalbuf+0x191 nfs_setattr(d8e27b60,20002,c41d6400,0,0) at nfs_setattr+0x1f5 setutimes(c41d6400,c470136c,d8e27ca8,2,0) at setutimes+0x1c4 kern_utimes(c41d6400,bfbff8e8,0,bfbfece0,0) at kern_utimes+0x9c utimes(c41d6400,d8e27d10,c03667a3,404,2) at utimes+0x31 syscall(2f,2f,2f,bfbff8e8,4) at syscall+0x24e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (138, FreeBSD ELF32, utimes), eip = 0x8049e8f, esp = 0xbfbfecbc, ebp = 0xbfbfed08 --- db To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
md(4) crashes in current
After configuring an md in current (as of last Friday), I will later panic. Unfortunately (or not), the kernel on this system was built without symbols, witness, invariants, or debugger. So I built a new kernel with all of those things and...no more panics. Makes debug a bit difficult. I then built a new kernel without debugging and tried again. The panics resumed. The ONLY differences in the kernels were symbols, witness, invariants,and debugger. I can get a dump and I'll try building a kernel with symbols but no WITNESS or INVARIANTS and see if that crashes. In the mean time, has anyone else seen this? % mdconfig -a -t vnode -f /D/file.iso -u 1 % Do some stuff which may or may not involve the md device. I normally mount the device and later dismount it, but I have had panics without doing anything else. Panics often come very soon after I de-configure the device. (mdconfig -d -u 1) I'll submit a PR if I can get enough information to put something that looks useful. R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: nfs panic: lockmgr: locking against myself
Andrew Gallatin wrote: I'm seeing the following panic under heavy NFS client usage on an SMP w/kernel sources from Weds. evening. Has this been fixed? If I'm not mistaken, this is the problem Jeff fixed in revision 1.134 of vfs_cluster.c. Cheers, Maxime To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: nfs panic: lockmgr: locking against myself
On Mon, 17 Mar 2003, Andrew Gallatin wrote: I'm seeing the following panic under heavy NFS client usage on an SMP w/kernel sources from Weds. evening. Has this been fixed? Thanks, Drew I believe that is fixed in nfs_vnops.c 1.200. panic: lockmgr: locking against myself cpuid = 0; lapic.id = Debugger(panic) Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0 db t Debugger(c0355e8c,0,c0354b5c,d8e2793c,1) at Debugger+0x55 panic(c0354b5c,0,c0354b22,ef,c41d6400) at panic+0x11f lockmgr(ce6abce4,2090022,c470136c,c41d6400,26b) at lockmgr+0x491 BUF_TIMELOCK(ce6abc18,10022,c470136c,c035d2bf,0) at BUF_TIMELOCK+0x80 nfs_flush(c470136c,c443e480,1,c41d6400,1) at nfs_flush+0x607 nfs_fsync(d8e27a9c,0,c035a68f,45a,c1df8858) at nfs_fsync+0x31 vinvalbuf(c470136c,1,c443e480,c41d6400,0) at vinvalbuf+0xe4 nfs_vinvalbuf(c470136c,1,c443e480,c41d6400,1) at nfs_vinvalbuf+0x191 nfs_setattr(d8e27b60,20002,c41d6400,0,0) at nfs_setattr+0x1f5 setutimes(c41d6400,c470136c,d8e27ca8,2,0) at setutimes+0x1c4 kern_utimes(c41d6400,bfbff8e8,0,bfbfece0,0) at kern_utimes+0x9c utimes(c41d6400,d8e27d10,c03667a3,404,2) at utimes+0x31 syscall(2f,2f,2f,bfbff8e8,4) at syscall+0x24e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (138, FreeBSD ELF32, utimes), eip = 0x8049e8f, esp = 0xbfbfecbc, ebp = 0xbfbfed08 --- db To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: nfs panic: lockmgr: locking against myself
Maxime Henrion writes: Andrew Gallatin wrote: I'm seeing the following panic under heavy NFS client usage on an SMP w/kernel sources from Weds. evening. Has this been fixed? If I'm not mistaken, this is the problem Jeff fixed in revision 1.134 of vfs_cluster.c. Great! Thanks! Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: NFS file unlocking problem
Andrew P. Lentvorski, Jr. wrote: The dump doesn't seem to be attached. However, I note that the request being sent is SETLKW which is a blocking wait until lock is granted. If the server thinks the file is already locked, it will hang *and* that is the proper behavior. It is, to ensure FIFO ordering of request grants. You could also implement this as a retry. If you do it the first way, you end up potentially deadlocking the server shen a single client has badly behaved code that locks against itself. If you do it the second way, you end up with timing dependent starvation deadlocks for individual client processes. Note that the first deadlock is normal -- it would happen if the file were local, as well... no help for badly written code -- but I mention it as important because we are talking about blocking multiple clients. I don't know what the process is, but a threaded process can cause a deadlock when it should be a grant/upgrade/downgrade of an existing lock overlap. This is because there is no such thing as a thread ID in the NFS protocol, and if process IDs are different for different threads, and the requests come from the same system ID, then you can get a deadlock when none should be present. To avoid this, either manage all locks in an apartment or rental model (queue all requests to a single thread, and have it do the locking by proxy) OR make sure that all requests from any thread in a given process in fact are given the same proxy process ID on the wire. [ ... This last is not likely your problem, but I mention it, in case you are using rfork() or Linux threads ... ] What is the result of running this locally on the NFS server and attempting to lock the underlying file? If rpc.lockd is hanging onto a lock, running that perl script locally on the actual file (not an NFS mounted image of it) should also hang. That was my next question, as well: does it happen on a local FS as well as an NFS FS? Personally, I would *NOT* recommend running it on the server, but mount a local FS on the client instead; the less variables, the better. On the other hand, this is clearly a deadlock that requires an existing, conflicting lock -- IFF the you are correct about the delayed locking behaviour. As a side note, you probably want to create a C executable to do this kind of fcntl fiddling when attempting to test NFS. That way you can use a locally mounted binary and you won't wind up with all of the Perl access calls on the NFS wire. Or, at least, use a local copy of Perl. I recommend a pared down test case. I suspect that the problem is that something that is expected to have the same ID is locking against itself. Does the failure occur with the same values in all cases in the F_RSETLKW? If so, I suggest you capture *all* locking packets on your wire, and then find who is conflicting. This may be a simple lock order reversal (deadly embrace deadlock) due to poor application performance. You may also find that you have multiple process IDs, when it should be a single process ID, for the proxy PID for the conflicting request. At worst, it would be nice to know the system that caused it. Actually, for a lock you know is threre, you *can* diagnose the problem (somewhat) by writing a program on the server, and using F_GETLK on the range for the hanging lock on the server -- this will return a struct flock, which will give you range and PID information. Do it on the Solaris box, though. The reason you want to do this on the Solaris box is that the struct flock on FreeBSD fails to include the l_rsysid -- the remote system ID. Actually, given this, I don't understand how FreeBSD server side proxy locking can actually work at all; it would incorrectly coelesce locks with local locks when the l_pid matched, which would be *all* locks in the lockd, and then incorrectly release them when a local process exited, or any process on any remote system unlocked an overlapping range (possibly in error). You are using FreeBSD as the NFS client in this case, right? If so, that's probably not an issue for you... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pptp mpd under 5.0
From: Aaron Wohl [EMAIL PROTECTED] replacement for MAKDEV to mkae them? ... and I cant seem to get truss to run at all on any program... bash-2.05b# truss mpd truss: cannot open /proc/curproc/mem: No such file or directory truss: cannot open /proc/1107/mem: No such file or directory bash-2.05b# The procfs isn't mounted by default on 5-CURRENT, you will need to mount it before running truss. Scot To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Bakul Shah wrote: UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Keeping track of 67M to 132M blocks and (assuming avg file size of 8k to 16k) something like 60M to 80M files takes quite a bit of time when you are also seeking all over the disk. Sorry, but the track-to-track seek latency optimizations you are referring to are turned off, given the newfs defaults, and have been for a very long time. Please see revision 1.7 of /usr/src/sbin/newfs/newfs.c: revision 1.7 date: 1995/02/05 08:42:31; author: phk; state: Exp; lines: +13 -2 Basically, the problem is that for a BG fsck, it's not possible to lock access on a cylinder group basis, and then there is a hell of a lot of data on that drive. What Julian needs is a Journalling or Log-structured FS. Even that will not save him in some failure cases, unless the hardware has a CMOS data area that can be written with the failure cause, even if it's a double panic (must distinguish power failure from kernel panic, as kernel panics result from corrupt kernel data, which means you must check all files). When you have about 67M (2^26) files, ideally you want to *avoid* checking as many as you can. Given access times, you are only going to be able to do a few hundred disk accesses at most in a minute. So you are going to have only a few files/dirs that may be inconsistent in case of a crash. Why not keep track of that somehow? The BG fsck code specifically *only* checks the CG bitmap allocations. Basically, this means that anything that might be using the CG has to be checked to see if it references blocks in the bitmaps. The easiest way to do this is to provide some sort of domain control, so that you limit the files you have to check to a smaller set of files per CG, so that you don't have to check all the files to check a particular CG -- and then preload the CG's for the sets of files you *do* have to check. Another way of saying this is... don't put all your space in a single FS. 8-). The problem with having to (effectively) read every inode and direct block on the disk is really insurmountable, I think. There are some FS design changes that could be made to fix the issue, by partitioning CG's by locality, and then providing locality forward references as a seperate bitmap, and forcing files to stick to their locality until they can't, (basically, the AIX GFS PP and LP voume management trick), but that would require a layout change and an additional dependency insertion. Too bad the soft updates implementation isn't a generic graph dependency mechanism, with node/node dependency resolvers that get registered for edges, or you could do this easily (as well as implying edges between stacking layers, and exporting a transaction interface to user space). Typically only a few cyl grps may be inconsistent in case of a crash. May be some info about which cyl groups need to be checked can be stored so that brute force checking of all grps can be avoided. This would work well... under laboratory test conditions. In the field, if the machine has a more general workload (or even a heterogeneous workload, but with a hell of a lot of files, like a big mail server), this falls down, as the number of bits marking unupdated cylinder groups becomes large. ...AND... The problem is still that you must scan everything on the disk (practically) to identify the inode or indirect block that references the block on the cylinder roup in question, and THAT's the problem. If you knew a small set of CG's, that needed checked, vs. all of them, it would *still* bust the cache, which is what takes all the time. Assume, on average, each file goes into indirect blocks. That's basically an N^3 algorithm to find a given block for a given CG; and the problem is the cache busting. Reducing this to N^2*M, where M N, or even M N, will not significantly speed up the process; e.g.: ,---.* |+++|* * Where you miss |+++,. + Where you have to repeat `---|| || || `' ...So even if you pick your repetition area right (you could have accidently picked the big one instead of the small one, if the files were very large, or there was a very large number of small ones), you don't save a lot. Typically a file will be stored in one or a small number of cyl groups. If that info. is stored somewhere it can speed things up. The problem is reverse
Re: Anyone working on fsck?
Julian Elischer wrote: The problem space is Fsck of UFS/FFS partitions is too slow for 200GB+ filesystems. The solution space can not contain any answer that includes redefining UFS/FFS. Welcome to the real world. :-) Use smaller than 200GB+ filesystems. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Bakul Shah wrote: I have been tending UNIX computers of all sorts for many years and there is one bit of wisdom that has yet to fail me: Every now and then, boot in single-user and run full fsck on all filesystems. If this had failed to be productive, I would have given up the habit years ago, but it is still a good idea it seems. Even now I use fsck in forground since the background fsck was not stable enough the last time I used it. But I remember thinking fsck was taking too long for as long as I have used it (since 1981). If your problem is availability, then any time 0 counts as too long. Taking the whole system offline, while the rest of the world has gone into an electronic transaction frenzy because some world event, and running a BG fsck is not really an option. Anything that runs for half hour or more in fg is likely to take longer in bg. What happens if the system crashes again before it finishes? Will bg fsck handle that? Am I right in thinking that it can not save files in /lost+found? It can, but it has to be changed; it's pretty ugly. Your best bet is to precreate lost+found, and then modify the code to allow you to ftruncate(2) a directory large, forcibly allocating backing blocks for it (that was my last workaround to the problem). Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. I am skeptical you will get more than a factor of 2 improvement without changing the FS (but hey, that is 3 hours for Julian so I am sure he will be happy with that!). I'm skeptical you will get a factor of 2. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, 17 Mar 2003, Terry Lambert wrote: Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). They are not responsible for data caching. Only informing the upper layers how many spindles they have. Software RAID should be a transform only in my opinion. There is no reason to have duplicate block caches in system memory. Cheers, Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Jeff Roberson wrote: On Mon, 17 Mar 2003, Terry Lambert wrote: Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). They are not responsible for data caching. Only informing the upper layers how many spindles they have. Software RAID should be a transform only in my opinion. There is no reason to have duplicate block caches in system memory. Let's turn that around: There is no reason to have duplicate spindle knowledge. Actually, I was not suggesting duplicate block caches, I was suggesting cache attribution by spindle by the code that knows what block lives on what spindle. Even so, for RAID, this is generally problematic, because there's multiple locations for the block: where it lives, where it's mirrored, where it's parity block lives, etc.. Ideally, these are all different spindles, so the problem can't be fixed by a simple cache. 8-(. You would need a chain of spindle references, supplied by the code that knows the spindle, hung off the cache object. Gets ugly fast. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
On Mon, Mar 17, 2003 at 23:02:38 -0500, Jeff Roberson wrote: On Mon, 17 Mar 2003, Terry Lambert wrote: Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). They are not responsible for data caching. Only informing the upper layers how many spindles they have. Software RAID should be a transform only in my opinion. There is no reason to have duplicate block caches in system memory. There are times when a software RAID layer should do some caching. Hopefully your software RAID layer will be integrated enough into the system that it can avoid doing any copies, though. The place where you really want some sort of caching is when you try to coalesce buffers to get full stripe writes with RAID-5. Otherwise, you have to do read-modify-write, which is more expensive. There are other cases where caching is needed for RAID-1 and RAID-5, but they generally require specialized hardware. (Hint: why do many controllers that support RAID-1 and RAID-5 have battery backed caches?) Other than stripe coalescing, and those cases that come up when you have specialized hardware, you're right, RAID should be a transform that doesn't require copying. (Caching is another story.) Ken -- Kenneth Merry [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. ... Sorry, but the track-to-track seek latency optimizations you are referring to are turned off, given the newfs defaults, and have been for a very long time. I was thinking of the basic idea of cylinder groups as good for normal load, not so good for fsck when you have too many CGs. I wasn't thinking of what fsck does or does not do. Basically, the problem is that for a BG fsck, it's not possible to lock access on a cylinder group basis, and then there is a hell of a lot of data on that drive. Note that Julian said 6 hours to fsck a TB in the normal foreground mode. What Julian needs is a Journalling or Log-structured FS. All that Julian wants is a faster fsck without mucking with the FS! While I agree with you that you do need a full consistency check, it is worth thinking about how one can avoid that whenever possible. For example, if you can know where the disk head is at the time of crash (based on what blocks were being written) it should be possible to avoid a full check. The easiest way to do this is to provide some sort of domain control, so that you limit the files you have to check to a smaller set of files per CG, so that you don't have to check all the files to check a particular CG -- and then preload the CG's for the sets of files you *do* have to check. If you have to visit a CG (during fsck) you have already paid the cost of the seek and rotational latency. Journalling wouldn't help here if you still have a zillion CGs. Another way of saying this is... don't put all your space in a single FS. 8-). Or in effect treat each CG (or a group of CGs) as a self contained filesystem (for the purpose of physical allocation) and maintain explicit import/export lists for files that span them. The problem with having to (effectively) read every inode and direct block on the disk is really insurmountable, I think. That is why I was suggesting putting them in one (or small number of) contiguous area(s). On a modern ATA100 or better disk you can read a GB in under a minute. Once the data is in-core you can divide up the checking to multiple processors. This is sort of like a distributed graph collection: you only need to worry about graphs that cross a node boundary. Most structures wille contained in one node. Even for UFS it is probably worth dividing fsck in two or more processes, one doing IO, one or more doing computation. Typically only a few cyl grps may be inconsistent in case of a crash. May be some info about which cyl groups need to be checked can be stored so that brute force checking of all grps can be avoided. This would work well... under laboratory test conditions. In the field, if the machine has a more general workload (or even a heterogeneous workload, but with a hell of a lot of files, like a big mail server), this falls down, as the number of bits marking unupdated cylinder groups becomes large. Possible -- it is one of the ideas I can think of. I'd have to actually model it or simulate it beyond handwaving to know one way or other. May be useful in conjunction with other ideas. ...AND... The problem is still that you must scan everything on the disk (practically) to identify the inode or indirect block that references the block on the cylinder roup in question, and THAT's the problem. If you knew a small set of CG's, that needed checked, vs. all of them, it would *still* bust the cache, which is what takes all the time. Assume, on average, each file goes into indirect blocks. On my machine the average file size is 21KB (averaged over 4,000,000 files). Even with 8KB blocksize very few will have indirect blocks. I don't know how typical my file size distribution is but I suspect the average case is probably smaller files (I store lots of datasheets, manuals, databases, PDFs, MP3s, cvs repositories, compressed tars of old stuff). But in any case wouldn't going forward from inodes make more sense? This is like a standard tracing garbage collection algorithm. Blocks that are not reachable are free. Even for a 1 TB system with 8K blocks you need 2^(40-13-3) == 16Mbytes bitmap or some multiple if you want more than 1 bit of state. The problem is reverse mapping a bit in a CG bitmap to the file that reference it... 8^p. Why would you want to do that?! Multithreading fsck would be an incredibly bad idea. It depends on the actual algorithm. Personally, I recommend using a different FS, if you are going to create a honing big FS as a single partition. 8-(. There are other issues with smaller partitions. I'd rather have a single logical file system where all the space can be used. If physically it is implemented as a number of smaller systems that is okay. Also note that now people can create big honking files with video streaming at the
ACPI suspend problem (ThinkPad X23)
Hi, My -CURRENT(2003/03/12) laptop(ThinkPad X23) can't be suspended. When I try # acpiconf -s 1 I have console message 'acpi0: AcpiGetSleepTypeData failed - AE_NOT_FOUND' How can I solve this? dmesg output is attached. Regards, Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT-20030312-JPSNAP #1: Tue Mar 18 01:26:29 JST 2003 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/JORDAN Preloaded elf kernel /boot/kernel/kernel at 0xc0802000. Preloaded elf module /boot/kernel/acpi.ko at 0xc08020a8. Timecounter i8254 frequency 1193182 Hz Timecounter TSC frequency 865934929 Hz CPU: Intel(R) Pentium(R) III Mobile CPU 866MHz (865.93-MHz 686-class CPU) Origin = GenuineIntel Id = 0x6b1 Stepping = 1 Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE real memory = 670498816 (639 MB) avail memory = 642658304 (612 MB) Allocating major#253 to net Allocating major#252 to pci Pentium Pro MTRR support enabled npx0: math processor on motherboard npx0: INT 16 interface acpi0: IBMTP-1Don motherboard ACPI-0625: *** Info: GPE Block0 defined as GPE0 to GPE15 ACPI-0625: *** Info: GPE Block1 defined as GPE16 to GPE31 pcibios: BIOS version 2.10 Using $PIR table, 14 entries at 0xc00fdeb0 ACPI-1287: *** Error: Method execution failed, AE_NOT_EXIST ACPI-1287: *** Error: Method execution failed, AE_NOT_EXIST acpi0: power button is handled as a fixed feature programming model. Timecounter ACPI-fast frequency 3579545 Hz acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0 acpi_cpu0: CPU port 0x530-0x537 on acpi0 acpi_tz0: thermal zone port 0x530-0x537 on acpi0 acpi_lid0: Control Method Lid Switch on acpi0 ACPI-1287: *** Error: Method execution failed, AE_NOT_EXIST acpi_button0: Sleep Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 agp0: Intel 82830 host to AGP bridge mem 0xd000-0xdfff at device 0.0 on pci0 pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0 pci1: ACPI PCI bus on pcib1 pci1: display, VGA at device 0.0 (no driver attached) uhci0: Intel 82801CA/CAM (ICH3) USB controller USB-A port 0x1800-0x181f irq 11 at device 29.0 on pci0 usb0: Intel 82801CA/CAM (ICH3) USB controller USB-A on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: Intel 82801CA/CAM (ICH3) USB controller USB-B port 0x1820-0x183f irq 11 at device 29.1 on pci0 usb1: Intel 82801CA/CAM (ICH3) USB controller USB-B on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: Intel 82801CA/CAM (ICH3) USB controller USB-C port 0x1840-0x185f irq 11 at device 29.2 on pci0 usb2: Intel 82801CA/CAM (ICH3) USB controller USB-C on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered pcib2: ACPI PCI-PCI bridge at device 30.0 on pci0 pci2: ACPI PCI bus on pcib2 cbb0: RF5C476 PCI-CardBus Bridge mem 0x5000-0x5fff irq 11 at device 3.0 on pci2 start (5000) sc-membase (c020) start (5000) sc-pmembase (e800) cardbus0: CardBus bus on cbb0 pccard0: 16-bit PCCard bus on cbb0 cbb1: RF5C476 PCI-CardBus Bridge mem 0x5010-0x50100fff irq 11 at device 3.1 on pci2 start (5010) sc-membase (c020) start (5010) sc-pmembase (e800) cardbus1: CardBus bus on cbb1 pccard1: 16-bit PCCard bus on cbb1 fwohci0: Ricoh R5C552 mem 0xc0201000-0xc02017ff irq 11 at device 3.2 on pci2 fwohci0: PCI bus latency was changing to 250. fwohci0: OHCI version 1.0 (ROM=0) fwohci0: No. of Isochronous channel is 4. fwohci0: EUI64 00:06:1b:00:20:01:b0:a0 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: IEEE1394(FireWire) bus on fwohci0 if_fwe0: Ethernet over FireWire on firewire0 if_fwe0: Fake Ethernet address: 02:06:1b:01:b0:a0 sbp0: SBP2/SCSI over firewire on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id = 0xc800ffc0, CYCLEMASTER mode firewire0: 1 nodes, maxhop = 0, cable IRM = 0 (me) wi0: Intersil Prism2.5 mem 0xf000-0xffff irq 11 at device 5.0 on pci2 wi0: 802.11 address: 00:20:e0:8a:71:c8 wi0: using RF:PRISM2.5 MAC:ISL3874A(Mini-PCI) wi0: Intersil Firmware: Primary (1.1.0), Station (1.4.2) wi0: supported rates: 1Mbps 2Mbps 5.5Mbps 11Mbps fxp0: Intel 82801CAM (ICH3) Pro/100 VE Ethernet port 0x8000-0x803f mem 0xc020-0xc0200fff irq 11 at device 8.0 on pci2 fxp0: Ethernet address 00:d0:59:aa:4d:f3 inphy0: i82562ET 10/100 media interface on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto isab0: PCI-ISA bridge at device 31.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel ICH3 UDMA100
Re: Anyone working on fsck?
In message [EMAIL PROTECTED], Brad Knowles writes: At 10:39 PM +0100 2003/03/17, Poul-Henning Kamp wrote: Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Just what are you saying? Is Julian Elischer not the right person to be working on this, because he has a history of not finishing the last 30% of something? Yes. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI suspend problem (ThinkPad X23)
In message [EMAIL PROTECTED], FUJITA Kazutoshi wrote : Hi, My -CURRENT(2003/03/12) laptop(ThinkPad X23) can't be suspended. When I try # acpiconf -s 1 I have console message 'acpi0: AcpiGetSleepTypeData failed - AE_NOT_FOUND' How can I solve this? dmesg output is attached. It seems that Your machine does not support S1 sleep. See the result of # acpidump |grep _S1 If there is no line like Name(\_S1_, Package(0x4){ Use S2, S3 instead. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
In message [EMAIL PROTECTED], Greg 'groggy' Lehey writes: Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Poul-Henning, how can you justify the second half of that sentence? I take exception to the implications. In case anybody is in any doubt, I've heard you say this sort of thing about julian before. Please don't do it again. I'll stop as soon as KSE is finished, fair ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Bakul Shah wrote: Sorry, but the track-to-track seek latency optimizations you are referring to are turned off, given the newfs defaults, and have been for a very long time. I was thinking of the basic idea of cylinder groups as good for normal load, not so good for fsck when you have too many CGs. I wasn't thinking of what fsck does or does not do. Number of CG's is irrelevent, really. You can read 10 of the bitmaps at a time and precache them, or you can read 1000 of them and precache them, the difference is going to be minimal. The thing that takes the time is the forward traversal to do the reverse lookup. Julian has the right idea about precaching; I suggested it, too, before reading his message, but after a certain level, there is a point of diminishing returns. Getting rid of cylinder groups just pushes the problem up a level, it doesn't eliminate it. No cylinder group bitmap to be the last thing updated? OK, then whatever becomes the last thing updated, instead of that, becomes the reverse lookup bottleneck. Basically, the problem is that for a BG fsck, it's not possible to lock access on a cylinder group basis, and then there is a hell of a lot of data on that drive. Note that Julian said 6 hours to fsck a TB in the normal foreground mode. Yes, I know. I'm aware. He has a lot of data to transfer in from the disk in order to do the reverse lookup, with that much data on the disk. He could change the code so that that everything that wasn't in use was zeroed. That would help, since then he could just create a second CG map in an empty file by traversing all inodes and indirect blocks for non-zero values, not caring if the indirect blocks were referenced by inodes. Assuming indirect blocks and data blocks were distinguishable. Like I said, there are a lot of FS changes that could help. What Julian needs is a Journalling or Log-structured FS. All that Julian wants is a faster fsck without mucking with the FS! And I'd like to regrow all my teeth that have ever had dental work done on them. There are some things you can't have. 8-). While I agree with you that you do need a full consistency check, it is worth thinking about how one can avoid that whenever possible. For example, if you can know where the disk head is at the time of crash (based on what blocks were being written) it should be possible to avoid a full check. You can buy hardware that can do this. Sun sold it in the 1980's, and people have made NVRAM-backed caching disk controllers for forever, since then. The easiest way to do this is to provide some sort of domain control, so that you limit the files you have to check to a smaller set of files per CG, so that you don't have to check all the files to check a particular CG -- and then preload the CG's for the sets of files you *do* have to check. If you have to visit a CG (during fsck) you have already paid the cost of the seek and rotational latency. No. The cost of the fsck is *not* the reading of the cylinder group bitmaps. You can cache a fixed number of those. The problem is in not knowing which inodes and indirect blocks contain direct and indirect block references to the cylinder groups you happen to have in cache. In other words, the cost is in enumerating all the allocated blocks in all of the files. The *reason* it doesn't matter in the CG bitmap case, is that a bitmap can only tell allocated vs. unallocated; there is no third state that means I haven't checked this bit yet. So you can only load up however many bitmaps you are going to check in a given pass (hopefully they all fit in memory, but if they don't fitting in the address space does you no good, because they still aren't tri-state), and then iterate every single bit reference in existance, and compare them to the ones you have, and clear them if they aren't referenced by *anyone*. In other words, this is *always* a cylinder group gitmap bit-major operations (as in row-major vs. column-major arrays in C and FORTRAN). Journalling wouldn't help here if you still have a zillion CGs. Yes, it would. If you had Journalling, you wouldn't have CG bitmaps to worry about, no matter how many CG's you had. The *only* thing you would have that you cared about is the operation started and operation complete stamps on a journal entry, and you'd *only* care whether the former was greater or less than than the latter. And then you'd write a field of last valid journal entry flip-flop into the oldest of the two flip-flop fields with a journal reference and the current date stamp, and you'd be done. The only thing you'd care about was how many last valid flip-flops you had, which would be totally unrelated to cylinder groups, because it's a file-major idea (and you have to traverse the files anyway). Another way of saying this is... don't put all your space in a single FS. 8-). Or in effect treat each CG (or a group of CGs) as a self
Software RAID caching? (was: Anyone working on fsck?)
On Monday, 17 March 2003 at 23:02:38 -0500, Jeff Roberson wrote: On Mon, 17 Mar 2003, Terry Lambert wrote: Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). They are not responsible for data caching. Only informing the upper layers how many spindles they have. Software RAID should be a transform only in my opinion. There is no reason to have duplicate block caches in system memory. Agreed. Vinum doesn't cache. There is one case, though, where it could be argued that it's worthwhile, namely in RAID-[45] parity blocks. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: NFS file unlocking problem
Hi, Terry - On Mon, Mar 17, 2003 at 07:02:31PM -0800, Terry Lambert wrote: Andrew P. Lentvorski, Jr. wrote: being sent is SETLKW which is a blocking wait until lock is granted. If the server thinks the file is already locked, it will hang *and* that is the proper behavior. It is, to ensure FIFO ordering of request grants. You could also implement this as a retry. If you do it the first way, you end up potentially deadlocking the server shen a single client has badly behaved code that locks against itself. If you do it the second way, you end up with timing dependent starvation deadlocks for individual client processes. Note that the first deadlock is normal -- it would happen if the file were local, as well... no help for badly written code -- but I mention it as important because we are talking about blocking multiple clients. I don't know what the process is, but a threaded process can cause a deadlock when it should be a grant/upgrade/downgrade of an existing lock overlap. This is because there is no such thing as a thread ID in the NFS protocol, and if process IDs are different for different threads, and the requests come from the same system ID, then you can get a deadlock when none should be present. To avoid this, either manage all locks in an apartment or rental model (queue all requests to a single thread, and have it do the locking by proxy) OR make sure that all requests from any thread in a given process in fact are given the same proxy process ID on the wire. [ ... This last is not likely your problem, but I mention it, in case you are using rfork() or Linux threads ... ] Thanks for the explanation. If I were a programmer, it would be very useful. As it is, it's still interesting. I have no way of judging the quality of the code in question, other than the empirical result that it works in most cases. What is the result of running this locally on the NFS server and attempting to lock the underlying file? If rpc.lockd is hanging onto a lock, running that perl script locally on the actual file (not an NFS mounted image of it) should also hang. That was my next question, as well: does it happen on a local FS as well as an NFS FS? Personally, I would *NOT* recommend running it on the server, but mount a local FS on the client instead; the less variables, the better. Works fine on the client on a local file system. Works fine on the server. On the other hand, this is clearly a deadlock that requires an existing, conflicting lock -- IFF the you are correct about the delayed locking behaviour. Not sure I understand this. As a side note, you probably want to create a C executable to do this kind of fcntl fiddling when attempting to test NFS. That way you can use a locally mounted binary and you won't wind up with all of the Perl access calls on the NFS wire. Or, at least, use a local copy of Perl. I recommend a pared down test case. I suspect that the problem is that something that is expected to have the same ID is locking against itself. I can't pare it down any further using perl. If someone better at C than I am gives me a sample C program, I'll be happy to try it. Does the failure occur with the same values in all cases in the F_RSETLKW? If so, I suggest you capture *all* locking packets on your wire, and then find who is conflicting. This may be a simple lock order reversal (deadly embrace deadlock) due to poor application performance. You may also find that you have multiple process IDs, when it should be a single process ID, for the proxy PID for the conflicting request. At worst, it would be nice to know the system that caused it. Actually, for a lock you know is threre, you *can* diagnose the problem (somewhat) by writing a program on the server, and using F_GETLK on the range for the hanging lock on the server -- this will return a struct flock, which will give you range and PID information. Do it on the Solaris box, though. The reason you want to do this on the Solaris box is that the struct flock on FreeBSD fails to include the l_rsysid -- the remote system ID. Sorry, but I don't understand any of that. Actually, given this, I don't understand how FreeBSD server side proxy locking can actually work at all; it would incorrectly coelesce locks with local locks when the l_pid matched, which would be *all* locks in the lockd, and then incorrectly release them when a local process exited, or any process on any remote system unlocked an overlapping range (possibly in error). So you're suggesting that when it works, it's just lucky? But others have said that it works for them, and it seems to work OK between FreeBSD systems. You are using FreeBSD as the NFS client in this case, right? If so, that's probably not an issue for you... No. I think that you may be trying to solve a problem I don't have. First - I'm not a programmer. I'm not trying to write any
Re: Anyone working on fsck?
I'll stop as soon as KSE is finished, fair ? I'm very disappointed in this response. Poul, everything else I've read from you to date has been reasonable except for this posting. I would think that you, yourself, should be especially sensitive to criticism of unfinished projects. Things such as KSE, SMP, and GEOM itself are each huge projects that require a great deal of perserverance over a long period of time to reach a state where in not just the majority of cases, but even in all the hard-to-get-to edge cases, the bugs and gotchas are put to rest. Each just affects too many areas of the system, and too many users; and too many changes in various areas of the system are interrelated. The QA involved is difficult enough in and of its own without personal dissention. Unlike the ports (where I generally work), I well understand that it is often difficult to work on kernel problems in isolation from work on other problems. Large undertakings such as these are areas where the many eyes development paradigm is stressed to its maximum. These are the areas where cooperation and collegiality are the most needed to overcome the significant intrinsic technical hurdles. And this is exactly the area where I believe you, in the above posting, have let the project down. But it's your personal reputation, not mine, that is at stake here. If this is the way you wish to have yourself represented in public, it's not my problem, but your own. I just wish you had reached for the delete key before sending this post. Mark Linimon To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: HEADS UP: Don't upgrade your Alphas!
Thus spake Ruslan Ermilov [EMAIL PROTECTED]: Yes, as I have suspected, the gdtoa change is responsible for a breakage. libc corresponding to this lib/libc works: cvs -q up -P -d -D'2003/03/12 20:20:00' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... This version, together with contrib/gdtoa, doesn't: cvs -q up -P -d -D'2003/03/12 20:29:59' : Using /home/ru/w/f/usr.bin/awk/nawk nawk... : nawk: floating point exception 8 : input record number 325, file : source line number 84 To see the breakage, one needs to install new libc, and run (assuming that /usr/bin/nawk is dynamically linked) make in usr.bin/truss; this will run the awk(1) script that exhibits one of these FPEs. P.S. Hmm, I didn't test this on i386, as I found this bug when attempting to produce a cross-release of i386 on Alpha, so i386's may be affected too. Will see. The bug is Alpha specific; makenewsyscalls.sh will die if you are using an awk compiled with the new sources, regardless of what architecture you are building for. This is because floating point support on Alpha is broken unless you specifically tell gcc to unbreak it by specifying -mieee. This week is really bad for me, so unless there's a quick fix, I won't get a chance to look into it further until Thursday night. Here's a way (other than using -mieee when compiling awk) to hack around the problem. It may not work with higher optimization levels. Index: lib.c === RCS file: /home/ncvs/src/contrib/one-true-awk/lib.c,v retrieving revision 1.1.1.3 diff -u -r1.1.1.3 lib.c --- lib.c 17 Mar 2003 07:59:58 - 1.1.1.3 +++ lib.c 18 Mar 2003 07:09:45 - @@ -678,7 +678,7 @@ char *ep; errno = 0; r = strtod(s, ep); - if (ep == s || r == HUGE_VAL || errno == ERANGE) + if (ep == s || isinf(r) || errno == ERANGE) return 0; while (*ep == ' ' || *ep == '\t' || *ep == '\n') ep++; That said, can someone out there with Alpha FP clue let me know why silent NaN's are broken? The architecture guide says that software support is needed to support quiet vs. signalling NaNs, but the default gcc settings seem to do this incorrectly. For instance, the following is wrong: [EMAIL PROTECTED]:~ cat foo.c #include math.h int main () { return (NAN == HUGE_VAL); } [EMAIL PROTECTED]:~ gcc foo.c [EMAIL PROTECTED]:~ ./a.out Floating exception [EMAIL PROTECTED]:~ FWIW, the bug is caused by the fact that our strtod() didn't used to understand the string nan as specified in ANSI C99, and now it does. awk detects whether a given token is a number using strtod(), and since quiet NaNs seem to incorrectly cause exceptions on Alpha, it chokes on the nanosleep symbol in syscalls.master. I'm hoping there's a better solution than disabling support for NaNs in strtod(). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: NFS file unlocking problem
Steve Sizemore wrote: Thanks for the explanation. If I were a programmer, it would be very useful. As it is, it's still interesting. I have no way of judging the quality of the code in question, other than the empirical result that it works in most cases. Well, then you are stuck with the code you have that someone else wrote. Hopefully that's not your problem, or your are in trouble. 8-). What is the result of running this locally on the NFS server and attempting to lock the underlying file? If rpc.lockd is hanging onto a lock, running that perl script locally on the actual file (not an NFS mounted image of it) should also hang. That was my next question, as well: does it happen on a local FS as well as an NFS FS? Personally, I would *NOT* recommend running it on the server, but mount a local FS on the client instead; the less variables, the better. Works fine on the client on a local file system. Works fine on the server. OK, then it isn't an intra-program deadlock, which is something. It could still be inter-program, but if it is, it's not going to be easy to find; you will need to find someone who *is* a programmer. FWIW, this happen when: Program 1 Program 2 LOCK A LOCK B LOCK B (Waiting for Program 2) LOCK A (Waiting for Program 1 waiting for me) On the other hand, this is clearly a deadlock that requires an existing, conflicting lock -- IFF the you are correct about the delayed locking behaviour. Not sure I understand this. If someone didn't already have it locks, your lock which waits for the region to be able to lock it would not need to wait: it would just give you the lock, and you wouldn't have the problem. Does the failure occur with the same values in all cases in the F_RSETLKW? If so, I suggest you capture *all* locking packets on your wire, and then find who is conflicting. This may be a simple lock order reversal (deadly embrace deadlock) due to poor application performance. You may also find that you have multiple process IDs, when it should be a single process ID, for the proxy PID for the conflicting request. At worst, it would be nice to know the system that caused it. Actually, for a lock you know is threre, you *can* diagnose the problem (somewhat) by writing a program on the server, and using F_GETLK on the range for the hanging lock on the server -- this will return a struct flock, which will give you range and PID information. Do it on the Solaris box, though. The reason you want to do this on the Solaris box is that the struct flock on FreeBSD fails to include the l_rsysid -- the remote system ID. Sorry, but I don't understand any of that. You need to find out why it's waiting. If it's waiting, it's waiting for somebody. You need to know who that somebody is. Once you know that, you can go hit them over the head with a large baseball bat. 8-). I have attached the program to run on your Solaris box. You may have to look in /usr/include/sys/fcntl.h to see the right name, if it complains about l_rsysid (might be l_sysid, or whatever). Actually, given this, I don't understand how FreeBSD server side proxy locking can actually work at all; it would incorrectly coelesce locks with local locks when the l_pid matched, which would be *all* locks in the lockd, and then incorrectly release them when a local process exited, or any process on any remote system unlocked an overlapping range (possibly in error). So you're suggesting that when it works, it's just lucky? But others have said that it works for them, and it seems to work OK between FreeBSD systems. I would have to look at the locking code in FreeBSD for the NFS case. I wrote some NFS locking code for FreeBSD in 1995 that was not used for the implementation. There are ways around the problem in userspace, but they're very hard to make efficient or get correct. They also make it very hard to debug easily, because you can't get the system ID for systems that have outstanding locks. 8-(. You are using FreeBSD as the NFS client in this case, right? If so, that's probably not an issue for you... No. I think that you may be trying to solve a problem I don't have. First - I'm not a programmer. I'm not trying to write any program at all, except as necessary to diagnose this problem. I'll summarize the situation briefly. The issue cropped up in a commercial program (Xinet) which was working on Solaris 2.6 client and server. I'm replacing the server with a FreeBSD box (RELENG_5_0) and the program stopped working. Xinet tech support diagnosed it as nfs locking problem, which I've confirmed by my simple perl program. Client Server Result == === == Solaris Solaris Works FreeBSD Solaris Works FreeBSD
Re: HEADS UP: Don't upgrade your Alphas!
David Schultz wrote: This is because floating point support on Alpha is broken unless you specifically tell gcc to unbreak it by specifying -mieee. Sounds like the ability to turn -mieee off at all, let alone making it the default, is bad? If so, why is that the way it is configured? I'm hoping there's a better solution than disabling support for NaNs in strtod(). Make -mieee on by default? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Anyone working on fsck?
Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Greg 'groggy' Lehey writes: Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Poul-Henning, how can you justify the second half of that sentence? I take exception to the implications. In case anybody is in any doubt, I've heard you say this sort of thing about julian before. Please don't do it again. I'll stop as soon as KSE is finished, fair ? Poul-Henning.. this is a bit of a cheap shot. Your point may be valid, but this isn't the way to express it as it just turns into a 'phk is being mean again' flamewar and the message gets lost in the noise. Anyway, the deadline for KSE to be demonstrated as robust in order to avoid getting disabled for 5.x is getting closer. I'm glad it was going to be finished inside 2 months, starting about 18 months ago. (See the 5.x release milestones for the actual deadline, June 30 if I recall correctly.) Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message