Re: Broken locale after upgrade to 6-STABEL from 5-STABLE

2007-05-24 Thread Stanislaw Halik
On Thu, May 24, 2007, Artem Kuchin wrote:
 What i don't understand, is how the appropriate 'so'
 is selected? How freebsd known which so to load
 this
 libc.so.5
 or this
 libc.so.6
 ?
 Did you recompile Perl after the last installworld? If not, do so.
 No i did not do it explicitelly. However, i think buildworl should
 do it, or it is not a part of the world anymore?

There's no Perl in the base system anymore; symlinking after a library
bump leads to errors, as ABI has changed.

-- 
Whenever you find that you are on the side of the majority, it is time
to reform.
-- Mark Twain
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Broken locale after upgrade to 6-STABEL from 5-STABLE

2007-05-23 Thread Stanislaw Halik
On Thu, May 24, 2007, Artem Kuchin wrote:
 What i don't understand, is how the appropriate 'so'
 is selected? How freebsd known which so to load
 this
 libc.so.5
 or this
 libc.so.6
 ?

Did you recompile Perl after the last installworld? If not, do so.

-- 
Whenever you find that you are on the side of the majority, it is time
to reform.
-- Mark Twain
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: large system date skew on RELENG_6 changes causes select() failures

2006-09-07 Thread Stanislaw Halik
On Tue, Sep 05, 2006, Mark Andrews wrote:

 A while ago, by accident, I've changed the system date back to the '98
 using date(1). To my astonishment, screen(1) barfed about EINVAL in
 select() and died. Programs, including opera (native FreeBSD-6 binary)
 kept spinning the CPU until I killed them.

 I have no means for debugging it.

 Is this somehow expected? If not (i.e. it's a bug), is it known?

 Probably, they calculated timeout's which magicly became negative, which
 isn't a valid timeout, and none of the programs are programmed well enough
 to handle the case and exhibited the behavior that you saw...

   Nope.  Just a simple limit in itimerfix.

 int
 itimerfix(struct timeval *tv)
 {

 if (tv-tv_sec  0 || tv-tv_sec  1 ||
 tv-tv_usec  0 || tv-tv_usec = 100)
 return (EINVAL);
 if (tv-tv_sec == 0  tv-tv_usec != 0  tv-tv_usec  tick)
 tv-tv_usec = tick;
 return (0);
 }

   date -j 9809051630 +%s - 904977000
   date +%s - 1157438219
   1157438219 - 904977000 - 252461219 which is greater that 1

The loop in GNU screen, which invokes select() looks like this:

{
  struct timeval t;

  t.tv_sec = (long) (msec / 1000);
  t.tv_usec = (long) ((msec % 1000) * 1000);
  select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, t);
}

There's no time_t substraction at all.

I dare to say that it's a bug.
/me ducks
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1-STABLE spontaneously reboots!

2006-09-05 Thread Stanislaw Halik
On Tue, Sep 05, 2006, [EMAIL PROTECTED] wrote:

 It's now back in production and for no cause that I can find, it reboots
 itself roughly twice a day.  Nothing in the syslog or console log [...]

Try setting `dumpdev' in your rc.conf to the swap device. After the
machine reboots, you might get a dump in /var/crash.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


large system date skew on RELENG_6 changes causes select() failures

2006-09-04 Thread Stanislaw Halik
Hello,

A while ago, by accident, I've changed the system date back to the '98
using date(1). To my astonishment, screen(1) barfed about EINVAL in
select() and died. Programs, including opera (native FreeBSD-6 binary)
kept spinning the CPU until I killed them.

I have no means for debugging it.

Is this somehow expected? If not (i.e. it's a bug), is it known?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Network often not responding

2006-08-09 Thread Stanislaw Halik
On Wed, Aug 09, 2006, Dominic Marks wrote:
 Aug  9 15:09:16 cache kernel: xl0: transmission error: 90
 Aug  9 15:09:16 cache kernel: xl0: tx underrun, increasing tx start 
 threshold to 120 bytes

 dc%d: TX underrun -- increasing TX threshold  The device generated a
 transmit underrun error while attempting to DMA and transmit a packet.
 This happens if the host is not able to DMA the packet data into the
  
 NIC's FIFO fast enough.  The driver will dynamically increase the
  ~~
 trans- mit start threshold so that more data must be DMAed into the
 FIFO before the NIC will start transmitting it onto the wire.

 So it would seem like the card cannot keep pace with the system. What NICs
~
 have you tried?

Basing on the quoted text, isn't it the opposite?

-- 
I saw `cout' being shifted Hello world times to the left and stopped right
there. -- Steve Gonedes


signature.asc
Description: Digital signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-07-03 Thread Stanislaw Halik
On Fri, Jun 30, 2006, Robert Watson wrote:
 Thanks for testing the patch -- it looks like there's a more pressing 
 logical problem in this code!  Could you try the following simpler patch:

 http://www.watson.org/~robert/freebsd/netperf/ip_ctloutput.diff

 The IP option code seems not to know that (in RELENG_6 and before) the pcb 
 is discarded on disconnect, and the application is querying the TTL after a 
 disconnect.  In FreeBSD 7.x, the pcb is preserved after disconnect so this 
 succeeds.

I'm running with the patch applied for 3 days straight and the machine
didn't crash once. Please, consider merging it to RELENG_6.


pgpFABO0jK0gx.pgp
Description: PGP signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-30 Thread Stanislaw Halik
On Wed, Jun 28, 2006, Robert Watson wrote:

 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you,
 experienced people, suggest me if it's a hardware problem or is it an
 error inside the OS?
 This is a known bug in the TCP code; a large set of outstanding changes 
 is present in 7.x that will fix the problem when merged.  However, I 
 recently had push-back on merging the larger batch of changes, so am 
 looking at merging a workaround that will also correct the problem 
 without the larger set of architectural changes.  I hope to have a chance 
 to look at that in detail this weekend.

 I'm glad to know that it isn't either unknown or hardware-related. Thank 
 you for your prompt reply!

 Per my earlier e-mail, I had hoped to merge a larger set of changes from 
 HEAD that resolve the underlying problem here (that inpcb's can be detached 
 from a socket while the socket is still in use), but right now I'm 
 deferring merging those changes as they are somewhat risky (as they are 
 large).  Instead, I've produced a candidate work-around patch, now attached 
 to kern/97095.  This does not fix the underlying problem, but seeks to 
 narrow the window for the race to be exercised by avoiding caching a 
 volatile pointer across user memory copying, which under load can result in 
 blocking I/O.  I would be quite interested in knowing if this resolves the 
 problem in practice -- if so, it's a definite short-term merge candidate to 
 reduce the symptoms of this problem until the proper fix can be merged.

Unfortunately, it still happens to crash in the same code path:

(kgdb) up 7
#7  0xc058e947 in ip_ctloutput (so=0x0, sopt=0xd67f2c80) at
/usr/src/sys/netinet/ip_output.c:1216
1216inp-inp_ip_tos = optval;
(kgdb) l /usr/src/sys/netinet/ip_output.c:1216
1211break;
1212
1213inp = sotoinpcb(so);
1214switch (sopt-sopt_name) {
1215case IP_TOS:
1216inp-inp_ip_tos = optval;
1217break;
1218
1219case IP_TTL:
1220inp-inp_ip_ttl = optval;
(kgdb) p inp
$1 = (struct inpcb *) 0x0

I'll be happy to test any other patches when they're available.


pgpugvvEnIizw.pgp
Description: PGP signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-30 Thread Stanislaw Halik
On Fri, Jun 30, 2006, Robert Watson wrote:
 Unfortunately, it still happens to crash in the same code path:
 snip
 I'll be happy to test any other patches when they're available.

 Thanks for testing the patch -- it looks like there's a more pressing 
 logical problem in this code!  Could you try the following simpler patch:

 http://www.watson.org/~robert/freebsd/netperf/ip_ctloutput.diff

 The IP option code seems not to know that (in RELENG_6 and before) the pcb 
 is discarded on disconnect, and the application is querying the TTL after a 
 disconnect.  In FreeBSD 7.x, the pcb is preserved after disconnect so this 
 succeeds.

 It could be we actually need both patches, but let's try this one by itself 
 first.

Thanks. I'll report back in few days after testing the patch.


pgpP50XgLAZwz.pgp
Description: PGP signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-28 Thread Stanislaw Halik
On Wed, Jun 28, 2006, Robert Watson wrote:

 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you,
 experienced people, suggest me if it's a hardware problem or is it an
 error inside the OS?
 This is a known bug in the TCP code; a large set of outstanding changes 
 is present in 7.x that will fix the problem when merged.  However, I 
 recently had push-back on merging the larger batch of changes, so am 
 looking at merging a workaround that will also correct the problem 
 without the larger set of architectural changes.  I hope to have a chance 
 to look at that in detail this weekend.
 I'm glad to know that it isn't either unknown or hardware-related. Thank 
 you for your prompt reply!
 Per my earlier e-mail, I had hoped to merge a larger set of changes from 
 HEAD that resolve the underlying problem here (that inpcb's can be detached 
 from a socket while the socket is still in use), but right now I'm 
 deferring merging those changes as they are somewhat risky (as they are 
 large).  Instead, I've produced a candidate work-around patch, now attached 
 to kern/97095.  This does not fix the underlying problem, but seeks to 
 narrow the window for the race to be exercised by avoiding caching a 
 volatile pointer across user memory copying, which under load can result in 
 blocking I/O.  I would be quite interested in knowing if this resolves the 
 problem in practice -- if so, it's a definite short-term merge candidate to 
 reduce the symptoms of this problem until the proper fix can be merged.

 http://www.watson.org/~robert/freebsd/netperf/20060628-ip_ctloutput.diff

Thank you for the patch. I'll let you know in few days if the crash
occurs again. It's quite reproducible (crashed yesterday in the same
code path).


pgpeerBZV3ylV.pgp
Description: PGP signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-27 Thread Stanislaw Halik
On Tue, Jun 27, 2006, Robert Watson wrote:
 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you, 
 experienced people, suggest me if it's a hardware problem or is it an 
 error inside the OS?
 This is a known bug in the TCP code; a large set of outstanding changes is 
 present in 7.x that will fix the problem when merged.  However, I recently 
 had push-back on merging the larger batch of changes, so am looking at 
 merging a workaround that will also correct the problem without the larger 
 set of architectural changes.  I hope to have a chance to look at that in 
 detail this weekend.

I'm glad to know that it isn't either unknown or hardware-related. Thank
you for your prompt reply!


pgpFCRk2uy1Cj.pgp
Description: PGP signature


trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-26 Thread Stanislaw Halik
Hello,

6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you,
experienced people, suggest me if it's a hardware problem or is it an
error inside the OS?


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x58
fault code  = supervisor write, page not present
instruction pointer = 0x20:0xc058e01a
stack pointer   = 0x28:0xd68d5acc
frame pointer   = 0x28:0xd68d5b04
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 42435 (rtorrent)
trap number = 12
panic: page fault
Uptime: 24d18h34m6s
Dumping 511 MB (2 chunks)
  chunk 0: 1MB (160 pages) ... ok
  chunk 1: 511MB (130816 pages) 496 480 464 448 432 416 400 384 368 352 336 320 
304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16

#0  doadump () at pcpu.h:165
165 __asm __volatile(movl %%fs:0,%0 : =r (td));
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc04d609c in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc04d63e9 in panic (fmt=0xc06817e7 %s) at 
/usr/src/sys/kern/kern_shutdown.c:565
#3  0xc066347c in trap_fatal (frame=0xd68d5a8c, eva=0) at 
/usr/src/sys/i386/i386/trap.c:836
#4  0xc0663152 in trap_pfault (frame=0xd68d5a8c, usermode=0, eva=88) at 
/usr/src/sys/i386/i386/trap.c:744
#5  0xc0662d0f in trap (frame=
  {tf_fs = 892993544, tf_es = -1014235096, tf_ds = -1024327640, tf_edi = 0, 
tf_esi = 0, tf_ebp = -695379196, tf_isp = -695379272, tf_ebx = -695378816, 
tf_edx = -695378544, tf_ecx = 0, tf_eax = 8, tf_trapno = 12, tf_err = 2, tf_eip 
= -1067917286, tf_cs = 32, tf_eflags = 2163335, tf_esp = -695378816, tf_ss = 
-695379220}) at /usr/src/sys/i386/i386/trap.c:434
#6  0xc0653cfa in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc058e01a in ip_ctloutput (so=0xd68d5d90, sopt=0xd68d5c80) at 
/usr/src/sys/netinet/ip_output.c:1210
#8  0xc059f7df in tcp_ctloutput (so=0xc35fb6f4, sopt=0xd68d5c80) at 
/usr/src/sys/netinet/tcp_usrreq.c:1038
#9  0xc051d867 in sosetopt (so=0xc35fb6f4, sopt=0xd68d5c80) at 
/usr/src/sys/kern/uipc_socket.c:1560
#10 0xc05246b9 in kern_setsockopt (td=0xc38c6780, s=8, level=8, name=8, 
val=0xbfbfe61c, valseg=UIO_USERSPACE, valsize=0)
at /usr/src/sys/kern/uipc_syscalls.c:1351
#11 0xc05245be in setsockopt (td=0x8, uap=0xd68d5d90) at 
/usr/src/sys/kern/uipc_syscalls.c:1307
#12 0xc0663870 in syscall (frame=
  {tf_fs = 139198523, tf_es = 138412091, tf_ds = -1078001605, tf_edi = 
-1077942700, tf_esi = -1077942700, tf_ebp = -1077942744, tf_isp = -695378588, 
tf_ebx = 673057632, tf_edx = 0, tf_ecx = 0, tf_eax = 105, tf_trapno = 0, tf_err 
= 2, tf_eip = 676107131, tf_cs = 51, tf_eflags = 2097734, tf_esp = -1077942788, 
tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:981
#13 0xc0653d4f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#14 0x0033 in ?? ()
Previous frame inner to this frame (corrupt stack?)


Thanks in advance for any feedback.

-- 
Stanislaw Halik


pgpQPDZNKMsvp.pgp
Description: PGP signature


Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006

2006-06-26 Thread Stanislaw Halik
On Tue, Jun 27, 2006, Stanislaw Halik wrote:
 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you,
 experienced people, suggest me if it's a hardware problem or is it an
 error inside the OS?
[...]

More info follows:

#7  0xc058e01a in ip_ctloutput (so=0xd68d5d90, sopt=0xd68d5c80) at
/usr/src/sys/netinet/ip_output.c:1210
1210inp-inp_ip_tos = optval;
Current language:  auto; currently c
(kgdb) p inp
$1 = (struct inpcb *) 0x0



pgp7bGapJ0ISD.pgp
Description: PGP signature


Re: GELI issues ? (Re: Increase in panics under 6.1)

2006-05-26 Thread Stanislaw Halik
On Thu, May 25, 2006, Fabian Keil wrote:
 Interestingly enough , i had some nasty issues todays on same laptop.
 I had  2 x 6 GB GELI vnodes, running mtree -K md5digest to compare
 contents. Disk IO was high as expected...but then it just died down
 (but the mtree hadnt finished). 
 (swap is also GELI)

 Any subsequent process trying to access the encrypted mount points
 simply stalled for as long as I cared to wait (10 minutes). The
 processes even stalled a shutdown -r. 
 I'm not sure if it's related, but I lately see this behaviour on
 NFS mounts if the server is not responding.

 Doing cd /mnt/mydeadnfsmount/[tab for autocompletion]
 is enough to render the current console unresponsive.

Isn't that normal and desired for `hard' mounts?

-- 
Kocia galeria - http://koty.foo.pl/


pgpLwvacqJ6dg.pgp
Description: PGP signature


Re: possible tcp problem

2006-05-20 Thread Stanislaw Halik
On Fri, May 19, 2006, Andras Got wrote:
 I'm using freebsd 6.1 and _sometimes_ (one for every ~30-40 minutes) I
 get mysql connect errors with permission denied. The mysql_connect
 returns error code 1, which is permission denied.

Quite certainly not true:

 Errors:
 sendmail[37085]: gethostbyaddr(IP) failed: 1
   ^
 Can't connect to MySQL server on 'IP' (1)

/usr/include/netdb.h:#defineHOST_NOT_FOUND  1 /* Authoritative Answer Host 
not found */

HTH,

-- sh


pgpowmI3S3K0X.pgp
Description: PGP signature


Re: possible tcp problem

2006-05-20 Thread Stanislaw Halik
On Sun, May 21, 2006, Andras Got wrote:
[top-posting corrected]
 On Fri, May 19, 2006, Andras Got wrote:
 I'm using freebsd 6.1 and _sometimes_ (one for every ~30-40 minutes) I
 get mysql connect errors with permission denied. The mysql_connect
 returns error code 1, which is permission denied.

 Quite certainly not true:

 Errors:
 sendmail[37085]: gethostbyaddr(IP) failed: 1
 ^
 Can't connect to MySQL server on 'IP' (1)

 /usr/include/netdb.h:#defineHOST_NOT_FOUND  1 /* Authoritative Answer 
 Host not found */

 HTH,

 Yes... This was because a bad setting in pf.conf. The state
 rules/buffers or something filled from time to time.

If the state limit is being approached, try adaptive.{start,end} and/or
limiting the total number of states for offending connections.

 The sendmail error wasn't connected to this, they just in the same
 jail and almost similar errors. I thought 1 means the same accross
 programs. :(

For errno -- yes. In that case, it's h_errno being set.

-- sh


pgpd45AuqlrD9.pgp
Description: PGP signature


Re: GCC in 6.0 fails to compile latest MySQL port

2006-04-08 Thread Stanislaw Halik
On Sat, Apr 08, 2006, Václav Haisman wrote:

 It could help me confirm/exclude that possibility if somebody could
 try to compile the preprocessed source with the pasted command.

doesn't ICE on 6.1-PRERELEASE from Apr 3.

HTH,

-- sh


pgpK6DllxLqpp.pgp
Description: PGP signature


sys/compat/linux/linux_socket.c - log() message without trailing \n

2005-12-23 Thread Stanislaw Halik
hello,

I got such entry mailed to me in last periodic run:

obsolete pre-RFC2553 sockaddr_in6 rejectedarplookup 62.21.28.103 failed: host 
is not on local network

I've grepped through sys/compat/linux/linux_socket.c and looks like
there should be an \n on line 157.

regards,

-- 
Stanisław Halik, http://tehran.lain.pl


pgppQmNyeD6n6.pgp
Description: PGP signature


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Stanislaw Halik
Xin LI [EMAIL PROTECTED] wrote:
 I have a box indicating the following sometimes:
 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096

 It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks
 attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled.

 Does this indicate a hardware issue, or some bugs elsewhere?

are you using swapfiles? swap memory through files's performance is kind
of lower than in swap partitions and it might be the cause. i've got
same messages from a server with swap file, but nothing which would
affect stability.

-- 
Stanisław Halik, http://tehran.lain.pl


pgpFcOPhEX3BH.pgp
Description: PGP signature