Re: Incorrect file size?

2008-06-23 Thread Krassimir Slavchev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Oliver Fromme wrote:
 Ivan Voras wrote:
   Rink Springer wrote:
The 'vscan' user leads me assume this is SpamAssassin - I've seen this
behaviour at work, where our scripts were trying to backup a 1TB file
(which actually was ~vscan/.spamassassin/auto-whitelist). The result was
that the backup script died due to lack of disk space on the backup
server (as we don't use compression).

When I was investigating why the file could be so large it, it turned
out the file was only a few hunderd 'real' MB's, so that is why I assume
this person is having the same issue as we do. The file is a Berkeley DB
file, by the way, so there's nothing textfile about it ;-)
   
   I learn something every day :)
   Didn't know BDB was smart enough to create sparse files.
 
 BTW, you can use ls -ls to display the number of physical
 blocks allocated to the file, so you can easily see whether
 a file is sparse or not:
 
 $ dd if=/dev/zero of=foo1 bs=1m count=1
 $ truncate -s 1m foo2
 $ ls -ls foo1 foo2
 1040 -rw---  1 olli  olli  1048576 Jun 20 22:43 foo1
   32 -rw---  1 olli  olli  1048576 Jun 20 22:43 foo2

# ls -lsk
total 1247288
 664064 -rw---  1 vscan  vscan  4398199488512 Jun 23 09:39
auto-whitelist
 88 -rw---  1 vscan  vscan  89976 Jun 23 09:39 bayes_journal
 566704 -rw---  1 vscan  vscan  1099639861248 Jun 23 09:39 bayes_seen
  16432 -rw---  1 vscan  vscan   21454848 Jun 23 09:39 bayes_toks

 
 As you can see, the file size is the same, but the block
 counts are different (I have BLOCKSIZE=K in my environment,
 so the blocks are displayed in 1KB units).
 
 I've written a small script that can be used to detect
 sparse files (it even displays the sparseness percentage):
 
 http://www.secnetix.de/olli/scripts/sparsecheck
 
 Best regards
Oliver
 
 PS:  Of course it is still possible that a file system is
 corrupt and needs fsck, no matter whether those files are
 sparse or not.
 


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFIX0W+xJBWvpalMpkRAqrdAJ47eLQ+WMp6zBrme5gNyCSvzBtdUwCffYwT
+37ul1gPqmk7rVKXRrha7fU=
=uSGe
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Incorrect file size?

2008-06-23 Thread Krassimir Slavchev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Ivan Voras wrote:
 Rink Springer wrote:
 On Fri, Jun 20, 2008 at 03:54:22PM +0200, Ivan Voras wrote:
 Except that the file in question should be, judging by the filename, a
 simple text file. I don't really see how a whitelist could grow to such
 monstrous sizes :) Most likely it's a file system corruption - fsck
 should be the first thing to try.
 The 'vscan' user leads me assume this is SpamAssassin - I've seen this
 behaviour at work, where our scripts were trying to backup a 1TB file
 (which actually was ~vscan/.spamassassin/auto-whitelist). The result was
 that the backup script died due to lack of disk space on the backup
 server (as we don't use compression).

Yes, it is SpamAssassin and I have the same problem with the backups.
The Amanda complains with:
 xxx  /  lev 1  FAILED [dump larger than available tape space,
307256178 KB, skipping incremental]
  / lev 4 FAILED [dump larger than tape, 354005617 KB,
skipping incremental]

I don't think this is a file system corruption because there are no
reasons for this. Also I have seen this on different machines.

Any ideas how to fix this?


 When I was investigating why the file could be so large it, it turned
 out the file was only a few hunderd 'real' MB's, so that is why I assume
 this person is having the same issue as we do. The file is a Berkeley DB
 file, by the way, so there's nothing textfile about it ;-)
 
 I learn something every day :)
 Didn't know BDB was smart enough to create sparse files.
 



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFIX0kwxJBWvpalMpkRAjkaAKCbF7sylsDyI2umCAoneqBqAYCNqwCgnQjA
9FkgJiVhqNI2NGCAlWpqfbg=
=PQZ1
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Incorrect file size?

2008-06-23 Thread Eugene Grosbein
On Mon, Jun 23, 2008 at 09:56:48AM +0300, Krassimir Slavchev wrote:

 I don't think this is a file system corruption because there are no
 reasons for this. Also I have seen this on different machines.
 
 Any ideas how to fix this?

Either use backup software that is aware of sparse files
or eliminate sofrware that uses such files :-)

Eugene
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Incorrect file size?

2008-06-23 Thread Jeremy Chadwick
On Mon, Jun 23, 2008 at 09:56:48AM +0300, Krassimir Slavchev wrote:
 Ivan Voras wrote:
  Rink Springer wrote:
  On Fri, Jun 20, 2008 at 03:54:22PM +0200, Ivan Voras wrote:
  Except that the file in question should be, judging by the filename, a
  simple text file. I don't really see how a whitelist could grow to such
  monstrous sizes :) Most likely it's a file system corruption - fsck
  should be the first thing to try.
  The 'vscan' user leads me assume this is SpamAssassin - I've seen this
  behaviour at work, where our scripts were trying to backup a 1TB file
  (which actually was ~vscan/.spamassassin/auto-whitelist). The result was
  that the backup script died due to lack of disk space on the backup
  server (as we don't use compression).

I've used SpamAssassin for a few years now, and I've never seen this
happen (including during migration from RELENG_6 to RELENG_7, and
between SpamAssassin versions (from 3.0 to 3.2.x).  I cannot even begin
to imagine how that file reached such a size, which is why I believe the
issue may be filesystem corruption.

I will take a moment to point out something I do whenever SpamAssassin
gets upgraded, though: I always delete bayes_* and auto-whitelist files
from .spamassassin, on every account.  I do this because there have been
cases in the past where the data format has changed in the DB, and SA
hasn't done a good job of seamlessly migrating them.

 Yes, it is SpamAssassin and I have the same problem with the backups.
 The Amanda complains with:
  xxx  /  lev 1  FAILED [dump larger than available tape space,
 307256178 KB, skipping incremental]
   / lev 4 FAILED [dump larger than tape, 354005617 KB,
 skipping incremental]

There isn't any way to solve this problem.  The file, as far as dump is
concerned, is indeed 3TB.

 I don't think this is a file system corruption because there are no
 reasons for this. Also I have seen this on different machines.

I can point you to all of the threads in the past year where users have
had mysterious problems with their filesystems, which have been fixed by
booting single-user and fsck'ing.  That may not be the problem here,
but it doesn't hurt for you to do it just in case now does it?

If it happens on multiple machines, I've only two recommendations:

1) Try what I said above (delete the bayes_* and auto-whitelist files).
If SpamAssassin recreates them as 3TB, then there is indeed a problem
with your system(s), because mine do not do that.

2) Consider migrating to dspam instead, which supposedly has a much
greater success rate of blocking spam.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Jerahmy Pocott

Hi,

I'm wondering if anything exists to set this.. When you create an INET  
socket

without the 'TCP_NODELAY' flag the network layer does 'naggling' on your
transmitted data. Sometimes with hosts that use Delayed_ACK  
(net.inet.tcp.
delayed_ack) it creates a dead-lock where the host will not ACK until  
it gets
another packet and the client will not send another packet until it  
gets an ACK..


The dead-lock gets broken by a time-out, which I think is around 200ms?

But I would like to change that time-out if possible to something  
lower, yet
I can't really see any sysctl knobs that have a name that suggests  
they do

that..

So does anyone know IF this can be tuned and if so by what?

Cheers,
Jerahmy.

(And yes you could solve it by setting the TCP_NODELAY flag on the  
socket,
but not everything has programmed in options to set it and you don't  
always

have access to the source, besides setting a sysctl value would be much
simpler than recompiling stuff)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Patch Failures during Portupgrade

2008-06-23 Thread Warren Liddell
Below is the hunk of ports that continually fail to upgrade due to problems 
with patching. i use the command portupgrade -aDkp -m BATCH=yes

Any assistance with this greatly appreciated.


** Listing the failed packages (-:ignored / *:skipped / !:failed)
! lang/ruby18 (ruby-1.8.6.111_1,1) (patch error)
! databases/mysql50-client (mysql-client-5.0.45_1) (patch error)
! graphics/OpenEXR (OpenEXR-1.6.0) (patch error)
! devel/libvolume_id (libvolume_id-0.75.0_1) (patch error)
! graphics/dri (dri-7.0.1,2) (install error)
! net/samba-libsmbclient (samba-libsmbclient-3.0.28) (patch error)
! graphics/libGLU (libGLU-7.0.1) (patch error)
! x11-fonts/xfs (xfs-1.0.5,1) (patch error)
! x11/xorg-libraries (xorg-libraries-7.3_1) (unknown build error)
! x11-toolkits/gtk20 (gtk-2.12.1_1) (patch error)
! irc/xchat (xchat-2.8.4_3) (patch error)
! sysutils/hal (hal-0.5.8.20070909) (configure error)
! x11/kdebase3 (kdebase-3.5.8) (patch error)
! x11-servers/xorg-server (xorg-server-1.4_3,1) (patch error)
! x11-drivers/xorg-drivers (xorg-drivers-7.3) (uninstall error)



=== Patching for ruby-1.8.6.111_2,1
=== Applying FreeBSD patches for ruby-1.8.6.111_2,1
Ignoring previously applied (or reversed) patch.
1 out of 1 hunks ignored--saving rejects to ext/tk/tkutil/extconf.rb.rej
= Patch patch-ext_tk_tkutil_extconf.rb failed to apply cleanly.
*** Error code 1

Stop in /usr/ports/lang/ruby18.
*** Error code 1

Stop in /usr/ports/lang/ruby18.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Matthew Dillon

:Hi,
:
:I'm wondering if anything exists to set this.. When you create an INET  
:socket
:without the 'TCP_NODELAY' flag the network layer does 'naggling' on your
:transmitted data. Sometimes with hosts that use Delayed_ACK  
:(net.inet.tcp.
:delayed_ack) it creates a dead-lock where the host will not ACK until  
:it gets
:another packet and the client will not send another packet until it  
:gets an ACK..
:
:The dead-lock gets broken by a time-out, which I think is around 200ms?
:
:But I would like to change that time-out if possible to something  
:lower, yet
:I can't really see any sysctl knobs that have a name that suggests  
:they do
:that..
:
:So does anyone know IF this can be tuned and if so by what?
:
:Cheers,
:Jerahmy.
:
:(And yes you could solve it by setting the TCP_NODELAY flag on the  
:socket,
:but not everything has programmed in options to set it and you don't  
:always
:have access to the source, besides setting a sysctl value would be much
:simpler than recompiling stuff)

There is a sysctl which adjusts the delayed-ack timing, its
called net.inet.tcp.delacktime.  The default is 1/10 of a second
(100 == 100 ms = 1/10 of a second).

BUT, it shouldn't be possible for nagle to deadlock against delayed acks
unless the TCP implementation is broken somehow.  A delayed ack is
simply that... the ack is delayed 100 ms in order to improve its
chances of being piggy-backed on return data.  The ack is not blocked
completely, just delayed, and certain events (such as the receiving
end turning around and sending data back, which is typical for an
interactive connection)... certain events will cause the delayed ack
to be aborted and for the ack to be immediately sent with the return data.

Can it break down and cause excessive lag?  Yes, it can.  Interactive
games almost universally have to disable Nagle because the lag is
actually due to the data relay from client 1 - server then relaying
the interactive event to client 2.  Without an immediate interactive
response to client 1 the ack gets delayed and the next event from 
client 1 hits Nagle and stops dead in the water until the first event
reaches client 2 and client 2 reacts to it (then client 2 - server - 
(abort delayed ack and send) - client 1 (client 1's nagle now allows
the second event to be transmitted).  That isn't a deadlock, just 
really poor interactive performance in that particular situation.

Delayed acks also have a safety valve.  The spec says that an ack
cannot be delayed more then two packets.  In a batch link when the
second (unacked) packet is received, the delayed ack is aborted and
an ack is immediately returned to the sender.  This is to prevent
congestion control (which is based on acks) from getting completely
out of whack and also to prevent the TCP window from getting exhausted.

In anycase, the usual solution is to disable Nagle rather then mess
with delayed acks.  What we need is a new Nagle that understands the
new reality for interactive connections... something that doesn't break
performance in the 'server in the middle' data relaying case.

-Matt

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Incorrect file size?

2008-06-23 Thread Krassimir Slavchev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Chadwick wrote:
 On Mon, Jun 23, 2008 at 09:56:48AM +0300, Krassimir Slavchev wrote:
 Ivan Voras wrote:
 Rink Springer wrote:
 On Fri, Jun 20, 2008 at 03:54:22PM +0200, Ivan Voras wrote:
 Except that the file in question should be, judging by the filename, a
 simple text file. I don't really see how a whitelist could grow to such
 monstrous sizes :) Most likely it's a file system corruption - fsck
 should be the first thing to try.
 The 'vscan' user leads me assume this is SpamAssassin - I've seen this
 behaviour at work, where our scripts were trying to backup a 1TB file
 (which actually was ~vscan/.spamassassin/auto-whitelist). The result was
 that the backup script died due to lack of disk space on the backup
 server (as we don't use compression).
 
 I've used SpamAssassin for a few years now, and I've never seen this
 happen (including during migration from RELENG_6 to RELENG_7, and
 between SpamAssassin versions (from 3.0 to 3.2.x).  I cannot even begin
 to imagine how that file reached such a size, which is why I believe the
 issue may be filesystem corruption.
 
 I will take a moment to point out something I do whenever SpamAssassin
 gets upgraded, though: I always delete bayes_* and auto-whitelist files
 from .spamassassin, on every account.  I do this because there have been
 cases in the past where the data format has changed in the DB, and SA
 hasn't done a good job of seamlessly migrating them.

I have made several upgrades but have never deleted these files.
 
 Yes, it is SpamAssassin and I have the same problem with the backups.
 The Amanda complains with:
  xxx  /  lev 1  FAILED [dump larger than available tape space,
 307256178 KB, skipping incremental]
   / lev 4 FAILED [dump larger than tape, 354005617 KB,
 skipping incremental]
 
 There isn't any way to solve this problem.  The file, as far as dump is
 concerned, is indeed 3TB.
 
 I don't think this is a file system corruption because there are no
 reasons for this. Also I have seen this on different machines.
 
 I can point you to all of the threads in the past year where users have
 had mysterious problems with their filesystems, which have been fixed by
 booting single-user and fsck'ing.  That may not be the problem here,
 but it doesn't hurt for you to do it just in case now does it?
 
 If it happens on multiple machines, I've only two recommendations:
 
 1) Try what I said above (delete the bayes_* and auto-whitelist files).
 If SpamAssassin recreates them as 3TB, then there is indeed a problem
 with your system(s), because mine do not do that.

I have deleted these files and the SpammAssassin recreated them
correctly (16384 bytes initial size).
 
 2) Consider migrating to dspam instead, which supposedly has a much
 greater success rate of blocking spam.

Yes, I will do this ASAP. Thanks for the point.
 

Thanks for all responses!

Best Regards


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFIX152xJBWvpalMpkRAlHbAJ4vkHJp2gVW4MWm+lC8ECuEuQGStQCdGPNX
50dC+RS0B0SD9tWCZsHPtKA=
=JPA9
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Incorrect file size?

2008-06-23 Thread David Malone
On Mon, Jun 23, 2008 at 12:19:28AM -0700, Jeremy Chadwick wrote:
 I've used SpamAssassin for a few years now, and I've never seen this
 happen (including during migration from RELENG_6 to RELENG_7, and
 between SpamAssassin versions (from 3.0 to 3.2.x).  I cannot even begin
 to imagine how that file reached such a size, which is why I believe the
 issue may be filesystem corruption.

I've seen spamassassin creating large sparse file regurally and do
other strang things like get the database into a state where it
spins if it trys to clean it out. That's even starting with a clean
database.

 There isn't any way to solve this problem.  The file, as far as dump is
 concerned, is indeed 3TB.

Dump does understand sparse files, but doesn't store them in the
most efficient way possible. Dumping a filesystem with a 1TB file
seems to use about 2GB.

David.

# truncate -s 1T /var/tmp/bigfile
# dump 0Lf - /var | wc -c
  DUMP: Date of this level 0 dump: Mon Jun 23 09:50:50 2008
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/ad4s3d (/var) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 2212849 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: DUMP: 2212776 tape blocks
  DUMP: finished in 59 seconds, throughput 37504 KBytes/sec
  DUMP: DUMP IS DONE
 2265876480

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread David Malone
On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:
 So does anyone know IF this can be tuned and if so by what?

You can tune it with net.inet.tcp.delacktime - it should be is ms.

David.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Jerahmy Pocott


On 23/06/2008, at 6:27 PM, Matthew Dillon wrote:

   Can it break down and cause excessive lag?  Yes, it can.   
Interactive



   games almost universally have to disable Nagle because the lag is
   actually due to the data relay from client 1 - server then  
relaying
   the interactive event to client 2.  Without an immediate  
interactive

   response to client 1 the ack gets delayed and the next event from
   client 1 hits Nagle and stops dead in the water until the first  
event
   reaches client 2 and client 2 reacts to it (then client 2 -  
server -
   (abort delayed ack and send) - client 1 (client 1's nagle now  
allows

   the second event to be transmitted).  That isn't a deadlock, just
   really poor interactive performance in that particular situation.


Yeah, that's what I'm talking about.

True, it's not really a dead-lock, but it's terribly slow! The  
interaction can

cause a 200ms delay on a LAN, as can be seen with samba if you disable
tcp_nodelay..



   In anycase, the usual solution is to disable Nagle rather then mess
   with delayed acks.  What we need is a new Nagle that understands  
the
   new reality for interactive connections... something that doesn't  
break

   performance in the 'server in the middle' data relaying case.



Exactly, there is nothing really wrong with delayed acks.. But with  
sysctl

I CAN disable and mess with the delayed acks, but I can't seem to do
anything to Nagle.

That's why I was thinking if I could change the Nagle time-out to 0ms it
would effectively disable it..

Cheers.
J.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Jerahmy Pocott


On 23/06/2008, at 7:00 PM, David Malone wrote:


On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:

So does anyone know IF this can be tuned and if so by what?


You can tune it with net.inet.tcp.delacktime - it should be is ms.


Yeah I saw that one. But that only changes the delayed ack...

The default value of 100ms seems fairly reasonable unless you're
talking about a LAN..

I guess what I really want to do is disable Nagle in the tcp stack, but
since you do that with the sockopts call on a per socket basis I'm
guessing there isn't any system wide tunable for it..

Thanks,
Jerahmy.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Stefan Eßer

Matthew Dillon wrote:

In anycase, the usual solution is to disable Nagle rather then mess
with delayed acks.  What we need is a new Nagle that understands the
new reality for interactive connections... something that doesn't break
performance in the 'server in the middle' data relaying case.


One possibility I see is a statistic about DelACKs per TCP connection,
counting those that were rightfully delayed (with hindsight). I.e.,
if an ACK is delayed, but there was no chance to piggy-back it or to
combine it with another ACK, it could have been sent without delay.
Only those delayed ACKs that reduce load are good, all others cause
additional state to be maintained and may increase latencies for no
good reason.

Therefore, I thought about starting with Nagle enabled, but give up
on delaying ACKs, when doing so is found to be ineffective.

The only problem with this approach is that once TCP_NODELAY is
implicitly set due to measured behavior of the communication, a
situation that would benefit from delayed ACKs can no longer be
detected. (Well, you could measure the delay between an ACK and
the next data sent to the same destination; disable TCP_NODELAY
if ACKs could have been piggy-backed on data packets without too
much delay. May be we could really have TCP auto-tune with respect
to use of delayed ACKs ...

I had suggested this years back, when the issue was discussed, but
consensus was, that you should just set TCP_NODELAY. But automatic
adjustment could also (implicitly) take RTT, window size into
consideration. And to me, automatic setting of TCP_NODELAY seems
more useful than automatic clearing (after delayed ACKs had been
found to be of no use for a window of say 8 or 16 ACKs).

The implementation would be quite simple: Whenever a delayed ACK
is sent, check whether it is sent on its own (bad) or whether it
could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
bother to keep on deciding whether delayed ACKs had become useful
in a different phase of the communication. If you want to be able
to automatically disable TCP_NODELAY, then just set a time-stamp
whenever an ACK is sent and when the next data is sent through
this same socket, check whether delaying the ACK had allowed to
send it with that data packet (i.e. the delay was less than the
maximum hold time of the delayed ACK). If it had been beneficial
to delay ACKs (say 3 out of a window of 4) then clear TCP_NODELAY.

I have no idea, whether SMP locking would be problematic, but I
guess the checks and counter updates could be put in sections
that are appropriately locked, anyway.

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 7-STABLE deadlock!

2008-06-23 Thread John Baldwin
On Monday 16 June 2008 07:21:15 am Lev Serebryakov wrote:
 Lev Serebryakov wrote:
   It seems to be ATA/SATA or UFS2 problem: now I have computer in state,
  when 4 iozone processes are hanged in Disk wait state, and I can not
  cd to filesystem, which is tested by iozone.
   But I can create processes, work on system, etc., if I don't touch this
  filesystem.

I can reproduce it, creating gmirror on 5 disks (yes, not very useful
 configuration, but I've started from non-base-system RAID5 and need to
 exclude it), FS with 64Kb blocks, and 4 threads of iozone with mixed
 workload (-i 8 -+p 70).

All 5 disks are ICH9DO-based, SATA-II WD5000AAKS HDDs.

Try getting the 'ps' output from ddb.  Also, get a crash dump if you can.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: AGP bridge detected as pcib

2008-06-23 Thread John Baldwin
On Tuesday 17 June 2008 07:50:34 am Daniel O'Connor wrote:
 Hi,
 I have an Epox 8HDAIPRO motherboard -
 http://www.epox.com/usA/product.asp?ID=EP-8HDAIPRO and its AGP slot is
 detected as pcib rather than agp as I would expect. I do have agp in the
 kernel -

In 7.0 agp0 will be a child device of the hostbX device.  pciconf -lcv might 
be useful.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CLARITY re: challenge: end of life for 6.2 is premature with buggy 6.3

2008-06-23 Thread John Baldwin
On Sunday 08 June 2008 07:49:35 am Andy Kosela wrote:
[ much snippage.. ]

 there is time to rethink FreeBSD overall strategy and goals. Major
 companies using FreeBSD in their infrastructure like Yahoo! or Juniper
 Networks would definetly benefit from such moves focused on long term
 support of stable releases. I honestly think it is in their interest to
 support, even financially

FWIW, Yahoo! tracks -stable branches, not point releases.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re[2]: FreeBSD 7-STABLE deadlock!

2008-06-23 Thread Lev Serebryakov
Hello, John.
You wrote 23 июня 2008 г., 18:47:33:

 On Monday 16 June 2008 07:21:15 am Lev Serebryakov wrote:
 Lev Serebryakov wrote:
   It seems to be ATA/SATA or UFS2 problem: now I have computer in state,
  when 4 iozone processes are hanged in Disk wait state, and I can not
  cd to filesystem, which is tested by iozone.
   But I can create processes, work on system, etc., if I don't touch this
  filesystem.

I can reproduce it, creating gmirror on 5 disks (yes, not very useful
 configuration, but I've started from non-base-system RAID5 and need to
 exclude it), FS with 64Kb blocks, and 4 threads of iozone with mixed
 workload (-i 8 -+p 70).

All 5 disks are ICH9DO-based, SATA-II WD5000AAKS HDDs.
 Try getting the 'ps' output from ddb.  Also, get a crash dump if you can.
  It was tracked douwn to known deadlock in buffer allocator when
 buffer map is fragmented (thnx to [EMAIL PROTECTED]). Workaround is known: 
don't
use FSes with 16Kb and 64Kb blocks on same system in one time. 16/32
mixture works well :)


-- 
// Black Lion AKA Lev Serebryakov [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Failure building apache22 and mysql51

2008-06-23 Thread Sorin Pânca

Hello people!
I recently upgraded a amd64 machine from FreeBSD-6.2-RELEASE-p11 to 
FreeBSD-7.0-RELEASE-p2 using the tutorial found at 
http://www.daemonology.net/blog/2007-11-11-freebsd-major-version-upgrade.html

All went well with the base system.
Now with the ports...
mysql51-server was installed and I noticed it fails to start (I mean it 
dies within a few seconds after the restart).
First I tried doing portupgrade -af for a couple of times but after 
compiling everything, I'm still unlucky. Then I tried portupgrade -aPPR, 
also to no avail.




Then:
# cd /usr/ports/databases/mysql51-server
# portupgrade -f
---  Reinstalling 'mysql-server-5.1.25' (databases/mysql51-server)
---  Building '/usr/ports/databases/mysql51-server'
===  Cleaning for mysql-server-5.1.25
===  Vulnerability check disabled, database not found

You may use the following build options:

WITH_CHARSET=charsetDefine the primary built-in charset 
(latin1).
WITH_XCHARSET=list  Define other built-in charsets (may be 
'all').
WITH_COLLATION=collate  Define default collation 
(latin1_swedish_ci).

WITH_OPENSSL=yesEnable secure connections.
WITH_LINUXTHREADS=yes   Use the linuxthreads pthread library.
WITH_PROC_SCOPE_PTH=yes Use process scope threads
(try it if you use libpthread).
BUILD_OPTIMIZED=yes Enable compiler optimizations
(use it if you need speed).
BUILD_STATIC=yesBuild a static version of mysqld.
(use it if you need even more speed).
WITH_NDB=yesEnable support for NDB Cluster.

===  Extracting for mysql-server-5.1.25
= MD5 Checksum OK for mysql-5.1.25-rc.tar.gz.
= SHA256 Checksum OK for mysql-5.1.25-rc.tar.gz.
[ ... ]
(cd .libs  rm -f libndb.la  ln -s ../libndb.la libndb.la)
if /usr/local/bin/libtool --preserve-dup-deps --tag=CC --mode=compile cc 
-DMYSQL_SERVER -DDEFAULT_MYSQL_HOME=\/usr/local\ 
-DDATADIR=\/var/db/mysql\ -DSHAREDIR=\/usr/local/share/mysql\ 
-DPLUGINDIR=\/usr/local/lib/mysql/plugin\ -DHAVE_EVENT_SCHEDULER 
-DHAVE_CONFIG_H -I. -I. -I../include -I../include -I../include 
-I../regex -I. -O2 -fno-strict-aliasing -pipe -MT udf_example.lo 
-MD -MP -MF .deps/udf_example.Tpo -c -o udf_example.lo udf_example.c; \
then mv -f .deps/udf_example.Tpo .deps/udf_example.Plo; 
else rm -f .deps/udf_example.Tpo; exit 1; fi
 cc -DMYSQL_SERVER -DDEFAULT_MYSQL_HOME=\/usr/local\ 
-DDATADIR=\/var/db/mysql\ -DSHAREDIR=\/usr/local/share/mysql\ 
-DPLUGINDIR=\/usr/local/lib/mysql/plugin\ -DHAVE_EVENT_SCHEDULER 
-DHAVE_CONFIG_H -I. -I. -I../include -I../include -I../include 
-I../regex -I. -O2 -fno-strict-aliasing -pipe -MT udf_example.lo -MD -MP 
-MF .deps/udf_example.Tpo -c udf_example.c  -fPIC -DPIC -o 
.libs/udf_example.o
 cc -DMYSQL_SERVER -DDEFAULT_MYSQL_HOME=\/usr/local\ 
-DDATADIR=\/var/db/mysql\ -DSHAREDIR=\/usr/local/share/mysql\ 
-DPLUGINDIR=\/usr/local/lib/mysql/plugin\ -DHAVE_EVENT_SCHEDULER 
-DHAVE_CONFIG_H -I. -I. -I../include -I../include -I../include 
-I../regex -I. -O2 -fno-strict-aliasing -pipe -MT udf_example.lo -MD -MP 
-MF .deps/udf_example.Tpo -c udf_example.c -o udf_example.o /dev/null 21
/usr/local/bin/libtool --preserve-dup-deps --tag=CC --mode=link cc  -O2 
-fno-strict-aliasing -pipe   -o udf_example.la  -module -rpath 
/usr/local/lib/mysql udf_example.lo  -pthread -lcrypt -lm  -pthread
cc -shared  .libs/udf_example.o  -pthread -lcrypt -lm -pthread  -pthread 
-pthread -pthread -pthread -pthread -pthread -Wl,-soname 
-Wl,udf_example.so.0 -o .libs/udf_example.so.0
/usr/bin/ld: /usr/lib/libpthread.a(thr_mutex.o): relocation R_X86_64_32 
can not be used when making a shared object; recompile with -fPIC

/usr/lib/libpthread.a: could not read symbols: Bad value
gmake[3]: *** [udf_example.la] Error 1
gmake[3]: Leaving directory 
`/usr/ports/databases/mysql51-server/work/mysql-5.1.25-rc/sql'

gmake[2]: *** [all-recursive] Error 1
gmake[2]: Leaving directory 
`/usr/ports/databases/mysql51-server/work/mysql-5.1.25-rc/sql'

gmake[1]: *** [all] Error 2
gmake[1]: Leaving directory 
`/usr/ports/databases/mysql51-server/work/mysql-5.1.25-rc/sql'

gmake: *** [all-recursive] Error 1
*** Error code 2

Stop in /usr/ports/databases/mysql51-server.
** Command failed [exit code 1]: /usr/bin/script -qa 
/tmp/portupgrade.94200.0 env UPGRADE_TOOL=portupgrade 
UPGRADE_PORT=mysql-server-5.1.25 UPGRADE_PORT_VER=5.1.25 make

** Fix the problem and try again.
** Listing the failed packages (-:ignored / *:skipped / !:failed)
! databases/mysql51-server (mysql-server-5.1.25)(new 
compiler error)






and again, with portupgrade -N www/apache22 this time:
[ ... ]
/bin/sh /usr/ports/www/apache22/work/httpd-2.2.8/srclib/apr/libtool 
--silent --mode=compile cc   -O2 -fno-strict-aliasing -pipe 
-I/usr/local/include/mysql -DHAVE_MYSQL_H -I/usr/include -DHAVE_CONFIG_H 
   -I./include 

Re: FreeBSD 7-STABLE deadlock!

2008-06-23 Thread Kris Kennaway

Lev Serebryakov wrote:

Hello, John.
You wrote 23 июня 2008 г., 18:47:33:


On Monday 16 June 2008 07:21:15 am Lev Serebryakov wrote:

Lev Serebryakov wrote:

 It seems to be ATA/SATA or UFS2 problem: now I have computer in state,
when 4 iozone processes are hanged in Disk wait state, and I can not
cd to filesystem, which is tested by iozone.
 But I can create processes, work on system, etc., if I don't touch this
filesystem.

   I can reproduce it, creating gmirror on 5 disks (yes, not very useful
configuration, but I've started from non-base-system RAID5 and need to
exclude it), FS with 64Kb blocks, and 4 threads of iozone with mixed
workload (-i 8 -+p 70).

   All 5 disks are ICH9DO-based, SATA-II WD5000AAKS HDDs.

Try getting the 'ps' output from ddb.  Also, get a crash dump if you can.

  It was tracked douwn to known deadlock in buffer allocator when
 buffer map is fragmented (thnx to [EMAIL PROTECTED]). Workaround is known: 
don't
use FSes with 16Kb and 64Kb blocks on same system in one time. 16/32
mixture works well :)




Is there a PR filed with this bug?  Having the specific information 
recorded will be very useful.


Kris

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re[2]: FreeBSD 7-STABLE deadlock!

2008-06-23 Thread Lev Serebryakov
Hello, Kris.
You wrote 23 июня 2008 г., 19:56:14:

 Is there a PR filed with this bug?  Having the specific information
 recorded will be very useful.
 Kostik (kib@) says, that I don't need to fill PR for this issue...

-- 
// Black Lion AKA Lev Serebryakov [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Matthew Dillon

:One possibility I see is a statistic about DelACKs per TCP connection,
:counting those that were rightfully delayed (with hindsight). I.e.,
:if an ACK is delayed, but there was no chance to piggy-back it or to
:combine it with another ACK, it could have been sent without delay.
:Only those delayed ACKs that reduce load are good, all others cause
:additional state to be maintained and may increase latencies for no
:good reason.
:
:...
:consideration. And to me, automatic setting of TCP_NODELAY seems
:more useful than automatic clearing (after delayed ACKs had been
:found to be of no use for a window of say 8 or 16 ACKs).
:
:The implementation would be quite simple: Whenever a delayed ACK
:is sent, check whether it is sent on its own (bad) or whether it
:could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
:be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
:bother to keep on deciding whether delayed ACKs had become useful
:in a different phase of the communication. If you want to be able
:to automatically disable TCP_NODELAY, then just set a time-stamp
:...
:Regards, STefan

That's an interesting approach.  I think it would catch some
of the cases, but not enough of them.  If the round-trip in
the server-relaying case is less then the delayed-ack, the acks
will still wind up piggy-backed on return traffic but the latency
will also still remain horrible.

It should be noted that Nagle can cause high latencies even when
delayed acks are turned off.  Nagle's delay is not timed... in its
simplest description it prevents packets from being transmitted
for new data coming from userland if the data already in the
sockbuf (and presumably already transmitted) has not yet been
acknowledged.

For interactive traffic this means that Nagle is putting the screws
on the packet stream even if the acks aren't delayed, simply from the
ack latency.  With delayed acks turned off the latency is lower, but
not 0, so interactive traffic is still being held up by Nagle.  The
effect is noticeable even on a LAN.  Jerahmy brought up Samba... that
is an excellent example.  NFS-over-TCP would be another good example.

Any protocol which multiplexes multiple commands from different
sources over the same connection gets really messed up (slowed down)
by Nagle.

On the flip side, Nagle can't just be turned off by default because
it would cause streaming connections from user programs which do tiny
writes to generate a lot of unnecessarily tiny packets.  This can become
apparent when using SSH over a slow link.  Numerous programs run from
a shell generate fairly ineffcient packets which could have easily
been batched when operating over SSH.  The result can be sludgy
performance for output which ought be batched up by TCP but isn't because
SSH turns off Nagle unconditionally.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 7-STABLE deadlock!

2008-06-23 Thread Kris Kennaway

Lev Serebryakov wrote:

Hello, Kris.
You wrote 23 июня 2008 г., 19:56:14:


Is there a PR filed with this bug?  Having the specific information
recorded will be very useful.

 Kostik (kib@) says, that I don't need to fill PR for this issue...



OK, that is good enough for me :)

Kris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Failure building apache22 and mysql51

2008-06-23 Thread Jeremy Chadwick
On Mon, Jun 23, 2008 at 06:43:04PM +0300, Sorin Pânca wrote:
 Hello people!
 I recently upgraded a amd64 machine from FreeBSD-6.2-RELEASE-p11 to  
 FreeBSD-7.0-RELEASE-p2 using the tutorial found at  
 http://www.daemonology.net/blog/2007-11-11-freebsd-major-version-upgrade.html
 All went well with the base system.

I'm doubting that greatly.

Both ports you're trying to build rely on pthread, and both die in the
same way:

 /usr/bin/ld: /usr/lib/libpthread.a(thr_mutex.o): relocation R_X86_64_32 can 
 not be used when making a shared object; recompile with -fPIC
 /usr/lib/libpthread.a: could not read symbols: Bad value

 /usr/bin/ld: /usr/lib/libpthread.a(thr_syscalls.o): relocation R_X86_64_32S 
 can not be used when making a shared object; recompile with -fPIC
 /usr/lib/libpthread.a: could not read symbols: Bad value

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Failure building apache22 and mysql51

2008-06-23 Thread Sorin Pânca

Jeremy Chadwick wrote:

On Mon, Jun 23, 2008 at 06:43:04PM +0300, Sorin P�nca wrote:

Hello people!
I recently upgraded a amd64 machine from FreeBSD-6.2-RELEASE-p11 to  
FreeBSD-7.0-RELEASE-p2 using the tutorial found at  
http://www.daemonology.net/blog/2007-11-11-freebsd-major-version-upgrade.html

All went well with the base system.


I'm doubting that greatly.

Well... they compiled successfully twice (once by doing portupgrade -a, 
and once by doing portupgrade -af - there were some complaines about 
some perl modules that we installed by cpan) and then a portupgrade 
-aPPR finished successfully...
Anyway, how do I check everything is clean and working or how do I 
recompile the pthread part (library) of the system to ensure it's linked 
against the proper libraries?



Both ports you're trying to build rely on pthread, and both die in the
same way:


/usr/bin/ld: /usr/lib/libpthread.a(thr_mutex.o): relocation R_X86_64_32 can not 
be used when making a shared object; recompile with -fPIC
/usr/lib/libpthread.a: could not read symbols: Bad value

/usr/bin/ld: /usr/lib/libpthread.a(thr_syscalls.o): relocation R_X86_64_32S can 
not be used when making a shared object; recompile with -fPIC
/usr/lib/libpthread.a: could not read symbols: Bad value



I agree. That's why I posted both of them.

Is my question good or bad? I'm not a programmer, so I really try hard 
to understand what is happening...


Sorin.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: AMD Geode LX crypto accelerator (glxsb)

2008-06-23 Thread Patrick Lamaizière
Le Sun, 22 Jun 2008 21:20:02 +0200,
Ivan Voras [EMAIL PROTECTED] a écrit :

Hi, 

 The 'numbers' are in 1000s of bytes per second processed.
 type 16 bytes 64 bytes256 bytes   1024 bytes
 8192 bytes aes-128 cbc   5359.57k 5577.49k 5654.53k
 5639.81k 5679.65k aes-128-cbc394.62k 1471.97k
 5457.89k15097.21k25895.72k

I've got the same results. The encryption of a file of 360 MBytes takes
around 20s with the hardware and 1m10s by software.

I am playing to overload my box (a soekris net5501) with ping floods on
ipsec (hmac-md5 and rijndael) by a modern computer.

With four 'ping -f -s 3000', 'top' reports 
CPU 0.4% user 0.0 nice 1.6% system, 90.3% interrupt,  7.8% idle.

With five 'ping', top does not run, and the kernel does not
display the message 'limiting icmp ping response to 300 to 200' anymore
too (on the serial console).

With the hardware, i can use 8 flood pings without any problem.
Top shows 
CPU:  0.0% user,  0.0% nice, 33.5% system, 12.5% interrupt, 54.1% idle

And the kernel displays limiting icmp ping response from 900 to 200
packets/s., instead '300 to 200'.

So it seems there is a real improvement.

Regards.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bus_dmamem_alloc failed to align memory properly

2008-06-23 Thread Jeff Blank
On Tue, Jun 17, 2008 at 11:43:42AM -0400, Jeff Blank wrote:
 I've installed a PCI (not PCIE) video card in a FreeBSD 7-STABLE
 (20080616 ~19:00 UTC) amd64 system, and when Xorg starts, the kernel
 logs the message in the subject.  Context:
 
 Jun 17 11:27:30 bender kernel: drm0: ATI Radeon RV280 9250 on vgapci0
 Jun 17 11:27:30 bender kernel: info: [drm] Initialized radeon 1.25.0 20060524
 Jun 17 11:27:33 bender kernel: info: [drm] Setting GART location based on new 
 memory map
 Jun 17 11:27:33 bender kernel: bus_dmamem_alloc failed to align memory 
 properly.
 Jun 17 11:27:33 bender kernel: info: [drm] Loading R200 Microcode
 Jun 17 11:27:33 bender kernel: info: [drm] writeback test succeeded in 2 usecs
 Jun 17 11:27:33 bender kernel: drm0: [ITHREAD]
 
 Is this anything to worry about?  Part of the reason I ask is that the
 machine locked up hard (had to cut power) the first time I hit
 ctrl-alt-bs to kill Xorg, though I can't reproduce that reliably (or,
 really, at all just yet).

Still hoping for an answer, as the PC locked up hard a few more times,
always when trying to exit the X server (normal logout/all clients
terminated, ctl-alt-bs, init 6).  The PC is a Dell Optiplex 740.  Its
onboard video is NVidia NVS 210S nVidia GForce 6150, but that is
disabled when an add-in video card is present (wasn't present in the
'pciconf -lv' I posted last week).

What can I do to determine whether the X11 problem is
bus_dmamem_alloc failed to align memory properly or something
else with the card?

thanks,
Jeff
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.3 deadlock (vm_map?) with DDB output

2008-06-23 Thread John Baldwin
On Thursday 19 June 2008 11:57:51 am James Gritton wrote:
 John Baldwin wrote:
  On Sunday 15 June 2008 07:23:19 am Stef Walter wrote:

  I've been trying to track down a deadlock on some newish production
  servers running FreeBSD 6.3-RELEASE-p2. The deadlock occurs on a
  specific (although mundane) hardware configuration, and each of several
  servers running this hardware deadlock about once per week.
 
  Although I suspect that this is not hardware related, from a (naive)
  perusal of the attached stack traces.
 
  Forgive me if my interpretation of this is all wrong, but I'm pretty
  desperate for help. So here's my basic understanding of the deadlock:
 
  These processes seem to be waiting on the page queue mutex:
   sendmail (in vm_mmap  vm_map_find  vm_map_insert  vm_map_pmap_enter)
   bsnmpd (in malloc, uma_large_malloc  page_alloc  kmem_malloc)
   httpd (in trap  trap_pfault  vm_fault)
   [g_up] (in g_vfs_done  bufdone)
 
  The page queue mutex is held by rsync process:
   rsync (in trap  trap_pfault  vm_fault  pmap_enter)
 
  Rsync kernel process (in pmap_enter) was interrupted while holding the
  page queue lock?
 
 
  Giant is enabled in loader.conf due to the needs of the pf firewall when
  dealing with user credentials lookups. I do not believe that Giant plays
  into this deadlock. Kernel config attached.
 
  Any and all help or info is welcome. Thanks in advance.
  
 
  Try this change:
 
  jhb 2007-10-27 22:07:40 UTC
 
FreeBSD src repository
 
Modified files:
  sys/kern sched_4bsd.c
Log:
Change the roundrobin implementation in the 4BSD scheduler to trigger a
userland preemption directly from hardclock() via sched_clock() when a
thread uses up a full quantum instead of using a periodic timeout to 
cause
a userland preemption every so often.  This fixes a potential deadlock
when IPI_PREEMPTION isn't enabled where softclock blocks on a lock held
by a thread pinned or bound to another CPU.  The current thread on that
CPU will never be preempted while softclock is blocked.
 
Note that ULE already drives its round-robin userland preemption from
sched_clock() as well and always enables IPI_PREEMPT.
 
MFC after:  1 week
 
Revision  ChangesPath
1.108 +8 -29 src/sys/kern/sched_4bsd.c
 
  We use it at work on 6.x.  W/o this fix, round-robin stops working on 4BSD 
  when softclock() (swi4: clock) blocks on a lock like Giant.

 
 I've been seeing similar troubles on 6.2 and I'll have to give this a 
 try as we upgrade to 6.3.  I notice MFC after: 1 week in the log; it's 
 been a week - any chance of seeing this fix rolled into 6.x?

If people confirm it fixes issues I will MFC it.  There was some pushback when 
I first committed it so I waited on the MFC.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Mailman results for Handle-globus

2008-06-23 Thread handle-globus-request
This is an automated response.

There were problems with the email commands you sent to Mailman via
the administrative address [EMAIL PROTECTED].

To obtain instructions on valid Mailman email commands, send email to
[EMAIL PROTECTED] with the word help in the
subject line or in the body of the message.

If you want to reach the human being that manages this mailing list,
please send your message to [EMAIL PROTECTED].

The following is a detailed description of the problems.

 Subject line ignored:
   Mail System Error - Returned Mail
Command? This is a multi-part message in MIME format.
Command? --=_NextPart_000_0001_B09F961C.9B9F6ED9
Command? Content-Type: text/plain;
Command? charset=us-ascii
 
 Too many errors encountered; the rest of the message is ignored:
 Content-Transfer-Encoding: 7bit
 
 ***
 InterScan VirusWall 6 has detected an item that contains a virus in this 
 message.
 
 Please contact the administrator for further information.
 ***
 
 The message was undeliverable due to the following reason(s):
 
 Your message could not be delivered because the destination server was
 not reachable within the allowed queue period. The amount of time
 a message is queued before it is returned depends on local configura-
 tion parameters.
 
 Most likely there is a network problem that prevented delivery, but
 it is also possible that the computer is turned off, or does not
 have a mail system running right now.
 
 Your message could not be delivered within 7 days:
 Host 41.121.57.72 is not responding.
 
 The following recipients could not receive this message:
 [EMAIL PROTECTED]
 
 Please reply to [EMAIL PROTECTED]
 if you feel this message to be in error.
 
 --=_NextPart_000_0001_B09F961C.9B9F6ED9
 Content-Type: application/octet-stream;
   name=yjt.zip
 Content-Transfer-Encoding: base64
 Content-Disposition: attachment;
   filename=yjt.zip
 
 UEsFBg==
 --=_NextPart_000_0001_B09F961C.9B9F6ED9--
 
 
 
 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.3 deadlock (vm_map?) with DDB output

2008-06-23 Thread James Gritton

John Baldwin wrote:

On Thursday 19 June 2008 11:57:51 am James Gritton wrote:
  

John Baldwin wrote:


On Sunday 15 June 2008 07:23:19 am Stef Walter wrote:
  
  

I've been trying to track down a deadlock on some newish production
servers running FreeBSD 6.3-RELEASE-p2. The deadlock occurs on a
specific (although mundane) hardware configuration, and each of several
servers running this hardware deadlock about once per week.

Although I suspect that this is not hardware related, from a (naive)
perusal of the attached stack traces.

Forgive me if my interpretation of this is all wrong, but I'm pretty
desperate for help. So here's my basic understanding of the deadlock:

These processes seem to be waiting on the page queue mutex:
 sendmail (in vm_mmap  vm_map_find  vm_map_insert  vm_map_pmap_enter)
 bsnmpd (in malloc, uma_large_malloc  page_alloc  kmem_malloc)
 httpd (in trap  trap_pfault  vm_fault)
 [g_up] (in g_vfs_done  bufdone)

The page queue mutex is held by rsync process:
 rsync (in trap  trap_pfault  vm_fault  pmap_enter)

Rsync kernel process (in pmap_enter) was interrupted while holding the
page queue lock?


Giant is enabled in loader.conf due to the needs of the pf firewall when
dealing with user credentials lookups. I do not believe that Giant plays
into this deadlock. Kernel config attached.

Any and all help or info is welcome. Thanks in advance.



Try this change:

jhb 2007-10-27 22:07:40 UTC

  FreeBSD src repository

  Modified files:
sys/kern sched_4bsd.c
  Log:
  Change the roundrobin implementation in the 4BSD scheduler to trigger a
  userland preemption directly from hardclock() via sched_clock() when a
  thread uses up a full quantum instead of using a periodic timeout to 
  

cause
  

  a userland preemption every so often.  This fixes a potential deadlock
  when IPI_PREEMPTION isn't enabled where softclock blocks on a lock held
  by a thread pinned or bound to another CPU.  The current thread on that
  CPU will never be preempted while softclock is blocked.

  Note that ULE already drives its round-robin userland preemption from
  sched_clock() as well and always enables IPI_PREEMPT.

  MFC after:  1 week

  Revision  ChangesPath
  1.108 +8 -29 src/sys/kern/sched_4bsd.c

We use it at work on 6.x.  W/o this fix, round-robin stops working on 4BSD 
when softclock() (swi4: clock) blocks on a lock like Giant.
  
  
I've been seeing similar troubles on 6.2 and I'll have to give this a 
try as we upgrade to 6.3.  I notice MFC after: 1 week in the log; it's 
been a week - any chance of seeing this fix rolled into 6.x?



If people confirm it fixes issues I will MFC it.  There was some pushback when 
I first committed it so I waited on the MFC.


I can confirm that on 6.3 I can recreate the deadlock without the patch, 
and can't recreate it with the patch.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Management interface for cards powered by the mfi driver?

2008-06-23 Thread Vivek Khera


On Jun 18, 2008, at 12:15 AM, Karl Denninger wrote:

No management tool = el-sucko, because you can't rebuild a failed  
disk or

even shut the alarm on the board off!


This is precisely the reason I have dropped using Adaptec  
controllers.  The most recent ones cannot be managed with the FreeBSD  
tools.


What I've ended up using is LSI Fibre Channel 4Gb cards (PCI-e for  
newer servers, but I have two with PCI-x) attached to an external RAID  
box.  My external RAIDs are custom built by Partners Data Systems and  
use an Areca controller.  All configuration and monitoring is done via  
web interface over ethernet so no proprietary software is needed.   
They fully support FreeBSD too.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bus_dmamem_alloc failed to align memory properly

2008-06-23 Thread Clifton Royston
On Mon, Jun 23, 2008 at 01:53:35PM -0400, Jeff Blank wrote:
 On Tue, Jun 17, 2008 at 11:43:42AM -0400, Jeff Blank wrote:
  I've installed a PCI (not PCIE) video card in a FreeBSD 7-STABLE
  (20080616 ~19:00 UTC) amd64 system, and when Xorg starts, the kernel
  logs the message in the subject.  Context:
  
  Jun 17 11:27:30 bender kernel: drm0: ATI Radeon RV280 9250 on vgapci0
  Jun 17 11:27:30 bender kernel: info: [drm] Initialized radeon 1.25.0 
  20060524
  Jun 17 11:27:33 bender kernel: info: [drm] Setting GART location based on 
  new memory map
  Jun 17 11:27:33 bender kernel: bus_dmamem_alloc failed to align memory 
  properly.
  Jun 17 11:27:33 bender kernel: info: [drm] Loading R200 Microcode
  Jun 17 11:27:33 bender kernel: info: [drm] writeback test succeeded in 2 
  usecs
  Jun 17 11:27:33 bender kernel: drm0: [ITHREAD]
  
  Is this anything to worry about?  Part of the reason I ask is that the
  machine locked up hard (had to cut power) the first time I hit
  ctrl-alt-bs to kill Xorg, though I can't reproduce that reliably (or,
  really, at all just yet).
 
 Still hoping for an answer, as the PC locked up hard a few more times,
 always when trying to exit the X server (normal logout/all clients
 terminated, ctl-alt-bs, init 6).  The PC is a Dell Optiplex 740.  Its
 onboard video is NVidia NVS 210S nVidia GForce 6150, but that is
 disabled when an add-in video card is present (wasn't present in the
 'pciconf -lv' I posted last week).
 
 What can I do to determine whether the X11 problem is
 bus_dmamem_alloc failed to align memory properly or something
 else with the card?

  I can not be completely sure, but believe this may be a cross-OS
problem with Xorg support of this specific ATI card family.  I have had
a nearly identical problem with X lockups on a Debian server using the
ATI Radeon 9200 Pro (PCI) which uses the same RV280 processor and the
same X driver, with similar startup messages:

$ dmesg | grep drm
[drm] Initialized drm 1.0.1 20051102
[drm] Initialized radeon 1.25.0 20060524 on minor 0
[drm] Setting GART location based on new memory map
[drm] Loading R200 Microcode
[drm] writeback test succeeded in 2 usecs

  Next time it appears to freeze, see if you can ssh into it; my
experience on the Debian box was that this continued to work even
though the console was completely unresponsive following the freeze.

  If you don't want to spend a lot of time on this, I'd try a different
video card; I have a strong suspicion that your problems are not
FreeBSD related and would go away with that change.

  -- Clifton

-- 
Clifton Royston  --  [EMAIL PROTECTED] / [EMAIL PROTECTED]
   President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: DVD-RW doesn't write

2008-06-23 Thread Jerahmy Pocott


On 11/06/2008, at 3:28 AM, Sean C. Farley wrote:


 had problems with burncd and my DVD drive when burning CD-RW's.  When
I tried atapicam and cdrecord, it gave me problems.  I believe it was
using burncd prior to atapicam that caused it because it works now  
if I
do not use burncd first.  You could try a reboot and use atapicam  
first;

the DVD drive may be in a funny state.  Just a guess.


Yes, I'v noticed that too. But using atacontrol to reset the channel  
that the drive
is on seems to return it to a working state. Trying to reset the  
device doesn't fix

it (maybe it locks up the whole channel?).

I'v managed to get the drive working with cdrecord, it still reports  
device errors
to the console and seems slower than it should be, but it *does*  
produce readable

dvds, which is the main thing!

However growisofs (well mkisofs part) won't write files over 2gb? But  
I'v

seen single archives written to span the whole dvd.. Do you need to use
-iso-level 3 or 4 to get this functionality? How well supported is  
reading from
disks created with level 4? (level 4 equates to ISO-9660:1999 or  
version 2

according to the man page).

Cheers.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: AGP bridge detected as pcib

2008-06-23 Thread Daniel O'Connor
On Tue, 24 Jun 2008, John Baldwin wrote:
 On Tuesday 17 June 2008 07:50:34 am Daniel O'Connor wrote:
  Hi,
  I have an Epox 8HDAIPRO motherboard -
  http://www.epox.com/usA/product.asp?ID=EP-8HDAIPRO and its AGP slot
  is detected as pcib rather than agp as I would expect. I do have
  agp in the kernel -

 In 7.0 agp0 will be a child device of the hostbX device.  pciconf
 -lcv might be useful.

Here you go!

[EMAIL PROTECTED]:0:0:0:  class=0x06 card=0x02821106 chip=0x02821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
cap 02[80] = AGP v3 SBA disabled
cap 01[50] = powerspec 2  supports D0 D3  current D0
cap 08[60] = HT slave
cap 08[58] = HT interrupt
[EMAIL PROTECTED]:0:0:1:  class=0x06 card=0x chip=0x12821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
[EMAIL PROTECTED]:0:0:2:  class=0x06 card=0x chip=0x22821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
[EMAIL PROTECTED]:0:0:3:  class=0x06 card=0x chip=0x32821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
[EMAIL PROTECTED]:0:0:4:  class=0x06 card=0x chip=0x42821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
[EMAIL PROTECTED]:0:0:7:  class=0x06 card=0x chip=0x72821106 
rev=0x00 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'K8T880Pro CPU to PCI Bridge'
class  = bridge
subclass   = HOST-PCI
[EMAIL PROTECTED]:0:1:0:   class=0x060400 card=0x chip=0xb1881106 
rev=0x00 hdr=0x01
vendor = 'VIA Technologies Inc'
device = 'VT8237 K8HTB CPU to AGP 2.0/3.0 Bridge'
class  = bridge
subclass   = PCI-PCI
cap 01[80] = powerspec 2  supports D0 D1 D3  current D0
[EMAIL PROTECTED]:0:8:0: class=0x0c0010 card=0x58c1 chip=0x58c1 
rev=0x61 hdr=0x00
vendor = 'Lucent/Agere Systems (Was: ATT MicroElectronics)'
device = 'FW322 1394A PCI PHY/Link Open Host Ctrlr I/F'
class  = serial bus
subclass   = FireWire
cap 01[44] = powerspec 2  supports D0 D1 D2 D3  current D0
[EMAIL PROTECTED]:0:9:0:   class=0x04 card=0x chip=0x036e109e 
rev=0x11 hdr=0x00
vendor = 'Conexant (Was: Brooktree Corp)'
device = 'Bt878/Fusion 878A Mediastream Controller'
class  = multimedia
subclass   = video
cap 03[44] = VPD
cap 01[4c] = powerspec 2  supports D0 D3  current D0
[EMAIL PROTECTED]:0:9:1:   class=0x048000 card=0x chip=0x0878109e 
rev=0x11 hdr=0x00
vendor = 'Conexant (Was: Brooktree Corp)'
device = '7610144DREV_02\41F7DBC9F009F0 TV Video Capture'
class  = multimedia
cap 03[44] = VPD
cap 01[4c] = powerspec 2  supports D0 D3  current D0
[EMAIL PROTECTED]:0:10:0:   class=0x010400 card=0x100113c1 chip=0x100113c1 
rev=0x01 hdr=0x00
vendor = '3ware Inc.'
device = '7000/8000 series ATA-133 Storage Controller'
class  = mass storage
subclass   = RAID
cap 01[40] = powerspec 1  supports D0 D1 D3  current D0
[EMAIL PROTECTED]:0:11:0:   class=0x02 card=0x00408086 chip=0x12298086 
rev=0x0c hdr=0x00
vendor = 'Intel Corporation'
device = '82550/1/7/8/9 EtherExpress PRO/100(B) Ethernet Adapter'
class  = network
subclass   = ethernet
cap 01[dc] = powerspec 2  supports D0 D1 D2 D3  current D0
[EMAIL PROTECTED]:0:13:0:class=0x02 card=0x90011695 chip=0x813910ec 
rev=0x10 hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'RT8139 (A/B/C/810x/813x/C+) Fast Ethernet Adapter'
class  = network
subclass   = ethernet
cap 01[50] = powerspec 2  supports D0 D1 D2 D3  current D0
[EMAIL PROTECTED]:0:15:0:class=0x01018f card=0x300c1695 chip=0x31491106 
rev=0x80 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'VT8237  VT6410 SATA RAID Controller'
class  = mass storage
subclass   = ATA
cap 01[c0] = powerspec 2  supports D0 D3  current D0
[EMAIL PROTECTED]:0:15:1:class=0x01018a card=0x300c1695 chip=0x05711106 
rev=0x06 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'VT82C586A/B/VT82C686/A/B/VT823x/A/C Bus Master IDE Controller'
class  = mass storage
subclass   = ATA
cap 01[c0] = powerspec 2  supports D0 D3  current D0
[EMAIL PROTECTED]:0:16:0:  class=0x0c0300 card=0x300c1695 chip=0x30381106 
rev=0x81 hdr=0x00
vendor = 'VIA Technologies Inc'
device = 'VT83C572,