Re: Poor CARP Interface Performance with NAT
On Tue, Jan 21, 2014 at 03:51:23PM -0800, Gabriel Kuri wrote: > I am running obsd 5.4 as my NAT router. I decided to setup a second obsd > box and run carp between the two for the external NATed interface (facing > the ISP). After I setup everything and switched pf to NAT using the address > on the carp interface, I'm seeing about 12Mbps - 13Mbps on the download, I > have a 60Mbps pipe (down). When I switch pf back to NAT using the address > on the physical interface, I get my full 60Mbps. Any ideas as to what I > could be doing wrong that would limit performance through the carp > interface to around 12Mbps - 13Mbps ? You might want to try posting this to the pf mailing list: http://www.benzedrine.cx/mailinglist.html Maybe somebody there will have a suggestion?
Re: ntpd switching between synced and unsynced since snap from 22nd jan
On Tue, 28 Jan 2014 23:32:30 +0100, Markus Lude wrote: > Hello, > > since updating to the latest snapshot on sparc64 from 22nd january > ntpd switches back and forth between synced and unsynced clock every > few minutes. Does anyone notice similar behavior? I have the same on amd64, but it appears only since the snapshot from 24th. > my ntpd.conf: > servers de.pool.ntp.org > > Regards, > Markus > -- Vigdis
ntpd switching between synced and unsynced since snap from 22nd jan
Hello, since updating to the latest snapshot on sparc64 from 22nd january ntpd switches back and forth between synced and unsynced clock every few minutes. Does anyone notice similar behavior? my ntpd.conf: servers de.pool.ntp.org Regards, Markus
Re: NAT reliability in light of recent checksum changes
Le 2014-01-28 12:45, Stuart Henderson a écrit : This analysis is bullshit. You need to take into account the fact that checksums are verified before regenerating them. That is, you need to compare a) verifying + regenerating vs b) updating. If there's an undetectable error, you're going to propagate it no matter whether you do a) or b). Checksums are, in many cases, only verified *on the NIC*. Consider this scenario, which has happened in real life. - NIC supports checksum offloading, verified checksum is OK. - PCI transfers are broken (in my case it affected multiple machines of a certain type, so most likely a motherboard bug), causing some corruption in the payload, but the machine won't detect them because it doesn't look at checksums itself, just trusts the NIC's "rx csum good" flag. In this situation, packets which have been NATted that are corrupt now get a new checksum that is valid; so the final endpoint can not detect the breakage. I'm not sure if this is common enough to be worth worrying about here, but the analysis is not bullshit. You're right. I was in the rough, sorry, and thanks for the explanation. I don't think this scenario is worth worrying about though. Simon
Re: NAT reliability in light of recent checksum changes
Em 28-01-2014 15:45, Stuart Henderson escreveu: > On 2014-01-28, Simon Perreault wrote: >> Le 2014-01-28 03:39, Richard Procter a écrit : >>> In order to hide payload corruption the update code would >>> have to modify the checksum to exactly account for it. But >>> that would have to happen by accident, as it never considers >>> the payload. It's not impossible, but, on the other hand, >>> checksum regeneration guarantees to hide any bad data. >>> So updates are more reliable. >> This analysis is bullshit. You need to take into account the fact that >> checksums are verified before regenerating them. That is, you need to >> compare a) verifying + regenerating vs b) updating. If there's an >> undetectable error, you're going to propagate it no matter whether you >> do a) or b). >> >> Simon >> >> > Checksums are, in many cases, only verified *on the NIC*. > > Consider this scenario, which has happened in real life. > > - NIC supports checksum offloading, verified checksum is OK. > > - PCI transfers are broken (in my case it affected multiple machines > of a certain type, so most likely a motherboard bug), causing some > corruption in the payload, but the machine won't detect them because > it doesn't look at checksums itself, just trusts the NIC's "rx csum > good" flag. > > In this situation, packets which have been NATted that are corrupt > now get a new checksum that is valid; so the final endpoint can not > detect the breakage. > > I'm not sure if this is common enough to be worth worrying about > here, but the analysis is not bullshit. > Stuart, It is more common than you might think. I had some gigabit motherboards in which some models always would corrupt the packets when using the onboard nic. I believe that in these cases there isn't much that the OS can do. Unfortunately, it's always the application job to detect if it is receiving good or bad data. Cheers, -- Giancarlo Razzolini GPG: 4096R/77B981BC
Re: NAT reliability in light of recent checksum changes
On 2014-01-28, Simon Perreault wrote: > Le 2014-01-28 03:39, Richard Procter a écrit : >> In order to hide payload corruption the update code would >> have to modify the checksum to exactly account for it. But >> that would have to happen by accident, as it never considers >> the payload. It's not impossible, but, on the other hand, >> checksum regeneration guarantees to hide any bad data. >> So updates are more reliable. > > This analysis is bullshit. You need to take into account the fact that > checksums are verified before regenerating them. That is, you need to > compare a) verifying + regenerating vs b) updating. If there's an > undetectable error, you're going to propagate it no matter whether you > do a) or b). > > Simon > > Checksums are, in many cases, only verified *on the NIC*. Consider this scenario, which has happened in real life. - NIC supports checksum offloading, verified checksum is OK. - PCI transfers are broken (in my case it affected multiple machines of a certain type, so most likely a motherboard bug), causing some corruption in the payload, but the machine won't detect them because it doesn't look at checksums itself, just trusts the NIC's "rx csum good" flag. In this situation, packets which have been NATted that are corrupt now get a new checksum that is valid; so the final endpoint can not detect the breakage. I'm not sure if this is common enough to be worth worrying about here, but the analysis is not bullshit.
Re: dead disk
On Tue, Jan 28, 2014 at 05:33, Andres Perera wrote: > do you understand that disks have write caches that don't give a hoot > about posix mkdir() rename() and so on? > > can bit rot change a inode type from directory to file, and vice versa? > > do you want the kernel to figure these out after the fact and > retroactively panic() for each occurence, neatly queueing them boot > after boot or do you want to grow a pair of balls instead? ./ffs/ffs_alloc.c: panic("ffs_alloc: bad size"); ./ffs/ffs_alloc.c: panic("ffs_alloc: missing credential"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: bad size"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: missing credential"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: bad bprev"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: bad blockno"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: small buf"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: bad optim"); ./ffs/ffs_alloc.c: panic("ffs_realloccg: small buf 2"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: unallocated block 1"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: non-logical cluster"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: non-physical cluster %d", i); ./ffs/ffs_alloc.c: panic("ffs1_reallocblk: start == end"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: unallocated block 2"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: alloc mismatch"); ./ffs/ffs_alloc.c: panic("ffs1_reallocblks: unallocated block 3"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: unallocated block 1"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: non-logical cluster"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: non-physical cluster %d", i); ./ffs/ffs_alloc.c: panic("ffs2_reallocblk: start == end"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: unallocated block 2"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: alloc mismatch"); ./ffs/ffs_alloc.c: panic("ffs2_reallocblks: unallocated block 3"); ./ffs/ffs_alloc.c: panic("ffs_valloc: dup alloc"); ./ffs/ffs_alloc.c: panic("ffs_clusteralloc: map mismatch"); ./ffs/ffs_alloc.c: panic("ffs_clusteralloc: allocated out of group"); ./ffs/ffs_alloc.c: panic("ffs_clusteralloc: lost block"); ./ffs/ffs_alloc.c: panic("ffs_nodealloccg: map corrupted"); ./ffs/ffs_alloc.c: panic("ffs_nodealloccg: block not in map"); ./ffs/ffs_alloc.c: panic("ffs_blkfree: bad size"); ./ffs/ffs_alloc.c: panic("ffs_blkfree: freeing free block"); ./ffs/ffs_alloc.c: panic("ffs_blkfree: freeing free frag"); ./ffs/ffs_alloc.c: panic("ffs_freefile: range: dev = 0x%x, ino = %d, fs = %s", ./ffs/ffs_alloc.c: panic("ffs_freefile: freeing free inode"); ./ffs/ffs_alloc.c: panic("ffs_checkblk: bad size"); ./ffs/ffs_alloc.c: panic("ffs_checkblk: bad block %lld", (long long)bno); ./ffs/ffs_alloc.c: panic("ffs_checkblk: partially free fragment"); ./ffs/ffs_alloc.c: * It is a panic if a request is made to find a block if none are ./ffs/ffs_alloc.c: panic("ffs_alloccg: map corrupted"); ./ffs/ffs_alloc.c: panic("ffs_alloccg: block not in map"); ./ffs/ffs_balloc.c: panic("ffs1_balloc: blk too big"); ./ffs/ffs_balloc.c: panic ("ffs1_balloc: ufs_bmaparray returned indirect block"); ./ffs/ffs_balloc.c: panic("Could not unwind indirect block, error %d", r); ./ffs/ffs_balloc.c: panic("ffs2_balloc: block too big"); ./ffs/ffs_balloc.c: panic("ffs2_balloc: ufs_bmaparray returned indirect block"); ./ffs/ffs_balloc.c: panic("ffs2_balloc: unwind failed"); ./ffs/ffs_inode.c: panic("ffs_update: bad link cnt"); ./ffs/ffs_inode.c: panic("ffs_truncate: partial truncate of symlink"); ./ffs/ffs_inode.c: panic("ffs_truncate: newspace"); ./ffs/ffs_inode.c: panic("ffs_truncate1"); ./ffs/ffs_inode.c: panic("ffs_truncate2"); ./ffs/ffs_inode.c: panic("ffs_indirtrunc: bad buffer size"); ./ffs/ffs_subr.c:__dead void panic(const char *, ...); ./ffs/ffs_subr.c: panic("Disk buffer overlap"); ./ffs/ffs_vfsops.c: panic("ffs_reload: dirty2"); ./ffs/ffs_vfsops.c: panic("ffs_reload: dirty1"); ./ffs/ffs_vfsops.c: panic("ffs_statfs"); ./ffs/ffs_vfsops.c: panic("ffs_statfs"); ./ffs/ffs_vfsops.c:
Re: yacc references
On Mon, Jan 27, 2014 at 01:57:48PM +0100, Jan Stary wrote: > The diff below moves the journal volume > reference into the .Rs block and adds pages. > > Jan > as discussed with jan, the diff below got committed. jmc Index: yacc.1 === RCS file: /cvs/src/usr.bin/yacc/yacc.1,v retrieving revision 1.27 diff -u -r1.27 yacc.1 --- yacc.1 15 Aug 2013 16:26:59 - 1.27 +++ yacc.1 28 Jan 2014 15:42:36 - @@ -187,7 +187,10 @@ .%A T. J. Pennello .%D 1982 .%J TOPLAS +.%N Issue 4 +.%P pp. 615\(en649 .%T Efficient Computation of LALR(1) Look-Ahead Sets +.%V Volume 4 .Re .Sh STANDARDS The @@ -212,7 +215,7 @@ Berkeley Yacc is based on the excellent algorithm for computing LALR(1) lookaheads developed by Tom Pennello and Frank DeRemer. The algorithm is described in their almost impenetrable article in -TOPLAS 4,4. +TOPLAS (see above). .Pp Finally, much credit must go to those who pointed out deficiencies of earlier releases.
Re: worm diff
On Thu, Jan 23, 2014 at 01:39:33PM +0100, Jan Stary wrote: > Time for today's silly diff yet? > The following augments the worm(6) manpage to > answer the burning question "what happens if > I make the worm too long initially"? > > Jan > i've just committed a fix from paul janzen that answers your burning question: on -current (i.e. as of 10 minutes ago), worm bails out with an error message if you make your initial worm too long. paul also contributed a second fix to have the cursor start on the worm's head. happy worming! jmc > > Index: games/worm/worm.6 > === > RCS file: /cvs/src/games/worm/worm.6,v > retrieving revision 1.12 > diff -u -p -u -p -r1.12 worm.6 > --- games/worm/worm.6 22 Oct 2009 18:07:31 - 1.12 > +++ games/worm/worm.6 23 Jan 2014 11:14:45 - > @@ -62,3 +62,6 @@ The current score > is kept in the upper right corner of the screen. > .Pp > The optional argument, if present, is the initial length of the worm. > +If the specified initial length would cause the worm to cover > +more then roughly one third of the screen, the default length > +will be used instead.
Re: NAT reliability in light of recent checksum changes
Le 2014-01-28 03:39, Richard Procter a écrit : In order to hide payload corruption the update code would have to modify the checksum to exactly account for it. But that would have to happen by accident, as it never considers the payload. It's not impossible, but, on the other hand, checksum regeneration guarantees to hide any bad data. So updates are more reliable. This analysis is bullshit. You need to take into account the fact that checksums are verified before regenerating them. That is, you need to compare a) verifying + regenerating vs b) updating. If there's an undetectable error, you're going to propagate it no matter whether you do a) or b). Simon
Re: NAT reliability in light of recent checksum changes
Le 2014-01-27 21:21, Geoff Steckel a écrit : It would be good if when data protected by a checksum is modified, the current checksum is validated and some appropriate? action is done (drop? produce invalid new checksum?) when proceeding. This is exactly what's being done. Don't you listen when Henning speaks? Simon
Re: dead disk
On Tue, Jan 28, 2014 at 6:12 AM, Philip Guenther wrote: > On Tue, Jan 28, 2014 at 2:03 AM, Andres Perera wrote: >> On Tue, Jan 28, 2014 at 4:55 AM, Philip Guenther wrote: > ... >>> I'm no expert on softdeps, so maybe you have a better explanation for >>> why Kirk made the choice he did to have it panic in some cases? >> >> well, i'm no expert either. now that we have presented our >> credentials, let's go back to what was already conjecture > ... >> do you want the kernel to figure these out after the fact and >> retroactively panic() for each occurence, neatly queueing them boot >> after boot or do you want to grow a pair of balls instead? > > You ignore my pointer to the actual engineering and logic in this area > and prefer to expand upon the conjecture. I cannot help in that area > and am unwilling to have you reorder my TODO list to suit your > pleasure. > the comments pertain to your misrepresentation of McKusick's softdep paper. this being a public forum, your todo list is your personal business, and in any case not for others to be shoehorned into when blatant mistakes need correction the paper does not support the notion that metadata cache flushing failures lead to complete system instability meriting a panic. quote the relevant text or stop pretending that it's there. meanwhile, there are cases where synchronous writing of metadata can also allow the unavailability and corruption of a previously succesful system call's pervasive effects. the onus is on you, or in your imaginary representation of the paper, to prove that halting the system is justifiable in BOTH circumstances. the paper does not discuss alternatives, eg, mounting read only and preserving references to unflushed data until unmount... so thread along if you find looking for a solution uninteresting. that's better than lying. > Instead I look forward to your diff fixing this bug in softdeps. > Please send that diff to the list and not me directly, as I find your > submissions uninteresting. > > > Philip Guenther
Re: bwi0: intr fatal TX/RX ([01]) error 0x00001000 (continuously streaming)
On Wed, Jan 15, 2014 at 9:41 AM, Martin Pieuchot wrote: > On 15/01/14(Wed) 15:27, Stefan Sperling wrote: >> It seems the bwi driver lacks support for PIO mode which the >> linux b43 driver falls back to in case of DMA errors such as >> this one. >> >> Two related linux commits: >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=9e3bd9190800e8209b4a3e1d724c35f0738dcad2 >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5100d5ac81b9330dc57e35adbe50923ba6107b8f >> I'm not sure why the latter commit talks about PCMCIA but it mentions >> the powerbook G4. > > That might help but that's just the top of the iceberg. At some point > this driver worked with my iBook G4 (I remember using it a lot in 4.7). > > Now and for quite some time it produces the same intr error. Same > problem with my PowerBook G4 12''. However it works well on my > PowerBook 15'' ... I finally had a moment to retrieve and installed 4.7-RELEASE to test this theory. After installing the bwi-firmware package, the same behavior is exhibited when one issues "ifconfig bwi0 up", so it wouldn't seem to be a regression. Martin, perhaps your iBook G4 has a slightly different card in it? Also: Stefan, would you be interested in working on this if I sent you the machine? If so, contact me offlist with shipping info. -slr
Re: Serial terminal (not console)
On Tue, Jan 28, 2014 at 03:27:55AM -0500, Hugo Villeneuve wrote: > On Mon, Jan 27, 2014 at 10:57:59AM +, Zé Loff wrote: > > On Thu, Jan 23, 2014 at 10:47:05AM +, Zé Loff wrote: > > > Sorry if it's a dumb question, but I'm stuck... > > > > > > I have two machines (yesterday's current, one i386, one amd64) and I can > > > properly setup a serial console on the i386 and access it on the amd64 > > > (i.e. changes in boot.conf and /etc/ttys as per the FAQ). This tells me > > > that the cable is OK, that there are no IRQ issues going on, that it all > > > works at the set baud rate (9600), etc... > > > > > > However, I can't get a simple terminal on the same tty device (i.e. not > > > the system's console). Without a /etc/boot.conf file, and an otherwise > > > vanilla /etc/ttys except for the following line > > > > > > tty00 "/usr/libexec/getty std.9600" vt100 on secure > > > > > > I don't get a login prompt when connecting on the other machine, even if > > > ps shows that there is a getty instance running for tty00. > > > > > > The ultimate goal is to be able to manage a couple of "com > > > port-impaired" HP MicroServers using a couple of ATEN USB serial > > > adapters, which can't be used for serial consoles for obvious reasons. > > > That's why I need to get a login on a serial terminal other than the > > > system's console... > > > > > > Any clues? > > > Thanks in advance > > > Zé > > > > > > > Sorry for insisting, but no one has a hint of why with this /etc/ttys > > > > # name getty typestatus comments > > console "/usr/libexec/getty std.9600" vt100 on secure <--- > > tty00 "/usr/libexec/getty std.9600" vt220 off <--- > > ... > > > > and this /etc/boot.conf > > > > stty com0 9600 > > set tty com0 > > > > I can remotely connect, but with this /etc/ttys > > > > # name getty typestatus comments > > console "/usr/libexec/getty std.9600" vt220 off secure <--- > > tty00 "/usr/libexec/getty std.9600" vt100 on secure<--- > > ... > > > > and no /boot.conf I can't? Cluebats welcome. > > > > I think what is tagged as "console" automatically gets the "local" > flag set. Altough, my whole experience with serial console is with > sparc/vax not i386/amd64, things may differ in that world. > > Try in /etc/ttys: > > tty00 "/usr/libexec/getty std.9600" vt100 on secure local > > > If you don't want to reboot, look at ttyflags(8). > > Good luck. > That did it, thanks. --
Re: dead disk
On Tue, Jan 28, 2014 at 2:03 AM, Andres Perera wrote: > On Tue, Jan 28, 2014 at 4:55 AM, Philip Guenther wrote: ... >> I'm no expert on softdeps, so maybe you have a better explanation for >> why Kirk made the choice he did to have it panic in some cases? > > well, i'm no expert either. now that we have presented our > credentials, let's go back to what was already conjecture ... > do you want the kernel to figure these out after the fact and > retroactively panic() for each occurence, neatly queueing them boot > after boot or do you want to grow a pair of balls instead? You ignore my pointer to the actual engineering and logic in this area and prefer to expand upon the conjecture. I cannot help in that area and am unwilling to have you reorder my TODO list to suit your pleasure. Instead I look forward to your diff fixing this bug in softdeps. Please send that diff to the list and not me directly, as I find your submissions uninteresting. Philip Guenther
Re: dead disk
On Tue, Jan 28, 2014 at 4:55 AM, Philip Guenther wrote: > On Tue, Jan 28, 2014 at 12:27 AM, Andres Perera wrote: >> On Sun, Jan 26, 2014 at 5:07 PM, Philip Guenther wrote: >>> On Sun, Jan 26, 2014 at 11:40 AM, emigrant wrote: My Master machine is dead, exactly HDD(thank you God for CARP+pfsync) :). root@master[/etc]wd0(pciide0:0:0): timeout type: ata c_bcount: 16384 c_skip: 0 >>> ... /: got error 5 while accessing filesystem panic: softdep_deallocate_dependencies: unrecovered I/O error Stopped at Debugger+0x4: popl%ebp RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb> >>> >>> This is a fundamental problem of softdeps:it can delay an operation to >>> a point where other operations depend on it in a such a way that if >>> the I/O for that first operation fails, the dependent operations >>> cannot be undone and the failure propagated up safely. Rather than >>> live a lie, it'll panic the system and die. >> >> the way the decision to panic() was stated implies that the course of >> action is justified, when detaching the disk/hub, or forcefully >> mounting it read only, are alternatives that could be explored. > > How do those alternative actions, which can only fail in-progress and > future operation, satisfactorily resolve the case of operations WHICH > HAVE ALREADY RETURNED SUCCESS but whose effects will actually be lost > and not durable? > > I'm no expert on softdeps, so maybe you have a better explanation for > why Kirk made the choice he did to have it panic in some cases? well, i'm no expert either. now that we have presented our credentials, let's go back to what was already conjecture do you understand that disks have write caches that don't give a hoot about posix mkdir() rename() and so on? can bit rot change a inode type from directory to file, and vice versa? do you want the kernel to figure these out after the fact and retroactively panic() for each occurence, neatly queueing them boot after boot or do you want to grow a pair of balls instead? > > > Philip Guenther
Re: dead disk
On Tue, Jan 28, 2014 at 12:27 AM, Andres Perera wrote: > On Sun, Jan 26, 2014 at 5:07 PM, Philip Guenther wrote: >> On Sun, Jan 26, 2014 at 11:40 AM, emigrant wrote: >>> My Master machine is dead, exactly HDD(thank you God for CARP+pfsync) :). >>> >>> root@master[/etc]wd0(pciide0:0:0): timeout >>> type: ata >>> c_bcount: 16384 >>> c_skip: 0 >> ... >>> /: got error 5 while accessing filesystem >>> panic: softdep_deallocate_dependencies: unrecovered I/O error >>> Stopped at Debugger+0x4: popl%ebp >>> RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! >>> DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! >>> ddb> >> >> This is a fundamental problem of softdeps:it can delay an operation to >> a point where other operations depend on it in a such a way that if >> the I/O for that first operation fails, the dependent operations >> cannot be undone and the failure propagated up safely. Rather than >> live a lie, it'll panic the system and die. > > the way the decision to panic() was stated implies that the course of > action is justified, when detaching the disk/hub, or forcefully > mounting it read only, are alternatives that could be explored. How do those alternative actions, which can only fail in-progress and future operation, satisfactorily resolve the case of operations WHICH HAVE ALREADY RETURNED SUCCESS but whose effects will actually be lost and not durable? I'm no expert on softdeps, so maybe you have a better explanation for why Kirk made the choice he did to have it panic in some cases? Philip Guenther
Re: NAT reliability in light of recent checksum changes
On 28/01/2014, at 4:19 AM, Simon Perreault wrote: > Le 2014-01-25 14:40, Richard Procter a écrit : >> I'm not saying the calculation is bad. I'm saying it's being >> calculated from the wrong copy of the data and by the wrong >> device. And it's not just me saying it: I'm quoting the guys >> who designed TCP. > > Those guys didn't envision NAT. > > If you want end-to-end checksum purity, don't do NAT. Let's look at the options. The world needs more addresses than IPv4 provides and NAT gives them to us. There's IPv6, which has about a hundred billion addresses for every bacteria estimated to live on the planet[0], but it's not looking to replace IPv4 any time soon. So NAT is here to stay for a good while longer. Perhaps I can at least stop using NAT on my own network. In my case I can't but let's assume I do. This eliminates one source of error. But my TCP streams may still have now-undetected one-bit errors (at least) if there may be routers out there regenerating checksums. As long as there are, good checksums no longer mean as much by themselves and if I want at least some assurance the network did its job, I still need some other way (e.g, checking the network path contains no such routers, either by inspection or statistically, or by reimplementing an end-to-end checksum at a higher layer, etc). Regenerated checksums affect me whether or not I use NAT myself. Another option is to always update the checksum as versions prior to version 5.4 did. It's reasonable to ask, well is any more reliable than recomputing them as 5.4 does? That is, can the old update code hide payload corruption, too? In order to hide payload corruption the update code would have to modify the checksum to exactly account for it. But that would have to happen by accident, as it never considers the payload. It's not impossible, but, on the other hand, checksum regeneration guarantees to hide any bad data. So updates are more reliable. A lot more reliable, in fact, as you'd require precisely those memory errors necessary to in effect compute the correct update, or some freak fault in the ALU that did the same thing, or some combination of both. And as that has nothing to do with the update code it is in principle possible for non-NAT connections, too. For the hardware, updates are just an extra load/modify/store and so the chances of a checksum update hiding a corrupted payload are in practical terms equivalent to those of normal forwarding. So your statement holds only if checksums are being regenerated. In general, NAT needn't compromise end-to-end TCP payload checksum integrity, and in versions prior to 5.4, it didn't. best, Richard. [0] "Prokaryotes: The unseen majority" Proc Natl Acad Sci U S A. 1998 June 9; 95(12): 6578–6583. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC33863/ 2^128 IPv6 addresses = ~ 10^38 ~ 10^38 IPv6 addresses / ~ 10^30 bacteria cells = ~ 10^8 addresses per cell. [1] RFC1071 "Computing the Internet Checksum" p21 "If anything, [this end-to-end property] is the most powerful feature of the TCP checksum!". Page 15 is also touches on the end-to-end preserving properties of checksum update.
Re: Serial terminal (not console)
On Mon, Jan 27, 2014 at 10:57:59AM +, Zé Loff wrote: > On Thu, Jan 23, 2014 at 10:47:05AM +, Zé Loff wrote: > > Sorry if it's a dumb question, but I'm stuck... > > > > I have two machines (yesterday's current, one i386, one amd64) and I can > > properly setup a serial console on the i386 and access it on the amd64 > > (i.e. changes in boot.conf and /etc/ttys as per the FAQ). This tells me > > that the cable is OK, that there are no IRQ issues going on, that it all > > works at the set baud rate (9600), etc... > > > > However, I can't get a simple terminal on the same tty device (i.e. not > > the system's console). Without a /etc/boot.conf file, and an otherwise > > vanilla /etc/ttys except for the following line > > > > tty00 "/usr/libexec/getty std.9600" vt100 on secure > > > > I don't get a login prompt when connecting on the other machine, even if > > ps shows that there is a getty instance running for tty00. > > > > The ultimate goal is to be able to manage a couple of "com > > port-impaired" HP MicroServers using a couple of ATEN USB serial > > adapters, which can't be used for serial consoles for obvious reasons. > > That's why I need to get a login on a serial terminal other than the > > system's console... > > > > Any clues? > > Thanks in advance > > Zé > > > > Sorry for insisting, but no one has a hint of why with this /etc/ttys > > # name getty typestatus comments > console "/usr/libexec/getty std.9600" vt100 on secure <--- > tty00 "/usr/libexec/getty std.9600" vt220 off <--- > ... > > and this /etc/boot.conf > > stty com0 9600 > set tty com0 > > I can remotely connect, but with this /etc/ttys > > # name getty typestatus comments > console "/usr/libexec/getty std.9600" vt220 off secure <--- > tty00 "/usr/libexec/getty std.9600" vt100 on secure<--- > ... > > and no /boot.conf I can't? Cluebats welcome. > I think what is tagged as "console" automatically gets the "local" flag set. Altough, my whole experience with serial console is with sparc/vax not i386/amd64, things may differ in that world. Try in /etc/ttys: tty00 "/usr/libexec/getty std.9600" vt100 on secure local If you don't want to reboot, look at ttyflags(8). Good luck.
Re: dead disk
On Sun, Jan 26, 2014 at 5:07 PM, Philip Guenther wrote: > On Sun, Jan 26, 2014 at 11:40 AM, emigrant wrote: >> My Master machine is dead, exactly HDD(thank you God for CARP+pfsync) :). >> >> root@master[/etc]wd0(pciide0:0:0): timeout >> type: ata >> c_bcount: 16384 >> c_skip: 0 > ... >> /: got error 5 while accessing filesystem >> panic: softdep_deallocate_dependencies: unrecovered I/O error >> Stopped at Debugger+0x4: popl%ebp >> RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! >> DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! >> ddb> > > This is a fundamental problem of softdeps:it can delay an operation to > a point where other operations depend on it in a such a way that if > the I/O for that first operation fails, the dependent operations > cannot be undone and the failure propagated up safely. Rather than > live a lie, it'll panic the system and die. the way the decision to panic() was stated implies that the course of action is justified, when detaching the disk/hub, or forcefully mounting it read only, are alternatives that could be explored the other day I unplugged the power connector from a malfunctioning DVD-RW drive that was being too darn noisy. the kernel proceeded to detach the ahci device hosting the aformentioned drive and an sd with mounted ffs partitions i could've unplugged the power connector in the middle of a string of metadata writes to the sd. does that entail that the system should panic? hopefully the current outcome will remain, because it's way more useful than my pc throwing a hissy's fits. > > I don't know exactly which operations can lead to that; if you need to > know that you should go read the softdeps papers on Kirk McKusick's > site. > > > Philip Guenther