Re: MySQL connection error after upgrade 4.9-5.0
One more step towards OT, and yet on the spot. Now it is on 'beauty'. At shutdown, we get always stopping package daemons:/etc/rc[260]: /etc/rc.d/: cannot execute - Is a directory Is there anything wrong, still, with our configuration? Uwe
MySQL connection error after upgrade 4.9-5.0
I have this unfortunate occurrence on one of my production machines: Database Error: Unable to connect to the database:Could not connect to MySQL I studied the Upgrade Guide 4.9 to 5.0 intensely before and after, but can't find what went wrong. I just did the upgrade, and made the links as proposed (so I hope) # pwd /etc/php-5.2 # ls -l total 0 lrwxr-xr-x 1 root wheel 26 Mar 14 17:26 gd.ini - /etc/php-5.2.sample/gd.ini lrwxr-xr-x 1 root wheel 29 Mar 14 17:25 mysql.ini - /etc/php-5.2.sample/mysql.ini # ls -l /var/www/conf/modules total 0 lrwxr-xr-x 1 root daemon 41 Mar 14 16:26 php.conf - /var/www/conf/modules.sample/php-5.2.conf I also tried to copy the modified php.ini from /var/www/conf to /etc/php-5.2.ini Then the result is: Database Error: Unable to connect to the database:The MySQL adapter mysql is not available. pkg_info looks okay: mini_sendmail-chroot-1.3.6p1 static mini_sendmail for chrooted apache mysql-client-5.1.54p0 multithreaded SQL database (client) mysql-server-5.1.54p9 multithreaded SQL database (server) nano-2.2.6 Pico editor clone with enhancements [...] pfstat-2.3p1packet filter statistics visualization php-5.2.17p5server-side HTML-embedded scripting language php-gd-5.2.17p4 image manipulation extensions for php5 php-mysql-5.2.17p3 mysql database access extensions for php5 png-1.5.4 library for manipulating PNG images I am very grateful for any help or advice how to further debug this. Uwe
Re: MySQL connection error after upgrade 4.9-5.0
On Wed, Mar 14, 2012 at 7:27 PM, Norman Golisz li...@zcat.de wrote: However, did you change any values in php.ini from default? no, not yet. I wasn't actually expecting everything to be up 100%, but to be up, with the 50-default php-5.2.ini. Or, with the previous php.ini in that place. I also looked at the 'diff' before my post, but the differences are just huge. Which values did you have in mind for changing in the default php.ini? Uwe
Re: MySQL connection error after upgrade 4.9-5.0
On Wed, Mar 14, 2012 at 6:55 PM, Fred Crowson fred.crow...@gmail.com wrote: What does your logs say in /var/mysql/ ? hth Yes, Fred, very much! - It is obvious that I failed - and still fail - to understand the new startup system. Can anyone point me to a complete overview to read up on it? I don't get yet which utility starts and controls the package scripts in rc.d. And slightly OT: I have stared at the pkg_scripts=${pkg_scripts} postfix in the Upgrade Guide, and still don't grasp what this is supposed to do, and where; since I am running postscript. Thanks again, Uwe
Re: MySQL connection error after upgrade 4.9-5.0
On Wed, Mar 14, 2012 at 8:35 PM, Rodolfo Gouveia rgouv...@cosmico.net wrote: And slightly OT: I have stared at the pkg_scripts=${pkg_scripts} postfix in the Upgrade Guide, and still don't grasp what this is supposed to do, and where; since I am running postscript. http://www.openbsd.org/faq/faq10.html#rc man rc.d I had read those. And yet, I don't understand that line. It doesn't look like it should be written into rc.conf / rc.conf.local, does it? Correct me if I'm wrong. It looks like a shell variable that has 'postfix' appended. Uwe
Re: MySQL connection error after upgrade 4.9-5.0
On Wed, Mar 14, 2012 at 10:22 PM, Uwe Dippel udip...@gmail.com wrote: I had read those. And yet, I don't understand that line. It doesn't look like it should be written into rc.conf / rc.conf.local, does it? Correct me if I'm wrong. It looks like a shell variable that has 'postfix' appended. Ooops, I think I got it finally. The shell variable is defined in rc.conf, and actually has 'postfix' appended when rc.conf.local is being run. Personally, I would not have expected the variable to be created in rc.conf, because since 4. something it is being considered 'clean' of user entries at upgrade. Would it not be better to add package start strings in rc.conf.local only? Uwe
etc/nsd.conf with wrong group after upgrade
I did the upgrade 4.8-4.9 as lined out in http://www.openbsd.org/faq/upgrade49.html Now I get in the daily insecurity: Checking special files and directories. Output format is: filename: criteria (shouldbe, reallyis) etc/nsd.conf: gid (97, 0) Did I miss anything? (I don't think so.) Yes, I'd know how to correct this, though I'd rather make sure that everything is done the proper manner on both sides. I think that when one does cd /tmp/etc cp daily disktab man.conf monthly netstart nsd.conf pf.os rc rc.conf weekly /etc as instructed in the Upgrade Guide, root:wheel is the outcome. Uwe
Re: Check for localtime?
On Fri, May 6, 2011 at 3:33 PM, Otto Moerbeek o...@drijf.net wrote: The original files didn't differ, but the install replaced Singapore to which /etc/localtime syms. So after resolving symlinks /etc/localtime did change. You copy in your postfix install wasn't changed. Thanks, Otto, for the explanation. Topic closed. Uwe
softraid - best practice?
Just had a problem with softraid on a 4.6 box. No, I don't ask to solve it, it needed urgent replacement, and so I did. What I would like to ask for, is advice on best practices for softraid under OpenBSD, to prevent similar things from happening again; getting hints on how to set it up better, and mostly: how to recover it better. What happened, was that some slices in a softraid simply went away after some power surge. In detail: sd1 and sd2 were set to RAID, and the ensuing RAID1 (sd3) sliced up into a number of /usr/, /var, /home/, /var/www, /var/mail, swap. After the reboot after a power surge, two of the slices (/var/mail, sd3g and /home, sd3h) were simply unavailable, couldn't be 'mount -a'-ed at reboot, and the system fell back to '/' only being mounted (on sd0). Strangely, though, disklabel sd3 showed the slices, as sd3g, sd3h. But they could not be accessed at all; and were not visible under /dev/. Still, an unexpected bahaviour as far as I am concerned, even more so since sysctl and bioctl showed an 'OK' and 'Online' softraid. I tried a few things, like fsck_ffs on these two disappeared slices, as well as the 'good' ones. The good ones were good, also with fsck_ffs -f. But the two gone missing were just not available (as devices). Then I made, I guess, a big mistake, and instead of ripping out one of the drives, I bioctl -d -ed sd3; leaving 2 drives with RAID file system on them. Over. Now, please, any suggestions on how to do better next time something like this happens? Thanks in advance, Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Tony Abernethy tony at servasoftware.com writes: Might be better to read and comprehend ``man patch'' before assuming limitations on the scope of patch's reach. It is always so nice to trample on the person lying on the ground, ain't it! Where in 'man patch' is the underlying problem addressed? - oh, yeah, maybe mine is the old version, again ... .
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Philip Guenther guenther at gmail.com writes: Please point to the part of the Upgrade Guide which talks about building from source, untarring the src tar file, or applying errata. I can't seem to find any such reference, but I'm sure it's in there somewhere, because you originally said that you did the upgrade exactly following the upgrade guide and, as we found out later, your steps included building from source. You misread what I did. I was following the Upgrade Guide to the dot, following Applying patches in OpenBSD to the dot, and then the instructions in the patch files. To the dot. This is where my unfortunate quarrel with Jacob came from, when he said I was building userland, and I insisted I was applying errata. From what I have learned, Applying patches in OpenBSD should be removed from the FAQ, since we would all have been spared this thread. It obviously has carried a number of people away. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Philip Guenther guenther at gmail.com writes: You now have and now it seems the core discussion is just about whether (or where) an additional rm -rf /usr/obj/* should be added to help people that know enough to set up the source tree for building/patching by untaring src.tar.gz but don't know to remove the obj tree at the same time. So, no diff here, but a suggestion: If one needs to avoid stale stuff lying around in /usr/obj at applying a patch, the only logical consequence is, to clean out all /obj totally, even before applying a single patch. If I am correct, the instructions should be clear for 00N_ThisApp.patch: Apply by doing: cd /usr/src patch -p0 00N_ThisApp.patch Clean the build directories by issuing the command /usr/sbin/mk_build_clean And then rebuild and install the library and statically-linked binaries that depend upon it: cd lib/libThisApp make obj make depend make includes make make install cd ../../sbin make obj make depend make make install , where mk_build_clean is just the set of steps pointed out in 'man release', respectively in FAQ5. To me, and I guess Richard Toohey, the case is solved. Everyone who can read, and likes following instructions, can read and follow this easily. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Thu, Jun 3, 2010 at 7:18 PM, Richard Toohey richardtoo...@paradise.net.nz wrote: OK, I've tried it and cannot reproduce what you see. I've never done an upgrade from bsd.rd before, so wanted to give it a go. Obviously something different with your set-up, or where you got the files from, or factor X - but as other people have said, they can't guess what. In short - the basic bsd.rd follow these instructions worked for me here. OK, I start with 4.6 amd64 (either 4.6 or just pre-4.6 release) uname -a OpenBSD dellamd64.home 4.6 GENERIC#0 amd64 But before I upgrade, what's /sbin/pfctl? $ ls -l /sbin/pfctl -r-xr-xr-x 1 root bin 492664 Dec 3 23:12 /sbin/pfctl $ md5 /sbin/pfctl MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 http://openbsd.org/faq/upgrade47.html One easy way to boot from the install kernel is to place the 4.7 version of bsd.rd in the root of your boot drive, then instruct the boot loader to boot using this new bsd.rd file. On amd64 and i386, you do this by entering boot bsd.rd at the initial boot prompt. OK, I'll get the bsd.rd from the 4.7 release CD (but could have used FTP.) /usr/bin/su root mv /bsd.rd /bsd46.rd mount /dev/cd0a /mnt/ cp /mnt/4.7/amd64/bsd.rd /bsd.rd umount /mnt eject /dev/cd0a reboot ... boot boot bsd.rd ... Welcome to the OpenBSD/amd64 4.7 installation program. ... I choose upgrade ... take defaults all the way until ... Location of sets? [What do I do here? I'll try http, and take the defaults. What did YOU do here?] bsd, bsd.rd, base47.tgz, misc47.tgz, comp47.tgz, man47.tgz, game47.tgz, xbase47.tgz xshare47.tgz, xfont47.tgz, xserv47.tgz ... all get to 100% no errors. ... rest of install, reboot ... $ uname -a OpenBSD dellamd64.home 4.7 GENERIC#112 amd64 $ ls -l /sbin/pfctl -r-xr-xr-x 1 root bin 500856 Mar 18 15:36 /sbin/pfctl $ md5 /sbin/pfctl MD5 (/sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4 Thanks, Richard. No, you couldn't encounter it. It comes in later. I have now the whole upgrade session of my third machine, the log is 2 MB. Whenever I rebooted, it was okay: 1. reboot to start bsd.rd - okay 2. reboot directly after bsd.rd upgrade - okay 3. reboot after 'Final steps', before pkg_add - okay 4. reboot after 'Upgrading packages' - okay 5. reboot after patching - old files and wrong timestamps - bummer, as Theo might say. I wonder if I can put the file up into the open, or if it contains security-related matter.?? As bz2 it is just 91 k; I will of course make it available to individuals on request. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Fri, Jun 4, 2010 at 2:18 PM, Uwe Dippel udip...@gmail.com wrote: 5. reboot after patching - old files and wrong timestamps - bummer, as Theo might say. Sorry, guys, (too quick as too often), just cat-grep pfctl shows where the old one comes in: pfctl: pf already enabled # ls -l # ls -l /etc/# ls -l /sbin/pfctl -r-xr-xr-x 1 root bin 500856 Mar 18 10:36 /sbin/pfctl # ls -l # ls -l /sbin/ # ls -l # ls -l /sbin/pfctl # ls -l /sbin/pfctl -r-xr-xr-x 1 root bin 500856 Mar 18 10:36 /sbin/pfctl -r-xr-xr-x 1 root bin500856 Mar 18 10:36 pfctl # pfctl -f # pfctl -f /etc/# pfctl -f # pfctl -f /etc/pf.conf# pfctl -f /etc/pf.conf # # pfctl -e pfctl: pf already enabled # # pfctl -d # # pfctl -e === pfctl /usr/src/sbin/pfctl/obj - /usr/obj/sbin/pfctl === pfctl === pfctl nroff -Tascii -mandoc /usr/src/sbin/pfctl/pfctl.8 pfctl.cat8 === pfctl install -c -s -o root -g bin -m 555 pfctl /sbin/pfctl install -c -o root -g bin -m 444 pfctl.cat8 /usr/share/man/cat8/pfctl.0 pfctl: DIOCBEGINADDRS: Operation not supported by device pfctl: DIOCBEGINADDRS: Operation not supported by device -r-xr-xr-x 1 root bin492664 Jun 4 13:28 pfctl Through patching outdated(?) source files; though with the proper time stamp: # cd /home/ftp/pub/OpenBSD/4.7 # ls -l -r-xr-xr-x 1 root ftp 131759003 Mar 21 19:17 src.tar.gz -rw-r--r-- 1 root ftp 20668814 Mar 21 19:17 sys.tar.gz # md5 src.tar.gz MD5 (src.tar.gz) = 5214cd951cac5b7fbd89c968d1b5f859 # md5 sys.tar.gz MD5 (sys.tar.gz) = 566c0cd7c3d2f28b17a9795324ead6ff Maybe TeXitoi was right, after all, when he mentioned corrupted files on some mirrors? I wouldn't bet on it, but usually our fastest mirror here is ftp://ftp.jaist.ac.jp. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Fri, Jun 4, 2010 at 2:41 PM, patrick keshishian pkesh...@gmail.com wrote: you mean applying the errata47.html patches? If so, are you certain your source tree is tagged OPENBSD_4_7 and not anything else? Do I understand you correctly? I am not building releases. I am installing/downloading the sets; then I do all the stuff in 'Upgrade guide', then rm -Rf * in /usr/src rm -Rf * in /usr/xenocara rm -Rf * in /usr/ports, and then tar ... the source files meticulously as pointed out in the guide: # cd /usr/src # tar xzf ../sys.tar.gz # tar xzf ../src.tar.gz # cd /usr # tar xzf xenocara.tar.gz # tar xzf ports.tar.gz and then download the patches into /usr/src, then applying them and this is what you would see in the serial log. So I don't tag, because I don't cvs; what i do is just download the 4 files. (Also see my other post, it points clearly to the sequence and the reboots done, with always checking pfctl after each reboot.) Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Fri, Jun 4, 2010 at 3:00 PM, Richard Toohey richardtoo...@paradise.net.nz wrote: I think we are getting closer, aren't we? So, NOTHING to do with the actual upgrade, is it? No absolutely nothing. I withdraw the subject with regret. At least the 'base47'-part thereof. Or the ports/packages. I guess, not. It is something to do with how you are PATCHING after an upgrade. You don't mention where/when you get the source you patch? Because that would be a separate step, wouldn't it? (I usually install from CD, so I scrub /usr/src load from src.tar.gz on the CD.) Exactly. Just explained it in the previous post, and don't want to repeat myself. Except that I download, and then the actual files used were: # md5 /usr/src.tar.gz MD5 (/usr/src.tar.gz) = 5214cd951cac5b7fbd89c968d1b5f859 # md5 /usr/sys.tar.gz MD5 (/usr/sys.tar.gz) = 566c0cd7c3d2f28b17a9795324ead6ff (Here, contrary to the earlier post, at the actual location in /usr of the target machine before extraction: # ls -l /usr/s* -rw-r--r-- 1 root wheel 131759003 Mar 21 19:17 /usr/src.tar.gz -rw-r--r-- 1 root wheel 20668814 Mar 21 19:17 /usr/sys.tar.gz) Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Fri, Jun 4, 2010 at 4:32 PM, patrick keshishian pkesh...@gmail.com wrote: Where did you get those tar-balls from? Those are most likely not 4.7 sources. I gave the potential link and their md5 sums further up. Our link here is sooo slooow; I *am* currently downloading the archives from ftp://ftp.openbsd.org/pub/OpenBSD/4.7 to compare the checksums. That would explain a lot (though not everything, since the kernel looks pretty correct: 4.7). Maybe someone is faster and can confirm or refute the authenticity of the archives? Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Fri, Jun 4, 2010 at 4:22 PM, Eric Faurot e...@faurot.net wrote: Don't you have old stuff lying around in /usr/obj that gets installed over your new binaries? That's probably the critical question now. Though, sorry to say, there is nowhere written that you have to rm -Rf it, when you - upgrade - patch Actually, I wasn't even aware of the existence of this directory until several minutes ago. (I expected it to be cleaned with wiping the source directories.) Then, according to what is written by a number of people further up, an number of people could be hit by this. I for one would expect the time stamps to take care for that. And, to stress it again: When you are under 'quality control', and responsible for the uptime of a system, you would never do anything out of the scope of instructions, naturally. especially not some rm -Rf * in a directory of your arbitrary choice. ;) And don't point me to man release, please. I am not doing releases, I am not doing stable, I do, like many others, 'Upgrade Guide X.Y to X.Z', and then get and apply the errata from http://openbsd.org/errataXZ.html; according to their instructions. If this happens to be wrong, by all means, then I make a mistake, and have been making this mistake for 5 years. So, rm -Rf * in /usr/obj is necessary? Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Jacob Meuser jakemsr at sdf.lonestar.org writes: oh good grief. you had a dirty /usr/obj. just look at the pfctl snippet of the log you posted. do you see pfctl being built? do you see pfctl being installed from /usr/obj? Oh, yes. So the blame is on my side, I guess. Mea culpa maxima! I didn't know that the object directories need to be cleaned manually. Until yesterday, I would have taken a bet that the object directories lie within the source trees (/usr/xenocaram /usr/src), and be cleaned when cleaning the sources. Now I am aware that I need to know the location of the object directories and clean them manually. I was totally unaware that, in case of a patch, the installer would take the next best file of the correct name from there; irrespective of the underlying version. Though I feel in good company. I guess, a great number of people on this list were in a similar situation. Knowing the 'social contract' of OpenBSD, I only have to blame myself for ignorance. Still, may I suggest, that the next Upgrade Guide gets an extra line, with a remark pointing out the existence of /usr/obj; and the suggestion to clean it? Also, with respect to the 'errata', the patches, they describe in detail what needs to be done. Maybe here, it could as well be suggested, that before applying the first patch of a new version of OpenBSD, /usr/obj should be cleaned, or be verified to be clean? Thanks for the various people who helped me patiently at analysing this problem to the very end! Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Theo de Raadt deraadt at cvs.openbsd.org writes: A chit-chat on a public mailing list isn't going to find this supposed bug. Why discuss it? Why not just keep prove it happened. Yes, Theo. Though: How? This is what I tried to find out. I showed the list if files. Do you assume I tinkered with it? Why should I? pfctl wasn't working correctly. Without the help of the list, I wouldn't have been able to drill it down to some 70 files being of the previous version. Thanks to everyone who helped! Don't you see how tiring it is to discuss it when we've seen no evidence? It might be tiring, but what evidence do you want? Here, I want to solve a problem of files missing. Since I followed the Upgrade guide to the dot, rebooted to bsd.rd in the beginning, rebooted at the command prompt, we (I) need to find what went wrong. That's all. I don't even mind if the mistake was on my side, then I could learn. So, please, specify the evidence that you need. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
On Thu, Jun 3, 2010 at 3:20 PM, Tony Abernethy t...@servasoftware.com wrote: The error message(s) you are suppressing (or maybe didn't see) About the only way you can get some files but not all files from a tarball is some fatal error in the extraction of the tarball. Any such error tends to give an error message. I don't think this list likes to play guessing games as to exactly what mistakes you have made or what evidence you are suppressing. Oh, how beautiful! This is a sign of mutual trust. I documented everything from the first pfctl after reboot from upgrade not working, for what I am chided; and still, I am supposed to 'suppress evidence'. How nice of you! And if I present a serial log, I will have been suspected to have tempered with it? No, that seriously turns me off. I have given everything in detail that I came across, I have not been silent about any additional message, any unusual activity. I have stated a few times that I followed the upgrade procedure to the dot, I have confirmed that nothing unusual showed. Over. I might have made some mistake, yes. Even though these same boxes have been upgraded since 3.8, nevermind. I could at all times have made a mistake or overlooked something. But to start kind of asking for 'proof', that's what's ridiculous, to cite Theo. I am willing to give individuals unprivileged access to the boxes, I did this before, to look around. When you have a box that is relevant to your company, and you are responsible for it, and you noticed something unusual, why would you not want to come screaming to the list (like it did, my excuses), to look for help, but 'conveniently avoid' mentioning that serious error message during the upgrade? You need the box to be up and running, and adding this error message can only help; so why would you suppress it, maybe preventing efficient help to be offered? I fully agree that what I report sounds highly unlikely. But it is true, and by now I confirmed exactly the same having happened on that other box (i386), If I suppressed anything, why should I add to the improbability? Yes, it happened, and I applied the same method and have by now tar xzvphf-ed the 70+ sbin-files that were there and - identically to the previous amd64 - are from version 4.6. It is not excluded, as I wrote earlier, that the upgrade itself does everything 100% correct. Who knows, there can always be a rogue package. Not that I'm saying this has happened, but theoretically, any package could contain 4.6-files and write them back at pkg_add. Uwe
Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT
Damon McMahon damon.mcmahon at gmail.com writes: Probably no help, but I had similar happen to me upgrading 4.5-4.6 a few months ago. Similar problem with pftcl after a diligent upgrade, and like you I have been following the upgrade procedure diligently since 3.something. I checked the timestamp on pfctl, it didn't seem right so I built from source and installed and the issue went away, so I assumed I had something wrong and thought nothing of it as generally if OpenBSD f*cks up it's down to me and not the developers Thanks so much! You saved my upcoming weekend. So I am not hallucinating. Of course, never expected the developers to solve *my* problems; only wanted to exclude a bug hitting others. You're better than me, and I only learned yesterday that the install-set files come back with the time stamp of creation. Had I known this, a lot could have been spared, because my second post already showed a lot of bad time stamps in /sbin. This 'strange' time stamp (see my earlier post) of May 31 20:28 still prevails in a number of files on my box: # ls -lR | grep May 31 20:28 | wc -l 153 , *after* I replaced the differing files in /sbin. I'd be interested to see if there's a common thread here, particularly before I upgrade this box to 4.7 which (like yours) is a remote box which will be upgraded over serial. I guess so. By now I'll have the list of the files affecting amd64 and i386 in these cases, and you should be able to correct everything using these lists. Maybe interesting enough, this timestamp was *not* when I upgraded the sets. It was the time when I upgraded the packages, respectively applied the patches. But from now on I'll be mum (I hope I can!) on this topic until a complete explanation is available. Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Getting closer ... Extracted the archive being used for the upgrade to amd64 into my user-directory and calculated all 7484 md5 for the files included in base47, and redirected those into a file. Then, I calculated all the md5 for the files *installed* in the upgraded machine; the file names taken from the same base47. Then, I had two files of 7484 md5s each, and could diff them. Further down is the result. I'm stumped. Why would these files (around 100) not be the 4.7 version, but previous (4.6 I guess; I haven't checked all). (Add /etc/pfctl to the list of different files; I had already manually copied it to /sbin from the archive to get the firewall working) So *all* sets were installed, in principle, but some files were not. Huh? I am sure, some people with more insight can help me further to explain what is going on here. What makes or made these files below so 'special' that they fail to be 'just there' after the upgrade on amd64? Thanks for any further hint, Uwe $ diff md5sums_archive md5sums_install 1c1 ./usr/lib/libasn1.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3 --- ./usr/lib/libasn1.so.17.0 aa2c929c805b55008bba1bc942483b01 3,4c3,4 ./usr/lib/libcom_err.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3 ./usr/lib/libcrypto.so.18.0 4280f48657120e382c01ca4c1a8aafc4 --- ./usr/lib/libcom_err.so.17.0 aa2c929c805b55008bba1bc942483b01 ./usr/lib/libcrypto.so.18.0 5f38e49397b845acdf818c520953eb0e 14,15c14,15 ./usr/lib/libkafs.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3 ./usr/lib/libkrb5.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3 --- ./usr/lib/libkafs.so.17.0 aa2c929c805b55008bba1bc942483b01 ./usr/lib/libkrb5.so.17.0 aa2c929c805b55008bba1bc942483b01 34c34 ./usr/lib/libssl.so.15.1 baa09f0512fbe6ecb1519de10ed6a8a4 --- ./usr/lib/libssl.so.15.1 e3dcfdfc876252231bd8994c3f0c6f1d 192,195c192,195 ./sbin/atactl 6ba0fc88a2cf2ad11bf5bdfee76b5bc5 ./sbin/badsect 4914ef0ea057a00d8b9ae91eaf894af5 ./sbin/bioctl ce8467b4415309be9b447405938fcda0 ./sbin/ccdconfig 08123bd59420bf542011b5a72c6f896c --- ./sbin/atactl e214991d640840544a90595b661b6378 ./sbin/badsect a87ca3adea353c650c15ff7db2059c94 ./sbin/bioctl 6737ac6873c92917779282cf3f3e8cb5 ./sbin/ccdconfig af51c2c19995759fcfd4f8ed6c7ab64d 197,198c197,198 ./sbin/clri 721d8795ff051d72ee67175902c53dd6 ./sbin/dhclient 3e1f06ca19aa5aceec214d2891fa504e --- ./sbin/clri afef2851d33038cb9de7c21f2d6dc037 ./sbin/dhclient 5f6a023ce04f9fc3611a56c1752c5c30 200,219c200,219 ./sbin/disklabel 3162a316caada7f5ebdd8b07d5722cb2 ./sbin/dmesg 3acb23453982bdf9974c4f3abb8d6dfd ./sbin/dump e24875bd59e468780c4f53dc8685befc ./sbin/dumpfs 5cc5b40775147860423cddf45e264fac ./sbin/fdisk 6a1a875a41cceed2874e2b1e861b0257 ./sbin/fsck 6fd26d79982bfe5d91d2f450ea495a19 ./sbin/fsck_ext2fs 5dbb372ffbdb958bcfbd8d24811c4f87 ./sbin/fsck_ffs 7bba80258056a3a40b51d24a63d4e5de ./sbin/fsck_msdos 85eacde50cecb85043601e05aac5a606 ./sbin/fsdb e46a8fa824d753af715cae0f8e4a8049 ./sbin/fsirand 65018fa13f98e1de12f5dfdcfc59cafd ./sbin/growfs 7e7ba9034167529de5cef12497e2228b ./sbin/halt e8612dfe7b7703188cb887852b073fe7 ./sbin/ifconfig bc731472da980771e922604a7f76bb7e ./sbin/init a6e6bf349857e9addcce114f5cbeebea ./sbin/iopctl b1ffd69049a845e749f1fdff490045be ./sbin/ipsecctl acdee246db653efa457193d9d7be195b ./sbin/isakmpd 6e8462f8a4c3cfc2901dbe3163c9f857 ./sbin/kbd b7da651953889ab863042dd1e05976dc ./sbin/ldattach b0b97a2496c0c2593c437842cb29d9df --- ./sbin/disklabel b6455e58788253af334bda563c12ca12 ./sbin/dmesg 4a9f96f0a968f616a4dda156ec1572f4 ./sbin/dump bdbfcd38d79289f81f23059cfb6156ea ./sbin/dumpfs 847cde118bbff6e12981ec92270aabcc ./sbin/fdisk 7b0d0a7788e323811c91c92761c7244f ./sbin/fsck 8105d9fc124a57dd343ba97d19c9fc48 ./sbin/fsck_ext2fs ed161578a1777c598c10bb6963d0b7b4 ./sbin/fsck_ffs 1b978655ccdcbf54e78c8febc2b8808b ./sbin/fsck_msdos 43bc067c65f648041f8ade25ddd077d1 ./sbin/fsdb 8f720b110108c74f55b69935a20adfa6 ./sbin/fsirand d39bf0252bfaad9aa256dbf294ede7da ./sbin/growfs d129af4e9526b87992de226da5f1e184 ./sbin/halt 2d0046c3e383d785b856d1cb0dbe7e5a ./sbin/ifconfig 35e192bac398bf47ddf8e0a190f6b06a ./sbin/init 37d5ca74a94642c48f2278c17420bf76 ./sbin/iopctl 04b18862d04525f6a53324694180592f ./sbin/ipsecctl 0f78f6df80715707bcd0dca44199debe ./sbin/isakmpd 9093d66c257145221ce33f4114ca3507 ./sbin/kbd d0e6b82ecadad09eab297ce032fe1d70 ./sbin/ldattach 04eace371d1dc317b289da273a311c10 221,242c221,242 ./sbin/lmccontrol 2c9a1f7a4cb9af7d9ceaf47d9482eb8b ./sbin/mkfifo 7ebd0d605fb65d8acdce0b1542b7a949 ./sbin/mknod 7ebd0d605fb65d8acdce0b1542b7a949 ./sbin/modload bca677810f776226d24832fa2a118609 ./sbin/modunload 95e4adeda57e7c52f240f136e092eb7b ./sbin/mount 6903ddec325432d73f65c80a56a9aef3 ./sbin/mount_cd9660 cb343f92845ad398d2cc3e4262934030 ./sbin/mount_ext2fs dcb91a3f42126fb96ece261f9a3db010 ./sbin/mount_ffs 6fbd41195622e084f3b9ace630d73d2d ./sbin/mount_mfs c707d5acad7bc11fc5feeb4f4841a1e0 ./sbin/mount_msdos c1d5742cc20e13b2b195f9d8d32c7769 ./sbin/mount_nfs
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Now, with $ diff md5sums_archive md5sums_install | grep ^ | cut -d ' ' -f2 these are the files different on amd64, between what the archive supplied, and what the installer left behind: ./usr/lib/libasn1.so.17.0 ./usr/lib/libcom_err.so.17.0 ./usr/lib/libcrypto.so.18.0 ./usr/lib/libkafs.so.17.0 ./usr/lib/libkrb5.so.17.0 ./usr/lib/libssl.so.15.1 ./sbin/atactl ./sbin/badsect ./sbin/bioctl ./sbin/ccdconfig ./sbin/clri ./sbin/dhclient ./sbin/disklabel ./sbin/dmesg ./sbin/dump ./sbin/dumpfs ./sbin/fdisk ./sbin/fsck ./sbin/fsck_ext2fs ./sbin/fsck_ffs ./sbin/fsck_msdos ./sbin/fsdb ./sbin/fsirand ./sbin/growfs ./sbin/halt ./sbin/ifconfig ./sbin/init ./sbin/iopctl ./sbin/ipsecctl ./sbin/isakmpd ./sbin/kbd ./sbin/ldattach ./sbin/lmccontrol ./sbin/mkfifo ./sbin/mknod ./sbin/modload ./sbin/modunload ./sbin/mount ./sbin/mount_cd9660 ./sbin/mount_ext2fs ./sbin/mount_ffs ./sbin/mount_mfs ./sbin/mount_msdos ./sbin/mount_nfs ./sbin/mount_nnpfs ./sbin/mount_ntfs ./sbin/mount_portal ./sbin/mount_procfs ./sbin/mount_udf ./sbin/mount_vnd ./sbin/mountd ./sbin/ncheck ./sbin/ncheck_ffs ./sbin/newfs ./sbin/newfs_msdos ./sbin/nfsd ./sbin/nologin ./sbin/pfctl ./sbin/pflogd ./sbin/ping ./sbin/ping6 ./sbin/quotacheck ./sbin/raidctl ./sbin/rdump ./sbin/reboot ./sbin/restore ./sbin/route ./sbin/rrestore ./sbin/rtsol ./sbin/savecore ./sbin/scan_ffs ./sbin/scsi ./sbin/shutdown ./sbin/slattach ./sbin/swapctl ./sbin/swapon ./sbin/sysctl ./sbin/ttyflags ./sbin/tunefs ./sbin/umount ./sbin/vnconfig ./sbin/wpa-psk ./sbin/wsconsctl ./usr/libexec/kdc ./usr/sbin/sysctl Let's assume for a moment, that the differences of Kerberos and crypto stuff is a result of the patches and packages, everything else different is the majority of the files in /sbin. A yet closer inspection of the differences there leads to a confirmation of what was assumed before: of all files in /sbin, except of /sbin/ifconfig, /sbin/ipsecctl and /sbin/isakmpd, are the files of the 4.6 Release. Waiting for some further enlightment about what was going on; what happened to those 4.7 files, Uwe
Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
Nick Holland nick at holland-consulting.net writes: There is one more machine (amd64) that needs to be upgraded. Before I do this, I rather solicit suggestions on how to log the upgrade process, debug it, or otherwise. serial console. Log everything from the first chars out the serial port to the reboot. In fact, log the reboot. Don't edit anything. Use a public mirror or an official CD for the install, make sure it is obvious which. Stick the resulting file on a webserver. Thanks, Nick. Based on the latest results, the problem seems to exist only for most of the /sbin files. So, the upgrade runs through as programmed. With a public mirror, it will take hours. I really hope SHA256 is good enough to confirm the integrity of the archives. Serial console seems a good idea; I have to use it in any case. What I have in mind, is, before the reboot, to use the command prompt to check the files in the /sbin-to-be. I have a hunch, that they'll be there, then. Then I'll do the same after the reboot, and once again, after the package upgrade. Should the phenomenon show again, by now I can imagine that the changes are happening some time later. We'll see ... Uwe
Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT
On 06/01/2010 05:32 AM, Philip Guenther wrote: Was there a common thread to what did turn up? My recall is that basically every time people get Operation not supported by device errors from pfctl, it's because their userland and kernel don't match. Review your upgrade procedure, because it's clearly broken. Thanks for your help, seriously. And I don't want to start arguing, not at all, but this is one of my production boxes, without access, and I have been running the boot.bsd.rd updates since 3.8 twice a year. Being production, I diligently watched, and saw with my own eyes the asterisks advancing. I can only say, I followed standard procedures; if just for my own sanity. I *am* losing the latter, because it seems that all files in /sbin are identical to my box still on 4.6; though something has happened to them yesterday: (this is my 4.6-box, upgraded only on April 19th:) $ ls -l /sbin/p* -r-xr-xr-x 1 root bin 492664 Apr 19 13:44 /sbin/pfctl -r-xr-xr-x 1 root bin 390264 Apr 19 13:44 /sbin/pflogd -r-sr-xr-x 1 root bin 210040 Apr 19 13:44 /sbin/ping -r-sr-xr-x 1 root bin 234616 Apr 19 13:44 /sbin/ping6 (This is my box upgraded yesterday, May 31st, to 4.7:) # ls -l /sbin/p* -r-xr-xr-x 1 root bin 492664 May 31 20:28 /sbin/pfctl -r-xr-xr-x 1 root bin 390264 May 31 20:28 /sbin/pflogd -r-sr-xr-x 1 root bin 210040 May 31 20:28 /sbin/ping -r-sr-xr-x 1 root bin 234616 May 31 20:28 /sbin/ping6 So it did something, from where did it get the old files? I guess not from a mistake on my side, because I accepted the upgrade path in the Upgrade shell. Plus: OpenBSD 4.7 (GENERIC.MP) #130: Wed Mar 17 20:48:50 MDT 2010 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP I never copied any myself down here. As I mentioned, production, upgrade twice per year through serial console. And now my sanity seems to fade: I did the same to one of my i386-boxen, and exactly the same happens there!! (Please, now I am starting to lose ground under my feet!) This is after the update to 4.7, i386, in front of the screen!: (mnt is /altroot, mounted just now to check; since pfctl did the same thing, again, here) # ls -l /mnt/sbin/p* -r-xr-xr-x 1 root bin 422648 Apr 19 12:51 /mnt/sbin/pfctl -r-xr-xr-x 1 root bin 328440 Apr 19 12:51 /mnt/sbin/pflogd -r-sr-xr-x 1 root bin 180984 Apr 19 12:51 /mnt/sbin/ping -r-sr-xr-x 1 root bin 197368 Apr 19 12:51 /mnt/sbin/ping6 # ls -l /sbin/p* -r-xr-xr-x 1 root bin 422648 Jun 1 12:54 /sbin/pfctl -r-xr-xr-x 1 root bin 328440 Jun 1 12:54 /sbin/pflogd -r-sr-xr-x 1 root bin 180984 Jun 1 12:54 /sbin/ping -r-sr-xr-x 1 root bin 197368 Jun 1 12:54 /sbin/ping6 A mix-up of versions? I don't think so, because $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/base47.tgz ./sbin/pfctl $ md5 sbin/pfctl MD5 (sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4 exactly what you had. Now I start to not exclude a bug any longer. Maybe under some circumstances, the files are not overwritten, but touched; or whatnot.? This leaves me with two questions: 1. How to debug what goes on? 2. (and more important for me): What to do? Should I tar xzvphf {file}47.tgz; or try an new upgrade? Uwe
Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT
Joachim Schipper joachim at joachimschipper.nl writes: Just untarring the release should work, but it's still odd. At least the md5sum of pfctl matches what I just downloaded, so that seems fine; did you actually use *that* tarball, though? (Note that the right pfctl binary is 500856 bytes long.) Are you sure that you upgraded the right disk? Yep. When I untar the files (I have them locally on a webserver: ftp://metalab.uniten.edu.my/pub/OpenBSD/4.7/ all files come out perfectly well, as above. I did the upgrades using this URL; I am sure it were these files, because they only exist once locally (the speed with which the updates were done is proof that I used these local resources, downloaded by myself). In the Upgrade procedure I only added the (internal) IP for that server, accepting all else. And it can't be 4.6 that I used, kind of, because the installed (upgraded) kernel is 4.7. I need to repeat, this is a remote production machine with serial access. I have no desire ever to do anything not along clear procedures, and I followed the Upgrade Guide 4.6-4.7 meticulously (system administration is part of my job description), even ticking off point after point on the printout of the upgrade guide. So something was done to the files, at least they have the new time stamp, and some files have actually been installed correctly (kernels); as the hashsums show. So, finally, I *was* in the right directory and installed to the correct disk. Here are the kernels, on the first machine, that has seemingly the previous 'base' throughout: # cksum -a sha256 bsd SHA256 (bsd) = e2af09ed48d1d94bec27aa4c18ffa6172d8435a190c3abecae53d26940ed9536 # cksum -a sha256 bsd.sp SHA256 (bsd.sp) = a34175b766d6ea9cefcc0903efa51c4dc3d87018b1e2f85c2333133ed25e9ff4 Now I wonder if the problem was with the untar? Maybe all sets have not been installed properly? Next, I will have to identify for each and every set, a sample file, and check if it is the previous one or the recent one. Very, very strange ... Thanks so much, you did actually help me a step further, Uwe
Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64
[I consider it better to open a new thread, since the title, and part of the content, of the previous one was superseded.] Having upgraded one machine (amd64) from 4.6 to 4.7, using the normal upgrade procedure as outlined in http://openbsd.org/faq/upgrade47.html to the dot, after the reboot it showed that the files contained in base47.tar were not installed. All other sets apparently did get installed. This led to some out-of-sync problems with pfctl, by which this failure was noticed. (http://article.gmane.org/gmane.os.openbsd.misc/174272) While waiting for some details, and trying to recapitulate what went wrong, I did the same procedure to an i386 machine, apparently with identical result: The base47 files are missing; there are still the previous files from base46 in the machine. As far as I can make out, all other sets installed perfectly well. Since this happened on two physical machines of different architecture, and in both cases by meticulously following the upgrade guide with the boot bsd.rd mechanism, it cannot be excluded that there is a weakness in the installer. There was no error message, on the contrary, all looked okay, including the installation of the sets. There is one more machine (amd64) that needs to be upgraded. Before I do this, I rather solicit suggestions on how to log the upgrade process, debug it, or otherwise. I do not state at all, that the observed behaviour affects all installations, nor anyone else. It still points to a potential weakness, since both machines run nothing but standard OpenBSD, and have been upgraded through all previous versions using the same method since 3.8; until now without fail. Uwe As preliminary evidence, I have attached randomly selected files from the tar-sets that were found to differ in 4.6 and 4.7. The first line shows the extraction of those files from the archives used for installation, the subsequent lines show the calculations of their md5 sums, firstly in the user directory; followed by the md5 sums of those files as installed by the upgrade process. (For misc and xfonts no files were found (non-exhaustive search) that differed in 4.6 and 4.7) The '$' prompt is on the machine containing the archives, followed by the '#' prompt, which is on the respective machine to which 4.7 was installed. amd64: base: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/base47.tgz ./sbin/pfctl $ md5 sbin/pfctl MD5 (sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4 == archive # md5 /sbin/pfctl MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 == upgraded $ md5 /sbin/pfctl MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 == 4.6 comp: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/comp47.tgz ./var/db/libc.tags $ md5 var/db/libc.tags MD5 (var/db/libc.tags) = ef05ce6515665eff14618c02c4678edc == archive # md5 /var/db/libc.tags MD5 (/var/db/libc.tags) = ef05ce6515665eff14618c02c4678edc == upgraded $ md5 /var/db/libc.tags MD5 (/var/db/libc.tags) = d3e2a489d70cd3f0d91fef538f4ebfd1 == 4.6 games: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/game47.tgz ./usr/games/morse $ md5 usr/games/morse MD5 (usr/games/morse) = 61157239de35061df71e7be2e17a9471 == archive # md5 /usr/games/morse MD5 (/usr/games/morse) = 61157239de35061df71e7be2e17a9471 == upgraded $ md5 /usr/games/morse MD5 (/usr/games/morse) = ee30c2129ceac343438ea03a9efa2fe5 == 4.6 man: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/man47.tgz ./usr/share/man/ps8/loongson $ md5 usr/share/man/ps8/loongson/ MD5 (usr/share/man/ps8/loongson/) = d41d8cd98f00b204e9800998ecf8427e == archive # md5 /usr/share/man/ps8/loongson/ MD5 (/usr/share/man/ps8/loongson/) = d41d8cd98f00b204e9800998ecf8427e == upgraded $ md5 /usr/share/man/ps8/loongson/ md5: cannot open /usr/share/man/ps8/loongson/: No such file or directory == 4.6 misc: ?? xbase: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xbase47.tgz ./usr/X11R6/man/whatis.db $ md5 usr/X11R6/man/whatis.db MD5 (usr/X11R6/man/whatis.db) = a6ebdd66fe58b66136c9fdfc9eca1c5d == archive # md5 /usr/X11R6/man/whatis.db MD5 (/usr/X11R6/man/whatis.db) = a6ebdd66fe58b66136c9fdfc9eca1c5d == upgraded $ md5 /usr/X11R6/man/whatis.db MD5 (/usr/X11R6/man/whatis.db) = 01e11bb37c523bc6fe8c37e139f6fe41 == 4.6 xetc: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xetc47.tgz ./var/db/sysmerge/xetcsum $ md5 var/db/sysmerge/xetcsum MD5 (var/db/sysmerge/xetcsum) = 374865b6f2b5a34b64148bfe6746cfd0 == archive # md5 /var/db/sysmerge/xetcsum md5: cannot open /var/db/sysmerge/xetcsum: No such file or directory == upgraded $ md5 /var/db/sysmerge/xetcsum md5: cannot open /var/db/sysmerge/xetcsum: No such file or directory == 4.6 xfonts: ?? xserv: $ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xserv47.tgz ./usr/X11R6/lib/modules/dri/r300_dri.so $ md5 usr/X11R6/lib/modules/dri/r300_dri.so MD5 (usr/X11R6/lib/modules/dri/r300_dri.so) = e7caa1ee3691a40c40f994dfb210738c == archive # md5 /usr/X11R6/lib/modules/dri/r300_dri.so MD5 (/usr/X11R6/lib/modules/dri/r300_dri.so) =
Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT
Joachim Schipper joachim at joachimschipper.nl writes: Just untarring the release should work, but it's still odd. At least the md5sum of pfctl matches what I just downloaded, so that seems fine; did you actually use *that* tarball, though? (Note that the right pfctl binary is 500856 bytes long.) Make this thread closed. I manually 'upgraded' (only) the file /sbin/pfctl from exactly the archive used at the upgrade 4.6-4.7 procedure, and everything 'pfctl' works fine. This leaves us with the problem of the failed upgrade procedure, twice now. Thanks to all contributors in this thread! Uwe
pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT
(I searched Google, but not much turned up.) Since I upgraded to 4.7; what I get is: # pwd /etc # cat pf.conf # works? # pfctl -f pf.conf.47 pfctl: DIOCBEGINADDRS: Operation not supported by device # pfctl -f /etc/pf.conf pfctl: DIOCXCOMMIT: Device busy Huh? (Actually I had used the original pf.conf, with the alterations for anchors done. Then I got this; and it seems pfctl is doing some nonsense: even a comment is not read and executed.) # which pfctl /sbin/pfctl # md5 /sbin/pfctl MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 (this is amd64) No changes of the hardware; I have no access to the machine. Any hint will be appreciated, Uwe OpenBSD 4.7 (GENERIC.MP) #130: Wed Mar 17 20:48:50 MDT 2010 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 2146381824 (2046MB) avail mem = 2079801344 (1983MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (78 entries) bios0: vendor HP version D19 date 07/16/2007 bios0: HP ProLiant ML350 G4p acpi0 at bios0: rev 2 acpi0: tables DSDT FACP SPCR MCFG APIC acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.58 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu0: 2MB 64b/line 8-way L2 cache cpu0: apic clock running at 200MHz cpu1 at mainbus0: apid 6 (application processor) cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu1: 2MB 64b/line 8-way L2 cache cpu2 at mainbus0: apid 1 (application processor) cpu2: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu2: 2MB 64b/line 8-way L2 cache cpu3 at mainbus0: apid 7 (application processor) cpu3: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu3: 2MB 64b/line 8-way L2 cache ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins ioapic1 at mainbus0: apid 9 pa 0xfec1, version 20, 24 pins ioapic1: misconfigured as apic 0, remapped to apid 9 ioapic2 at mainbus0: apid 10 pa 0xfec8, version 20, 24 pins ioapic3 at mainbus0: apid 11 pa 0xfec80400, version 20, 24 pins acpiprt0 at acpi0: bus 1 (IP2P) acpiprt1 at acpi0: bus 2 (IPXB) acpiprt2 at acpi0: bus 6 (PCXA) acpiprt3 at acpi0: bus 9 (PCXB) acpiprt4 at acpi0: bus 5 (PTA0) acpiprt5 at acpi0: bus 13 (PTB0) acpiprt6 at acpi0: bus 16 (PTC0) acpiprt7 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0 acpicpu1 at acpi0 acpicpu2 at acpi0 acpicpu3 at acpi0 acpitz0 at acpi0: critical temperature 31 degC pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c pci1 at ppb0 bus 5 ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09 pci2 at ppb1 bus 6 ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09 pci3 at ppb2 bus 9 rl0 at pci3 dev 2 function 0 D-Link Systems 530TX+ rev 0x10: apic 11 int 0 (irq 3), address 00:11:95:5e:50:ba rlphy0 at rl0 phy 0: RTL internal PHY ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c pci4 at ppb3 bus 13 ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c pci5 at ppb4 bus 16 ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02 pci6 at ppb5 bus 2 mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 int 0 (irq 10) scsibus0 at mpi0: 16 targets, initiator 7 sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF1468A4CC, HPB5 SCSI3 0/direct fixed sd0: 140014MB, 512 bytes/sec, 286749488 sec total sd1 at scsibus0 targ 2 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed sd1: 34732MB, 512 bytes/sec, 71132000 sec total sd2 at scsibus0 targ 5 lun 0: COMPAQ, BD1468A4C5, HPB4 SCSI3 0/direct fixed sd2: 140014MB, 512 bytes/sec, 286749488 sec total mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi0: target 2 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi0: target 5 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 int 1 (irq 10) scsibus1 at mpi1: 16 targets, initiator 7 uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 16 (irq 3) uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 19 (irq 5) Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 int 23
No networking with cd46.iso on qemu?
Trying to install a virtual OpenBSD on OpenBSD 4.6 on amd64, I did: # env ETHER=em0 qemu -net nic,model=rtl8139 -net tap -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd46.iso -boot d as described in http://www.openbsd.org/cgi-bin/cvsweb/ports/emulators/qemu/files/README.OpenBSD?rev=1.5;content-type=text%2Fx-cvsweb-markup It does start the install, everything, but I can't seem to get any network connection. dhcp times out. I restarted, and tried the usual qemu 10.0.2.0/24 as fixed addresses (and given in that site); but no fun. Restarted again and set some addresses of the network in which the host runs, 192.168.1.0/24. At least, I could ping the (real) gateway of the host, and even an outside server holding the installation files. Strange enough, though, the ftp to get the .tgz closes immediately with some 'connection refused', though I can ping it from the guest-to-be installed, as well as ftp to it from the host. I used qemu on Knoppix before, and it always offered dhcp out of the box. Where is my mistake? Uwe
Re: No networking with cd46.iso on qemu?
Rares Aioanei wrote: On 04/28/2010 04:03 PM, Uwe Dippel wrote: Trying to install a virtual OpenBSD on OpenBSD 4.6 on amd64, I did: # env ETHER=em0 qemu -net nic,model=rtl8139 -net tap -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd46.iso -boot d as described in http://www.openbsd.org/cgi-bin/cvsweb/ports/emulators/qemu/files/README.OpenBSD?rev=1.5;content-type=text%2Fx-cvsweb-markup You're not mistaking, Realtek is. Stay away from them, virtual or not, and try the other NICs qemu has to offer. It will work. No, sorry. All the same. No dhcp (dhclient) ever. I tried pcnet, ne2k_pci, i82551 (the latter segfaults, see below). What I did: # env ETHER=bge0 qemu -net nic,model=ne2k_pci -net tap -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd46.iso -boot d I guess there is something wrong with that 'ether' thing. I had tried em0 as written in that OpenBSD cvsweb, but then it works even less. I see no em0 coming up, only the pcn0 (pcnet), ne3 (ne2k_pci), fxp0 (i82551); no vlan, tun. Just lo and that respective network card. I always boot cd46.iso and go to (S)hell immediately, and do the network setting, first trying dhcp, then manually. All dhclient {pcn0| ne3} time out. All manual settings fail to connect to ftp, as well. dhclient fxp0 segfaults qemu reproducably: # env ETHER=bge0 qemu -net nic,model=i82551 -net tap -m 32 -monitor stdio -no- -cdrom cd46.iso -boot d {tun0 (bridge0 - bge0)} QEMU 0.9.1 monitor - type 'help' for more information (qemu) assertion !feature is missing in this emulation: unknown word read failed: file /usr/obj/ports/qemu-0.9.1p10/qemu-0.9.1/hw/eepro100.c, line 1202 , function eepro100_read2 Abort trap (core dumped) It *must* be a mistake on my side, if the description on the OpenBSD site is correct. What can I do? Uwe
Re: No networking with cd46.iso on qemu?
Uwe Dippel wrote: It *must* be a mistake on my side, if the description on the OpenBSD site is correct. What can I do? Let me add some remarks, after trying to debug it further: Trying the install the conventional way results as to be expected on amd64: qemu -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd45.iso -boot d segfaults reproduceably at entering the root password (why there?) Okay, so I followed the 2. tap mode further down. Still, no success. The interfaces are not created as expected/described: # ifconfig tun0 link0 # ifconfig bridge0 create # brconfig bridge0 add tun0 add em0 up brconfig: bridge0: em0: No such file or directory No luck here, neither. What to do next? Uwe Here is the dmesg, in case: OpenBSD 4.6 (GENERIC.MP) #0: Mon Apr 26 18:00:52 SGT 2010 udip...@mybox.myorg.my:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 3756994560 (3582MB) avail mem = 3633987584 (3465MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries) bios0: vendor HP version D17 date 07/16/2007 bios0: HP ProLiant ML350 G4 acpi0 at bios0: rev 2 acpi0: tables DSDT FACP SPCR MCFG APIC acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.50 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu0: 1MB 64b/line 8-way L2 cache cpu0: apic clock running at 200MHz cpu1 at mainbus0: apid 6 (application processor) cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu1: 1MB 64b/line 8-way L2 cache cpu2 at mainbus0: apid 1 (application processor) cpu2: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.12 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu2: 1MB 64b/line 8-way L2 cache cpu3 at mainbus0: apid 7 (application processor) cpu3: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.12 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG cpu3: 1MB 64b/line 8-way L2 cache ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins ioapic1: misconfigured as apic 0, remapped to apid 9 ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins acpiprt0 at acpi0: bus 1 (IP2P) acpiprt1 at acpi0: bus 2 (IPXB) acpiprt2 at acpi0: bus 6 (PCXA) acpiprt3 at acpi0: bus 9 (PCXB) acpiprt4 at acpi0: bus 5 (PTA0) acpiprt5 at acpi0: bus 13 (PTB0) acpiprt6 at acpi0: bus 16 (PTC0) acpiprt7 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0 acpicpu1 at acpi0 acpicpu2 at acpi0 acpicpu3 at acpi0 acpitz0 at acpi0: critical temperature 31 degC pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c pci1 at ppb0 bus 5 ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09 pci2 at ppb1 bus 6 ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09 pci3 at ppb2 bus 9 ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c pci4 at ppb3 bus 13 ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c pci5 at ppb4 bus 16 ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02 pci6 at ppb5 bus 2 mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 int 0 (irq 5) scsibus0 at mpi0: 16 targets, initiator 7 sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed sd0: 34732MB, 512 bytes/sec, 71132000 sec total sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd1: 286102MB, 512 bytes/sec, 585937500 sec total sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd2: 286102MB, 512 bytes/sec, 585937500 sec total mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 int 1 (irq 5) scsibus1 at mpi1: 16 targets, initiator 7 uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 16 (irq 3) uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 19 (irq 3) Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 int 23 (irq 3) usb0 at ehci0: USB revision 2.0 uhub0 at usb0 Intel
Problem after upgrade 4.5 to 4.6: ERR M
Having done upgrades from 4.0 onwards, on a OpenBSD-only server (amd64), this time something must have gone wrong: Despite of the (remote, I have no physical access, via serial console) 'successful' upgrade (no error messages), when I was asked to reboot, I did, as always. Alas, it came up with Attempting Boot From Floppy Drive (A:) Attempting Boot From CD-ROM Attempting Boot From Hard Drive (C:) Using drive 0, partition 3. Loading... ERR M on an HP ML350G4p. From all I know it is a problem with the MBR. What I'd really like to get, before I drive there and get access, is how to best solve this problem, and most straightforward. Talking about what went wrong can wait, since this is a production machine and should be back as soon as possible. Thanks in advance, Uwe
Re: Problem after upgrade 4.5 to 4.6: ERR M
Tobias Ulmer wrote: As explained above, no, you likely moved around/corrupted /boot in a way that doesn't work for biosboot. Hmm. Actually I didn't. Through serial console, I had rebooted the server, just 'to make sure', before booting to bsd.rd, and everything went through. I rebooted again, immediately, to bsd.rd, and went through the very normal and standard procedure like umteen times before. One exception: the bsd.mp was shown as corrupted by its sha256 hash. The install program, however, continued; so that I could not rectify this. being on a multi-CPU box, in the end, it automatically copied the (corrupted) bsd.mp to bsd, which then had a size of 1.3 MB. Therefore, at the very end, after the device nodes, at the 'reboot now' prompt, I ftp-ed a correct version from another location into there, and cp-ed it into bsd. Then, strangely enough, suddenly there appeared a bsd.sp of a size of 0, which had not been there before. I found this quite strange, both the installer going through despite of the wrong hash; and more so the (new?) automatic move of bsd.mp to bsd on a multicore machine; though the size was wrong. And in the end, a '0'-sized bsd.sp after moving in a healthy bsd.mp. I would not totally exclude an interference of this (new?) code that lead to the described situation. Honestly, nothing at all done in that session aside from what I wrote, between the 2 boots. I guess, nothing of what I did should hurt the /boot? Thanks for the reply. I'll go there next to try what has been proposed. Before I try, in case the # /usr/*m*dec/installboot -v boot /*usr/mdec*/biosboot sd0 does NOT work, what else could I do? (I am asking, because it is a server room quite far away, with little chance for me to communicate, and difficult to go.) So, is there any alternative, or additional, solution to fall back to, when I am there, and installboot doesn't cut it? Uwe
Re: Problem after upgrade 4.5 to 4.6: ERR M [Solved]
Nick Holland wrote: And in the end, a '0'-sized bsd.sp after moving in a healthy bsd.mp. I would not totally exclude an interference of this (new?) code that lead to the described situation. Honestly, nothing at all done in that session aside from what I wrote, between the 2 boots. I guess, nothing of what I did should hurt the /boot? well, something did damage /boot. I doubt it was anything you did intentionally, but something also caused a bad bsd.sp to be copied over. Possibly related. May indicate system problems of some type. Okay, back. Works! But we should not stop here. Because at mounting my / on /mnt, I noticed that the /boot had also taken to a zero size. Like that bsd.sp, which was okay, but received 0 after copying bsd.mp to bsd. What would now make /boot zero? /usr/mdec/installboot -v boot /usr/mdec/biosboot sd0 says something like boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c no error message, but /boot is still '0'. Then I removed /boot, and then an error message came up: ... boot: No such file or directory Meaning, it couldn't rectify the /boot of size 0. Last chance: I copied /usr/mdec/boot to /mnt/ Again: /usr/mdec/installboot -v boot /usr/mdec/biosboot sd0 boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c boot is 3 blocks x 16384 bytes fs block shift 2; offset 63; inode block 24, offset 936 using MBR partition 3: type 0xA6 offset 63 Now, this looked promising and actually worked. I still take a bet on a round of drinks that there is a bug in the recent install/upgrade code that has a tendency to render files to zero size. Thanks for all the input to get this production box back! Uwe
Re: strange (?) ssh user
Paul de Weerd wrote: Hi Uwe, Yes. Like Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2 And this XXX.XX.XX.XX is the address of a machine you know ? Yes The user is a well known user to you, Yes some system account perhaps ? No To be clear, the user exists, and logged on the last time three days ago as far as 'last' is concerned. This does not really match up with your previous statements of who never logged on, is not visible with 'last'. Sorry, my shoddy way of saying things. 'Never' meant 'never while there were processes running under his user-ID in the last hours' So his last 'last' is 3 days old. What is this user doing ? Any other processes running under his uid ? No, only the root- and user-id of ssh. If he's back immediately after a reboot, it sounds like an automated log in (using password auth; that may be interesting). What exactly do you want to know here ? How to log in without showing up in finger/w/last/etc ? Try `while :; do ssh ${HOST} read A; done`, it does exactly what you describe. Are you sure that account is not compromised and your machine is not sending out lots of e-mail ? Hmm. How would I know? The daily security report gives out a reasonable number of mails, top looks okay to me, low as usual. Cheers, Thanks, Uwe
Re: strange (?) ssh user
Edd Barrett wrote: Hi, On Fri, Aug 21, 2009 at 6:54 AM, Uwe Dippeludip...@uniten.edu.my wrote: Yes. Like Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2 To be clear, the user exists, and logged on the last time three days ago as far as 'last' is concerned. This sounds very fishy. I would start backing up if I were you. Did this. You said first that last says the user had not logged on, but now that it has 3 days ago? Is the user covering up his/her traces or was that a typo? (See my other mail, my ambiguity: Last record in 'last' of 3 days ago.) See what the user is doing and what is in his/her home directory. Nothing except of ssh - Nothing much. The usual few files. Nothing in hidden files. Try to find information about the machine which it is coming from. It is an inside (LAN) machine, standard workstation/desktop I would be interested to know. Me too! ;) Uwe
Re: strange (?) ssh user
Iqigo Ortiz de Urbina wrote: As its not clear to me if isuser is a user you trust, created or needed for your services, 'Trusted', created by myself, needs a local account. I would say your machine might have been compromised. What kind of traffic is isuser generating? Difficult to find out if I assume I could not trust my box any longer. Is it just a reverse ssh shell? Could very well be. Would this not show in 'last' or 'w'? Interesting to me, that no pseudo-terminal is associated with the activities (ssh), contrary to a usual local logon. Can you shutdown his account or set his/her/its shell to nologin(8)? I'll try this next when I see her activities: nologin. Next install you might consider following the advices of mtree(8) as the output of previous and current `mtree -cK sha1digest` would be really usefeul here. I'll have to study this first. Thanks!
Re: strange (?) ssh user
Paul de Weerd wrote: tcpdump(8) will tell you a lot, I suppose ;) I guess the best way to make sure the account is not compromised is talking to your user and asking him if he can explain what is going on. Again, my current guess is TCP forwarding, but it could be a lot of other things too. Ask your user and see if he knows about this. I can't as of now (weekend). But I can see it reoccurring, kind of: Aug 21 18:31:25 mybox sshd[31888]: Accepted password for isuser from XXX.XX.XX.XX port 57519 ssh2 in authlog, reflected pretty well by isuser ttyp0172.16.0.35 Fri Aug 21 18:31 - 18:31 (00:00) in 'last'; though still busy sending stuff forth and back: isuser 16994 0.0 0.8 3176 1992 ?? S 6:31PM0:00.13 sshd: isuser There are a bunch of logons of that user, of 00:00 logon duration during the last weeks. The only thing running from this user at this moment is the ssh. That would mean, one can log on, spawn a process, log off, and the process keeps running? Then everything could be 'fine', and the system not compromised, only exploited to run some ssh-tunnel or so. Though this behaviour of the system would be unexpected by myself. Uwe
Re: strange (?) ssh user
Robert C Wittig wrote: Have you considered adding a PF rule that would drop all incoming login requests from this specific user? Yes. But it won't work, because there is a NAT-address-rewrite in between that changes the source address. Also, that user has plenty of machines to log on to. It seems by now that it is not a compromise, but something else, rather 'abuse'. Uwe
Re: strange (?) ssh user
Paul de Weerd wrote: You could check for the presence of forwarded TCP sessions with fstat, an exmaple looks like this : weerdsshd 29016 11* internet stream tcp 0x40009ab33d0 127.0.0.1:44410 -- 127.0.0.1:3128 If you open an ssh session to a remote machine with a forwarded port, then open the forwarded port and once the connection over the forwarded port has been established ^D the initial session, you'll get the behaviour you just described. The established TCP session over the forwarded connection keeps the SSH session alive but the user is shown as logged out (and no processes show other than the sshd's you mentioned). Now I am pretty sure that this is what we see here. It also makes sense, since all those users sit on a tightly controlled LAN; while that machine is 'further out'. So that restricted services can be accessed through some tunneling. Now: How to prevent it?? I have hundreds of users, who can log on from hundreds of machines, and all need access to ssh, and easily 30 at the same time. So, filtering IP addresses is out, nologin is out, no ssh is out. Of course, I can politely ask, but I would not necessarily trust it to be followed. I'd much rather disallow it technically. At least, have an easy access to the record (e.g. in 'last'). But since it doesn't require logon, what to do? And how to prevent this?? Any suggestion appreciated, Uwe
Re: strange (?) ssh user
Johan Beisser wrote: Read the man page for ssh_config(5) and sshd_config(5), and look at restricting what your users can do. Specifically: AllowTcpForwarding, PermitOpen and PermitTunnel, combined with Match. Thanks everyone for a great number of enlightening and helpful replies to my post! I have learned a lot. Last not least, and again, how biased I can think: When I noticed some activities by a user who was not logged on, I feared a compromise. That lead me away from the solution: reading the man pages of ssh, as I did not expect this to be 'normal' or even legal. Thanks again! Uwe
strange (?) ssh user
Recently, I noticed an ssh user on one of my machines, who never logged on, is not visible with 'last', seems to have no terminal active, and is back immediately after a reboot. Hmm. root 13415 0.0 0.9 3280 2420 ?? Ss12:04PM0:00.08 sshd: isuser isuser 702 0.0 0.7 3280 1824 ?? S 12:04PM0:00.00 sshd: isuser Whatever I do with finger, w, last, no trace of any activity; not even a login. I tried to kill the processes, and they are gone, but the next second another pair is up. Could anyone help me to explain what is going on here? Uwe
Re: strange (?) ssh user
Ryan Flannery wrote: On Fri, Aug 21, 2009 at 1:19 AM, Uwe Dippeludip...@uniten.edu.my wrote: Recently, I noticed an ssh user on one of my machines, who never logged on, is not visible with 'last', seems to have no terminal active, and is back immediately after a reboot. Hmm. root 13415 0.0 0.9 3280 2420 ?? Ss12:04PM0:00.08 sshd: isuser isuser 702 0.0 0.7 3280 1824 ?? S 12:04PM0:00.00 sshd: isuser Whatever I do with finger, w, last, no trace of any activity; not even a login. Just to be clear here, do you see anything in /var/log/authlog? Yes. Like Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2 To be clear, the user exists, and logged on the last time three days ago as far as 'last' is concerned.
4.6 will be released on October 1st?
At least, that's what the website says at http://openbsd.org/46.html True or typo? (I'd expect November 1st.) Uwe
Re: Panic at install of amd64 on HP nx6320
Uwe Dippel udippel at uniten.edu.my writes: Marco Peereboom wrote: we need a trace; this is worthless. Thought so. Here are the screens, in the attachment. Hope, it goes through! Uwe [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0623.JPG] [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0624.JPG] [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0625.JPG] So they didn't go through, as to be expected. Here is a link: http://metalab.uniten.edu.my/~udippel/
Re: Panic at install of amd64 on HP nx6320
Jonathan Gray jsg at goblin.cx writes: Nowhere do you state which release you are running. Similiar problems have been fixed in -current some months ago, so what are you running? My fault. I'm running 4.5 stable. Would those fixes have made it into 4.5? If yes, -current is no alternative. What exactly do you mean, so that I could check the changelog? Uwe
Re: Panic at install of amd64 on HP nx6320
Jonathan Gray jsg at goblin.cx writes: No, this will never be in 4.5. The acpi parser has changed significantly since 4.5 which made many hp machines much happier. You need to run a snapshot to get the newer parser to resolve this problem. Correct, guys, thanks so much! I ran the -cuurent of August 7th, and it runs through, and reboots properly. And X comes up without problem with 'startx'. Looks good to me, so far. And a new installer. Somewhat confusing, though: Layout [A]utomatic (or so) has a lower case default at the line end:[a] At the end, it finds an MP kernel and says 'using bsd.mp instead'. It might be better to formulate it in a manner to clearly state - I will use or - You might want to use Actually, I'd prefer the second version: asking, with .mp as default. Timezone might better go at the beginning, with ntp. One slash is missing in between, when the newly created directories are shown. There it looks something like /mnt /mnt/usr /mnt/home etc. I was looking for '/', and it seems to be missing in the first line. Thanks again!! Uwe
Panic at install of amd64 on HP nx6320
What I did: Install into wd0, second DOS partition, 20G. Everything looked good. At reboot, the panic happens, always. ps is easy: ddb ps * 0 -1 0 0 7 0x80200 swapper ddb trace Debugger() at Debugger+0x5 panic() at panic+0x122 _aml_die() at _aml_die+0xdb aml_xconvert() at aml_xconvert+0x68 [...] config_attach() at config_attach+0x11b cpu_configure() at cpu_configure+0x1c main() at main+0x3c5 end trace frame:0x0, count: -31 (If someone will ask for the complete trace: I'll take a screenshot with a camera if need be.) Any recommendation? (The machine works well at its other boots: XP and Ubuntu.) Uwe
Re: Panic at install of amd64 on HP nx6320
Marco Peereboom wrote: we need a trace; this is worthless. Thought so. Here are the screens, in the attachment. Hope, it goes through! Uwe [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0623.JPG] [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0624.JPG] [demime 1.01d removed an attachment of type image/jpeg which had a name of IMG_0625.JPG]
Re: softraid
Janne Johansson jj at it.su.se writes: Isn't that the case with all fstab entries right now? You get the computer to list some drive before other disks, raid or no raid, and fstab breaks on you. No, you didn't read it carefully enough. fstab breaks on me when I shove in a drive 'before', true. But when I add a 'higher' one, everything is fine. Believe me, I have a bunch. softraid is different, because ANY added physical drive will increase the count, and push the softraid drive one notch nigher. Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Then keep asking! I do have the impression, what I wanted, is what you already had in mind: a broken mirror simply remains dead and broken, and the machine runs happily before and after reboot on the sane drive. Correct? Correct. If this isn't the case then I need to see a dmesg before after rebooting and bioctl output before and after reboot. Since we (that's I, sorry) seem to discuss the whole bunch (not a bad idea after, all hoping to get things into their places and finally enjoying a really beautiful and functioning softraid), I allow myself to add another question, real life, on a to-be-production system: Okay, now I have a broken harddisk, one half of the mirror is gone. Then I will have to dump the partitions, create a new mirror, and restore, correct? Next problem: There are quite a number of bays available in my box, so that I can plug another drive for a local 'dump'. But irrespective where I plug it, it won't come up: softraid0 at root softraid0: roaming device sd1b - sd2b softraid0: roaming device sd2b - sd3b softraid0: roaming device sd1b - sd2b softraid0: roaming device sd2b - sd3b scsibus3 at softraid0: 1 targets sd4 at scsibus3 targ 0 lun 0: OPENBSD, SR RAID 1, 003 SCSI2 0/direct fixed sd4: 285789MB, 512 bytes/sec, 585296066 sec total softraid0: volume sd4 is roaming, it used to be sd3, updating metadata root on sd0a swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/rsd0a: file system is clean; not checking Can't open /dev/rsd3h: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3h: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3d: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3d: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3f: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3f: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3e: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3e: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3g: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3g: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3i: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3i: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3j: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3j: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY: ffs: /dev/rsd3h (/home), ffs: /dev/rsd3d (/tmp), ffs: /dev/rsd3f (/usr), ffs: /dev/rsd3e (/var), ffs: /dev/rsd3g (/var/mail), ffs: /dev/rsd3i (/var/www), ffs: /dev/rsd3j (/backup) Automatic file system check failed; help! Enter pathname of shell or RETURN for sh: Now that roaming is in the way. I wonder if softraid really should do this: You plug an additional disk, higher or lower, and automatically it will roam the mirror to drives of its liking, and inevitably fail the RAID thereby.
Re: softraid
Marco Peereboom slash at peereboom.us writes: This one the pulled drive still contains the same metadata as the surviving members. Since you are running a home made kernel I have no idea what code you are running. This scenario should work with the code I committed a couple of weeks ago. From the looks of it this is a bug or you are running old code. 4.5 stable. I have no clue what 'home made kernel' implies, it is just the recompiled (according to FAQ) standard, generic kernel; needed for the patches issued in 4.5 until now. Zero any other item. Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: This is a repeat of the can't bring up a raid set with missing members Yes, exactly. This can be closed; it was just to demonstrate that I am not the only person, who sees broken mirrors being re-attached.
Re: softraid
Marco Peereboom slash at peereboom.us writes: This is currently correct because I am working on this particular case. This one has proved to be very hairy hence it isn't in the tree yet. Good to know, thanks for the heads-up, I keep waiting then for 4.6, I guess? I'd expect the softraid, in order to be useful, to reboot on its sane leg. See previous comment. This is incomplete code. Thanks for the info. Complete is -current or will it be in 4.6? Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Next problem: There are quite a number of bays available in my box, so that I can plug another drive for a local 'dump'. But irrespective where I plug it, it won't come up: Your trace shows that it comes up just fine. softraid0: volume sd4 is roaming, it used to be sd3, updating metadata You inserted a disk in front of others. From where I am sitting it is all working as designed. Hmm. I plugged the fourth 'before' and 'after'; or the tray above and the tray below that pair. Both acted the same: breaking the mirror. To me it *looks* as if any physical disk would go before the RAID, and therefore the RAID always be incremented whenever one inserts another drive. And since the RAID gets roamed, while fstab isn't, it will always break. No? This is not a softraid problem this is an OpenBSD problem. We don't have uuids or labels on disks so when disks move around bad things happen. This is a fact of life today that needs to be deal with accordingly. Yes and no. 'Move around' is not what I am doing, when I plug an extra drive *after* sd2 (in my case). However, the system makes that one sd3, and voilC , the s*** hits the fan, because the RAID volume is moved up one index. To me this seems a result of the sequence at boot: at first we identify the physical drives, that is sd0, sd1, sd2 and sd3 in this case, and only later do we get softraid up, sensibly roaming the RAID one up. Sensibly? Because fstab can't know and will want to mount partitions of a lower number (sd3 in this case), which is always impossible.
Re: softraid
Uwe Dippel udippel at uniten.edu.my writes: To me this seems a result of the sequence at boot: at first we identify the physical drives, that is sd0, sd1, sd2 and sd3 in this case, and only later do we get softraid up, sensibly roaming the RAID one up. Sensibly? Because fstab can't know and will want to mount partitions of a lower number (sd3 in this case), which is always impossible. I do understand the problem of 'no labels'/'no UUID', but the current working will break boot whatever happens: any extra drive, in any slot, will be discovered at boot time before softraid is activated. So it will break 100%, right? There is no real solution without disk IDs, though a hackish one: If softraid was configured at sd3 (assembled from sd1 and sd2 in this case), the kernel needs to be aware of this fact when it goes into drive discovery at boot. So that when one plugs another drive into a higher controller, it will discover: sd0 - sd1 - sd2 - sd3_is_taken - sd4. Then fstab will be correct w.r.t. sd0 to sd3, and one can use sd4, the new drive, for whatever purpose it had been intended. And if sd0 was removed from the original configuration, it would find sd0 - sd1 - sd3_is_taken. Then roaming can still do sd0-sd1 and sd1-sd2, and the RAID will come up properly, again. That's the best I could think of now, anything but perfect, but always better than a 100% breakage. What do you think?
Re: OpenBSD on OpenBSD with qemu through the network only?
Stuart Henderson stu at spacehopper.org writes: Are you trying to boot an amd64 kernel? If so, you need qemu-system-x86_64. Chances are, that I did. Downloaded the i386-cd45.iso, and followed the 'tap mode' path: ifconfig tun0 link0 ifconfig bridge0 create brconfig bridge0 add tun0 add bge0 up All went through, and I have now: tun0: flags=9903UP,BROADCAST,PROMISC,SIMPLEX,LINK0,MULTICAST mtu 1500 lladdr 00:bd:f5:ab:6c:01 priority: 0 groups: tun inet6 fe80::2bd:f5ff:feab:6c01%tun0 prefixlen 64 scopeid 0x7 bridge0: flags=41UP,RUNNING mtu 1500 priority: 0 groups: bridge Next, going back to Quick Start, I type qemu -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd45.iso -boot d which actually, does boot! and gets me into the installer, all hunky dory. Only when downloading of the packages is supposed to start, once I have entered IP-address and confirmed the directory, everything segfaults. Always. I guess something networking-wise is still not okay. Can someone help me to point out the mistake here? Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: The plugging in of the disk is a non-event. The disk is dead to the OS and by extension to softraid. Let me follow up on this topic, please, and report some more experiments and results and thoughts. I recreated the mirror from scratch, and put /tmp, /var, /usr, /home and /backup directories on it. (No need to point out this is kind of stupid.) Running for 2 days. Hot unplugged drive A. Then 'echo Nonsense /backup/testo' Good outcome, though not tested intensely yet: the system keeps running on B as if nothing had happened. Shutdown and plugged A back, restart. Fails at file check, with 'help!' and dropping to a shell at /var. Problem is, that the .pid had been properly removed on B, but not on A; and I needed to delete those one by one at fsck. I also fsck-ed all other partitions, and as to be expected, the 'testo' was on B, not on A, and therefore it needed to be deleted. Reboot, alas, ending in a hangman. Reboot. Another time /var drops to a shell, it has some trouble with 'lost+found', another manual fsck is needed, reboot. Finally, the mirror comes up properly. Next, I'd like to do a real test on a production machine. What scares me, is the lack of physical access, so the hangman and the drop to shell for fsck are not good. And, on a production box here, there might be thousands of files accumulating on the plugged drive that won't be available on the unplugged one, and I will be asked to delete those. Also, this is not good. My question/suggestion: I for one would be happy if the state after reboot would by default be identical to the (degraded) state before the reboot: Because then I would hope to get the system started without the earlier defunct drive; that means, hopefully starting okay, and more relevant, not require me to do anything, not to delete any files. Simply start with the sane drive of the broken mirror as it was shut down. Then I could dump and restore the data to a freshly created RAID, without any further ado. Then, at least, a broken drive, a flimsy controller would not interfere into the proper running and restarting of the box; and giving me the chance to retrieve all, including the most recent, data. Does this make sense? Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Upon reboot the mirror should be brought up with only the surviving member. If this isn't the case please show me a trace so that I can go fix that bug. 'trace' means what here? Yes, I unplugged a drive of a working mirror as I wrote, halt, plugged it back with power off, rebooted, and had the described state of fsck problems, fortunately telling me exactly about the files that were available on A ('.pid') and B ('testo') only. Drive A is DEAD. Do not EVER use it again. That's fine, except it was tried to re-insert it into the mirror at reboot. Not on my account, but on its own, as one might say. YOU CAN NEVER USE THE UNPLUGGED DRIVE EVER EVER EVER EVER EVER AGAIN! IT IS DEAD AND IS CORRUPT AND PUPPIES DIE WHEN YOU USE IT!! Hope this sinks in. Fine. (Should I really say, it required some file checks first, and then, and now, it automagically is a member again of my - by now - online raid? Without me adding it? If the dead drive becomes a participating member of a raid set something is broken; very badly broken. Show me a trace, including bioctl output, if this is the case. Again, bioctl I do understand. 'trace' only at panic. Does this make sense? Not sure. I have a hard time following what you are doing vs. what your expectations are. Then keep asking! I do have the impression, what I wanted, is what you already had in mind: a broken mirror simply remains dead and broken, and the machine runs happily before and after reboot on the sane drive. Correct? Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Correct. If this isn't the case then I need to see a dmesg before after rebooting and bioctl output before and after reboot. Keep in mind that softraid can only detect failure AFTER an io fails. This is key, because you could fail a drive and go undetected by softraid. Clear. This is why I tested with 'echo Nonsense testo'. Here is what I did, I hope it explains what is going on. If not, just ask! [rebooted] # bioctl softraid0 Volume Status Size Device softraid0 0 Online 299671585280 sd3 RAID1 0 Online 299671585280 0:0.0 noencl sd1b 1 Online 299671585280 0:1.0 noencl sd2b # df -h /dev/sd0a 300M108M177M38%/ /dev/sd3h 9.8G730M8.6G 8%/home /dev/sd3d 1008M6.0K958M 0%/tmp /dev/sd3f 7.9G2.7G4.8G36%/usr /dev/sd3e 492M 17.1M450M 4%/var /dev/sd3g 2.0G1.4M1.9G 0%/var/mail /dev/sd3i 7.9G3.3M7.5G 0%/var/www /dev/sd3j 246G 95.5G138G41%/backup # cd /backup # ls -l total 200231436 [some files listed] # echo Nonsense testo_b4 # ls -l testo_b4 -rw-r--r-- 1 root wheel 9 May 22 11:57 testo_b4 # bioctl softraid0 Volume Status Size Device softraid0 0 Online 299671585280 sd3 RAID1 0 Online 299671585280 0:0.0 noencl sd1b 1 Online 299671585280 0:1.0 noencl sd2b # [pull drive] # dmesg OpenBSD 4.5 (GENERIC.MP) #0: Thu May 14 18:57:01 SGT 2009 r...@claude2.uwe.uniten.edu.my:/usr/src/sys/arch/amd64 /compile/GENERIC.MP real mem = 3756994560 (3582MB) avail mem = 3634552832 (3466MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries) bios0: vendor HP version D17 date 07/16/2007 bios0: HP ProLiant ML350 G4 acpi0 at bios0: rev 2 acpi0: tables DSDT FACP SPCR MCFG APIC acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.53 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36, CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL, CNXT-ID,CX16,xTPR,LONG cpu0: 1MB 64b/line 8-way L2 cache cpu0: apic clock running at 200MHz cpu1 at mainbus0: apid 6 (application processor) cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36, CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL, CNXT-ID,CX16,xTPR,LONG cpu1: 1MB 64b/line 8-way L2 cache ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins ioapic1: misconfigured as apic 0, remapped to apid 9 ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins acpiprt0 at acpi0: bus 1 (IP2P) acpiprt1 at acpi0: bus 2 (IPXB) acpiprt2 at acpi0: bus 6 (PCXA) acpiprt3 at acpi0: bus 9 (PCXB) acpiprt4 at acpi0: bus 5 (PTA0) acpiprt5 at acpi0: bus 13 (PTB0) acpiprt6 at acpi0: bus 16 (PTC0) acpiprt7 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0 acpicpu1 at acpi0 acpitz0 at acpi0: critical temperature 31 degC pci0 at mainbus0 bus 0: configuration mode 1 pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c pci1 at ppb0 bus 5 ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09 pci2 at ppb1 bus 6 ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09 pci3 at ppb2 bus 9 ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c pci4 at ppb3 bus 13 ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c pci5 at ppb4 bus 16 ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02 pci6 at ppb5 bus 2 mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 int 0 (irq 5) scsibus0 at mpi0: 16 targets, initiator 7 sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed sd0: 34732MB, 512 bytes/sec, 71132000 sec total sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd1: 286102MB, 512 bytes/sec, 585937500 sec total sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd2: 286102MB, 512 bytes/sec, 585937500 sec total mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 int 1 (irq 5) scsibus1 at mpi1: 16 targets, initiator 7 uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 16 (irq 5) uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 19 (irq 5) Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function
Re: softraid
Marco Peereboom slash at peereboom.us writes: Correct. If this isn't the case then I need to see a dmesg before after rebooting and bioctl output before and after reboot. This is as well supported by the post http://vext01.blogspot.com/2007/11/playing-with-new-softraid-driver-in.html [...] Bioctl is the utility used for managing both hardware and software RAID in OpenBSD, the transparency is superb. # bioctl softraid0 Volume Status Size Device softraid0 0 Online 1023009 sd0 RAID1 0 Online 1023009 0:0.0 noencl 1 Online 1023009 0:1.0 noencl 2 Online 1023009 0:2.0 noencl Lets break things and see what happens. First I will simulate a missing disk at boot, by detaching wd3. After a reboot I see this: # dmesg | grep softraid0 softraid0 at root softraid0: not assembling partial disk that used to be volume 0 # bioctl softraid0 # Our RAID array was not registered by the kernel, as a disk was missing. I imagine this will be changed at some point. As I said, the softraid driver is not finished. Shutdown the system and put the disk back: # dmesg | grep softraid softraid0 at root scsibus0 at softraid0: 1 targets # bioctl softraid0 Volume Status Size Device softraid0 0 Online 1023009 sd0 RAID1 0 Online 1023009 0:0.0 noencl 1 Online 1023009 0:1.0 noencl 2 Online 1023009 0:2.0 noencl It's back. Isn't this what you said it shouldn't? (Be back 'Online' after an earlier breakage of the mirror) Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Then keep asking! I do have the impression, what I wanted, is what you already had in mind: a broken mirror simply remains dead and broken, and the machine runs happily before and after reboot on the sane drive. Correct? Correct. If this isn't the case then I need to see a dmesg before after rebooting and bioctl output before and after reboot. Alas, it doesn't (run happily ever after). :( My next experiment: Everything healthy, according to bioctl: # bioctl softraid0 Volume Status Size Device softraid0 0 Online 299671585280 sd3 RAID1 0 Online 299671585280 0:0.0 noencl sd1b 1 Online 299671585280 0:1.0 noencl sd2b # [pull drive] [...] [new situation: NOT putting the drive back, ever - simulating a dead drive, maybe spindle or head gone] (System operates fine, read/write without any problem) [reboot - as mentioned NOT pushing the drive back] [...] ugen0 at uhub2 port 1 American Power Conversion Back-UPS RS 1000 FW:7.g8 .I USB FW:g8 rev 1.10/1.06 addr 2 softraid0 at root softraid0: roaming device sd2b - sd1b softraid0: not assembling partial disk that used to be volume 0 root on sd0a swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/rsd0a: file system is clean; not checking Can't open /dev/rsd3h: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3h: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3d: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3d: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3f: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3f: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3e: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3e: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3g: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3g: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3i: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3i: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. Can't open /dev/rsd3j: Device not configured CAN'T CHECK FILE SYSTEM. /dev/rsd3j: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY. THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY: ffs: /dev/rsd3h (/home), ffs: /dev/rsd3d (/tmp), ffs: /dev/rsd3f (/usr), ffs: /dev/rsd3e (/var), ffs: /dev/rsd3g (/var/mail), ffs: /dev/rsd3i (/var/www), ffs: /dev/rsd3j (/backup) Automatic file system check failed; help! Enter pathname of shell or RETURN for sh: Here, at least in production environment, and according to the situation of lacking physical access, I really would want the drive/system to come back. Yes. To me, lacking of '-R' is no big deal. But what is the whole thing 'softraid' about, if it doesn't survive a reboot, on a single, before 100% sane, drive? See, it was sane, and working, and saving my files until reboot. Then, after reboot (can always happen), all is 'lost'. Not quite, but I simply can't go there any time of day or night to resolve the problem manually. I'd expect the softraid, in order to be useful, to reboot on its sane leg. Uwe
softraid - speed
I tried again, setting up RAID1 on 2 U320 drives, 15k, as described in softraid(4). Now I find the speed to be too slow. Writing to a single file is kind of okay: [everything/pwd is /mnt, which is a softraid drive, /dev/sd3f] # bioctl sd3 Volume Status Size Device softraid0 0 Online 299671585280 sd3 RAID1 0 Online 299671585280 0:0.0 noencl sd1b 1 Online 299671585280 0:1.0 noencl sd2b dump and restore is the task. It is not fast: DUMP: Volume 1 took 0:00:07 DUMP: Volume 1 transfer rate: 2147 KB/s DUMP: Date this dump completed: Wed May 20 16:31:08 2009 DUMP: Average transfer rate: 2147 KB/s 7 seconds for 14 MB. But data transfer itself is okay: # dump -0ua -f testo /dev/sd0e DUMP: Volume 1 took 0:00:01 DUMP: Volume 1 transfer rate: 15039 KB/s DUMP: Date this dump completed: Wed May 20 16:49:53 2009 DUMP: Average transfer rate: 15039 KB/s DUMP: level 0 dump on Wed May 20 16:49:51 2009 It is writing that takes the time: # date restore rf testo date Wed May 20 16:51:48 SGT 2009 Wed May 20 16:51:54 SGT 2009 The raw speed is good: # dd if=/dev/zero of=nonsense.img bs=1m count=5000 5000+0 records in 5000+0 records out 524288 bytes transferred in 100.534 secs (52149868 bytes/sec) But a dump restore of /usr is a tad sick: (/dev/sd0f 7.9G2.4G5.1G32%/usr) # dump -0ua -f - /dev/sd0f | restore rf - DUMP: Date of this level 0 dump: Wed May 20 16:53:46 2009 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/rsd0f (/usr) to standard output DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 2549189 tape blocks. DUMP: Volume 1 started at: Wed May 20 16:53:48 2009 DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 4.42% done, finished in 3:48 DUMP: 36.44% done, finished in 0:27 DUMP: 40.42% done, finished in 0:30 DUMP: 52.60% done, finished in 0:23 DUMP: 64.08% done, finished in 0:17 DUMP: 77.57% done, finished in 0:10 DUMP: 92.19% done, finished in 0:03 DUMP: 2717062 tape blocks DUMP: Date of this level 0 dump: Wed May 20 16:53:46 2009 DUMP: Volume 1 completed at: Wed May 20 17:36:48 2009 DUMP: Volume 1 took 0:43:00 DUMP: Volume 1 transfer rate: 1053 KB/s DUMP: Date this dump completed: Wed May 20 17:36:48 2009 DUMP: Average transfer rate: 1053 KB/s DUMP: level 0 dump on Wed May 20 16:53:46 2009 DUMP: DUMP IS DONE The LEDs of the drives were kind of continuously on. I also tried to mount 'softdep', but that didn't make much of a difference. When I do 'df -h' in another console, I can see at times that the data amount transfered is huge, at other times it is moving by steps of 0.1-0.2 MB/s. Probably it is a problem of number of files, not of size. Any idea what to do to improve the performance? Uwe
OpenBSD on OpenBSD with qemu through the network only?
I would like to set up a virtual/emulated OpenBSD machine on an existing OpenBSD box; using qemu. The guest will only be accessible through the network (ssh), under an address different from the host IP; though through the same physical NIC. I wonder if anyone has some experience or a link about this; otherwise I'll have to step-by-step myself through this. Also, it seems that I can't install a qemu-harddisk-image on my OpenBSD boxen, since it needs graphics, right? So I'll have to use Ubuntu (e.g.) to create that image, and transfer it and start it on the console-only OpenBSD. Would that work? Uwe
Re: OpenBSD on OpenBSD with qemu through the network only?
Abel Camarillo acamari at the00z.org writes: that escenario is explained in the README.OpenBSD installed with qemu. That's a really great hint! Thanks a bunch! (Though it still looks like installation needs X: 3. Install the os: [...] NOTE: start this inside an xterm or equivalent contrary to what someone stated off-group. If I do it in a console, I get WSCONS error ... SDL) Uwe
Re: OpenBSD on OpenBSD with qemu through the network only?
Abel Camarillo acamari at the00z.org writes: that escenario is explained in the README.OpenBSD installed with qemu. Sorry, it still won't. It will segfault, though I follow the text (correctly, and correct me if I am wrong): So what I typed is: # env ETHER=bge0 qemu -net nic,model=rtl8139 -net tap -m 32 \ -monitor stdio -no-fd-bootchk -hda virtual.img \ -cdrom cd44.iso -boot d {tun0 (bridge0 - bge0)} QEMU 0.9.1 monitor - type 'help' for more information (qemu) qemu: fatal: triple fault EAX=e001003b EBX=00b1c7f8 ECX=c080 EDX= ESI=00b1c000 EDI=00b33000 EBP=005f7408 ESP=005f765e EIP=001004d6 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00cf9300 CS =0008 00cf9f00 SS =0010 00cf9300 DS =0010 00cf9300 FS =0010 00cf9300 GS =0010 00cf9300 LDT= 8000 TR = 8000 GDT= 00040c70 0027 IDT= 000409d8 027f CR0=e001003b CR2=00040a18 CR3=00b1c000 CR4=06b0 CCS=1000 CCD=e001003b CCO=LOGICL FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80 FPR0= FPR1= FPR2= FPR3= FPR4= FPR5= FPR6= FPR7= XMM00= XMM01= XMM02= XMM03= XMM04= XMM05= XMM06= XMM07= Abort trap What's wrong? Uwe
Re: softraid
Marco Peereboom wrote: [push the disk back in] Stale metadata, disk will remain unused from now on. check [pull the other disk] You lose. all data is gone (for all intents and purposes). check # ls -l total 4 -rw-r--r-- 1 root wheel 9 May 13 12:00 testo [everything okay until here] Nope, this comes out of cache. # rm testo rm: testo: Input/output error [I still guess this may happen] Shall happen. Yes. And no. Maybe I wasn't all too clear? My expectation is not (yet) the automatic recovery of the respective half mirror! Sure not! I don't expect miracles. What I do expect, though, is a consistent, defined and predictable state. Please, try to view it from a different perspective. Nobody would voluntary pull out disk A, plug it back after 20 seconds, expecting it to recover the mirror, pull out disk B after another 10 seconds, and plug it back after 20 seconds, and still expect a full mirror! But, and that's a big 'but' for me: some fault might do exactly that, a flimsy controller, a faulty power supply. And then I don't want I/O errors, and neither a panic at reboot. My expectations are much lower, but based on consistency: 0. Running sane raid 1. One drive goes offline What I'd expect, personally, would basically be minimally: A. Immediate info about a drive lost. B. 2 half mirrors remaining that I can plug into another box, at least to access the data on either. C. No further attempt to use that drive that went offline any longer, at least not until a reboot. D. That means, I won't have I/O errors, but the system running happily from the active drive, E. And it means that a reboot will go through smoothly. I am aware that this implies, that when the second drive goes offline as well, that NO more drive is available (even if either came back!). As I mentioned, I request consistency of data, not necessarily uptime. I want to be abe to retrieve the data from the drive that went offline first, and I want to be able to retrieve data from the drive that went offline later. Personally, to me RAID is not failover, or availability, but access to the data up to and until that moment when a drive goes offline. And I want a clean reboot, irrespective of all ups and downs of the drives. Please, correct me if I am wrong! Uwe
Re: softraid
Marco Peereboom slash at peereboom.us writes: Maybe I wasn't all too clear? My expectation is not (yet) the automatic recovery of the respective half mirror! Sure not! I don't expect miracles. What I do expect, though, is a consistent, defined and predictable state. Your expectations are out of whack with reality. Marco, I hope not. I think this is why I am using OpenBSD, e.g. Predictability is number one for security, so consistency is a must as well. Though we seem to agree what softraid should be doing from the statements below. I propose to tar all photos and send them to you privately. 0. Running sane raid 1. One drive goes offline What I'd expect, personally, would basically be minimally: A. Immediate info about a drive lost. That is there. And you haven't shown me any evidence it isn't. You are right. I simply could not read from the man page the most obvious: that the state is displayed without any options (and me stupid tried almost all options!). So I guess it still is a cronjob to scan for 'degraded'? B. 2 half mirrors remaining that I can plug into another box, at least to access the data on either. No, 1 half mirror; the other one is basically lost. You got an IO error for some reason. There is no telling what didn't get written to it after the remaining chunk continued on its merry way. Good to know. C. No further attempt to use that drive that went offline any longer, at least not until a reboot. Right, and softraid will detect that it went tits up prior and ignore it. Good D. That means, I won't have I/O errors, but the system running happily from the active drive, Right. Good E. And it means that a reboot will go through smoothly. Right. Good. Meaning that we agree, and I'm looking forward to try again! Uwe
Re: softraid
Raimo Niskanen raimo+openbsd at erix.ericsson.se writes: Does not really bioctl say nothing? Try bioctl sd3 bioctl softraid0, bioctl -q sd3, bioclt -q softraid0. Thanks, the first two do it; fault was on my side. I did try the latter 2, and both , well, I dunno what they tell me: # bioctl -q sd3 sd3: OPENBSD, SR RAID 1, 003, serial OPENBSD SR RAID 1 003 # bioctl -q softraid0 bioctl: DIOCINQ: No such file or directory Thanks again for the pointer to the first. Maybe an example could be added to the man pages? The existing repair option as I recall it (again, search the archives) is to backup the still working filesystems sd3a and sd3b on the broken mirror, re-create the array from scratch, and restore them. Yes, this is what I was thinking, and it is fine with me. (Though shoving in a new drive followed by -R would definitively be a huge progress.) That sounds fatal. You should repair the RAID mirror, not break the working half. Now both mirror halves are probably regarded as broken. Your RAID is doomed. Sure. Agreed. But reboot ought to go through, as we discussed elsewhere. Thanks again, Uwe
softraid
Beautyful, as it looks like! I tried here on 2 300 GB U320, and the setup went through without any warnings (?? most users encounter some?). What I did was: (my system disk is sd0) fdisk -iy sd1 fdisk -iy sd2 printf a\n\n\n\nRAID\nw\nq\n\n | disklabel -E sd1 printf a\n\n\n\nRAID\nw\nq\n\n | disklabel -E sd2 bioctl -c 1 -l /dev/sd1a,/dev/sd2a softraid0 dd if=/dev/zero of=/dev/rsd3c bs=1m count=1 disklabel sd3 (creating my partitions/slices) newfs /dev/rsd3a newfs /dev/rsd3b mount /dev/sd3b /mnt/ cd /mnt/ [pull one hot-swap out] echo Nonsense testo [push the disk back in] [pull the other disk] # ls -l total 4 -rw-r--r-- 1 root wheel 9 May 13 12:00 testo [everything okay until here] # rm testo rm: testo: Input/output error [I still guess this may happen] But now my question: All posts say all info is in 'man softraid' and 'man bioctl'. There is nothing about *warnings* in there. I also tried bioctl -a/-q, but none would indicate that anything was wrong when one of the drives was pulled. This will be a production server, but it can take downtime, in case. However: 1. I *need to know* when a disk goes offline 2. I need to know, in real life(!), if I can simply use the broken mirror to save my data; how I can mount it in another machine. Alas, softraid and bioctl are silent about these two. Another reason for asking: Next I issued 'reboot'; and could play hangman :( After the reboot, I got: ... softraid0 at root softraid0: sd3 was not shutdown properly scsibus3 at softraid0: 1 targets, initiator 1 sd3 at scsibus3 targ 0 lun 0: OPENBSD, SR RAID 1, 003 SCSI2 0/direct fixed sd3: 286094MB, 36471 cyl, 255 head, 63 sec, 512 bytes/sec, 585922538 sec total Now I wonder what to do. Will a traditional fsck do, or do I have to recreate the softraid? Can anyone please help me further? Uwe
Re: softraid
Uwe Dippel udippel at uniten.edu.my writes: Now I wonder what to do. Will a traditional fsck do, or do I have to recreate the softraid? I guess, I can answer this myself, in the meantime: I did the fsck of the softraid volume sd3a and sd3b (the first one was clean, to be expected, the second not; but marked clean with fsck). Then I mounted it again: # fsck /dev/sd3b ** /dev/rsd3b ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups SUMMARY INFORMATION BAD SALVAGE? [Fyn?] SALVAGE? [Fyn?] y BLK(S) MISSING IN BIT MAPS SALVAGE? [Fyn?] y 1 files, 1 used, 128675303 free (15 frags, 16084411 blocks, 0.0% fragmentation) MARK FILE SYSTEM CLEAN? [Fyn?] y * FILE SYSTEM WAS MODIFIED * # mount /dev/sd3b /mnt/ # cd /mnt/ # ls -l # [that's okay, I never put anything there] # pwd /mnt # echo Nonsense testo [that's not okay, because it got me another hangman] If anyone was interested, I have all the 'trace'es and 'ps'es on camera. I guess it is time for a dmesg as well: OpenBSD 4.4 (GENERIC.MP) #0: Fri Jan 23 14:33:38 SGT 2009 r...@claude2.uwe.uniten.edu.my:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4214460416 (4019MB) avail mem = 4092596224 (3903MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries) bios0: vendor HP version D17 date 07/16/2007 bios0: HP ProLiant ML350 G4 acpi0 at bios0: rev 2 acpi0: tables DSDT FACP SPCR MCFG APIC acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.52 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36, CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL, CNXT-ID,CX16,xTPR,LONG cpu0: 1MB 64b/line 8-way L2 cache cpu0: apic clock running at 200MHz cpu1 at mainbus0: apid 6 (application processor) cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36, CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL, CNXT-ID,CX16,xTPR,LONG cpu1: 1MB 64b/line 8-way L2 cache ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins ioapic1: misconfigured as apic 0, remapped to apid 9 ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins acpiprt0 at acpi0: bus 1 (IP2P) acpiprt1 at acpi0: bus 2 (IPXB) acpiprt2 at acpi0: bus 6 (PCXA) acpiprt3 at acpi0: bus 9 (PCXB) acpiprt4 at acpi0: bus 5 (PTA0) acpiprt5 at acpi0: bus 13 (PTB0) acpiprt6 at acpi0: bus 16 (PTC0) acpiprt7 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0 acpicpu1 at acpi0 acpitz0 at acpi0: critical temperature 31 degC pci0 at mainbus0 bus 0: configuration mode 1 pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c pci1 at ppb0 bus 5 ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09 pci2 at ppb1 bus 6 ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09 pci3 at ppb2 bus 9 ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c pci4 at ppb3 bus 13 ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c pci5 at ppb4 bus 16 ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02 pci6 at ppb5 bus 2 mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 int 0 (irq 5) scsibus0 at mpi0: 16 targets, initiator 7 sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed sd0: 34732MB, 50824 cyl, 2 head, 699 sec, 512 bytes/sec, 71132000 sec total sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd1: 286102MB, 82594 cyl, 8 head, 886 sec, 512 bytes/sec, 585937500 sec total sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed sd2: 286102MB, 82594 cyl, 8 head, 886 sec, 512 bytes/sec, 585937500 sec total mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1 mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 int 1 (irq 5) scsibus1 at mpi1: 16 targets, initiator 7 uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 16 (irq 5) uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 19 (irq 5) Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 int 23 (irq 5) usb0 at ehci0: USB revision 2.0 uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1 ppb6
Re: spam from chrooted CMSes
Matthew Weigel unique at idempot.net writes: Huh? I'm talking about the CMS itself authenticating to the SMTP server, and giving each application a single set of credentials. chroot is the name, and isolation is the game. This should be set in the CMS's config files, much like database credentials. Again, I didn't write or install them. Then I configure that board's software to connect to my SMTP server to send mail, and it has to authenticate as board at idempot.net to send any mail. Now, if my server starts sending out spam, I can check the logs and see if the spam is coming from the user board at idempot.net to verify that the particular board software I'm using is the compromised software or not. And here we come to something! This makes sense, compared to me looking through users' code: A hook that allows the insertion of a filter either in php before calling mini_sendmail, or in mini_sendmail itself. postfix is the wrong answer, because the default sender from chrooted mini_sendmail would be 'root', and postfix needs to accept mail from root. So that filter would do something like deny all allow cms_legal allow cms_department allow cms_conference In case anybody had some snippets, I'd be grateful to receive those. Thanks, Uwe
Re: spam from chrooted CMSes
Vadim Zhukov wrote: Do your clients have ability to connect to external hosts? If yes then you should not even bother logging PHP mail() calls or such. If outgoing connections are closed then you should have different system users (i.e., different UIDs) for each client; otherwise it'll be easy possible for hacker to spoof sender: nothing stops him from modifying other client's scripts or just implementing SMTP server entire in PHP. Exactly. That's what I, that's what everyone has who hosts web sites of users. If someone can hack into it, she can write some basic SMTP easily. But when you have 200+ users, and 10+ run some php code, and your postfix spews spam to world and sundry, a filter 'From:' - before reaching postfix, because 'root' does not send from chrooted Apache, can conveniently block all mails with illegal senders' addresses. And only if both requirements passed then you can improve your antispam scurity either by 1) modifying mini_sendmail, or 2) writing a simple Perl wrapper that parses input data (bundled and/or in-ports Perl modules should make it very easy) and then passes data to real mini_sendmail. IMHO, it's much easier to make mini_sendmail log mail, or add a specific header to each letter that may help you in debugging. In the latter case you may even put some limits for mail based on your header knowledge in your real MTA, which mini_sendmail will forward letters to. You do not need big programming skills to do that, just some basic C knowledge. If you do not know C at all, ask some your friend to do this work for beer (or mineral water, if he doesn't like alcohol ;) ). I don't mind paying a drink, I even don't mind gobbling up something myself. But maybe something likewise existed, and then I could simply save my time. I guess I'm not the only one who runs official CMSes on a server, that need to send mail, and want to block everyone else's website hosted there as well, from sending mail. Thanks, Uwe
Re: spam from chrooted CMSes
When dealing with web based submission, the best thing I have found is to make sure the web based submission adds its own headers like what it is and where the user came from and such so when diagnosing the problem one can easily block based on that information. If there is an account involved, you should include that info as well. Dear Todd, I'm sorry, but I lack the experience to understand what you mean. I have 200+ users, several of them having set up (sorry, yes, written!), who can install any CMS of their liking, using ftp; or any other script that sends mail. Some of them are official websites, so I can not shut down the whole mini_sendmail business in the chrooted Apache. I also cannot read, study, hundreds of thousands of lines of code to find out how and where a web-page hosted by me allows an attacker to inject a message of her own, to a recipient of her own choice. Since mini_sendmail receives it through php from Apache, I wonder how I could log e.g. the website from which it was sent, or at least easily limit the number of calls of mini_sendmail. Again, your idea being fine for an application developer, which I am not. I wouldn't know how to add the account to which the application belongs that can be abused. The only two places where I, IMHO, can see a chance would be with an extended log or check of Apache or php; whenever a mail-call is logged, from which directory, e.g. If you're really cracking this nut properly, you'd include heuristics to temporarily block if too many messages are sent in a given time period, and permanently block pending review if too many temporary blocks occur within a given time period. Yes. But that's a complete coder's work, isn't it? I wonder if there is no other solution, as mentioned above. sendmail_path = /bin/mini_sendmail -t -i is what I have in php.ini. I wonder, if there are no logging features for mini_sendmail or so. I read the man-page online, but didn't see any. If it doesn't exist, it surely would be a good enhancement if the path of the application from which it is send was carried through, so that a filter can be written, to allow or drop depending on the path of that application. Uwe
Re: spam from chrooted CMSes
Matthew Weigel unique at idempot.net writes: Then you have grown your userbase too fast with a terrible setup, and now you're caught in the middle of fixing the problem or avoiding downtime. Are you sure this is not a misunderstanding? When you host user accounts, on a tight, default, setup of OpenBSD (or any other OS), and allow them to ftp into their web-directories, how could one prevent them from uploading code that mail()-s something? Aside of removing mini_sendmail, that is. Sure, if you go through and find every line of code where mail() is called, you can add logging at that point. But so far you've refused to make any changes to the applications. Are you sure that this is not a misunderstanding? Which sysadmin can 'make changes to the applications' that his 200+ users run?? His idea is the right one. Most PHP applications I've dealt with support, at least through plugins or extensions, SMTP + AUTH for sending mail instead of PHP's mail(). Are you sure that this is not a misunderstanding? If you host, for example, any CMS, it should have the functionality to the remote user, registered with that CMS, to request a password reset. Which SMTP+AUTH do you want to use here?? AFAICS, here we need to allow a straightforward SMTP. The userbase is registered in the various databases of the CMSes. And again, no sysadmin will re-write all user-supplied applications to extract all those remote users for SMTP-authentication. Get real, please! The only two places where I, IMHO, can see a chance would be with an extended log or check of Apache or php; whenever a mail-call is logged, from which directory, e.g. I don't think PHP ever changes the working directly except explicitly; probably every call to mail() (which leads to mini_sendmail) occurs in the chroot /. Exactly. So how to log it?? Yes. But that's a complete coder's work, isn't it? I wonder if there is no other solution, as mentioned above. There are, but they require you to set the parameters of how web apps can work in your environment so as to enforce a minimum of auditability. Yes, this is the crucial point. I'd be more than happy to learn how to set this, for example in php.ini! Any suggestion will be appreciated! sendmail_path = /bin/mini_sendmail -t -i is what I have in php.ini. I wonder, if there are no logging features for mini_sendmail or so. I read the man-page online, but didn't see any. Well, mini_sendmail is an external package... talk to the authors about that, but I think they'll tell you they can't really track what you need tracked. So, how to solve the problem, then?? Thanks anyway, Uwe
Re: spam from chrooted CMSes
Chris Bennett wrote: This could be helpful, possibly. First, you can maintain a functional mini_sendmail by putting a nother script at /bin/mini_sendmail, this script could do some sort of logging and then pass things on to the real mini_sendmail, located somewhere else, different (hidden) name. This is what I tried already! But it seems it is written in a different manner than what I was hoping for. Something like echo $0 $1 $2 /tmp/loggo /bin/mini_sendmail $1 $2 doesn't work. No, it is not the mini_sendmail that doesn't work, the first line already doesn't log anything. Since all this goes through php first, setting up this script is easy. Maybe you could give me a hint? I am not a PHP-coder, actually. Uwe
spam from chrooted CMSes
I'm running postfix as MTA on a machine with several CMS, on a chrooted Apache. Recently, there is a huge number of spam being sent from there, alas. When I scan the postfix-logs, all those come from 'root', meaning they don't come through port 25. I run OpenBSD with mini-sendmail, and now I wonder how I could find out from which CMS they are sent. Is there any chance to find out from which CMS they are sent? Thanks, Uwe
Apache failed at 'graceful'
This is the first time in years; and I wonder what might be wrong. I am running OpenBSD 4.4, nothing specific. I have rotated the logs daily through these years, with the script as below, and no problem until today. The machine has an uptime of 50 days. Still after the cron-ed logrotation this morning it didn't come up with 'graceful'. This is what I have gathered until now from the error-log: [Wed Mar 18 04:21:06 2009] [error] [client 77.37.220.140] File does not exist: /htdocs/v2/intl/zh-CN/ [Wed Mar 18 04:21:06 2009] [error] [client 77.37.220.140] File does not exist: /htdocs/v2/error_404.html [Wed Mar 18 04:21:10 2009] [error] PHP Notice: Undefined index: HTTP_HOST in /htdocs/v2/libraries/joomla/environment/uri.php on line 154 [Wed Mar 18 04:21:10 2009] [error] PHP Notice: Undefined index: HTTP_USER_AGENT in /htdocs/v2/libraries/joomla/html/html/behavior.php on line 51 [Wed Mar 18 04:21:10 2009] [error] PHP Notice: Undefined index: HTTP_USER_AGENT in /htdocs/v2/templates/ja_purity/ja_templatetools.php on line 200 [Wed Mar 18 04:21:10 2009] [error] PHP Notice: Undefined index: HTTP_USER_AGENT in /htdocs/v2/templates/ja_purity/ja_templatetools.php on line 249 [Wed Mar 18 04:22:44 2009] [warn] child process 23069 still did not exit, sending a SIGTERM [Wed Mar 18 04:22:44 2009] [warn] child process 11890 still did not exit, sending a SIGTERM [Wed Mar 18 04:22:44 2009] [warn] child process 9049 still did not exit, sending a SIGTERM [Wed Mar 18 04:22:44 2009] [warn] child process 17620 still did not exit, sending a SIGTERM [Wed Mar 18 04:22:44 2009] [warn] child process 6571 still did not exit, sending a SIGTERM [Wed Mar 18 04:22:52 2009] [notice] caught SIGTERM, shutting down PHP Warning: Module 'mysql' already loaded in Unknown on line 0 PHP Warning: Module 'mysql' already loaded in Unknown on line 0 [Wed Mar 18 18:05:43 2009] [notice] Initializing etag from /var/www/logs/etag-state apachectl stop newsyslog -f /var/www/conf/newhttpdlog.conf apachectl graceful are the commands in the crontab. When I run the command now, it works. It looks like a glitch. Still, it ought not happen. Please, inform me if more is required. Uwe
Re: Is this panic related to +ExecCGI?
Slightly off-topic: Would it rather be perl committing all that memory or httpd? add some instrumentation and you'll find out. symon can be good for this sort of thing (you can have it monitor memory/cpu use of specific processes at a frequent interval and graph them). Can this be prevented one way or another? login.conf So I thought. I had read the man page a few times, but still lack the understanding where to best put the limits. Currently, the user sits on default, with datasize-max=512M:\ datasize-cur=512M:\ :maxproc-max=128:\ :maxproc-cur=64:\ openfiles-cur=128:\ stacksize-cur=4M:\ The datasize doesn't seem to cut it, the free mem+swap usually is around 3.5G. Now I wonder which of the following is best employed limiting the system usage below die-off: cputime, filesize, memoryuse, vmemoryuse? Is there any further description, a link or a document available with a formula on 'how to prevent one's system from running out of resources at all cost'? That would be the greatest and best; then I could put that user into this class, and she could never bring down the system, right? Thanks, Uwe
Is this panic related to +ExecCGI?
Dear all, for the first time ever I had one of my production boxes crashing. And that was just a few hours after I allowed one of my users ExecCGI in her home, in the chrooted, default, Apache. The user was deploying some perl-script. Can anyone with insight please look at the trace and point out to me if there is a link between the +ExecCGI and the panic? Was I blue-eyed when I allowed the deployment of perl as a non-privileged user in the chrooted Apache, thinking that Apache and perl would handle any exception well enough to not crash? Thanks for advice, Uwe root on sd0a swap on sd0b dump on sd0b panic: kernel diagnostic assertion uvmexp.swpgonly = uvmexp.swpages failed: f ile /usr/src/sys/uvm/uvm_pdaemon.c, line 573 Stopped at Debugger+0x5: leave RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb{1} trace Debugger() at Debugger+0x5 panic() at panic+0x122 __assert() at __assert+0x21 uvm_aiodone_daemon() at uvm_aiodone_daemon+0x2fd uvm_aiodone_daemon() at uvm_aiodone_daemon+0x9e8 uvm_pageout() at uvm_pageout+0xca end trace frame: 0x0, count: -6 ddb{1} PID PPID PGRPUID S FLAGS WAIT COMMAND 11671 7706 11671 0 7 0 cron 7706 12712 12712 0 3 0x280 piperd cron 10810 18329 18329 67 3 0x2000180 netcon httpd 7948 18329 18329 67 3 0x2000180 netio httpd 7911 18329 18329 67 3 0x2000180 netcon httpd 26768 18329 18329 67 3 0x2000180 netcon httpd 22062 18329 18329 67 2 0x100 httpd 1169 18329 18329 67 3 0x2000180 netcon httpd 22624 18329 18329 67 2 0x100 httpd 18841 18329 18329 67 2 0x100 httpd 11509 18329 18329 67 7 0x100 httpd 22935 18329 18329 67 2 0x100 httpd 9398 18329 18329 67 2 0x100 httpd 23859 6501 23859 0 2 0x4000 sshd 10659 6501 10659 0 2 0x4000 sshd 27338 1 27338 1231 2 0x4002 vi 26143 18329 18329 67 2 0x100 httpd 5820 18329 18329 67 3 0x2000180 netio httpd 5758 18329 18329 67 2 0x100 httpd 12944 18329 18329 67 3 0x2000180 netio httpd 2760 18329 18329 67 2 0x100 httpd 17215 18329 18329 67 3 0x2000180 netio httpd 26725 18329 18329 67 2 0x100 httpd 13133 18329 18329 67 3 0x2000180 netio httpd 4211 18329 18329 67 2 0x100 httpd 3040 18329 18329 67 3 0x2000180 netio httpd 19165 18329 18329 67 2 0x100 httpd 30747 18329 18329 67 2 0x100 httpd 2122 18329 18329 67 3 0x2000180 netio httpd 30941 18329 18329 67 3 0x2000180 netio httpd 10089 18329 18329 67 3 0x2000180 netio httpd 8959 18329 18329 67 2 0x100 httpd 24254 18329 18329 67 2 0x100 httpd 23026 19319 19319507 3 0x2004180 kqread pickup 30424 18329 18329 67 2 0x100 httpd 24054 18329 18329 67 2 0x100 httpd 18990 18329 18329 67 2 0x100 httpd 17856 18329 18329 67 2 0x100 httpd 29787 18329 18329 67 2 0x100 httpd 15163 18329 18329
Re: Is this panic related to +ExecCGI?
Ted Unangst wrote: You ran out of swap. Thanks, Ted, for the fast reply and immediate insight. So it must have to do with that perl-program, since what I usually see is something like: Memory: Real: 107M/347M act/tot Free: 1637M Swap: 0K/2151M used/tot Slightly off-topic: Would it rather be perl committing all that memory or httpd? Can this be prevented one way or another? I understand the development was done on some Ubuntu-flavour, and then transferred to OpenBSD. Can we have something like a sandbox for such situations, so that applications are running on the system, but with limited resources? The underlying application was supposed to run on a production machine with 100+ users, and at one moment in time it did need to be tested, after it ran successfully on a laptop. What would be a better strategy? Thanks again, Uwe P.S.: And Hurray! to the serial console. I could do the trace and reboot an otherwise unaccessible and remote machine!
4.4 is getting stuck
Yesterday I upgraded my last production box (remote) from 4.3 to 4.4., without any hitch, rebooted, and so forth. Last night at some innocuous time, it stopped accepting incoming mail (postfix). This morning, it did courier-imap well, until I used an existing ssh-session like this: # pwd /usr/src/usr.sbin/httpd # cd /var/log/ # /usr/local/sbin/post postalias postfix postkick postqueue postcat postfix-disable postlock postsuper postconfpostfix-enable postlog postdroppostfix-install postmap # /usr/local/sbin/postfix status ^C^Z Now it is stuck like this for an hour or so. It still takes keyboard input, though. Courier-imag also does not respond any longer. But nmap is still somewhat okay: $ nmap -sV 172.16.0.4 Starting Nmap 4.68 ( http://nmap.org ) at 2009-01-10 08:58 SGT Interesting ports on 172.16.0.4: Not shown: 1707 closed ports PORTSTATE SERVICE VERSION 13/tcp open daytime 22/tcp open ssh? 25/tcp open smtp? 37/tcp open time (32 bits) 53/tcp open domain? 80/tcp open httpApache httpd 110/tcp open pop3? 993/tcp open imaps? daytime works fine, http works very well, but domain, pop3 and smtp time out; or worse: all get stuck like here: $ telnet 172.16.0.4 25 Trying 172.16.0.4... Connected to 172.16.0.4. Escape character is '^]'. helo mail from:m...@gmail.com quit ^C^Z $ telnet 172.16.0.4 110 Trying 172.16.0.4... Connected to 172.16.0.4. Escape character is '^]'. user udippel Why do I write in: 1. I have no access. It is a remote production server. If I only could stop that 'hanging' postfix, I might be able to issue a 'reboot' 2. Any further trial to ssh into it also get stuck like this: $ ssh -v 172.16.0.4 OpenSSH_5.1, OpenSSL 0.9.7j 04 May 2006 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Connecting to 172.16.0.4 [172.16.0.4] port 22. debug1: Connection established. debug1: identity file /home/users/udippel/.ssh/identity type -1 debug1: identity file /home/users/udippel/.ssh/id_rsa type -1 debug1: identity file /home/users/udippel/.ssh/id_dsa type -1 after which I can only leave by killing the session on the client. 3. Even if I went there with a huge effort, and some time delay, how can I debug the problem, so that it won't occur again? Thanks for all ideas, Uwe
Upgrade woes with httpd at 4.3-4.4 on amd64
Here after reboot I find the following: # apachectl start /usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isinf' /usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isnan' /usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isinf' /usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isnan' Syntax error on line 1 of /var/www/conf/modules/php5.conf: Cannot load /usr/local/lib/php/libphp5.so into server: Cannot load specified object /usr/sbin/apachectl start: httpd could not be started Some details: # ls -l /usr/local/lib/php/libphp5.so -r--r--r-- 1 root bin 4580796 Mar 11 2008 /usr/local/lib/php/libphp5.so # ls -l /usr/lib/libm.so.2.3 -r--r--r-- 1 root bin 613515 Mar 13 2008 /usr/lib/libm.so.2.3 # cat /var/www/conf/modules/php5.conf LoadModule php5_module /usr/local/lib/php/libphp5.so IfModule mod_php5.c AddType application/x-httpd-php .php .phtml .php3 AddType application/x-httpd-php-source .phps # Most php configs require this DirectoryIndex index.php /IfModule I tried archives and Google, and now I wonder which was the mistake that I did? (I *think* I followed the upgrade guide meticulously.) How can I get the services back? Thanks for helping out, Uwe
Re: Upgrade woes with httpd at 4.3-4.4 on amd64
Uwe Dippel wrote: Here after reboot I find the following: # apachectl start /usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isinf' /usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isnan' /usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isinf' /usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isnan' Syntax error on line 1 of /var/www/conf/modules/php5.conf: Cannot load /usr/local/lib/php/libphp5.so into server: Cannot load specified object /usr/sbin/apachectl start: httpd could not be started Nevermind. It was sorted after the full package upgrade and application of all the softlinks and stuff and another reboot. Uwe
Re: dhcpd on 4.4 is problematic
Kenneth R Westerback wrote: This (untested) diff might help. Unfortunately I have no Solaris to test against and I'm off to work now. Test reports welcome, or better fixes. You lack the Solaris, and my firewall lacks the sources and stuff, so I can't compile there. But if the dhcpd is a simple, stand-alone executable, I am willing to just plug a modified version for i386 and try it out. Uwe
Re: dhcpd on 4.4 is problematic
Theo de Raadt wrote: Oh, we are supposed to ask? Please, get real. If you want to give us all the information you can file a bug report. By now you should know we won't bend over backwards to ask for information. You want this fixed as much as we do. Sorry, Theo, the message to gnats is on the way. I hope it goes through all our filters ... Here are some relevant info, in case: OpenBSD 4.4 (GENERIC) #1021: Tue Aug 12 17:16:55 MDT 2008 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel Pentium/MMX (GenuineIntel 586-class) 233 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,MMX cpu0: F00F bug workaround installed real mem = 66678784 (63MB) avail mem = 55009280 (52MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 07/07/97, BIOS32 rev. 0 @ 0xfd920 pcibios0 at bios0: rev 2.1 @ 0xf/0x1 pcibios0: PCI BIOS has 6 Interrupt Routing table entries pcibios0: PCI Interrupt Router at 000:07:0 (Intel 82371SB ISA rev 0x00) pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xc/0x8000 0xea000/0x2000 cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel 82437VX rev 0x02 pcib0 at pci0 dev 7 function 0 Intel 82371SB ISA rev 0x01 pciide0 at pci0 dev 7 function 1 Intel 82371SB IDE rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: WDC AC12500R wd0: 16-sector PIO, LBA, 2441MB, 4999680 sectors wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 pciide0: channel 1 ignored (disabled) vga1 at pci0 dev 8 function 0 S3 Trio32/64 rev 0x54 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) drm at vga1 unsupported rl0 at pci0 dev 17 function 0 Realtek 8139 rev 0x10: irq 9, address 00:40:95:00:21:a8 rlphy0 at rl0 phy 0: RTL internal PHY xl0 at pci0 dev 19 function 0 3Com 3c905 100Base-TX rev 0x00: irq 9, address 00:60:97:73:55:1a nsphy0 at xl0 phy 24: DP83840 10/100 PHY, rev. 1 isa0 at pcib0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 ne1 at isa0 port 0x300/32 irq 10, NE2000, address 00:50:ba:c0:41:17 pcppi0 at isa0 port 0x61 midi0 at pcppi0: PC speaker spkr0 at pcppi0 lpt0 at isa0 port 0x378/4 irq 7 npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec biomask f965 netmask ff65 ttymask softraid0 at root root on wd0a swap on wd0b dump on wd0b /etc/dhcpd.conf: shared-network LOCAL-NET { option domain-name my.domain.com; option domain-name-servers 192.168.116.200; option netbios-name-servers 172.16.3.247; subnet 192.168.116.0 netmask 255.255.255.0 { range 192.168.116.101 192.168.116.199; default-lease-time 86400; max-lease-time 259200; option broadcast-address 192.168.116.255; option routers 192.168.116.200; } } ps aux | grep dhcp: _dhcp 2000 0.0 1.5 452 972 ?? Is 3:13PM0:00.05 dhcpd xl0 ifconfig: lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33204 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 rl0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:40:95:00:21:a8 groups: egress media: Ethernet autoselect (100baseTX full-duplex) status: active inet 172.20.16.207 netmask 0xff00 broadcast 172.20.16.255 inet6 fe80::240:95ff:fe00:21a8%rl0 prefixlen 64 scopeid 0x1 xl0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:60:97:73:55:1a media: Ethernet autoselect (100baseTX full-duplex) status: active inet 192.168.116.200 netmask 0xff00 broadcast 192.168.116.255 inet6 fe80::260:97ff:fe73:551a%xl0 prefixlen 64 scopeid 0x2 ne1: flags=8822BROADCAST,NOTRAILERS,SIMPLEX,MULTICAST mtu 1500 lladdr 00:50:ba:c0:41:17 media: Ethernet manual enc0: flags=0 mtu 1536 pflog0: flags=141UP,RUNNING,PROMISC mtu 33204 groups: pflog Uwe
Re: dhcpd on 4.4 is problematic
Robert Blacquiere wrote: Missing info would be output from dhcpd in the /var/log/daemon. Please grep there on dhcpd and send this. Also add the mac address of the failing machine. This will give us atleast info about the request and offers to this box. Sure. It is a tad long, but maybe, maybe, it adds information. In between you can see the successful transaction with the Knoppix 5.3.1 to which I booted the machine for debugging purposes, to exclude hardware. I remember I saw the 167 in the Knoppix terminal. Nov 4 14:00:23 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:00:23 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:01:28 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:01:28 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:02:33 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:02:33 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:03:36 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:03:36 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:04:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:04:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:05:45 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:05:45 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:06:50 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:06:50 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:07:53 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:07:53 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:08:57 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:08:57 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:10:00 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:10:00 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:11:05 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:11:05 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:12:09 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:12:09 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:13:13 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:13:13 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:14:17 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:14:17 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:15:21 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:15:21 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:16:25 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:16:25 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:17:30 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:17:30 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:18:34 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:18:34 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:19:38 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:19:38 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:20:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:20:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:21:46 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:21:46 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:22:49 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:22:49 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:23:52 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:23:52 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:24:56 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:24:56 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:25:59 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:25:59 firewall
Re: dhcpd on 4.4 is problematic
Robert Blacquiere wrote: Missing info would be output from dhcpd in the /var/log/daemon. Please grep there on dhcpd and send this. Also add the mac address of the failing machine. This will give us atleast info about the request and offers to this box. Sure. It is a tad long, but maybe, maybe, it adds information. In between you can see the successful transaction with the Knoppix 5.3.1 to which I booted the machine for debugging purposes, to exclude hardware. I remember I saw the 167 in the Knoppix terminal. Nov 4 14:00:23 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:00:23 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:01:28 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:01:28 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:02:33 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:02:33 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:03:36 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:03:36 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:04:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:04:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:05:45 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:05:45 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:06:50 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:06:50 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:07:53 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:07:53 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:08:57 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:08:57 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:10:00 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:10:00 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:11:05 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:11:05 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:12:09 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:12:09 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:13:13 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:13:13 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:14:17 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:14:17 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:15:21 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:15:21 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:16:25 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:16:25 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:17:30 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:17:30 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:18:34 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:18:34 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:19:38 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:19:38 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:20:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:20:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:21:46 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:21:46 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:22:49 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:22:49 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:23:52 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:23:52 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:24:56 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:24:56 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 00:20:ed:ee:ed:14 via xl0 Nov 4 14:25:59 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via xl0 Nov 4 14:25:59
dhcpd on 4.4 is problematic
I read the upgrade guide, followed it, and have a 4.4-router in front of me. Alas, it does not at all dish out an IP-address to an OpenSolaris client (nv98). It used to do so before, without any fail at all, ever. Immediately after the upgrade to 4.4, it fails 100%. It does dish out IP-addresses to Knoppix 5.3.1 on the very same interface of the very same machine when booted to Knoppix. I have as well restarted the dhcpd with the interface on which I want the address to be given out ('dhcpd xl0'), to no avail. 'ifconfig nge0 dhcp' on OpenSolaris also times out all the time. I only need to 'ifonfig nge0 ..' and 'route add ..', on the OpenSolaris, though, to connect to the router. Therefore it is not a hardware problem. It looks much more like a compatibility problem between OpenBSD 4.4 and (Open?)Solaris, which didn't exist before and which must not happen. Any detail can be furnished on request, Uwe
Re: dhcpd on 4.4 is problematic
Deraj Puma wrote: This same thing happened to me last night between me and my ISP. I deleted /var/db/dhclient.leases.if and rebooted which worked. No cigar. Of course, I have no /var/db/dhclient.leases, but I did move dhcpd.leases out of the way and rebooted. It was recreated, but no IP dished out. Further, I also 'pfctl -d' the firewall out of the way. Still no success. I tried to find the location of the dhcp leases on Solaris, but no success. I found the only config file for the dhcpagent, and it has only one uncommented line for inet4: # By default, a parameter request list requesting a subnet mask (1), # router (3), DNS server (6), hostname (12), DNS domain (15), broadcast # address (28), and encapsulated vendor options (43), is sent to the DHCP # server when the DHCP agent sends requests. However, if desired, this # can be changed by altering the following parameter-value pair. The # numbers correspond to the values defined in the IANA bootp-dhcp-parameters # registry at the time of this writing. # PARAM_REQUEST_LIST=1,3,6,12,15,28,43 Then I continued debugging on the client side, and that was more promising. Here is the session on the admin side: # ifconfig nge0 dhcp status Interface State Sent Recv Declined Flags nge0 SELECTING7 0 0 # ifconfig nge0 dhcp status Interface State Sent Recv Declined Flags nge0 SELECTING8 0 0 # ifconfig nge0 dhcp drop # ifconfig nge0 dhcp status ifconfig: nge0: interface is not under DHCP control # ifconfig nge0 dhcp start ^C # /sbin/dhcpagent -d 2 -v # ps -eaf | grep dhcp root 319 1 0 11:06:37 ? 0:00 /sbin/dhcpagent # kill 319 # ifconfig nge0 192.168.116.91 # route add default 192.168.116.200 add net default: gateway 192.168.116.200 In the first part, one can see the counter of unsuccessful attempts incrementing. Then I stop dhcpclient, and restart; and stop and ask for verbose output. This is the dmesg, and it clearly shows a compatibility problem; under the default as well as under verbose states: Nov 5 11:07:07 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: bad option overload Nov 5 11:13:18 solN last message repeated 17 times Nov 5 11:13:50 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: bad option overload Nov 5 11:15:50 solN last message repeated 11 times Nov 5 11:16:28 solN /sbin/dhcpagent[1156]: [ID 787751 daemon.error] dhcp_ipc_init: cannot bind to port 4999 (agent already running?) Nov 5 11:16:53 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: bad option overload It is beyond my horizon, what the recv_pkt: bad option overload means, but it shows that Solaris is not going to accept offers from the dhcpd of 4.4. Alas, my fault, I have no backup of 4.3, but exactly the same had worked flawlessly throughout up to and including 4.3. Uwe
Re: dhcpd on 4.4 is problematic
Here is what Stuart requested. I hope the attachment goes through! Uwe 12:10:18.698196 00:20:ed:df:a7:28 ff:ff:ff:ff:ff:ff 0800 342: 0.0.0.0.68 255.255.255.255.67: [udp sum ok] xid:0x8e0c275e vend-rfc1048 DHCP:DISCOVER MSZ:1472 LT:4294967295 VC:83.85.78.87.46.105.56.54.112.99 PR:SM+DG+NS+HN+DN+BR+VO (DF) (ttl 255, id 43389, len 328) : 4500 0148 a97d 4000 ff11 d127 E..H)[EMAIL PROTECTED]' 0010: 0044 0043 0134 0ce5 0101 0600 .D.C.4.e 0020: 8e0c 275e ..'^ 0030: 0020 eddf a728 . m_'(.. 0040: 0050: 0060: 0070: 0080: 0090: 00a0: 00b0: 00c0: 00d0: 00e0: 00f0: 0100: 6382 5363 3501 0139 c.Sc5..9 0110: 0205 c033 04ff ff3c 0a53 554e 572e [EMAIL PROTECTED].SUNW. 0120: 6938 3670 6337 0701 0306 0c0f 1c2b ff00 i86pc7...+. 0130: 0140: 12:10:18.700699 00:60:97:73:55:1a 00:20:ed:df:a7:28 0800 362: 192.168.116.200.67 192.168.116.102.68: [udp sum ok] xid:0x8e0c275e Y:192.168.116.102 S:192.168.116.200 vend-rfc1048 OO:0 DHCP:OFFER SID:192.168.116.200 LT:259200 SM:255.255.255.0 DG:192.168.116.200 NS:192.168.116.200 DN:uwe.uniten.edu.my BR:192.168.116.255 RN:129600 RB:226800 WNS:172.16.3.247 [tos 0x10] (ttl 16, id 0, len 348) : 4510 015c 1011 3f02 c0a8 74c8 E..\..?.@(tH 0010: c0a8 7466 0043 0044 0148 82d5 0201 0600 @(tf.C.D.H.U 0020: 8e0c 275e c0a8 7466 ..'^@(tf 0030: c0a8 74c8 0020 eddf a728 @(tH. m_'(.. 0040: 0050: 0060: 0070: 0080: 0090: 00a0: 00b0: 00c0: 00d0: 00e0: 00f0: 0100: 6382 5363 3401 0035 c.Sc4..5 0110: 0102 3604 c0a8 74c8 3304 0003 f480 0104 ..6.@(tH3...t... 0120: ff00 0304 c0a8 74c8 0604 c0a8 74c8 ...@(tH..@(tH 0130: 0f11 7577 652e 756e 6974 656e 2e65 6475 ..uwe.uniten.edu 0140: 2e6d 791c 04c0 a874 ff3a 0400 01fa 403b .my..@(t:...z@; 0150: 0400 0375 f02c 04ac 1003 f7ff...up,.,..w 12:10:23.217543 00:20:ed:df:a7:28 ff:ff:ff:ff:ff:ff 0800 342: 0.0.0.0.68 255.255.255.255.67: [udp sum ok] xid:0x8e0c275e secs:5 vend-rfc1048 DHCP:DISCOVER MSZ:1472 LT:4294967295 VC:83.85.78.87.46.105.56.54.112.99 PR:SM+DG+NS+HN+DN+BR+VO (DF) (ttl 255, id 43390, len 328) : 4500 0148 a97e 4000 ff11 d126 E..H)[EMAIL PROTECTED] 0010: 0044 0043 0134 0ce0 0101 0600 .D.C.4.` 0020: 8e0c 275e 0005 ..'^ 0030: 0020 eddf a728 . m_'(.. 0040: 0050: 0060: 0070: 0080: 0090: 00a0: 00b0: 00c0: 00d0: 00e0: 00f0: 0100: 6382 5363 3501 0139 c.Sc5..9 0110:
Re: Slow file access on Compact Flash [SOLVED]
Uwe Dippel wrote: 4.2 runs out of the box, but with very slow access of files. The CF is reasonably fast, though, with ~6MB at 'dd'. But once it has to access files for r/w, it gets very slow. Any hint welcome, I got a really great hint. Let me start with the results: tar -C /tmp -xzphf etc43.tgz takes exactly 3 min 16 sec With softdep on, it takes exactly 2 seconds. Thanks so much! Now CF is as fast as hard disk. (I use a 133x Kingston) Uwe
Slow file access on Compact Flash
[I read all postings in the archive AFAIK] Just started with CF on embedded hardware advertised to run OpenBSD; ARInfoTek. It does run OpenBSD very well! Now I want the embedded system to run off CF; the board has a CF socket to be wd0. 4.2 runs out of the box, but with very slow access of files. The CF is reasonably fast, though, with ~6MB at 'dd'. But once it has to access files for r/w, it gets very slow. I found some postings that 4.3 would be better, but the install of 4.3 here mainly -stalled- and took a good hour, from a local ftp-site. locate.updatedb is incredibly fast, while some file extraction takes ages. It looks like a large, single, file copies very fast, similar to 'dd'. But opening a file for r/w seems to take ages. Something like tar -C /tmp -xzphf etc43.tgz takes a minute, easily. And etc43.tgz is only 1.2MB. Copying of this file is quick: $ date cp etc43.tgz demo date Mon Oct 20 11:29:15 SGT 2008 Mon Oct 20 11:29:16 SGT 2008 Any hint welcome, Uwe OpenBSD 4.3 (GENERIC) #698: Wed Mar 12 11:07:05 MDT 2008 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Geode(TM) Integrated Processor by AMD PCS (AuthenticAMD 586-class) 500 MHz cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX real mem = 527785984 (503MB) avail mem = 502276096 (479MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 05/23/08, BIOS32 rev. 0 @ 0xfaf00 apm0 at bios0: Power Management spec V1.2 (slowidle) apm0: AC on, battery charge unknown pcibios0 at bios0: rev 2.1 @ 0xf/0xdb74 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdaf0/128 (6 entries) pcibios0: PCI Exclusive IRQs: 5 7 10 11 pcibios0: no compatible PCI ICU found: ICU vendor 0x1022 product 0x2090 pcibios0: Warning, unable to fix up PCI interrupt routing pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc/0x8000 0xef000/0x1000! cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 1 function 0 AMD Geode LX rev 0x31 vga1 at pci0 dev 1 function 1 AMD Geode LX Video rev 0x00 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) glxsb0 at pci0 dev 1 function 2 AMD Geode LX Crypto rev 0x00: RNG AES glxpcib0 at pci0 dev 15 function 0 AMD CS5536 ISA rev 0x03: rev 0, 32-bit 3579545Hz timer, watchdog, gpio gpio0 at glxpcib0: 32 pins pciide0 at pci0 dev 15 function 2 AMD CS5536 IDE rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: ULTIMATE CF CARD 4GB wd0: 1-sector PIO, LBA, 3967MB, 8124480 sectors wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 pciide0: channel 1 ignored (disabled) ohci0 at pci0 dev 15 function 4 AMD CS5536 USB rev 0x02: irq 7, version 1.0, legacy support ehci0 at pci0 dev 15 function 5 AMD CS5536 USB rev 0x02: irq 7 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 AMD EHCI root hub rev 2.00/1.00 addr 1 ppb0 at pci0 dev 18 function 0 Pericom PCI-PCI rev 0x00 pci1 at ppb0 bus 1 fxp0 at pci1 dev 12 function 0 Intel 82559ER rev 0x10, i82551: irq 5, address 00:14:b7:00:26:6a inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4 fxp1 at pci1 dev 13 function 0 Intel 82559ER rev 0x10, i82551: irq 11, address 00:14:b7:00:26:6b inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4 fxp2 at pci1 dev 14 function 0 Intel 82559ER rev 0x10, i82551: irq 10, address 00:14:b7:00:26:6c inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4 fxp3 at pci0 dev 19 function 0 Intel 82559ER rev 0x10, i82551: irq 11, address 00:14:b7:00:26:69 inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4 isa0 at glxpcib0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbdprobe: reset response 0xfa pcppi0 at isa0 port 0x61 midi0 at pcppi0: PC speaker spkr0 at pcppi0 npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo usb1 at ohci0: USB revision 1.0 uhub1 at usb1 AMD OHCI root hub rev 1.00/1.00 addr 1 biomask f3cf netmask ffef ttymask ffef mtrr: K6-family MTRR support (2 registers) uplcom0 at uhub1 port 1 Prolific Technology Inc. USB-Serial Controller rev 1.10/3.00 addr 2 ucom0 at uplcom0 uhidev0 at uhub1 port 2 configuration 1 interface 0 NOVATEK USB Multimedia Keyboard rev 1.10/1.00 addr 3 uhidev0: iclass 3/1 ukbd0 at uhidev0: 8 modifier keys, 6 key codes, country code 33 wskbd0 at ukbd0 mux 1 wskbd0: connecting to wsdisplay0 uhidev1 at uhub1 port 2 configuration 1 interface 1 NOVATEK USB Multimedia Keyboard rev 1.10/1.00 addr 3 uhidev1: iclass 3/0, 3 report ids uhid0 at uhidev1 reportid 2: input=1, output=0, feature=0 uhid1 at uhidev1 reportid 3: input=3, output=0, feature=0 softraid0 at root root on wd0a swap on wd0b dump on wd0b
Re: apc Back-UPS ES 525
On Wed, 16 Jul 2008 19:41:55 +0700, sonjaya wrote: i have small ups seri APC / Back-UPS ES 525 , how to joint and control with openbsd , i try using apc-upsd when test not working. then i try nut but unknown driver. if any sucsess story can share to me :) Yes, but not with ports, this tine. Here I use apcupsd, as in http://www.apcupsd.org/ Check out: they have a (slightly) outdated section on OpenBSD, how to install. Uwe
Re: Postfix race condition at boot
On Mon, 14 Jul 2008 12:47:40 -0500, Karl O. Pinc wrote: I've an OpenBSD box that's been running postfix for a few years, strictly as a send-only mta, and every night the box gets rebooted. Every couple of months postfix does not come up on reboot. All that shows up in the logs is: snip postfix/postfix-script[3005]: fatal: Postfix integrity check failed! My suspicion is that syslogd has not yet finished making the log socket and the postfix check that happens at postfix start fails. (/etc/rc.conf.local has: syslogd_flags=-a /var/spool/postfix/dev/log ) I can always log in and start postfix manually using the same sendmail command that the rc scripts use. Any suggestions as to how to confirm the problem and/or what to do about it? Does anyone else have this problem? Should I be talking to the postfix port maintainer? Alright. I have exactly the same problem, asked ports@ and got only an off-list mail, confirming this. Plus, one of a chap who has a similar problem with another application. I wonder why there was nothing on the list, though. I know all too well, that the people here care for correctness, though the start sequence seems faltering, or maybe unclear? I do also confirm, that the problem appears only on my smallest and oldest box: 1.7 GHz, 256 MB. Solution? Remove the sendmail-flags from rc.conf.local and put a 'postfix start' at the end of rc.local. That should help. Uwe
Re: Postfix race condition at boot
On Sun, 20 Jul 2008 20:19:05 +1000, Damien Miller wrote: My suspicion is that syslogd has not yet finished making the log socket and the postfix check that happens at postfix start fails. That shouldn't happen, because syslogd delays its exit until after its log sockets have been established. Damien, I am not so sure if it is syslog that fails. I have something else failing before, please see my maillog in the ports@: Jul 11 11:56:19 claude authdaemond: modules=authuserdb authpwd authpgsql authld ap authmysql authpipe, daemons=5 Jul 11 11:56:19 claude authdaemond: Installing libauthuserdb Jul 11 11:56:19 claude authdaemond: File not found Jul 11 11:56:19 claude authdaemond: Installing libauthpwd Jul 11 11:56:19 claude authdaemond: Installation complete: authpwd Jul 11 11:56:19 claude authdaemond: Installing libauthpgsql Jul 11 11:56:19 claude authdaemond: File not found Jul 11 11:56:19 claude authdaemond: Installing libauthldap Jul 11 11:56:19 claude authdaemond: File not found Jul 11 11:56:19 claude authdaemond: Installing libauthmysql Jul 11 11:56:19 claude authdaemond: File not found Jul 11 11:56:19 claude authdaemond: Installing libauthpipe Jul 11 11:56:19 claude authdaemond: Installation complete: authpipe Jul 11 11:56:20 claude postfix/postfix-script[17841]: fatal: Postfix integrity c heck failed! I am not aware that I'd use courier-authlib for that postfix, but who knows what it checks? in any case, postfix seems to wait for something, that slower machines cannot provide fast enough. If you have any idea how to debug this and find out *what* it can't find, let me know, Uwe
httpd-problem after upgrade 4.2 - 4.3
After the successful upgrade of the first machine, I have some trouble with the second. Chances are that the trouble is my fault, but I could still appreciate a clue: Apache reacts very slow. Despite of a load 0.5, lynx 127.0.0.1 (as root) takes more than 5-10 seconds until the static -rwxr-x--- 1 root www 2236 Dec 12 2006 /var/www/htdocs/index.html props up. Any other task on the system is done instantaneous. From other machines, on the same network, it takes a similar time to see that page. But it is not a data-rate problem, because after the lng wait, the data itself comes down to the clients at close to 100Mb/sec. Downloading a file of 60M, on the same subnet, takes about 20 seconds to connect to the IP-address/subdir and 7 seconds for the transfer. pf is disabled, /etc/hosts is ::1 localhost.uniten.edu.my localhost 127.0.0.1 localhost.uniten.edu.my localhost ::1 metalab.uniten.edu.my metalab 172.16.0.2 metalab.uniten.edu.my metalab Apache has been restarted, it stops and restarts 'graceful' within a second or two top says Memory: Real: 76M/423M act/tot Free: 1562M Swap: 0K/2151M used/tot I am stumped, Uwe
Re: httpd-problem after upgrade 4.2 - 4.3
On Thu, 08 May 2008 09:41:23 +0800, Uwe Dippel wrote: Apache reacts very slow. Despite of a load 0.5, lynx 127.0.0.1 (as root) takes more than 5-10 seconds until the static -rwxr-x--- 1 root www 2236 Dec 12 2006 /var/www/htdocs/index.html props up. Any other task on the system is done instantaneous. From other machines, on the same network, it takes a similar time to see that page. Sorry, guys, brown bag. I seem to be very noisy these days. The reason for this is just a little DDoS! Once out of sight, everything behaves fine. My excuses again, Uwe