Re: MySQL connection error after upgrade 4.9-5.0

2012-03-15 Thread Uwe Dippel
One more step towards OT, and yet on the spot. Now it is on 'beauty'.
At shutdown, we get always
stopping package daemons:/etc/rc[260]: /etc/rc.d/: cannot execute -
Is a directory

Is there anything wrong, still, with our configuration?

Uwe



MySQL connection error after upgrade 4.9-5.0

2012-03-14 Thread Uwe Dippel
I have this unfortunate occurrence on one of my production machines:
Database Error: Unable to connect to the database:Could not connect to MySQL
I studied the Upgrade Guide 4.9 to 5.0 intensely before and after, but
can't find what went wrong. I just did the upgrade, and made the links
as proposed (so I hope)

# pwd
/etc/php-5.2
# ls -l
total 0
lrwxr-xr-x  1 root  wheel  26 Mar 14 17:26 gd.ini - /etc/php-5.2.sample/gd.ini
lrwxr-xr-x  1 root  wheel  29 Mar 14 17:25 mysql.ini -
/etc/php-5.2.sample/mysql.ini

# ls -l /var/www/conf/modules
total 0
lrwxr-xr-x  1 root  daemon  41 Mar 14 16:26 php.conf -
/var/www/conf/modules.sample/php-5.2.conf

I also tried to copy the modified php.ini from /var/www/conf to /etc/php-5.2.ini
Then the result is:
Database Error: Unable to connect to the database:The MySQL adapter
mysql is not available.

pkg_info looks okay:
mini_sendmail-chroot-1.3.6p1 static mini_sendmail for chrooted apache
mysql-client-5.1.54p0 multithreaded SQL database (client)
mysql-server-5.1.54p9 multithreaded SQL database (server)
nano-2.2.6  Pico editor clone with enhancements
[...]
pfstat-2.3p1packet filter statistics visualization
php-5.2.17p5server-side HTML-embedded scripting language
php-gd-5.2.17p4 image manipulation extensions for php5
php-mysql-5.2.17p3  mysql database access extensions for php5
png-1.5.4   library for manipulating PNG images

I am very grateful for any help or advice how to further debug this.

Uwe



Re: MySQL connection error after upgrade 4.9-5.0

2012-03-14 Thread Uwe Dippel
On Wed, Mar 14, 2012 at 7:27 PM, Norman Golisz li...@zcat.de wrote:

 However, did you change any values in php.ini from default?

no, not yet. I wasn't actually expecting everything to be up 100%, but
to be up, with the 50-default php-5.2.ini. Or, with the previous
php.ini in that place.
I also looked at the 'diff' before my post, but the differences are just huge.

Which values did you have in mind for changing in the default php.ini?

Uwe



Re: MySQL connection error after upgrade 4.9-5.0

2012-03-14 Thread Uwe Dippel
On Wed, Mar 14, 2012 at 6:55 PM, Fred Crowson fred.crow...@gmail.com wrote:

 What does your logs say in /var/mysql/ ?

 hth

Yes, Fred, very much! - It is obvious that I failed - and still fail -
to understand the new startup system. Can anyone point me to a
complete overview to read up on it?
I don't get yet which utility starts and controls the package scripts in rc.d.
And slightly OT: I have stared at the
pkg_scripts=${pkg_scripts} postfix in the Upgrade Guide, and still
don't grasp what this is supposed to do, and where; since I am running
postscript.

Thanks again,

Uwe



Re: MySQL connection error after upgrade 4.9-5.0

2012-03-14 Thread Uwe Dippel
On Wed, Mar 14, 2012 at 8:35 PM, Rodolfo Gouveia rgouv...@cosmico.net wrote:

 And slightly OT: I have stared at the
 pkg_scripts=${pkg_scripts} postfix in the Upgrade Guide, and still
 don't grasp what this is supposed to do, and where; since I am running
 postscript.

 http://www.openbsd.org/faq/faq10.html#rc
 man rc.d

I had read those. And yet, I don't understand that line. It doesn't
look like it should be written into rc.conf / rc.conf.local, does it?
Correct me if I'm wrong. It looks like a shell variable that has
'postfix' appended.

Uwe



Re: MySQL connection error after upgrade 4.9-5.0

2012-03-14 Thread Uwe Dippel
On Wed, Mar 14, 2012 at 10:22 PM, Uwe Dippel udip...@gmail.com wrote:

 I had read those. And yet, I don't understand that line. It doesn't
 look like it should be written into rc.conf / rc.conf.local, does it?
 Correct me if I'm wrong. It looks like a shell variable that has
 'postfix' appended.

Ooops, I think I got it finally. The shell variable is defined in
rc.conf, and actually has 'postfix' appended when rc.conf.local is
being run.
Personally, I would not have expected the variable to be created in
rc.conf, because since 4. something it is being considered 'clean' of
user entries at upgrade.
Would it not be better to add package start strings in rc.conf.local only?

Uwe



etc/nsd.conf with wrong group after upgrade

2011-05-06 Thread Uwe Dippel
I did the upgrade 4.8-4.9 as lined out in 
http://www.openbsd.org/faq/upgrade49.html

Now I get in the daily insecurity:

Checking special files and directories.
Output format is:
   filename:
   criteria (shouldbe, reallyis)
etc/nsd.conf:
   gid (97, 0)

Did I miss anything? (I don't think so.)
Yes, I'd know how to correct this, though I'd rather make sure that 
everything is done the proper manner on both sides. I think that when 
one does



cd /tmp/etc
cp daily disktab man.conf monthly netstart nsd.conf pf.os rc rc.conf 
weekly /etc


as instructed in the Upgrade Guide, root:wheel is the outcome.

Uwe



Re: Check for localtime?

2011-05-06 Thread Uwe Dippel
On Fri, May 6, 2011 at 3:33 PM, Otto Moerbeek o...@drijf.net wrote:

 The original files didn't differ, but the install replaced Singapore
 to which /etc/localtime syms. So after resolving symlinks
 /etc/localtime did change.

 You copy in your postfix install wasn't changed.

Thanks, Otto, for the explanation. Topic closed.

Uwe



softraid - best practice?

2010-11-18 Thread Uwe Dippel
Just had a problem with softraid on a 4.6 box. No, I don't ask to solve 
it, it needed urgent replacement, and so I did.
What I would like to ask for, is advice on best practices for softraid 
under OpenBSD, to prevent similar things from happening again; getting 
hints on how to set it up better, and mostly: how to recover it better.


What happened, was that some slices in a softraid simply went away after 
some power surge.
In detail: sd1 and sd2 were set to RAID, and the ensuing RAID1 (sd3) 
sliced up into a number of /usr/, /var, /home/, /var/www, /var/mail, swap.
After the reboot after a power surge, two of the slices (/var/mail, sd3g 
and /home, sd3h) were simply unavailable, couldn't be 'mount -a'-ed at 
reboot, and the system fell back to '/' only being mounted (on sd0).
Strangely, though, disklabel sd3 showed the slices, as sd3g, sd3h. But 
they could not be accessed at all; and were not visible under /dev/. 
Still, an unexpected bahaviour as far as I am concerned, even more so 
since sysctl and bioctl showed an 'OK' and 'Online' softraid.
I tried a few things, like fsck_ffs on these two disappeared slices, as 
well as the 'good' ones. The good ones were good, also with fsck_ffs -f. 
But the two gone missing were just not available (as devices). Then I 
made, I guess, a big mistake, and instead of ripping out one of the 
drives, I bioctl -d -ed sd3; leaving 2 drives with RAID file system on 
them. Over.


Now, please, any suggestions on how to do better next time something 
like this happens?

Thanks in advance,

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-05 Thread Uwe Dippel
Tony Abernethy tony at servasoftware.com writes:

 Might be better to read and comprehend ``man patch'' before assuming
 limitations on the scope of patch's reach.

It is always so nice to trample on the person lying on the ground, ain't it!
Where in 'man patch' is the underlying problem addressed? - oh, yeah, maybe mine
is the old version, again ... .



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-05 Thread Uwe Dippel
Philip Guenther guenther at gmail.com writes:

 Please point to the part of the Upgrade Guide which talks about
 building from source, untarring the src tar file, or applying errata.
 I can't seem to find any such reference, but I'm sure it's in there
 somewhere, because you originally said that you did the upgrade
 exactly following the upgrade guide and, as we found out later, your
 steps included building from source.

You misread what I did. I was following the Upgrade Guide to the dot, following 
Applying patches in OpenBSD to the dot, and then the instructions in the patch
files. To the dot. This is where my unfortunate quarrel with Jacob came from,
when he said I was building userland, and I insisted I was applying errata. 
From what I have learned, Applying patches in OpenBSD should be removed from
the FAQ, since we would all have been spared this thread. It obviously has
carried a number of people away.

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-05 Thread Uwe Dippel
Philip Guenther guenther at gmail.com writes:

 You now have and now it
 seems the core discussion is just about whether (or where) an
 additional rm -rf /usr/obj/* should be added to help people that
 know enough to set up the source tree for building/patching by
 untaring src.tar.gz but don't know to remove the obj tree at the same
 time.

So, no diff here, but a suggestion:

If one needs to avoid stale stuff lying around in /usr/obj at applying a patch,
the only logical consequence is, to clean out all /obj totally, even before
applying a single patch. 
If I am correct, the instructions should be clear for 00N_ThisApp.patch: 

Apply by doing:
 cd /usr/src
 patch -p0  00N_ThisApp.patch

Clean the build directories by issuing the command /usr/sbin/mk_build_clean

And then rebuild and install the library and statically-linked binaries
that depend upon it:

 cd lib/libThisApp
 make obj
 make depend
 make includes
 make
 make install
 cd ../../sbin
 make obj
 make depend
 make
 make install
   

, where mk_build_clean is just the set of steps pointed out in 'man release',
respectively in FAQ5.
To me, and I guess Richard Toohey, the case is solved.

Everyone who can read, and likes following instructions, can read and follow
this easily.

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Thu, Jun 3, 2010 at 7:18 PM, Richard Toohey
richardtoo...@paradise.net.nz wrote:

 OK, I've tried it and cannot reproduce what you see.  I've never done
 an upgrade from bsd.rd before, so wanted to give it a go.

 Obviously something different with your set-up, or where you got the files
 from, or factor X - but as other people have said, they can't guess what.

 In short - the basic bsd.rd follow these instructions worked for me here.

 OK, I start with 4.6 amd64 (either 4.6 or just pre-4.6 release)

 uname -a
 OpenBSD dellamd64.home 4.6 GENERIC#0 amd64

 But before I upgrade, what's /sbin/pfctl?

 $ ls -l /sbin/pfctl
 -r-xr-xr-x  1 root  bin  492664 Dec  3 23:12 /sbin/pfctl
 $ md5 /sbin/pfctl
 MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7

 http://openbsd.org/faq/upgrade47.html

 One easy way to boot from the install kernel is to place the 4.7 version of
bsd.rd in the root of your boot drive, then instruct the
  boot loader to boot using this new bsd.rd file. On amd64 and i386, you do
this by entering boot bsd.rd at the initial boot prompt.

 OK, I'll get the bsd.rd from the 4.7 release CD (but could have used FTP.)

 /usr/bin/su root
 mv /bsd.rd /bsd46.rd
 mount /dev/cd0a /mnt/
 cp /mnt/4.7/amd64/bsd.rd /bsd.rd
 umount /mnt
 eject /dev/cd0a
 reboot
 ... boot  boot bsd.rd
 ... Welcome to the OpenBSD/amd64 4.7 installation program.
 ... I choose upgrade ... take defaults all the way until ...
 Location of sets?  [What do I do here?  I'll try http, and take the
defaults.  What did YOU do here?]
 bsd, bsd.rd, base47.tgz, misc47.tgz, comp47.tgz, man47.tgz, game47.tgz,
xbase47.tgz
 xshare47.tgz, xfont47.tgz, xserv47.tgz ... all get to 100% no errors.
 ... rest of install, reboot ...
 $ uname -a
 OpenBSD dellamd64.home 4.7 GENERIC#112 amd64
 $ ls -l /sbin/pfctl
 -r-xr-xr-x  1 root  bin  500856 Mar 18 15:36 /sbin/pfctl
 $ md5 /sbin/pfctl
 MD5 (/sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4

Thanks, Richard.

No, you couldn't encounter it.
It comes in later.
I have now the whole upgrade session of my third machine, the log is  2 MB.
Whenever I rebooted, it was okay:
1. reboot to start bsd.rd - okay
2. reboot directly after bsd.rd upgrade - okay
3. reboot after 'Final steps', before pkg_add - okay
4. reboot after 'Upgrading packages' - okay
5. reboot after patching - old files and wrong timestamps - bummer, as
Theo might say.

I wonder if I can put the file up into the open, or if it contains
security-related matter.??
As bz2 it is just 91 k; I will of course make it available to
individuals on request.

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Fri, Jun 4, 2010 at 2:18 PM, Uwe Dippel udip...@gmail.com wrote:

 5. reboot after patching - old files and wrong timestamps - bummer, as
 Theo might say.

Sorry, guys, (too quick as too often), just cat-grep pfctl shows where
the old one comes in:

pfctl: pf already enabled
# ls -l # ls -l /etc/# ls -l /sbin/pfctl
-r-xr-xr-x  1 root  bin  500856 Mar 18 10:36 /sbin/pfctl
# ls -l # ls -l /sbin/ # ls -l # ls -l /sbin/pfctl # ls -l
/sbin/pfctl
-r-xr-xr-x  1 root  bin  500856 Mar 18 10:36 /sbin/pfctl
-r-xr-xr-x  1 root  bin500856 Mar 18 10:36 pfctl
# pfctl -f # pfctl -f /etc/# pfctl -f # pfctl -f /etc/pf.conf# pfctl
-f /etc/pf.conf
# # pfctl -e
pfctl: pf already enabled
# # pfctl -d
# # pfctl -e
=== pfctl
/usr/src/sbin/pfctl/obj - /usr/obj/sbin/pfctl
=== pfctl
=== pfctl
nroff -Tascii -mandoc /usr/src/sbin/pfctl/pfctl.8  pfctl.cat8
=== pfctl
install -c -s -o root -g bin  -m 555 pfctl /sbin/pfctl
install -c -o root -g bin -m 444 pfctl.cat8 /usr/share/man/cat8/pfctl.0
pfctl: DIOCBEGINADDRS: Operation not supported by device
pfctl: DIOCBEGINADDRS: Operation not supported by device
-r-xr-xr-x  1 root  bin492664 Jun  4 13:28 pfctl

Through patching outdated(?) source files; though with the proper time stamp:
# cd /home/ftp/pub/OpenBSD/4.7
# ls -l
-r-xr-xr-x  1 root  ftp  131759003 Mar 21 19:17 src.tar.gz
-rw-r--r--  1 root  ftp   20668814 Mar 21 19:17 sys.tar.gz
# md5 src.tar.gz
MD5 (src.tar.gz) = 5214cd951cac5b7fbd89c968d1b5f859
# md5 sys.tar.gz
MD5 (sys.tar.gz) = 566c0cd7c3d2f28b17a9795324ead6ff

Maybe TeXitoi was right, after all, when he mentioned corrupted files
on some mirrors?
I wouldn't bet on it, but usually our fastest mirror here is
ftp://ftp.jaist.ac.jp.

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Fri, Jun 4, 2010 at 2:41 PM, patrick keshishian pkesh...@gmail.com wrote:

 you mean applying the errata47.html patches? If so, are you certain
 your source tree is tagged OPENBSD_4_7 and not anything else?

Do I understand you correctly? I am not building releases. I am
installing/downloading the sets; then I do all the stuff in 'Upgrade
guide', then

rm -Rf * in /usr/src
rm -Rf * in /usr/xenocara
rm -Rf * in /usr/ports,
and then tar ... the source files meticulously as pointed out in the guide:
# cd /usr/src
# tar xzf ../sys.tar.gz
# tar xzf ../src.tar.gz
# cd /usr
# tar xzf xenocara.tar.gz
# tar xzf ports.tar.gz
and then download the patches into /usr/src, then applying them
and this is what you would see in the serial log.
So I don't tag, because I don't cvs; what i do is just download the 4
files. (Also see my other post, it points clearly to the sequence and
the reboots done, with always checking pfctl after each reboot.)

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Fri, Jun 4, 2010 at 3:00 PM, Richard Toohey
richardtoo...@paradise.net.nz wrote:

 I think we are getting closer, aren't we?

 So, NOTHING to do with the actual upgrade, is it?

No absolutely nothing. I withdraw the subject with regret. At least
the 'base47'-part thereof.

 Or the ports/packages.

I guess, not.

 It is something to do with how you are PATCHING after an upgrade.

 You don't mention where/when you get the source you patch?

 Because that would be a separate step, wouldn't it?

 (I usually install from CD, so I scrub /usr/src  load from src.tar.gz on the 
 CD.)

Exactly. Just explained it in the previous post, and don't want to
repeat myself. Except that I download, and then the actual files used
were:
# md5 /usr/src.tar.gz
MD5 (/usr/src.tar.gz) = 5214cd951cac5b7fbd89c968d1b5f859
# md5 /usr/sys.tar.gz
MD5 (/usr/sys.tar.gz) = 566c0cd7c3d2f28b17a9795324ead6ff
(Here, contrary to the earlier post, at the actual location in /usr of
the target machine before extraction:
# ls -l /usr/s*
-rw-r--r--  1 root  wheel  131759003 Mar 21 19:17 /usr/src.tar.gz
-rw-r--r--  1 root  wheel   20668814 Mar 21 19:17 /usr/sys.tar.gz)

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Fri, Jun 4, 2010 at 4:32 PM, patrick keshishian pkesh...@gmail.com wrote:

 Where did you get those tar-balls from? Those are most likely not 4.7 sources.

I gave the potential link and their md5 sums further up. Our link here
is sooo slooow; I *am* currently downloading the archives from
ftp://ftp.openbsd.org/pub/OpenBSD/4.7 to compare the checksums. That
would explain a lot (though not everything, since the kernel looks
pretty correct: 4.7).
Maybe someone is faster and can confirm or refute the authenticity of
the archives?

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
On Fri, Jun 4, 2010 at 4:22 PM, Eric Faurot e...@faurot.net wrote:

 Don't you have old stuff lying around in /usr/obj that gets installed
 over your new binaries?

That's probably the critical question now. Though, sorry to say, there
is nowhere written that you have to rm -Rf it, when you
 - upgrade
 - patch
Actually, I wasn't even aware of the existence of this directory until
several minutes ago. (I expected it to be cleaned with wiping the
source directories.)
Then, according to what is written by a number of people further up,
an number of people could be hit by this. I for one would expect the
time stamps to take care for that.

And, to stress it again:
When you are under 'quality control', and responsible for the
uptime of a system, you would never do anything out of the scope of
instructions, naturally. especially not some rm -Rf * in a directory
of your arbitrary choice.  ;)
And don't point me to man release, please. I am not doing releases, I
am not doing stable, I do, like many others, 'Upgrade Guide X.Y to
X.Z', and then get and apply the errata from
http://openbsd.org/errataXZ.html; according to their instructions.

If this happens to be wrong, by all means, then I make a mistake, and
have been making this mistake for 5 years. So, rm -Rf * in /usr/obj is
necessary?

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-04 Thread Uwe Dippel
Jacob Meuser jakemsr at sdf.lonestar.org writes:

 oh good grief.  you had a dirty /usr/obj.
 
 just look at the pfctl snippet of the log you posted.  do you see pfctl
 being built?  do you see pfctl being installed from /usr/obj?

Oh, yes. So the blame is on my side, I guess. Mea culpa maxima!
I didn't know that the object directories need to be cleaned manually. Until
yesterday, I would have taken a bet that the object directories lie within the
source trees (/usr/xenocaram /usr/src), and be cleaned when cleaning the
sources. Now I am aware that I need to know the location of the object
directories and clean them manually. 
I was totally unaware that, in case of a patch, the installer would take the
next best file of the correct name from there; irrespective of the underlying
version.
Though I feel in good company. I guess, a great number of people on this list
were in a similar situation. Knowing the 'social contract' of OpenBSD, I only
have to blame myself for ignorance.
Still, may I suggest, that the next Upgrade Guide gets an extra line, with a
remark pointing out the existence of /usr/obj; and the suggestion to clean it?
Also, with respect to the 'errata', the patches, they describe in detail what
needs to be done. Maybe here, it could as well be suggested, that before
applying the first patch of a new version of OpenBSD, /usr/obj should be
cleaned, or be verified to be clean?

Thanks for the various people who helped me patiently at analysing this problem
to the very end!

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-03 Thread Uwe Dippel
Theo de Raadt deraadt at cvs.openbsd.org writes:

 A chit-chat on a public mailing list isn't going to find this supposed
 bug.  Why discuss it?  Why not just keep prove it happened.

Yes, Theo. Though: How? This is what I tried to find out. 
I showed the list if files. Do you assume I tinkered with it? Why should I?
pfctl wasn't working correctly. Without the help of the list, I wouldn't have 
been able to drill it down to some 70 files being of the previous version.
Thanks to everyone who helped!

 Don't you see
 how tiring it is to discuss it when we've seen no evidence?

It might be tiring, but what evidence do you want? Here, I want to solve a
problem of files missing. Since I followed the Upgrade guide to the dot,
rebooted to bsd.rd in the beginning, rebooted at the command prompt, we (I) need
to find what went wrong. That's all. I don't even mind if the mistake was on my
side, then I could learn.

So, please, specify the evidence that you need.

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-03 Thread Uwe Dippel
On Thu, Jun 3, 2010 at 3:20 PM, Tony Abernethy t...@servasoftware.com wrote:

 The error message(s) you are suppressing (or maybe didn't see)

 About the only way you can get some files but not all files
 from a tarball is some fatal error in the extraction of the
 tarball. Any such error tends to give an error message.
 I don't think this list likes to play guessing games as to exactly
 what mistakes you have made or what evidence you are suppressing.

Oh, how beautiful! This is a sign of mutual trust. I documented
everything from the first pfctl after reboot from upgrade not working,
for what I am chided; and still, I am supposed to 'suppress evidence'.
How nice of you!
And if I present a serial log, I will have been suspected to have
tempered with it?

No, that seriously turns me off. I have given everything in detail
that I came across, I have not been silent about any additional
message, any unusual activity. I have stated a few times that I
followed the upgrade procedure to the dot, I have confirmed that
nothing unusual showed.
Over. I might have made some mistake, yes. Even though these same
boxes have been upgraded since 3.8, nevermind. I could at all times
have made a mistake or overlooked something.
But to start kind of asking for 'proof', that's what's ridiculous, to
cite Theo. I am willing to give individuals unprivileged access to the
boxes, I did this before, to look around.
When you have a box that is relevant to your company, and you are
responsible for it, and you noticed something unusual, why would you
not want to come screaming to the list (like it did, my excuses), to
look for help, but 'conveniently avoid' mentioning that serious error
message during the upgrade? You need the box to be up and running, and
adding this error message can only help; so why would you suppress it,
maybe preventing efficient help to be offered?

I fully agree that what I report sounds highly unlikely. But it is
true, and by now I confirmed exactly the same having happened on that
other box (i386), If I suppressed anything, why should I add to the
improbability? Yes, it happened, and I applied the same method and
have by now tar xzvphf-ed the 70+ sbin-files that were there and -
identically to the previous amd64 - are from version 4.6.
It is not excluded, as I wrote earlier, that the upgrade itself does
everything 100% correct. Who knows, there can always be a rogue
package. Not that I'm saying this has happened, but theoretically, any
package could contain 4.6-files and write them back at pkg_add.

Uwe



Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT

2010-06-03 Thread Uwe Dippel
Damon McMahon damon.mcmahon at gmail.com writes:


 Probably no help, but I had similar happen to me upgrading 4.5-4.6 a
 few months ago. Similar problem with pftcl after a diligent upgrade,
 and like you I have been following the upgrade procedure diligently
 since 3.something. I checked the timestamp on pfctl, it didn't seem
 right so I built from source and installed and the issue went away, so
 I assumed I had something wrong and thought nothing of it as generally
 if OpenBSD f*cks up it's down to me and not the developers 

Thanks so much! You saved my upcoming weekend. So I am not hallucinating.
Of course, never expected the developers to solve *my* problems; only wanted to
exclude a bug hitting others.

You're better than me, and I only learned yesterday that the install-set files
come back with the time stamp of creation. Had I known this, a lot could have
been spared, because my second post already showed a lot of bad time stamps in
/sbin. 

This 'strange' time stamp (see my earlier post) of May 31 20:28 still prevails
in a number of files on my box:
# ls -lR | grep May 31 20:28 | wc -l  
 153
, *after* I replaced the differing files in /sbin.
 
 I'd be interested to see if there's a common thread here, particularly
 before I upgrade this box to 4.7 which (like yours) is a remote box
 which will be upgraded over serial.

I guess so. By now I'll have the list of the files affecting amd64 and i386 in
these cases, and you should be able to correct everything using these lists.

Maybe interesting enough, this timestamp was *not* when I upgraded the sets. It
was the time when I upgraded the packages, respectively applied the patches.

But from now on I'll be mum (I hope I can!) on this topic until a complete
explanation is available. 

Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-02 Thread Uwe Dippel

Getting closer ...

Extracted the archive being used for the upgrade to amd64 into my 
user-directory and calculated all 7484 md5 for the files included in 
base47, and redirected those into a file.
Then, I calculated all the md5 for the files *installed* in the upgraded 
machine; the file names taken from the same base47.

Then, I had two files of 7484 md5s each, and could diff them.
Further down is the result. I'm stumped. Why would these files (around 
100) not be the 4.7 version, but previous (4.6 I guess; I haven't 
checked all).
(Add /etc/pfctl to the list of different files; I had already manually 
copied it to /sbin from the archive to get the firewall working)

So *all* sets were installed, in principle, but some files were not. Huh?

I am sure, some people with more insight can help me further to explain 
what is going on here. What makes or made these files below so 'special' 
that they fail to be 'just there' after the upgrade on amd64?


Thanks for any further hint,

Uwe


$ diff md5sums_archive md5sums_install
1c1
 ./usr/lib/libasn1.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3
---
 ./usr/lib/libasn1.so.17.0 aa2c929c805b55008bba1bc942483b01
3,4c3,4
 ./usr/lib/libcom_err.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3
 ./usr/lib/libcrypto.so.18.0 4280f48657120e382c01ca4c1a8aafc4
---
 ./usr/lib/libcom_err.so.17.0 aa2c929c805b55008bba1bc942483b01
 ./usr/lib/libcrypto.so.18.0 5f38e49397b845acdf818c520953eb0e
14,15c14,15
 ./usr/lib/libkafs.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3
 ./usr/lib/libkrb5.so.17.0 f07fcaad530dd9632ef7de1491ed6bd3
---
 ./usr/lib/libkafs.so.17.0 aa2c929c805b55008bba1bc942483b01
 ./usr/lib/libkrb5.so.17.0 aa2c929c805b55008bba1bc942483b01
34c34
 ./usr/lib/libssl.so.15.1 baa09f0512fbe6ecb1519de10ed6a8a4
---
 ./usr/lib/libssl.so.15.1 e3dcfdfc876252231bd8994c3f0c6f1d
192,195c192,195
 ./sbin/atactl 6ba0fc88a2cf2ad11bf5bdfee76b5bc5
 ./sbin/badsect 4914ef0ea057a00d8b9ae91eaf894af5
 ./sbin/bioctl ce8467b4415309be9b447405938fcda0
 ./sbin/ccdconfig 08123bd59420bf542011b5a72c6f896c
---
 ./sbin/atactl e214991d640840544a90595b661b6378
 ./sbin/badsect a87ca3adea353c650c15ff7db2059c94
 ./sbin/bioctl 6737ac6873c92917779282cf3f3e8cb5
 ./sbin/ccdconfig af51c2c19995759fcfd4f8ed6c7ab64d
197,198c197,198
 ./sbin/clri 721d8795ff051d72ee67175902c53dd6
 ./sbin/dhclient 3e1f06ca19aa5aceec214d2891fa504e
---
 ./sbin/clri afef2851d33038cb9de7c21f2d6dc037
 ./sbin/dhclient 5f6a023ce04f9fc3611a56c1752c5c30
200,219c200,219
 ./sbin/disklabel 3162a316caada7f5ebdd8b07d5722cb2
 ./sbin/dmesg 3acb23453982bdf9974c4f3abb8d6dfd
 ./sbin/dump e24875bd59e468780c4f53dc8685befc
 ./sbin/dumpfs 5cc5b40775147860423cddf45e264fac
 ./sbin/fdisk 6a1a875a41cceed2874e2b1e861b0257
 ./sbin/fsck 6fd26d79982bfe5d91d2f450ea495a19
 ./sbin/fsck_ext2fs 5dbb372ffbdb958bcfbd8d24811c4f87
 ./sbin/fsck_ffs 7bba80258056a3a40b51d24a63d4e5de
 ./sbin/fsck_msdos 85eacde50cecb85043601e05aac5a606
 ./sbin/fsdb e46a8fa824d753af715cae0f8e4a8049
 ./sbin/fsirand 65018fa13f98e1de12f5dfdcfc59cafd
 ./sbin/growfs 7e7ba9034167529de5cef12497e2228b
 ./sbin/halt e8612dfe7b7703188cb887852b073fe7
 ./sbin/ifconfig bc731472da980771e922604a7f76bb7e
 ./sbin/init a6e6bf349857e9addcce114f5cbeebea
 ./sbin/iopctl b1ffd69049a845e749f1fdff490045be
 ./sbin/ipsecctl acdee246db653efa457193d9d7be195b
 ./sbin/isakmpd 6e8462f8a4c3cfc2901dbe3163c9f857
 ./sbin/kbd b7da651953889ab863042dd1e05976dc
 ./sbin/ldattach b0b97a2496c0c2593c437842cb29d9df
---
 ./sbin/disklabel b6455e58788253af334bda563c12ca12
 ./sbin/dmesg 4a9f96f0a968f616a4dda156ec1572f4
 ./sbin/dump bdbfcd38d79289f81f23059cfb6156ea
 ./sbin/dumpfs 847cde118bbff6e12981ec92270aabcc
 ./sbin/fdisk 7b0d0a7788e323811c91c92761c7244f
 ./sbin/fsck 8105d9fc124a57dd343ba97d19c9fc48
 ./sbin/fsck_ext2fs ed161578a1777c598c10bb6963d0b7b4
 ./sbin/fsck_ffs 1b978655ccdcbf54e78c8febc2b8808b
 ./sbin/fsck_msdos 43bc067c65f648041f8ade25ddd077d1
 ./sbin/fsdb 8f720b110108c74f55b69935a20adfa6
 ./sbin/fsirand d39bf0252bfaad9aa256dbf294ede7da
 ./sbin/growfs d129af4e9526b87992de226da5f1e184
 ./sbin/halt 2d0046c3e383d785b856d1cb0dbe7e5a
 ./sbin/ifconfig 35e192bac398bf47ddf8e0a190f6b06a
 ./sbin/init 37d5ca74a94642c48f2278c17420bf76
 ./sbin/iopctl 04b18862d04525f6a53324694180592f
 ./sbin/ipsecctl 0f78f6df80715707bcd0dca44199debe
 ./sbin/isakmpd 9093d66c257145221ce33f4114ca3507
 ./sbin/kbd d0e6b82ecadad09eab297ce032fe1d70
 ./sbin/ldattach 04eace371d1dc317b289da273a311c10
221,242c221,242
 ./sbin/lmccontrol 2c9a1f7a4cb9af7d9ceaf47d9482eb8b
 ./sbin/mkfifo 7ebd0d605fb65d8acdce0b1542b7a949
 ./sbin/mknod 7ebd0d605fb65d8acdce0b1542b7a949
 ./sbin/modload bca677810f776226d24832fa2a118609
 ./sbin/modunload 95e4adeda57e7c52f240f136e092eb7b
 ./sbin/mount 6903ddec325432d73f65c80a56a9aef3
 ./sbin/mount_cd9660 cb343f92845ad398d2cc3e4262934030
 ./sbin/mount_ext2fs dcb91a3f42126fb96ece261f9a3db010
 ./sbin/mount_ffs 6fbd41195622e084f3b9ace630d73d2d
 ./sbin/mount_mfs c707d5acad7bc11fc5feeb4f4841a1e0
 ./sbin/mount_msdos c1d5742cc20e13b2b195f9d8d32c7769
 ./sbin/mount_nfs 

Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-02 Thread Uwe Dippel

Now, with

$ diff md5sums_archive md5sums_install | grep ^ | cut -d ' ' -f2
these are the files different on amd64, between what the archive 
supplied, and what the installer left behind:


./usr/lib/libasn1.so.17.0
./usr/lib/libcom_err.so.17.0
./usr/lib/libcrypto.so.18.0
./usr/lib/libkafs.so.17.0
./usr/lib/libkrb5.so.17.0
./usr/lib/libssl.so.15.1
./sbin/atactl
./sbin/badsect
./sbin/bioctl
./sbin/ccdconfig
./sbin/clri
./sbin/dhclient
./sbin/disklabel
./sbin/dmesg
./sbin/dump
./sbin/dumpfs
./sbin/fdisk
./sbin/fsck
./sbin/fsck_ext2fs
./sbin/fsck_ffs
./sbin/fsck_msdos
./sbin/fsdb
./sbin/fsirand
./sbin/growfs
./sbin/halt
./sbin/ifconfig
./sbin/init
./sbin/iopctl
./sbin/ipsecctl
./sbin/isakmpd
./sbin/kbd
./sbin/ldattach
./sbin/lmccontrol
./sbin/mkfifo
./sbin/mknod
./sbin/modload
./sbin/modunload
./sbin/mount
./sbin/mount_cd9660
./sbin/mount_ext2fs
./sbin/mount_ffs
./sbin/mount_mfs
./sbin/mount_msdos
./sbin/mount_nfs
./sbin/mount_nnpfs
./sbin/mount_ntfs
./sbin/mount_portal
./sbin/mount_procfs
./sbin/mount_udf
./sbin/mount_vnd
./sbin/mountd
./sbin/ncheck
./sbin/ncheck_ffs
./sbin/newfs
./sbin/newfs_msdos
./sbin/nfsd
./sbin/nologin
./sbin/pfctl
./sbin/pflogd
./sbin/ping
./sbin/ping6
./sbin/quotacheck
./sbin/raidctl
./sbin/rdump
./sbin/reboot
./sbin/restore
./sbin/route
./sbin/rrestore
./sbin/rtsol
./sbin/savecore
./sbin/scan_ffs
./sbin/scsi
./sbin/shutdown
./sbin/slattach
./sbin/swapctl
./sbin/swapon
./sbin/sysctl
./sbin/ttyflags
./sbin/tunefs
./sbin/umount
./sbin/vnconfig
./sbin/wpa-psk
./sbin/wsconsctl
./usr/libexec/kdc
./usr/sbin/sysctl

Let's assume for a moment, that the differences of Kerberos and crypto 
stuff is a result of the patches and packages, everything else different 
is the majority of the files in /sbin.
A yet closer inspection of the differences there leads to a confirmation 
of what was assumed before: of all files in /sbin, except of 
/sbin/ifconfig, /sbin/ipsecctl and /sbin/isakmpd, are the files of the 
4.6 Release.


Waiting for some further enlightment about what was going on; what 
happened to those 4.7 files,


Uwe



Re: Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-02 Thread Uwe Dippel
Nick Holland nick at holland-consulting.net writes:

  There is one more machine (amd64) that needs to be upgraded. Before I do 
  this, I rather solicit suggestions on how to log the upgrade process, 
  debug it, or otherwise.
 
 serial console.
 Log everything from the first chars out the serial port to the reboot.
  In fact, log the reboot.
 Don't edit anything.
 Use a public mirror or an official CD for the install, make sure it is
 obvious which.
 
 Stick the resulting file on a webserver.

Thanks, Nick.

Based on the latest results, the problem seems to exist only for most of the
/sbin files. So, the upgrade runs through as programmed. 
With a public mirror, it will take hours. I really hope SHA256 is good enough to
confirm the integrity of the archives. Serial console seems a good idea; I have
to use it in any case. 
What I have in mind, is, before the reboot, to use the command prompt to check
the files in the /sbin-to-be. I have a hunch, that they'll be there, then. Then
I'll do the same after the reboot, and once again, after the package upgrade.
Should the phenomenon show again, by now I can imagine that the changes are
happening some time later. We'll see ...

Uwe



Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT

2010-06-01 Thread Uwe Dippel

On 06/01/2010 05:32 AM, Philip Guenther wrote:


Was there a common thread to what did turn up?  My recall is that
basically every time people get Operation not supported by device
errors from pfctl, it's because their userland and kernel don't match.



Review your upgrade procedure, because it's clearly broken.


Thanks for your help, seriously. And I don't want to start arguing, not 
at all, but this is one of my production boxes, without access, and I 
have been running the boot.bsd.rd updates since 3.8 twice a year.
Being production, I diligently watched, and saw with my own eyes the 
asterisks advancing. I can only say, I followed standard procedures; if 
just for my own sanity.
I *am* losing the latter, because it seems that all files in /sbin are 
identical to my box still on 4.6; though something has happened to them 
yesterday:


(this is my 4.6-box, upgraded only on April 19th:)
$ ls -l /sbin/p* 


-r-xr-xr-x  1 root  bin  492664 Apr 19 13:44 /sbin/pfctl
-r-xr-xr-x  1 root  bin  390264 Apr 19 13:44 /sbin/pflogd
-r-sr-xr-x  1 root  bin  210040 Apr 19 13:44 /sbin/ping
-r-sr-xr-x  1 root  bin  234616 Apr 19 13:44 /sbin/ping6

(This is my box upgraded yesterday, May 31st, to 4.7:)
# ls -l /sbin/p* 


-r-xr-xr-x  1 root  bin  492664 May 31 20:28 /sbin/pfctl
-r-xr-xr-x  1 root  bin  390264 May 31 20:28 /sbin/pflogd
-r-sr-xr-x  1 root  bin  210040 May 31 20:28 /sbin/ping
-r-sr-xr-x  1 root  bin  234616 May 31 20:28 /sbin/ping6

So it did something, from where did it get the old files? I guess not 
from a mistake on my side, because I accepted the upgrade path in the 
Upgrade shell. Plus:

OpenBSD 4.7 (GENERIC.MP) #130: Wed Mar 17 20:48:50 MDT 2010
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
I never copied any myself down here. As I mentioned, production, upgrade 
twice per year through serial console.


And now my sanity seems to fade: I did the same to one of my i386-boxen, 
and exactly the same happens there!! (Please, now I am starting to lose 
ground under my feet!)

This is after the update to 4.7, i386, in front of the screen!:

(mnt is /altroot, mounted just now to check; since pfctl did the same 
thing, again, here)
# ls -l /mnt/sbin/p* 


-r-xr-xr-x  1 root  bin  422648 Apr 19 12:51 /mnt/sbin/pfctl
-r-xr-xr-x  1 root  bin  328440 Apr 19 12:51 /mnt/sbin/pflogd
-r-sr-xr-x  1 root  bin  180984 Apr 19 12:51 /mnt/sbin/ping
-r-sr-xr-x  1 root  bin  197368 Apr 19 12:51 /mnt/sbin/ping6
# ls -l /sbin/p*
-r-xr-xr-x  1 root  bin  422648 Jun  1 12:54 /sbin/pfctl
-r-xr-xr-x  1 root  bin  328440 Jun  1 12:54 /sbin/pflogd
-r-sr-xr-x  1 root  bin  180984 Jun  1 12:54 /sbin/ping
-r-sr-xr-x  1 root  bin  197368 Jun  1 12:54 /sbin/ping6

A mix-up of versions? I don't think so, because
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/base47.tgz ./sbin/pfctl
$ md5 sbin/pfctl
MD5 (sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4
exactly what you had.

Now I start to not exclude a bug any longer. Maybe under some 
circumstances, the files are not overwritten, but touched; or whatnot.?


This leaves me with two questions:

1. How to debug what goes on?

2. (and more important for me): What to do? Should I tar xzvphf 
{file}47.tgz; or try an new upgrade?


Uwe



Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT

2010-06-01 Thread Uwe Dippel

Joachim Schipper joachim at joachimschipper.nl writes:

 Just untarring the release should work, but it's still odd. At least 
 the md5sum of pfctl matches what I just downloaded, so that seems

 fine; did you actually use *that* tarball, though? (Note that the
 right pfctl binary is 500856 bytes long.)

 Are you sure that you upgraded the right disk?

Yep.

When I untar the files (I have them locally on a webserver:
ftp://metalab.uniten.edu.my/pub/OpenBSD/4.7/
all files come out perfectly well, as above. I did the upgrades using 
this URL; I am sure it were these files, because they only exist once 
locally (the speed with which the updates were done is proof that I used 
these local resources, downloaded by myself). In the Upgrade procedure I 
only added the (internal) IP for that server, accepting all else. And it 
can't be 4.6 that I used, kind of, because the installed (upgraded) 
kernel is 4.7.
I need to repeat, this is a remote production machine with serial 
access. I have no desire ever to do anything not along clear procedures, 
and I followed the Upgrade Guide 4.6-4.7 meticulously (system 
administration is part of my job description), even ticking off point 
after point on the printout of the upgrade guide.
So something was done to the files, at least they have the new time 
stamp, and some files have actually been installed correctly (kernels); 
as the hashsums show. So, finally, I *was* in the right directory and 
installed to the correct disk.

Here are the kernels, on the first machine, that has seemingly the
previous 'base' throughout:
# cksum -a sha256 bsd
SHA256 (bsd) = 
e2af09ed48d1d94bec27aa4c18ffa6172d8435a190c3abecae53d26940ed9536

# cksum -a sha256 bsd.sp
SHA256 (bsd.sp) = 
a34175b766d6ea9cefcc0903efa51c4dc3d87018b1e2f85c2333133ed25e9ff4


Now I wonder if the problem was with the untar? Maybe all sets have not 
been installed properly? Next, I will have to identify for each and 
every set, a sample file, and check if it is the previous one or the 
recent one.


Very, very strange ...

Thanks so much, you did actually help me a step further,

Uwe



Installer bug? - Upgrade 4.6 to 4.7 failed to upgrade base47, on i386 and amd64

2010-06-01 Thread Uwe Dippel
[I consider it better to open a new thread, since the title, and part of 
the content, of the previous one was superseded.]


Having upgraded one machine (amd64) from 4.6 to 4.7, using the normal 
upgrade procedure as outlined in http://openbsd.org/faq/upgrade47.html 
to the dot, after the reboot it showed that the files contained in 
base47.tar were not installed. All other sets apparently did get 
installed. This led to some out-of-sync problems with pfctl, by which 
this failure was noticed. 
(http://article.gmane.org/gmane.os.openbsd.misc/174272)


While waiting for some details, and trying to recapitulate what went 
wrong, I did the same procedure to an i386 machine, apparently with 
identical result: The base47 files are missing; there are still the 
previous files from base46 in the machine. As far as I can make out, all 
other sets installed perfectly well.


Since this happened on two physical machines of different architecture, 
and in both cases by meticulously following the upgrade guide with the 
boot bsd.rd mechanism, it cannot be excluded that there is a weakness in 
the installer. There was no error message, on the contrary, all looked 
okay, including the installation of the sets.


There is one more machine (amd64) that needs to be upgraded. Before I do 
this, I rather solicit suggestions on how to log the upgrade process, 
debug it, or otherwise.


I do not state at all, that the observed behaviour affects all 
installations, nor anyone else. It still points to a potential weakness, 
since both machines run nothing but standard OpenBSD, and have been 
upgraded through all previous versions using the same method since 3.8; 
until now without fail.


Uwe


As preliminary evidence, I have attached randomly selected files from 
the tar-sets that were found to differ in 4.6 and 4.7.
The first line shows the extraction of those files from the archives 
used for installation, the subsequent lines show the calculations of 
their md5 sums, firstly in the user directory; followed by the md5 sums 
of those files as installed by the upgrade process.
(For misc and xfonts no files were found (non-exhaustive search) that 
differed in 4.6 and 4.7)
The '$' prompt is on the machine containing the archives, followed by 
the '#' prompt, which is on the respective machine to which 4.7 was 
installed.


amd64:

base:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/base47.tgz ./sbin/pfctl
$ md5 sbin/pfctl
MD5 (sbin/pfctl) = 7720c9a4dc100fe29d2d3c4a16954eb4 == archive
# md5 /sbin/pfctl
MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 == upgraded
$ md5 /sbin/pfctl
MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7 == 4.6


comp:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/comp47.tgz ./var/db/libc.tags
$ md5 var/db/libc.tags
MD5 (var/db/libc.tags) = ef05ce6515665eff14618c02c4678edc == archive
# md5 /var/db/libc.tags
MD5 (/var/db/libc.tags) = ef05ce6515665eff14618c02c4678edc == upgraded
$ md5 /var/db/libc.tags
MD5 (/var/db/libc.tags) = d3e2a489d70cd3f0d91fef538f4ebfd1 == 4.6

games:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/game47.tgz ./usr/games/morse
$ md5 usr/games/morse
MD5 (usr/games/morse) = 61157239de35061df71e7be2e17a9471 == archive
# md5 /usr/games/morse
MD5 (/usr/games/morse) = 61157239de35061df71e7be2e17a9471 == upgraded
$ md5 /usr/games/morse
MD5 (/usr/games/morse) = ee30c2129ceac343438ea03a9efa2fe5 == 4.6

man:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/man47.tgz ./usr/share/man/ps8/loongson
$ md5 usr/share/man/ps8/loongson/
MD5 (usr/share/man/ps8/loongson/) = d41d8cd98f00b204e9800998ecf8427e == archive
# md5 /usr/share/man/ps8/loongson/
MD5 (/usr/share/man/ps8/loongson/) = d41d8cd98f00b204e9800998ecf8427e == 
upgraded
$ md5 /usr/share/man/ps8/loongson/
md5: cannot open /usr/share/man/ps8/loongson/: No such file or directory == 4.6

misc:
??

xbase:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xbase47.tgz ./usr/X11R6/man/whatis.db
$ md5 usr/X11R6/man/whatis.db
MD5 (usr/X11R6/man/whatis.db) = a6ebdd66fe58b66136c9fdfc9eca1c5d == archive
# md5 /usr/X11R6/man/whatis.db
MD5 (/usr/X11R6/man/whatis.db) = a6ebdd66fe58b66136c9fdfc9eca1c5d == upgraded
$ md5 /usr/X11R6/man/whatis.db
MD5 (/usr/X11R6/man/whatis.db) = 01e11bb37c523bc6fe8c37e139f6fe41 == 4.6

xetc:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xetc47.tgz ./var/db/sysmerge/xetcsum
$ md5 var/db/sysmerge/xetcsum
MD5 (var/db/sysmerge/xetcsum) = 374865b6f2b5a34b64148bfe6746cfd0 == archive
# md5 /var/db/sysmerge/xetcsum
md5: cannot open /var/db/sysmerge/xetcsum: No such file or directory == 
upgraded
$ md5 /var/db/sysmerge/xetcsum
md5: cannot open /var/db/sysmerge/xetcsum: No such file or directory == 4.6

xfonts:
??

xserv:
$ tar xzf /home/ftp/pub/OpenBSD/4.7/amd64/xserv47.tgz 
./usr/X11R6/lib/modules/dri/r300_dri.so
$ md5 usr/X11R6/lib/modules/dri/r300_dri.so
MD5 (usr/X11R6/lib/modules/dri/r300_dri.so) = e7caa1ee3691a40c40f994dfb210738c 
== archive
# md5 /usr/X11R6/lib/modules/dri/r300_dri.so
MD5 (/usr/X11R6/lib/modules/dri/r300_dri.so) = 

Re: pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT

2010-06-01 Thread Uwe Dippel
Joachim Schipper joachim at joachimschipper.nl writes:

 Just untarring the release should work, but it's still odd. At least the
 md5sum of pfctl matches what I just downloaded, so that seems fine; did
 you actually use *that* tarball, though? (Note that the right pfctl
 binary is 500856 bytes long.)

Make this thread closed.
I manually 'upgraded' (only) the file /sbin/pfctl from exactly the archive used
at the upgrade 4.6-4.7 procedure, and everything 'pfctl' works fine.
This leaves us with the problem of the failed upgrade procedure, twice now.

Thanks to all contributors in this thread!

Uwe



pfctl not working in 4.7: DIOCBEGINADDRS and DIOCXCOMMIT

2010-05-31 Thread Uwe Dippel

(I searched Google, but not much turned up.)

Since I upgraded to 4.7; what I get is:
# pwd
/etc
# cat pf.conf
# works?
# pfctl -f pf.conf.47
pfctl: DIOCBEGINADDRS: Operation not supported by device
# pfctl -f /etc/pf.conf
pfctl: DIOCXCOMMIT: Device busy

Huh?
(Actually I had used the original pf.conf, with the alterations for 
anchors done. Then I got this; and it seems pfctl is doing some 
nonsense: even a comment is not read and executed.)


# which pfctl
/sbin/pfctl
# md5 /sbin/pfctl 


MD5 (/sbin/pfctl) = 3e1fa4f69809adff432f9da62010a6a7
(this is amd64)

No changes of the hardware; I have no access to the machine.
Any hint will be appreciated,

Uwe


OpenBSD 4.7 (GENERIC.MP) #130: Wed Mar 17 20:48:50 MDT 2010
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2146381824 (2046MB)
avail mem = 2079801344 (1983MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (78 entries)
bios0: vendor HP version D19 date 07/16/2007
bios0: HP ProLiant ML350 G4p
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP SPCR MCFG APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.58 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu0: 2MB 64b/line 8-way L2 cache
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu1: 2MB 64b/line 8-way L2 cache
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu2: 2MB 64b/line 8-way L2 cache
cpu3 at mainbus0: apid 7 (application processor)
cpu3: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu3: 2MB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0: apid 9 pa 0xfec1, version 20, 24 pins
ioapic1: misconfigured as apic 0, remapped to apid 9
ioapic2 at mainbus0: apid 10 pa 0xfec8, version 20, 24 pins
ioapic3 at mainbus0: apid 11 pa 0xfec80400, version 20, 24 pins
acpiprt0 at acpi0: bus 1 (IP2P)
acpiprt1 at acpi0: bus 2 (IPXB)
acpiprt2 at acpi0: bus 6 (PCXA)
acpiprt3 at acpi0: bus 9 (PCXB)
acpiprt4 at acpi0: bus 5 (PTA0)
acpiprt5 at acpi0: bus 13 (PTB0)
acpiprt6 at acpi0: bus 16 (PTC0)
acpiprt7 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0
acpicpu1 at acpi0
acpicpu2 at acpi0
acpicpu3 at acpi0
acpitz0 at acpi0: critical temperature 31 degC
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c
ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c
pci1 at ppb0 bus 5
ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09
pci2 at ppb1 bus 6
ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09
pci3 at ppb2 bus 9
rl0 at pci3 dev 2 function 0 D-Link Systems 530TX+ rev 0x10: apic 11 
int 0 (irq 3), address 00:11:95:5e:50:ba

rlphy0 at rl0 phy 0: RTL internal PHY
ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c
pci4 at ppb3 bus 13
ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c
pci5 at ppb4 bus 16
ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02
pci6 at ppb5 bus 2
mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 
int 0 (irq 10)

scsibus0 at mpi0: 16 targets, initiator 7
sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF1468A4CC, HPB5 SCSI3 0/direct 
fixed

sd0: 140014MB, 512 bytes/sec, 286749488 sec total
sd1 at scsibus0 targ 2 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct 
fixed

sd1: 34732MB, 512 bytes/sec, 71132000 sec total
sd2 at scsibus0 targ 5 lun 0: COMPAQ, BD1468A4C5, HPB4 SCSI3 0/direct 
fixed

sd2: 140014MB, 512 bytes/sec, 286749488 sec total
mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi0: target 2 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi0: target 5 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 
int 1 (irq 10)

scsibus1 at mpi1: 16 targets, initiator 7
uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 
16 (irq 3)
uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 
19 (irq 5)

Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured
Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured
ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 int 
23 

No networking with cd46.iso on qemu?

2010-04-28 Thread Uwe Dippel

Trying to install a virtual OpenBSD on OpenBSD 4.6 on amd64, I did:

# env ETHER=em0 qemu -net nic,model=rtl8139 -net tap -m 32 -monitor stdio 
-no-fd-bootchk -hda virtual.img \
-cdrom cd46.iso -boot d

as described in 
http://www.openbsd.org/cgi-bin/cvsweb/ports/emulators/qemu/files/README.OpenBSD?rev=1.5;content-type=text%2Fx-cvsweb-markup
It does start the install, everything, but I can't seem to get any network connection. dhcp times out. 
I restarted, and tried the usual qemu 10.0.2.0/24 as fixed addresses (and given in that site); but no fun.
Restarted again and set some addresses of the network in which the host runs, 192.168.1.0/24. At least, I could ping the (real) gateway of the host, 
and even an outside server holding the installation files. Strange enough, though, the ftp to get the .tgz closes immediately 
with some 'connection refused', though I can ping it from the guest-to-be installed, as well as ftp to it from the host.

I used qemu on Knoppix before, and it always offered dhcp out of the box.

Where is my mistake?

Uwe



Re: No networking with cd46.iso on qemu?

2010-04-28 Thread Uwe Dippel

Rares Aioanei wrote:

On 04/28/2010 04:03 PM, Uwe Dippel wrote:
  

Trying to install a virtual OpenBSD on OpenBSD 4.6 on amd64, I did:

# env ETHER=em0 qemu -net nic,model=rtl8139 -net tap -m 32 -monitor 
stdio -no-fd-bootchk -hda virtual.img \

-cdrom cd46.iso -boot d

as described in 
http://www.openbsd.org/cgi-bin/cvsweb/ports/emulators/qemu/files/README.OpenBSD?rev=1.5;content-type=text%2Fx-cvsweb-markup 



You're not mistaking, Realtek is. Stay away from them, virtual or not,
and try the other NICs qemu has to offer. It will work.
  

No, sorry. All the same. No dhcp (dhclient) ever.
I tried pcnet, ne2k_pci, i82551 (the latter segfaults, see below).
What I did:

# env ETHER=bge0 qemu -net nic,model=ne2k_pci -net tap -m 32 -monitor 
stdio -no-fd-bootchk -hda virtual.img \

-cdrom cd46.iso -boot d

I guess there is something wrong with that 'ether' thing. I had tried 
em0 as written in that OpenBSD cvsweb, but then it works even less. I 
see no em0 coming up, only the pcn0 (pcnet), ne3 (ne2k_pci), fxp0 
(i82551); no vlan, tun. Just lo and that respective network card.


I always boot cd46.iso and go to (S)hell immediately, and do the network 
setting, first trying dhcp, then manually. All dhclient {pcn0| ne3} time 
out. All manual settings fail to connect to ftp, as well.


dhclient fxp0 segfaults qemu reproducably:
# env ETHER=bge0 qemu -net nic,model=i82551 -net tap -m 32 -monitor 
stdio -no-
 -cdrom cd46.iso -boot 
d 
{tun0 (bridge0 - bge0)}

QEMU 0.9.1 monitor - type 'help' for more information
(qemu) assertion !feature is missing in this emulation:  unknown 
word read
failed: file /usr/obj/ports/qemu-0.9.1p10/qemu-0.9.1/hw/eepro100.c, 
line 1202

, function eepro100_read2
Abort trap (core dumped)

It *must* be a mistake on my side, if the description on the OpenBSD 
site is correct.

What can I do?

Uwe



Re: No networking with cd46.iso on qemu?

2010-04-28 Thread Uwe Dippel

Uwe Dippel wrote:


It *must* be a mistake on my side, if the description on the OpenBSD 
site is correct.

What can I do?


Let me add some remarks, after trying to debug it further:

Trying the install the conventional way results as to be expected on amd64:

qemu -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \
-cdrom cd45.iso -boot d

segfaults reproduceably at entering the root password (why there?)

Okay, so I followed the 2. tap mode further down. Still, no success. 
The interfaces are not created as expected/described:


# ifconfig tun0 link0
# ifconfig bridge0 create
# brconfig bridge0 add tun0 add em0 up
brconfig: bridge0: em0: No such file or directory

No luck here, neither.

What to do next?

Uwe

Here is the dmesg, in case:

OpenBSD 4.6 (GENERIC.MP) #0: Mon Apr 26 18:00:52 SGT 2010
   udip...@mybox.myorg.my:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3756994560 (3582MB)
avail mem = 3633987584 (3465MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries)
bios0: vendor HP version D17 date 07/16/2007
bios0: HP ProLiant ML350 G4
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP SPCR MCFG APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.50 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu0: 1MB 64b/line 8-way L2 cache
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu1: 1MB 64b/line 8-way L2 cache
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.12 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu2: 1MB 64b/line 8-way L2 cache
cpu3 at mainbus0: apid 7 (application processor)
cpu3: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.12 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR,LONG

cpu3: 1MB 64b/line 8-way L2 cache
ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins
ioapic1: misconfigured as apic 0, remapped to apid 9
ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins
ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins
acpiprt0 at acpi0: bus 1 (IP2P)
acpiprt1 at acpi0: bus 2 (IPXB)
acpiprt2 at acpi0: bus 6 (PCXA)
acpiprt3 at acpi0: bus 9 (PCXB)
acpiprt4 at acpi0: bus 5 (PTA0)
acpiprt5 at acpi0: bus 13 (PTB0)
acpiprt6 at acpi0: bus 16 (PTC0)
acpiprt7 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0
acpicpu1 at acpi0
acpicpu2 at acpi0
acpicpu3 at acpi0
acpitz0 at acpi0: critical temperature 31 degC
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c
ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c
pci1 at ppb0 bus 5
ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09
pci2 at ppb1 bus 6
ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09
pci3 at ppb2 bus 9
ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c
pci4 at ppb3 bus 13
ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c
pci5 at ppb4 bus 16
ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02
pci6 at ppb5 bus 2
mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: apic 9 
int 0 (irq 5)

scsibus0 at mpi0: 16 targets, initiator 7
sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct 
fixed

sd0: 34732MB, 512 bytes/sec, 71132000 sec total
sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct 
fixed

sd1: 286102MB, 512 bytes/sec, 585937500 sec total
sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct 
fixed

sd2: 286102MB, 512 bytes/sec, 585937500 sec total
mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 
int 1 (irq 5)

scsibus1 at mpi1: 16 targets, initiator 7
uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 int 
16 (irq 3)
uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 int 
19 (irq 3)

Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured
Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured
ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 int 
23 (irq 3)

usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel

Problem after upgrade 4.5 to 4.6: ERR M

2010-03-22 Thread Uwe Dippel
Having done upgrades from 4.0 onwards, on a OpenBSD-only server (amd64), 
this time something must have gone wrong: Despite of the (remote, I have 
no physical access, via serial console) 'successful'  upgrade (no error 
messages), when I was asked to reboot, I did, as always. Alas, it came 
up with


Attempting Boot From Floppy Drive (A:)
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
Using drive 0, partition 3.
Loading...
ERR M

on an HP ML350G4p.

From all I know it is a problem with the MBR.
What I'd really like to get, before I drive there and get access, is how 
to best solve this problem, and most straightforward. Talking about what 
went wrong can wait, since this is a production machine and should be 
back as soon as possible.


Thanks in advance,

Uwe



Re: Problem after upgrade 4.5 to 4.6: ERR M

2010-03-22 Thread Uwe Dippel

Tobias Ulmer wrote:


As explained above, no, you likely moved around/corrupted /boot in a way
that doesn't work for biosboot.
  


Hmm. Actually I didn't. Through serial console, I had rebooted the 
server, just 'to make sure', before booting to bsd.rd, and everything 
went through. I rebooted again, immediately, to bsd.rd, and went through 
the very normal and standard procedure like umteen times before. One 
exception: the bsd.mp was shown as corrupted by its sha256 hash. The 
install program, however, continued; so that I could not rectify this. 
being on a multi-CPU box, in the end, it automatically copied the 
(corrupted) bsd.mp to bsd, which then had a size of 1.3 MB. Therefore, 
at the very end, after the device nodes, at the 'reboot now' prompt, I 
ftp-ed a correct version from another location into there, and cp-ed it 
into bsd. Then, strangely enough, suddenly there appeared a bsd.sp of a 
size of 0, which had not been there before.
I found this quite strange, both the installer going through despite of 
the wrong hash; and more so the (new?) automatic move of bsd.mp to bsd 
on a multicore machine; though the size was wrong. And in the end, a 
'0'-sized bsd.sp after moving in a healthy bsd.mp.
I would not totally exclude an interference of this (new?) code that 
lead to the described situation. Honestly, nothing at all done in that 
session aside from what I wrote, between the 2 boots. I guess, nothing 
of what I did should hurt the /boot?


Thanks for the reply. I'll go there next to try what has been proposed. 
Before I try, in case the

# /usr/*m*dec/installboot -v boot /*usr/mdec*/biosboot sd0
does NOT work, what else could I do? (I am asking, because it is a 
server room quite far away, with little chance for me to communicate, 
and difficult to go.) So, is there any alternative, or additional, 
solution to fall back to, when I am there, and installboot doesn't cut it?


Uwe



Re: Problem after upgrade 4.5 to 4.6: ERR M [Solved]

2010-03-22 Thread Uwe Dippel

Nick Holland wrote:
And in the end, a 
'0'-sized bsd.sp after moving in a healthy bsd.mp.
I would not totally exclude an interference of this (new?) code that 
lead to the described situation. Honestly, nothing at all done in that 
session aside from what I wrote, between the 2 boots. I guess, nothing 
of what I did should hurt the /boot?



well, something did damage /boot.  I doubt it was anything you did 
intentionally, but something also caused a bad bsd.sp to be copied over. 
  Possibly related.  May indicate system problems of some type.
  


Okay, back. Works!

But we should not stop here.
Because at mounting my / on /mnt, I noticed that the /boot had also 
taken to a zero size. Like that bsd.sp, which was okay, but received 0 
after copying bsd.mp to bsd. What would now make /boot zero?


/usr/mdec/installboot -v boot /usr/mdec/biosboot sd0
says something like
boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c
no error message, but /boot is still '0'.
Then I removed /boot, and then an error message came up:
... boot: No such file or directory
Meaning, it couldn't rectify the /boot of size 0.
Last chance: I copied /usr/mdec/boot to /mnt/
Again:
/usr/mdec/installboot -v boot /usr/mdec/biosboot sd0
boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c
boot is 3 blocks x 16384 bytes
fs block shift 2; offset 63; inode block 24, offset 936
using MBR partition 3: type 0xA6 offset 63
Now, this looked promising and actually worked.
I still take a bet on a round of drinks that there is a bug in the 
recent install/upgrade code that has a tendency to render files to zero 
size.


Thanks for all the input to get this production box back!

Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Paul de Weerd wrote:

Hi Uwe,

  


Yes. Like
Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2



And this XXX.XX.XX.XX is the address of a machine you know ?


Yes


 The user
is a well known user to you,


Yes


 some system account perhaps ?
  


No

  
To be clear, the user exists, and logged on the last time three days ago  
as far as 'last' is concerned.



This does not really match up with your previous statements of who
never logged on, is not visible with 'last'.
  


Sorry, my shoddy way of saying things. 'Never' meant 'never while there 
were processes running under his user-ID in the last hours'

So his last 'last' is 3 days old.


What is this user doing ? Any other processes running under his uid ?
  


No, only the root- and user-id of ssh.


If he's back immediately after a reboot, it sounds like an automated
log in (using password auth; that may be interesting).

What exactly do you want to know here ? How to log in without showing
up in finger/w/last/etc ? Try `while :; do ssh ${HOST} read A; done`,
it does exactly what you describe.

Are you sure that account is not compromised and your machine is not
sending out lots of e-mail ?
  


Hmm. How would I know? The daily security report gives out a reasonable 
number of mails, top looks okay to me, low as usual.



Cheers,
  


Thanks,

Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Edd Barrett wrote:

Hi,

On Fri, Aug 21, 2009 at 6:54 AM, Uwe Dippeludip...@uniten.edu.my wrote:
  

Yes. Like
Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2

To be clear, the user exists, and logged on the last time three days ago as
far as 'last' is concerned.



This sounds very fishy. I would start backing up if I were you.
  


Did this.


You said first that last says the user had not logged on, but now that
it has 3 days ago? Is the user covering up his/her traces or was that
a typo?
  


(See my other mail, my ambiguity: Last record in 'last' of 3 days ago.)


See what the user is doing and what is in his/her home directory.


Nothing except of ssh - Nothing much. The usual few files. Nothing in 
hidden files.



 Try
to find information about the machine which it is coming from.
  


It is an inside (LAN) machine, standard workstation/desktop


I would be interested to know.
  


Me too!  ;)

Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Iqigo Ortiz de Urbina wrote:


As its not clear to me if isuser is a user you trust, created or 
needed for your services,


'Trusted', created by myself, needs a local account.

I would say your machine might have been compromised. What kind of 
traffic is isuser generating?


Difficult to find out if I assume I could not trust my box any longer.


Is it just a reverse ssh shell?


Could very well be.
Would this not show in 'last' or 'w'?
Interesting to me, that no pseudo-terminal is associated with the 
activities (ssh), contrary to a usual local logon.



Can you shutdown his account or set his/her/its shell to nologin(8)?


I'll try this next when I see her activities: nologin.


Next install you might consider following the advices of mtree(8) as 
the output of previous and current `mtree -cK sha1digest` would be 
really usefeul here.


I'll have to study this first.

Thanks!



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Paul de Weerd wrote:



tcpdump(8) will tell you a lot, I suppose ;) I guess the best way to
make sure the account is not compromised is talking to your user and
asking him if he can explain what is going on. Again, my current guess
is TCP forwarding, but it could be a lot of other things too. Ask your
user and see if he knows about this.


I can't as of now (weekend).

But I can see it reoccurring, kind of:
Aug 21 18:31:25 mybox sshd[31888]: Accepted password for isuser from 
XXX.XX.XX.XX port 57519 ssh2

in authlog, reflected pretty well by
isuser  ttyp0172.16.0.35  Fri Aug 21 18:31 - 18:31  (00:00)
in 'last'; though still busy sending stuff forth and back:
isuser 16994  0.0  0.8  3176  1992 ??  S  6:31PM0:00.13 sshd: isuser

There are a bunch of logons of that user, of 00:00 logon duration during 
the last weeks. The only thing running from this user at this moment is 
the ssh.
That would mean, one can log on, spawn a process, log off, and the 
process keeps running?
Then everything could be 'fine', and the system not compromised, only 
exploited to run some ssh-tunnel or so.

Though this behaviour of the system would be unexpected by myself.

Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Robert C Wittig wrote:

Have you considered adding a PF rule that would drop all incoming
login requests from this specific user?
  


Yes. But it won't work, because there is a NAT-address-rewrite in 
between that changes the source address. Also, that user has plenty of 
machines to log on to.
It seems by now that it is not a compromise, but something else, rather 
'abuse'.


Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Paul de Weerd wrote:


You could check for the presence of forwarded TCP sessions with fstat,
an exmaple looks like this :

weerdsshd   29016   11* internet stream tcp 0x40009ab33d0 127.0.0.1:44410 
-- 127.0.0.1:3128

If you open an ssh session to a remote machine with a forwarded port,
then open the forwarded port and once the connection over the
forwarded port has been established ^D the initial session, you'll get
the behaviour you just described. The established TCP session over the
forwarded connection keeps the SSH session alive but the user is shown
as logged out (and no processes show other than the sshd's you
mentioned).
  


Now I am pretty sure that this is what we see here.
It also makes sense, since all those users sit on a tightly controlled 
LAN; while that machine is 'further out'. So that restricted services 
can be accessed through some tunneling.
Now: How to prevent it?? I have hundreds of users, who can log on from 
hundreds of machines, and all need access to ssh, and easily 30 at the 
same time.

So, filtering IP addresses is out, nologin is out, no ssh is out.
Of course, I can politely ask, but I would not necessarily trust it to 
be followed. I'd much rather disallow it technically. At least, have an 
easy access to the record (e.g. in 'last'). But since it doesn't require 
logon, what to do? And how to prevent this??


Any suggestion appreciated,

Uwe



Re: strange (?) ssh user

2009-08-21 Thread Uwe Dippel

Johan Beisser wrote:



Read the man page for ssh_config(5) and sshd_config(5), and look at
restricting what your users can do.

Specifically: AllowTcpForwarding, PermitOpen and PermitTunnel,
combined with Match.
  


Thanks everyone for a great number of enlightening and helpful replies 
to my post!
I have learned a lot. Last not least, and again, how biased I can think: 
When I noticed some activities by a user who was not logged on, I feared 
a compromise. That lead me away from the solution: reading the man pages 
of ssh, as I did not expect this to be 'normal' or even legal.


Thanks again!

Uwe



strange (?) ssh user

2009-08-20 Thread Uwe Dippel
Recently, I noticed an ssh user on one of my machines, who never logged 
on, is not visible with 'last', seems to have no terminal active, and is 
back immediately after a reboot.

Hmm.
root 13415  0.0  0.9  3280  2420 ??  Ss12:04PM0:00.08 sshd: 
isuser

isuser   702  0.0  0.7  3280  1824 ??  S 12:04PM0:00.00 sshd: isuser
Whatever I do with finger, w, last, no trace of any activity; not even a 
login.
I tried to kill the processes, and they are gone, but the next second 
another pair is up.


Could anyone help me to explain what is going on here?

Uwe



Re: strange (?) ssh user

2009-08-20 Thread Uwe Dippel

Ryan Flannery wrote:

On Fri, Aug 21, 2009 at 1:19 AM, Uwe Dippeludip...@uniten.edu.my wrote:
  

Recently, I noticed an ssh user on one of my machines, who never logged on,
is not visible with 'last', seems to have no terminal active, and is back
immediately after a reboot.
Hmm.
root 13415  0.0  0.9  3280  2420 ??  Ss12:04PM0:00.08 sshd:
isuser
isuser   702  0.0  0.7  3280  1824 ??  S 12:04PM0:00.00 sshd: isuser
Whatever I do with finger, w, last, no trace of any activity; not even a
login.



Just to be clear here, do you see anything in /var/log/authlog?
  


Yes. Like
Accepted password for isuser from XXX.XX.XX.XX port 61802 ssh2

To be clear, the user exists, and logged on the last time three days ago 
as far as 'last' is concerned.




4.6 will be released on October 1st?

2009-08-12 Thread Uwe Dippel

At least, that's what the website says at http://openbsd.org/46.html
True or typo? (I'd expect November 1st.)

Uwe



Re: Panic at install of amd64 on HP nx6320

2009-08-08 Thread Uwe Dippel
Uwe Dippel udippel at uniten.edu.my writes:

 
 Marco Peereboom wrote:
  we need a trace; this is worthless.

 
 Thought so. Here are the screens, in the attachment.
 Hope, it goes through!
 
 Uwe
 
 [demime 1.01d removed an attachment of type image/jpeg which had a name of
IMG_0623.JPG]
 
 [demime 1.01d removed an attachment of type image/jpeg which had a name of
IMG_0624.JPG]
 
 [demime 1.01d removed an attachment of type image/jpeg which had a name of
IMG_0625.JPG]
 


So they didn't go through, as to be expected.
Here is a link:
http://metalab.uniten.edu.my/~udippel/



Re: Panic at install of amd64 on HP nx6320

2009-08-08 Thread Uwe Dippel
Jonathan Gray jsg at goblin.cx writes:

 Nowhere do you state which release you are running.  Similiar problems
 have been fixed in -current some months ago, so what are you running?

My fault. I'm running 4.5 stable. Would those fixes have made it into 4.5? If
yes, -current is no alternative. What exactly do you mean, so that I could check
the changelog?

Uwe



Re: Panic at install of amd64 on HP nx6320

2009-08-08 Thread Uwe Dippel
Jonathan Gray jsg at goblin.cx writes:

 No, this will never be in 4.5.  The acpi parser has changed significantly
 since 4.5 which made many hp machines much happier.  You need
 to run a snapshot to get the newer parser to resolve this problem.

Correct, guys, thanks so much!
I ran the -cuurent of August 7th, and it runs through, and reboots properly.
And X comes up without problem with 'startx'. Looks good to me, so far.
And a new installer. Somewhat confusing, though:
Layout [A]utomatic (or so) has a lower case default at the line end:[a]
At the end, it finds an MP kernel and says 'using bsd.mp instead'. It might be
better to formulate it in a manner to clearly state
 - I will use
or
 - You might want to use
Actually, I'd prefer the second version: asking, with .mp as default.
Timezone might better go at the beginning, with ntp.
One slash is missing in between, when the newly created directories are shown.
There it looks something like
/mnt
/mnt/usr
/mnt/home
etc.
I was looking for '/', and it seems to be missing in the first line.

Thanks again!!

Uwe



Panic at install of amd64 on HP nx6320

2009-08-07 Thread Uwe Dippel
What I did: Install into wd0, second DOS partition, 20G. Everything 
looked good. At reboot, the panic happens, always.


ps is easy:
ddb ps
* 0 -1 0 0 7 0x80200 swapper
ddb trace
Debugger() at Debugger+0x5
panic() at panic+0x122
_aml_die() at _aml_die+0xdb
aml_xconvert() at aml_xconvert+0x68
[...]
config_attach() at config_attach+0x11b
cpu_configure() at cpu_configure+0x1c
main() at main+0x3c5
end trace frame:0x0, count: -31
(If someone will ask for the complete trace: I'll take a screenshot with 
a camera if need be.)


Any recommendation? (The machine works well at its other boots: XP and 
Ubuntu.)


Uwe



Re: Panic at install of amd64 on HP nx6320

2009-08-07 Thread Uwe Dippel
Marco Peereboom wrote:
 we need a trace; this is worthless.
   

Thought so. Here are the screens, in the attachment.
Hope, it goes through!

Uwe

[demime 1.01d removed an attachment of type image/jpeg which had a name of 
IMG_0623.JPG]

[demime 1.01d removed an attachment of type image/jpeg which had a name of 
IMG_0624.JPG]

[demime 1.01d removed an attachment of type image/jpeg which had a name of 
IMG_0625.JPG]



Re: softraid

2009-05-23 Thread Uwe Dippel
Janne Johansson jj at it.su.se writes:

 Isn't that the case with all fstab entries right now?
 
 You get the computer to list some drive before other disks, raid or no 
 raid, and fstab breaks on you.

No, you didn't read it carefully enough. fstab breaks on me when I shove in a
drive 'before', true. But when I add a 'higher' one, everything is fine. Believe
me, I have a bunch. softraid is different, because ANY added physical drive will
increase the count, and push the softraid drive one notch nigher.

Uwe



Re: softraid

2009-05-22 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:


  Then keep asking!
  I do have the impression, what I wanted, is what you already had in mind:
  a broken mirror simply remains dead and broken, and the machine runs 
  happily 
  before and after reboot on the sane drive. Correct?
 
 Correct.  If this isn't the case then I need to see a dmesg before 
 after rebooting and bioctl output before and after reboot.

Since we (that's I, sorry) seem to discuss the whole bunch (not a bad idea
after, all hoping to get things into their places and finally enjoying a 
really beautiful and functioning softraid), I allow myself to add another
question, real life, on a to-be-production system:
Okay, now I have a broken harddisk, one half of the mirror is gone.
Then I will have to dump the partitions, create a new mirror, and restore,
correct?
Next problem: There are quite a number of bays available in my box, 
so that I can plug another drive for a local 'dump'. But irrespective 
where I plug it, it won't come up:

softraid0 at root

softraid0: roaming device sd1b - sd2b

softraid0: roaming device sd2b - sd3b

softraid0: roaming device sd1b - sd2b

softraid0: roaming device sd2b - sd3b

scsibus3 at softraid0: 1 targets

sd4 at scsibus3 targ 0 lun 0: OPENBSD, SR RAID 1, 003 SCSI2 0/direct fixed

sd4: 285789MB, 512 bytes/sec, 585296066 sec total

softraid0: volume sd4 is roaming, it used to be sd3, updating metadata

root on sd0a swap on sd0b dump on sd0b

Automatic boot in progress: starting file system checks.
/dev/rsd0a: file system is clean; not checking
Can't open /dev/rsd3h: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3h: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3d: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3d: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3f: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3f: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3e: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3e: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3g: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3g: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3i: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3i: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3j: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3j: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY:

ffs: /dev/rsd3h (/home), ffs: /dev/rsd3d (/tmp), ffs: /dev/rsd3f 
(/usr), ffs: /dev/rsd3e (/var), ffs: /dev/rsd3g (/var/mail), ffs: /dev/rsd3i
(/var/www),
ffs: /dev/rsd3j (/backup)
Automatic file system check failed; help!

Enter pathname of shell or RETURN for sh: 

Now that roaming is in the way. I wonder if softraid really should do this:
You plug an additional disk, higher or lower, and automatically it will 
roam the mirror to drives of its liking, and inevitably fail the RAID thereby.



Re: softraid

2009-05-22 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

 
 This one the pulled drive still contains the same metadata as the
 surviving members.  Since you are running a home made kernel I have no
 idea what code you are running.  This scenario should work with the code
 I committed a couple of weeks ago.  From the looks of it this is a bug
 or you are running old code.

4.5 stable.
I have no clue what 'home made kernel' implies, it is just the recompiled 
(according to FAQ)  standard, generic kernel; needed for the patches issued 
in 4.5 until now. Zero any other item.

Uwe



Re: softraid

2009-05-22 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

 
 This is a repeat of the can't bring up a raid set with missing
 members

Yes, exactly. This can be closed; it was just to demonstrate that I am not
the only person, who sees broken mirrors being re-attached.



Re: softraid

2009-05-22 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:


 This is currently correct because I am working on this particular case.
 This one has proved to be very hairy hence it isn't in the tree yet.

Good to know, thanks for the heads-up, I keep waiting then for 4.6, I guess?


  I'd expect the
  softraid, in order to be useful, to reboot on its sane leg.
 
 See previous comment.  This is incomplete code.

Thanks for the info. Complete is -current or will it be in 4.6?

Uwe



Re: softraid

2009-05-22 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

  Next problem: There are quite a number of bays available in my box, 
  so that I can plug another drive for a local 'dump'. But irrespective 
  where I plug it, it won't come up:
 
 Your trace shows that it comes up just fine.

  softraid0: volume sd4 is roaming, it used to be sd3, updating metadata
 
 You inserted a disk in front of others.  From where I am sitting it is
 all working as designed.

Hmm. I plugged the fourth 'before' and 'after'; or the tray above and the tray
below that pair. Both acted the same: breaking the mirror.
To me it *looks* as if any physical disk would go before the RAID, and therefore
the RAID always be incremented whenever one inserts another drive. And since
the RAID gets roamed, while fstab isn't, it will always break. No?


 This is not a softraid problem this is an OpenBSD problem.  We don't
 have uuids or labels on disks so when disks move around bad things
 happen.  This is a fact of life today that needs to be deal with
 accordingly.

Yes and no. 'Move around' is not what I am doing, when I plug an extra drive
*after* sd2 (in my case). However, the system makes that one sd3, and voilC , 
the s*** hits the fan, because the RAID volume is moved up one index. 
To me this seems a result of the sequence at boot: at first we identify the
physical drives, that is sd0, sd1, sd2 and sd3 in this case, and only later
do we get softraid up, sensibly roaming the RAID one up. Sensibly? Because fstab
can't know and will want to mount partitions of a lower number (sd3 in this
case), which is always impossible.



Re: softraid

2009-05-22 Thread Uwe Dippel
Uwe Dippel udippel at uniten.edu.my writes:

 To me this seems a result of the sequence at boot: at first we identify the
 physical drives, that is sd0, sd1, sd2 and sd3 in this case, and only later
 do we get softraid up, sensibly roaming the RAID one up. Sensibly? Because
 fstab can't know and will want to mount partitions of a lower number 
 (sd3 in this case), which is always impossible.

I do understand the problem of 'no labels'/'no UUID', but the current working
will break boot whatever happens: any extra drive, in any slot, will be
discovered at boot time before softraid is activated. So it will break 100%,
right? There is no real solution without disk IDs, though a hackish one: 
If softraid was configured at sd3 (assembled from sd1 and sd2 in this case), 
the kernel needs to be aware of this fact when it goes into drive discovery 
at boot.
So that when one plugs another drive into a higher controller, it will discover:
sd0 - sd1 - sd2 - sd3_is_taken - sd4. Then fstab will be correct w.r.t. sd0 to
sd3, and one can use sd4, the new drive, for whatever purpose it had been
intended. And if sd0 was removed from the original configuration, it would find
sd0 - sd1 - sd3_is_taken. Then roaming can still do sd0-sd1 and sd1-sd2, and
the RAID will come up properly, again.
That's the best I could think of now, anything but perfect, but always better
than a 100% breakage. 
What do you think?



Re: OpenBSD on OpenBSD with qemu through the network only?

2009-05-21 Thread Uwe Dippel
Stuart Henderson stu at spacehopper.org writes:

 Are you trying to boot an amd64 kernel? If so, you need qemu-system-x86_64.

Chances are, that I did. Downloaded the i386-cd45.iso, and followed the 
'tap mode' path:
ifconfig tun0 link0
ifconfig bridge0 create
brconfig bridge0 add tun0 add bge0 up
All went through, and I have now:
tun0: flags=9903UP,BROADCAST,PROMISC,SIMPLEX,LINK0,MULTICAST mtu 1500
lladdr 00:bd:f5:ab:6c:01
priority: 0
groups: tun
inet6 fe80::2bd:f5ff:feab:6c01%tun0 prefixlen 64 scopeid 0x7
bridge0: flags=41UP,RUNNING mtu 1500
priority: 0
groups: bridge

Next, going back to Quick Start, I type
qemu -m 32 -monitor stdio -no-fd-bootchk -hda virtual.img \
   -cdrom cd45.iso -boot d
which actually, does boot! and gets me into the installer, all hunky dory. Only
when downloading of the packages is supposed to start, once I have entered
IP-address and confirmed the directory, everything segfaults. Always.
I guess something networking-wise is still not okay. Can someone help me to
point out the mistake here?

Uwe



Re: softraid

2009-05-21 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:


 The plugging in of the disk is a non-event.  The disk is dead to the
 OS and by extension to softraid.

Let me follow up on this topic, please, and report some more experiments and
results and thoughts.

I recreated the mirror from scratch, and put /tmp, /var, /usr, /home and 
/backup directories on it. (No need to point out this is kind of stupid.)
Running for 2 days. 
Hot unplugged drive A. Then 'echo Nonsense  /backup/testo'
Good outcome, though not tested intensely yet: the system keeps running on 
B as if nothing had happened. 
Shutdown and plugged A back, restart.
Fails at file check, with 'help!' and dropping to a shell at /var. 
Problem is, that the .pid had been properly removed on B, but not on A; 
and I needed to delete those one by one at fsck. I also fsck-ed all other
partitions, and as to be expected, the 'testo' was on B, not on A, 
and therefore it needed to be deleted.
Reboot, alas, ending in a hangman. Reboot.
Another time /var drops to a shell, it has some trouble with 'lost+found',
another manual fsck is needed, reboot.
Finally, the mirror comes up properly. 

Next, I'd like to do a real test on a production machine. What scares me, 
is the lack of physical access, so the hangman and the drop to shell for 
fsck are not good. And, on a production box here, there might be thousands 
of files accumulating on the plugged drive that won't be available on the 
unplugged one,
and I will be asked to delete those. Also, this is not good. 
My question/suggestion: I for one would be happy if the state after reboot 
would by default be identical to the (degraded) state before the reboot: 
Because then I would hope to get the system started without the earlier 
defunct drive; that means, hopefully starting okay, and more relevant, 
not require me to do anything, not to delete any files. Simply start with 
the sane drive of the broken mirror as it was shut down. Then I could 
dump and restore the data to a freshly created RAID, without any further ado.
Then, at least, a broken drive, a flimsy controller would not interfere 
into the proper running and restarting of the box; and giving me the 
chance to retrieve all, including the most recent, data.

Does this make sense?

Uwe



Re: softraid

2009-05-21 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

 Upon reboot the mirror should be brought up with only the surviving
 member.  If this isn't the case please show me a trace so that I can go
 fix that bug.

'trace' means what here? Yes, I unplugged a drive of a working mirror as I
wrote, halt, plugged it back with power off, rebooted, and had the described 
state of fsck problems, fortunately telling me exactly about the files that
were available on A ('.pid') and B ('testo') only.

 Drive A is DEAD.  Do not EVER use it again.

That's fine, except it was tried to re-insert it into the mirror at reboot.
Not on my account, but on its own, as one might say.

 YOU CAN NEVER USE THE UNPLUGGED DRIVE EVER EVER EVER EVER EVER
 AGAIN!  IT IS DEAD AND IS CORRUPT AND PUPPIES DIE WHEN YOU USE IT!!
 
 Hope this sinks in.

Fine. (Should I really say, it required some file checks first, and then,
and now, it automagically is a member again of my - by now - online raid?
Without me adding it?

 If the dead drive becomes a participating member of a raid set
 something is broken; very badly broken.  Show me a trace, including
 bioctl output,  if this is the case.

Again, bioctl I do understand. 'trace' only at panic.

  Does this make sense?
 
 Not sure.  I have a hard time following what you are doing vs. what your
 expectations are.

Then keep asking!
I do have the impression, what I wanted, is what you already had in mind:
a broken mirror simply remains dead and broken, and the machine runs happily 
before and after reboot on the sane drive. Correct?

Uwe



Re: softraid

2009-05-21 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:


 Correct.  If this isn't the case then I need to see a dmesg before 
 after rebooting and bioctl output before and after reboot.
 
 Keep in mind that softraid can only detect failure AFTER an io fails.
 This is key, because you could fail a drive and go undetected by
 softraid.

Clear. This is why I tested with 'echo Nonsense  testo'.


Here is what I did, I hope it explains what is going on. 
If not, just ask!

[rebooted]
# bioctl softraid0 
Volume  Status   Size Device  
softraid0 0 Online   299671585280 sd3 RAID1
  0 Online   299671585280 0:0.0   noencl sd1b
  1 Online   299671585280 0:1.0   noencl sd2b
# df -h
/dev/sd0a  300M108M177M38%/
/dev/sd3h  9.8G730M8.6G 8%/home
/dev/sd3d 1008M6.0K958M 0%/tmp
/dev/sd3f  7.9G2.7G4.8G36%/usr
/dev/sd3e  492M   17.1M450M 4%/var
/dev/sd3g  2.0G1.4M1.9G 0%/var/mail
/dev/sd3i  7.9G3.3M7.5G 0%/var/www
/dev/sd3j  246G   95.5G138G41%/backup
# cd /backup
# ls -l
total 200231436
[some files listed]
# echo Nonsense  testo_b4
# ls -l testo_b4   
-rw-r--r--  1 root  wheel  9 May 22 11:57 testo_b4
# bioctl softraid0 
Volume  Status   Size Device  
softraid0 0 Online   299671585280 sd3 RAID1
  0 Online   299671585280 0:0.0   noencl sd1b
  1 Online   299671585280 0:1.0   noencl sd2b
# [pull drive]
# dmesg
OpenBSD 4.5 (GENERIC.MP) #0: Thu May 14 18:57:01 SGT 2009
r...@claude2.uwe.uniten.edu.my:/usr/src/sys/arch/amd64
/compile/GENERIC.MP
real mem = 3756994560 (3582MB)
avail mem = 3634552832 (3466MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries)
bios0: vendor HP version D17 date 07/16/2007
bios0: HP ProLiant ML350 G4
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP SPCR MCFG APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.53 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,
CNXT-ID,CX16,xTPR,LONG
cpu0: 1MB 64b/line 8-way L2 cache
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,
CNXT-ID,CX16,xTPR,LONG
cpu1: 1MB 64b/line 8-way L2 cache
ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins
ioapic1: misconfigured as apic 0, remapped to apid 9
ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins
ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins
acpiprt0 at acpi0: bus 1 (IP2P)
acpiprt1 at acpi0: bus 2 (IPXB)
acpiprt2 at acpi0: bus 6 (PCXA)
acpiprt3 at acpi0: bus 9 (PCXB)
acpiprt4 at acpi0: bus 5 (PTA0)
acpiprt5 at acpi0: bus 13 (PTB0)
acpiprt6 at acpi0: bus 16 (PTC0)
acpiprt7 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0
acpicpu1 at acpi0
acpitz0 at acpi0: critical temperature 31 degC
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c
ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c
pci1 at ppb0 bus 5
ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09
pci2 at ppb1 bus 6
ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09
pci3 at ppb2 bus 9
ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c
pci4 at ppb3 bus 13
ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c
pci5 at ppb4 bus 16
ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02
pci6 at ppb5 bus 2
mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: 
apic 9 int 0 (irq 5)
scsibus0 at mpi0: 16 targets, initiator 7
sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed
sd0: 34732MB, 512 bytes/sec, 71132000 sec total
sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed
sd1: 286102MB, 512 bytes/sec, 585937500 sec total
sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed
sd2: 286102MB, 512 bytes/sec, 585937500 sec total
mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: 
apic 9 int 1 (irq 5)
scsibus1 at mpi1: 16 targets, initiator 7
uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: 
apic 8 int 16 (irq 5)
uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: 
apic 8 int 19 (irq 5)
Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 

Re: softraid

2009-05-21 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:


 Correct.  If this isn't the case then I need to see a dmesg before 
 after rebooting and bioctl output before and after reboot.

This is as well supported by the post
http://vext01.blogspot.com/2007/11/playing-with-new-softraid-driver-in.html

[...]
Bioctl is the utility used for managing both hardware and software RAID in
OpenBSD, the transparency is superb.

# bioctl softraid0
Volume Status Size Device
softraid0 0 Online 1023009 sd0 RAID1
0 Online 1023009 0:0.0 noencl
1 Online 1023009 0:1.0 noencl
2 Online 1023009 0:2.0 noencl

Lets break things and see what happens. First I will simulate a missing disk at
boot, by detaching wd3. After a reboot I see this:

# dmesg | grep softraid0
softraid0 at root
softraid0: not assembling partial disk that used to be volume 0
# bioctl softraid0
#

Our RAID array was not registered by the kernel, as a disk was missing. I
imagine this will be changed at some point. As I said, the softraid driver is
not finished.

Shutdown the system and put the disk back:

# dmesg | grep softraid
softraid0 at root
scsibus0 at softraid0: 1 targets
# bioctl softraid0
Volume Status Size Device
softraid0 0 Online 1023009 sd0 RAID1
0 Online 1023009 0:0.0 noencl
1 Online 1023009 0:1.0 noencl
2 Online 1023009 0:2.0 noencl

It's back.

Isn't this what you said it shouldn't? (Be back 'Online' after an earlier
breakage of the mirror)

Uwe



Re: softraid

2009-05-21 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

 
  Then keep asking!
  I do have the impression, what I wanted, is what you already had in mind:
  a broken mirror simply remains dead and broken, and the machine runs 
  happily 
  before and after reboot on the sane drive. Correct?
 
 Correct.  If this isn't the case then I need to see a dmesg before 
 after rebooting and bioctl output before and after reboot.

Alas, it doesn't (run happily ever after).  :(

My next experiment:
Everything healthy, according to bioctl:

# bioctl softraid0 
Volume  Status   Size Device  
softraid0 0 Online   299671585280 sd3 RAID1
  0 Online   299671585280 0:0.0   noencl sd1b
  1 Online   299671585280 0:1.0   noencl sd2b
# [pull drive]

[...]

[new situation: NOT putting the drive back, ever - simulating a dead drive,
maybe spindle or head gone]

(System operates fine, read/write without any problem)

[reboot - as mentioned NOT pushing the drive back]

[...]

ugen0 at uhub2 port 1 American Power Conversion Back-UPS RS 1000 FW:7.g8 .I USB
FW:g8 rev 1.10/1.06 addr 2

softraid0 at root

softraid0: roaming device sd2b - sd1b

softraid0: not assembling partial disk that used to be volume 0

root on sd0a swap on sd0b dump on sd0b

Automatic boot in progress: starting file system checks.
/dev/rsd0a: file system is clean; not checking
Can't open /dev/rsd3h: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3h: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3d: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3d: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3f: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3f: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3e: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3e: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3g: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3g: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3i: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3i: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3j: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3j: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY:

ffs: /dev/rsd3h (/home), ffs: /dev/rsd3d (/tmp), ffs: /dev/rsd3f (/usr),
ffs: /dev/rsd3e (/var), ffs: /dev/rsd3g (/var/mail), ffs: /dev/rsd3i (/var/www),
ffs: /dev/rsd3j (/backup)
Automatic file system check failed; help!

Enter pathname of shell or RETURN for sh: 

Here, at least in production environment, and according to the situation of
lacking physical access, I really would want the drive/system to come back. Yes.
To me, lacking of '-R' is no big deal. But what is the whole thing 'softraid'
about, if it doesn't survive a reboot, on a single, before 100% sane, drive?
See, it was sane, and working, and saving my files until reboot. Then, after
reboot (can always happen), all is 'lost'. Not quite, but I simply can't go
there any time of day or night to resolve the problem manually. I'd expect the
softraid, in order to be useful, to reboot on its sane leg.

Uwe



softraid - speed

2009-05-20 Thread Uwe Dippel
I tried again, setting up RAID1 on 2 U320 drives, 15k, as described in 
softraid(4).
Now I find the speed to be too slow. Writing to a single file is kind of 
okay: [everything/pwd is /mnt, which is a softraid drive, /dev/sd3f]

# bioctl sd3
Volume  Status   Size Device
softraid0 0 Online   299671585280 sd3 RAID1
  0 Online   299671585280 0:0.0   noencl sd1b
  1 Online   299671585280 0:1.0   noencl sd2b

dump and restore is the task. It is not fast:
DUMP: Volume 1 took 0:00:07
  DUMP: Volume 1 transfer rate: 2147 KB/s
  DUMP: Date this dump completed:  Wed May 20 16:31:08 2009
  DUMP: Average transfer rate: 2147 KB/s
7 seconds for 14 MB. But data transfer itself is okay:
# dump -0ua -f testo /dev/sd0e
DUMP: Volume 1 took 0:00:01
  DUMP: Volume 1 transfer rate: 15039 KB/s
  DUMP: Date this dump completed:  Wed May 20 16:49:53 2009
  DUMP: Average transfer rate: 15039 KB/s
  DUMP: level 0 dump on Wed May 20 16:49:51 2009
It is writing that takes the time:
# date  restore rf testo  date
Wed May 20 16:51:48 SGT 2009
Wed May 20 16:51:54 SGT 2009

The raw speed is good:
# dd if=/dev/zero of=nonsense.img bs=1m count=5000
5000+0 records in
5000+0 records out
524288 bytes transferred in 100.534 secs (52149868 bytes/sec)

But a dump  restore of /usr is a tad sick:
(/dev/sd0f  7.9G2.4G5.1G32%/usr)
# dump -0ua -f - /dev/sd0f | restore rf -
  DUMP: Date of this level 0 dump: Wed May 20 16:53:46 2009
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/rsd0f (/usr) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 2549189 tape blocks.
  DUMP: Volume 1 started at: Wed May 20 16:53:48 2009
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 4.42% done, finished in 3:48
  DUMP: 36.44% done, finished in 0:27
  DUMP: 40.42% done, finished in 0:30
  DUMP: 52.60% done, finished in 0:23
  DUMP: 64.08% done, finished in 0:17
  DUMP: 77.57% done, finished in 0:10
  DUMP: 92.19% done, finished in 0:03
  DUMP: 2717062 tape blocks
  DUMP: Date of this level 0 dump: Wed May 20 16:53:46 2009
  DUMP: Volume 1 completed at: Wed May 20 17:36:48 2009
  DUMP: Volume 1 took 0:43:00
  DUMP: Volume 1 transfer rate: 1053 KB/s
  DUMP: Date this dump completed:  Wed May 20 17:36:48 2009
  DUMP: Average transfer rate: 1053 KB/s
  DUMP: level 0 dump on Wed May 20 16:53:46 2009
  DUMP: DUMP IS DONE

The LEDs of the drives were kind of continuously on.

I also tried to mount 'softdep', but that didn't make much of a 
difference. When I do 'df -h' in another console, I can see at times 
that the data amount transfered is huge, at other times it is moving by 
steps of 0.1-0.2 MB/s. Probably it is a problem of number of files, not 
of size.


Any idea what to do to improve the performance?

Uwe



OpenBSD on OpenBSD with qemu through the network only?

2009-05-20 Thread Uwe Dippel
I would like to set up a virtual/emulated OpenBSD machine on an existing 
OpenBSD box; using qemu. The guest will only be accessible through the 
network (ssh), under an address different from the host IP; though 
through the same physical NIC.
I wonder if anyone has some experience or a link about this; otherwise 
I'll have to step-by-step myself through this.
Also, it seems that I can't install a qemu-harddisk-image on my OpenBSD 
boxen, since it needs graphics, right? So I'll have to use Ubuntu (e.g.) 
to create that image, and transfer it and start it on the console-only 
OpenBSD. Would that work?


Uwe



Re: OpenBSD on OpenBSD with qemu through the network only?

2009-05-20 Thread Uwe Dippel
Abel Camarillo acamari at the00z.org writes:

 
 that escenario is explained in the README.OpenBSD installed with qemu.

That's a really great hint! Thanks a bunch!

(Though it still looks like installation needs X:
3. Install the os:
   [...]
   NOTE: start this inside an xterm or equivalent
contrary to what someone stated off-group. If I do it in a console, I get WSCONS
error ... SDL)

Uwe



Re: OpenBSD on OpenBSD with qemu through the network only?

2009-05-20 Thread Uwe Dippel
Abel Camarillo acamari at the00z.org writes:

 
 that escenario is explained in the README.OpenBSD installed with qemu.

Sorry, it still won't. It will segfault, though I follow the text (correctly,
and correct me if I am wrong):

So what I typed is:
# env ETHER=bge0 qemu -net nic,model=rtl8139 -net tap -m 32 \ 
 -monitor stdio -no-fd-bootchk -hda virtual.img \
 -cdrom cd44.iso -boot d 
 {tun0 (bridge0 - bge0)}
QEMU 0.9.1 monitor - type 'help' for more information
(qemu) qemu: fatal: triple fault
EAX=e001003b EBX=00b1c7f8 ECX=c080 EDX=
ESI=00b1c000 EDI=00b33000 EBP=005f7408 ESP=005f765e
EIP=001004d6 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010   00cf9300
CS =0008   00cf9f00
SS =0010   00cf9300
DS =0010   00cf9300
FS =0010   00cf9300
GS =0010   00cf9300
LDT=   8000
TR =   8000
GDT= 00040c70 0027
IDT= 000409d8 027f
CR0=e001003b CR2=00040a18 CR3=00b1c000 CR4=06b0
CCS=1000 CCD=e001003b CCO=LOGICL  
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0= FPR1=
FPR2= FPR3=
FPR4= FPR5=
FPR6= FPR7=
XMM00= XMM01=
XMM02= XMM03=
XMM04= XMM05=
XMM06= XMM07=
Abort trap 

What's wrong?

Uwe



Re: softraid

2009-05-13 Thread Uwe Dippel

Marco Peereboom wrote:


  

[push the disk back in]



Stale metadata, disk will remain unused from now on.
  


check

  

[pull the other disk]



You lose.  all data is gone (for all intents and purposes).
  


check

  

# ls -l
total 4
-rw-r--r--  1 root  wheel  9 May 13 12:00 testo
[everything okay until here]



Nope, this comes out of cache.

  
# rm testo 


rm: testo: Input/output error
[I still guess this may happen]



Shall happen.
  


Yes. And no.
Maybe I wasn't all too clear? My expectation is not (yet) the automatic 
recovery of the respective half mirror! Sure not! I don't expect 
miracles. What I do expect, though, is a consistent, defined and 
predictable state.


Please, try to view it from a different perspective. Nobody would 
voluntary pull out disk A, plug it back after 20 seconds, expecting it 
to recover the mirror, pull out disk B after another 10 seconds, and 
plug it back after 20 seconds, and still expect a full mirror!
But, and that's a big 'but' for me: some fault might do exactly that, a 
flimsy controller, a faulty power supply. And then I don't want I/O 
errors, and neither a panic at reboot. My expectations are much lower, 
but based on consistency:

0. Running sane raid
1. One drive goes offline
What I'd expect, personally, would basically be minimally:
A. Immediate info about a drive lost.
B. 2 half mirrors remaining that I can plug into another box, at least 
to access the data on either.
C. No further attempt to use that drive that went offline any longer, at 
least not until a reboot.
D. That means, I won't have I/O errors, but the system running happily 
from the active drive,

E. And it means that a reboot will go through smoothly.

I am aware that this implies, that when the second drive goes offline as 
well, that NO more drive is available (even if either came back!). As I 
mentioned, I request consistency of data, not necessarily uptime. I want 
to be abe to retrieve the data from the drive that went offline first, 
and I want to be able to retrieve data from the drive that went offline 
later. Personally, to me RAID is not failover, or availability, but 
access to the data up to and until that moment when a drive goes offline.

And I want a clean reboot, irrespective of all ups and downs of the drives.

Please, correct me if I am wrong!

Uwe



Re: softraid

2009-05-13 Thread Uwe Dippel
Marco Peereboom slash at peereboom.us writes:

  Maybe I wasn't all too clear? My expectation is not (yet) the automatic  
  recovery of the respective half mirror! Sure not! I don't expect  
  miracles. What I do expect, though, is a consistent, defined and  
  predictable state.
 
 Your expectations are out of whack with reality.

Marco, I hope not. I think this is why I am using OpenBSD, e.g.
Predictability is number one for security, so consistency is a must as well.

Though we seem to agree what softraid should be doing from the statements below.

I propose to tar all photos and send them to you privately.

  0. Running sane raid
  1. One drive goes offline
  What I'd expect, personally, would basically be minimally:
  A. Immediate info about a drive lost.
 
 That is there.  And you haven't shown me any evidence it isn't.

You are right. I simply could not read from the man page the most obvious: that
the state is displayed without any options (and me stupid tried almost all
options!).
So I guess it still is a cronjob to scan for 'degraded'?

  B. 2 half mirrors remaining that I can plug into another box, at least  
  to access the data on either.
 
 No, 1 half mirror; the other one is basically lost.  You got an IO error
 for some reason.  There is no telling what didn't get written to it
 after the remaining chunk continued on its merry way.

Good to know.

  C. No further attempt to use that drive that went offline any longer, at  
  least not until a reboot.
 
 Right, and softraid will detect that it went tits up prior and ignore
 it.

Good

  D. That means, I won't have I/O errors, but the system running happily  
  from the active drive,
 
 Right.

Good

  E. And it means that a reboot will go through smoothly.
 
 Right.

Good.

Meaning that we agree, and I'm looking forward to try again!

Uwe



Re: softraid

2009-05-13 Thread Uwe Dippel
Raimo Niskanen raimo+openbsd at erix.ericsson.se writes:


 Does not really bioctl say nothing? Try bioctl sd3
 bioctl softraid0, bioctl -q sd3, bioclt -q softraid0.

Thanks, the first two do it; fault was on my side. I did try the latter 2, and
both , well, I dunno what they tell me:
# bioctl -q sd3
sd3: OPENBSD, SR RAID 1, 003, serial OPENBSD SR RAID 1 003
# bioctl -q softraid0
bioctl: DIOCINQ: No such file or directory

Thanks again for the pointer to the first. Maybe an example could be added to
the man pages?

 The existing repair option as I recall it (again, search
 the archives) is to backup the still working filesystems
 sd3a and sd3b on the broken mirror, re-create the array
 from scratch, and restore them.

Yes, this is what I was thinking, and it is fine with me. (Though shoving in a
new drive followed by -R would definitively be a huge progress.)

 That sounds fatal. You should repair the RAID mirror,
 not break the working half. Now both mirror halves
 are probably regarded as broken. Your RAID is doomed.

Sure. Agreed. But reboot ought to go through, as we discussed elsewhere.

Thanks again,

Uwe



softraid

2009-05-12 Thread Uwe Dippel

Beautyful, as it looks like!

I tried here on 2 300 GB U320, and the setup went through without any 
warnings (?? most users encounter some?).

What I did was: (my system disk is sd0)

fdisk -iy sd1
fdisk -iy sd2

printf a\n\n\n\nRAID\nw\nq\n\n | disklabel -E sd1
printf a\n\n\n\nRAID\nw\nq\n\n | disklabel -E sd2

bioctl -c 1 -l /dev/sd1a,/dev/sd2a softraid0

dd if=/dev/zero of=/dev/rsd3c bs=1m count=1
disklabel sd3 (creating my partitions/slices)

newfs /dev/rsd3a
newfs /dev/rsd3b

mount /dev/sd3b /mnt/
cd /mnt/
[pull one hot-swap out]
echo Nonsense  testo
[push the disk back in]
[pull the other disk]
# ls -l
total 4
-rw-r--r--  1 root  wheel  9 May 13 12:00 testo
[everything okay until here]
# rm testo 


rm: testo: Input/output error
[I still guess this may happen]

But now my question: All posts say all info is in 'man softraid' and 
'man bioctl'. There is nothing about *warnings* in there. I also tried 
bioctl -a/-q, but none would indicate that anything was wrong when one 
of the drives was pulled.


This will be a production server, but it can take downtime, in case.
However:
1. I *need to know* when a disk goes offline
2. I need to know, in real life(!), if I can simply use the broken 
mirror to save my data; how I can mount it in another machine. Alas, 
softraid and bioctl are silent about these two.


Another reason for asking:
Next I issued 'reboot'; and could play hangman :(

After the reboot, I got:
...
softraid0 at root
softraid0: sd3 was not shutdown properly
scsibus3 at softraid0: 1 targets, initiator 1
sd3 at scsibus3 targ 0 lun 0: OPENBSD, SR RAID 1, 003 SCSI2 0/direct fixed
sd3: 286094MB, 36471 cyl, 255 head, 63 sec, 512 bytes/sec, 585922538 sec 
total


Now I wonder what to do. Will a traditional fsck do, or do I have to 
recreate the softraid?


Can anyone please help me further?

Uwe



Re: softraid

2009-05-12 Thread Uwe Dippel
Uwe Dippel udippel at uniten.edu.my writes:

 Now I wonder what to do. Will a traditional fsck do, or do I have to 
 recreate the softraid?

I guess, I can answer this myself, in the meantime:
I did the fsck of the softraid volume sd3a and sd3b 
(the first one was clean, to be expected, the second not; 
but marked clean with fsck).
Then I mounted it again:

# fsck /dev/sd3b 
** /dev/rsd3b
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? [Fyn?] SALVAGE? [Fyn?] y

BLK(S) MISSING IN BIT MAPS
SALVAGE? [Fyn?] y

1 files, 1 used, 128675303 free 
(15 frags, 16084411 blocks, 0.0% fragmentation)

MARK FILE SYSTEM CLEAN? [Fyn?] y


* FILE SYSTEM WAS MODIFIED *
# mount /dev/sd3b /mnt/
# cd /mnt/ 
# ls -l
# 
[that's okay, I never put anything there]
# pwd
/mnt
# echo Nonsense  testo
[that's not okay, because it got me another hangman]

If anyone was interested, I have all the 'trace'es and 'ps'es on camera.

I guess it is time for a dmesg as well:
OpenBSD 4.4 (GENERIC.MP) #0: Fri Jan 23 14:33:38 SGT 2009
r...@claude2.uwe.uniten.edu.my:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4214460416 (4019MB)
avail mem = 4092596224 (3903MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xec000 (62 entries)
bios0: vendor HP version D17 date 07/16/2007
bios0: HP ProLiant ML350 G4
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP SPCR MCFG APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.52 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,
CNXT-ID,CX16,xTPR,LONG
cpu0: 1MB 64b/line 8-way L2 cache
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, 3000.11 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,
CNXT-ID,CX16,xTPR,LONG
cpu1: 1MB 64b/line 8-way L2 cache
ioapic0 at mainbus0 apid 8 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0 apid 9 pa 0xfec1, version 20, 24 pins
ioapic1: misconfigured as apic 0, remapped to apid 9
ioapic2 at mainbus0 apid 10 pa 0xfec8, version 20, 24 pins
ioapic3 at mainbus0 apid 11 pa 0xfec80400, version 20, 24 pins
acpiprt0 at acpi0: bus 1 (IP2P)
acpiprt1 at acpi0: bus 2 (IPXB)
acpiprt2 at acpi0: bus 6 (PCXA)
acpiprt3 at acpi0: bus 9 (PCXB)
acpiprt4 at acpi0: bus 5 (PTA0)
acpiprt5 at acpi0: bus 13 (PTB0)
acpiprt6 at acpi0: bus 16 (PTC0)
acpiprt7 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0
acpicpu1 at acpi0
acpitz0 at acpi0: critical temperature 31 degC
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 Intel E7520 Host rev 0x0c
ppb0 at pci0 dev 2 function 0 Intel E7520 PCIE rev 0x0c
pci1 at ppb0 bus 5
ppb1 at pci1 dev 0 function 0 Intel PCIE-PCIE rev 0x09
pci2 at ppb1 bus 6
ppb2 at pci1 dev 0 function 2 Intel PCIE-PCIE rev 0x09
pci3 at ppb2 bus 9
ppb3 at pci0 dev 4 function 0 Intel E7520 PCIE rev 0x0c
pci4 at ppb3 bus 13
ppb4 at pci0 dev 6 function 0 Intel E7520 PCIE rev 0x0c
pci5 at ppb4 bus 16
ppb5 at pci0 dev 28 function 0 Intel 6300ESB PCIX rev 0x02
pci6 at ppb5 bus 2
mpi0 at pci6 dev 3 function 0 Symbios Logic 53c1030 rev 0x08: 
apic 9 int 0 (irq 5)
scsibus0 at mpi0: 16 targets, initiator 7
sd0 at scsibus0 targ 0 lun 0: COMPAQ, BF03688284, HPB3 SCSI3 0/direct fixed
sd0: 34732MB, 50824 cyl, 2 head, 699 sec, 512 bytes/sec, 71132000 sec total
sd1 at scsibus0 targ 3 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed
sd1: 286102MB, 82594 cyl, 8 head, 886 sec, 512 bytes/sec, 585937500 sec total
sd2 at scsibus0 targ 5 lun 0: COMPAQ, BF3008AFEC, HPB1 SCSI3 0/direct fixed
sd2: 286102MB, 82594 cyl, 8 head, 886 sec, 512 bytes/sec, 585937500 sec total
mpi0: target 0 Sync at 160MHz width 16bit offset 63 QAS 1 DT 1 IU 1
mpi0: target 3 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi0: target 5 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
mpi1 at pci6 dev 3 function 1 Symbios Logic 53c1030 rev 0x08: apic 9 
int 1 (irq 5)
scsibus1 at mpi1: 16 targets, initiator 7
uhci0 at pci0 dev 29 function 0 Intel 6300ESB USB rev 0x02: apic 8 
int 16 (irq 5)
uhci1 at pci0 dev 29 function 1 Intel 6300ESB USB rev 0x02: apic 8 
int 19 (irq 5)
Intel 6300ESB WDT rev 0x02 at pci0 dev 29 function 4 not configured
Intel 6300ESB APIC rev 0x02 at pci0 dev 29 function 5 not configured
ehci0 at pci0 dev 29 function 7 Intel 6300ESB USB rev 0x02: apic 8 
int 23 (irq 5)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb6

Re: spam from chrooted CMSes

2009-04-12 Thread Uwe Dippel
Matthew Weigel unique at idempot.net writes:

 Huh?  I'm talking about the CMS itself authenticating to the SMTP server,
 and giving each application a single set of credentials. 

chroot is the name, and isolation is the game.

 This should be set in
 the CMS's config files, much like database credentials.

Again, I didn't write or install them.

 Then I configure that board's software to 
 connect to my
 SMTP server to send mail, and it has to authenticate as board at
idempot.net to
 send any mail.  Now, if my server starts sending out spam, I can check the
 logs and see if the spam is coming from the user board at idempot.net
 to verify
 that the particular board software I'm using is the compromised software
 or not.

And here we come to something! This makes sense, compared to me looking 
through users' code: A hook that allows the insertion of a filter either
in php before calling mini_sendmail, or in mini_sendmail itself. 
postfix is the wrong answer, because the default sender from chrooted
mini_sendmail would be 'root', and postfix needs to accept mail from root.
So that filter would do something like
deny all
allow cms_legal
allow cms_department
allow cms_conference

In case anybody had some snippets, I'd be grateful to receive those.

Thanks,

Uwe



Re: spam from chrooted CMSes

2009-04-12 Thread Uwe Dippel

Vadim Zhukov wrote:


Do your clients have ability to connect to external hosts? If yes then 
you should not even bother logging PHP mail() calls or such.


If outgoing connections are closed then you should have different system 
users (i.e., different UIDs) for each client; otherwise it'll be easy 
possible for hacker to spoof sender: nothing stops him from modifying 
other client's scripts or just implementing SMTP server entire in PHP.
  


Exactly. That's what I, that's what everyone has who hosts web sites of 
users.

If someone can hack into it, she can write some basic SMTP easily.
But when you have 200+ users, and 10+ run some php code, and your 
postfix spews spam to world and sundry,  a filter 'From:' - before 
reaching postfix, because 'root' does not send from chrooted Apache, can 
conveniently block all mails with illegal senders' addresses.
And only if both requirements passed then you can improve your antispam 
scurity either by 1) modifying mini_sendmail, or 2) writing a simple 
Perl wrapper that parses input data (bundled and/or in-ports Perl 
modules should make it very easy) and then passes data to real 
mini_sendmail.


IMHO, it's much easier to make mini_sendmail log mail, or add a specific 
header to each letter that may help you in debugging. In the latter case 
you may even put some limits for mail based on your header knowledge in 
your real MTA, which mini_sendmail will forward letters to. You do not 
need big programming skills to do that, just some basic C knowledge. If 
you do not know C at all, ask some your friend to do this work for beer 
(or mineral water, if he doesn't like alcohol ;) ).


  


I don't mind paying a drink, I even don't mind gobbling up something 
myself. But maybe something likewise existed, and then I could simply 
save my time. I guess I'm not the only one who runs official CMSes on a 
server, that need to send mail, and want to block everyone else's 
website hosted there as well, from sending mail.


Thanks,

Uwe



Re: spam from chrooted CMSes

2009-04-11 Thread Uwe Dippel

When dealing with web based submission, the best thing I have found is
to make sure the web based submission adds its own headers like what it
is and where the user came from and such so when diagnosing the problem
one can easily block based on that information. If there is an account
involved, you should include that info as well.


Dear Todd,

I'm sorry, but I lack the experience to understand what you mean. 
I have 200+ users, several of them having set up (sorry, yes, written!),

who can install any CMS of their liking, using ftp; or any other script that
sends mail. Some of them are official websites, so I can not shut down the
whole mini_sendmail business in the chrooted Apache. I also cannot read, study,
hundreds of thousands of lines of code to find out how and where a web-page 
hosted by me allows an attacker to inject a message of her own, to a 
recipient of her own choice.


Since mini_sendmail receives it through php from Apache, I wonder how I 
could log e.g. the website from which it was sent, or at least easily 
limit the number of calls of mini_sendmail.

Again, your idea being fine for an application developer, which I am not.
I wouldn't know how to add the account to which the application belongs 
that can be abused.

The only two places where I, IMHO, can see a chance would be with an extended
log or check of Apache or php; whenever a mail-call is logged, from which 
directory, e.g.



If you're really cracking this nut properly, you'd include heuristics
to temporarily block if too many messages are sent in a given time period,
and permanently block pending review if too many temporary blocks occur
within a given time period.


Yes. But that's a complete coder's work, isn't it? I wonder if there is no
other solution, as mentioned above. 
sendmail_path = /bin/mini_sendmail -t -i

is what I have in php.ini. I wonder, if there are no logging features for
mini_sendmail or so. I read the man-page online, but didn't see any. 


If it doesn't exist, it surely would be a good enhancement if the path of
the application from which it is send was carried through, so that a filter
can be written, to allow or drop depending on the path of that application.

Uwe



Re: spam from chrooted CMSes

2009-04-11 Thread Uwe Dippel
Matthew Weigel unique at idempot.net writes:

 Then you have grown your userbase too fast with a terrible setup, and now
 you're caught in the middle of fixing the problem or avoiding downtime.

Are you sure this is not a misunderstanding? When you host user accounts, on a
tight, default, setup of OpenBSD (or any other OS), and allow them to ftp into
their web-directories, how could one prevent them from uploading code that
mail()-s something? Aside of removing mini_sendmail, that is.

 Sure, if you go through and find every line of code where mail() is called,
 you can add logging at that point.  But so far you've refused to make any
 changes to the applications.

Are you sure that this is not a misunderstanding? Which sysadmin can 'make
changes to the applications' that his 200+ users run??

 His idea is the right one.  Most PHP applications I've dealt with support, at
 least through plugins or extensions, SMTP + AUTH for sending mail instead of
 PHP's mail().

Are you sure that this is not a misunderstanding? If you host, for example, any
CMS, it should have the functionality to the remote user, registered with that
CMS, to request a password reset. Which SMTP+AUTH do you want to use here??
AFAICS, here we need to allow a straightforward SMTP. The userbase is registered
in the various databases of the CMSes. And again, no sysadmin will re-write all
user-supplied applications to extract all those remote users for
SMTP-authentication. Get real, please!

  The only two places where I, IMHO, can see a chance would be with an
  extended
  log or check of Apache or php; whenever a mail-call is logged, from
  which directory, e.g.
 
 I don't think PHP ever changes the working directly except explicitly;
 probably every call to mail() (which leads to mini_sendmail) occurs in the
 chroot /.

Exactly. So how to log it??

  Yes. But that's a complete coder's work, isn't it? I wonder if there is no
  other solution, as mentioned above.
 
 There are, but they require you to set the parameters of how web apps can work
 in your environment so as to enforce a minimum of auditability. 

Yes, this is the crucial point. I'd be more than happy to learn how to set this,
for example in php.ini! Any suggestion will be appreciated!

  sendmail_path = /bin/mini_sendmail
  -t -i
  is what I have in php.ini. I wonder, if there are no logging features for
  mini_sendmail or so. I read the man-page online, but didn't see any.
 
 Well, mini_sendmail is an external package... talk to the authors about that,
 but I think they'll tell you they can't really track what you need tracked.

So, how to solve the problem, then??

Thanks anyway,

Uwe



Re: spam from chrooted CMSes

2009-04-11 Thread Uwe Dippel

Chris Bennett wrote:


This could be helpful, possibly.  First, you can maintain a functional 
mini_sendmail by putting a nother script at /bin/mini_sendmail, this 
script could do some sort of logging and then pass things on to the real 
mini_sendmail, located somewhere else, different (hidden) name.
  


This is what I tried already! But it seems it is written in a different 
manner than what I was hoping for.

Something like
echo $0 $1 $2  /tmp/loggo
/bin/mini_sendmail $1 $2
doesn't work. No, it is not the mini_sendmail that doesn't work, the 
first line already doesn't log anything.



Since all this goes through php first, setting up this script is easy.
  


Maybe you could give me a hint? I am not a PHP-coder, actually.

Uwe



spam from chrooted CMSes

2009-04-09 Thread Uwe Dippel
I'm running postfix as MTA on a machine with several CMS, on a chrooted 
Apache.  Recently, there is a huge number of spam being sent from there, 
alas. When I scan the postfix-logs, all those come from 'root', meaning 
they don't come through port 25. I run OpenBSD with mini-sendmail, and 
now I wonder how I could find out from which CMS they are sent. Is there 
any chance to find out from which CMS they are sent?


Thanks,

Uwe



Apache failed at 'graceful'

2009-03-18 Thread Uwe Dippel
This is the first time in years; and I wonder what might be wrong. I
am running OpenBSD 4.4, nothing specific. I have rotated the logs
daily through these years, with the script as below, and no problem
until today. The machine has an uptime of 50 days.
Still after the cron-ed logrotation this morning it didn't come up
with 'graceful'.
This is what I have gathered until now from the error-log:

[Wed Mar 18 04:21:06 2009] [error] [client 77.37.220.140] File does
not exist: /htdocs/v2/intl/zh-CN/
[Wed Mar 18 04:21:06 2009] [error] [client 77.37.220.140] File does
not exist: /htdocs/v2/error_404.html
[Wed Mar 18 04:21:10 2009] [error] PHP Notice:  Undefined index:
HTTP_HOST in /htdocs/v2/libraries/joomla/environment/uri.php on line
154
[Wed Mar 18 04:21:10 2009] [error] PHP Notice:  Undefined index:
HTTP_USER_AGENT in /htdocs/v2/libraries/joomla/html/html/behavior.php
on line 51
[Wed Mar 18 04:21:10 2009] [error] PHP Notice:  Undefined index:
HTTP_USER_AGENT in /htdocs/v2/templates/ja_purity/ja_templatetools.php
on line 200
[Wed Mar 18 04:21:10 2009] [error] PHP Notice:  Undefined index:
HTTP_USER_AGENT in /htdocs/v2/templates/ja_purity/ja_templatetools.php
on line 249
[Wed Mar 18 04:22:44 2009] [warn] child process 23069 still did not
exit, sending a SIGTERM
[Wed Mar 18 04:22:44 2009] [warn] child process 11890 still did not
exit, sending a SIGTERM
[Wed Mar 18 04:22:44 2009] [warn] child process 9049 still did not
exit, sending a SIGTERM
[Wed Mar 18 04:22:44 2009] [warn] child process 17620 still did not
exit, sending a SIGTERM
[Wed Mar 18 04:22:44 2009] [warn] child process 6571 still did not
exit, sending a SIGTERM
[Wed Mar 18 04:22:52 2009] [notice] caught SIGTERM, shutting down
PHP Warning:  Module 'mysql' already loaded in Unknown on line 0
PHP Warning:  Module 'mysql' already loaded in Unknown on line 0
[Wed Mar 18 18:05:43 2009] [notice] Initializing etag from
/var/www/logs/etag-state

apachectl stop
newsyslog -f /var/www/conf/newhttpdlog.conf
apachectl graceful
are the commands in the crontab.
When I run the command now, it works. It looks like a glitch. Still,
it ought not happen.

Please, inform me if more is required.

Uwe



Re: Is this panic related to +ExecCGI?

2009-02-07 Thread Uwe Dippel

 Slightly off-topic:
 Would it rather be perl committing all that memory or httpd?

add some instrumentation and you'll find out. symon can be good
for this sort of thing (you can have it monitor memory/cpu use of
specific processes at a frequent interval and graph them).

 Can this be prevented one way or another?

login.conf
  


So I thought. I had read the man page a few times, but still lack the 
understanding where to best put the limits. Currently, the user sits on 
default, with

datasize-max=512M:\
datasize-cur=512M:\
:maxproc-max=128:\
:maxproc-cur=64:\
openfiles-cur=128:\
stacksize-cur=4M:\
The datasize doesn't seem to cut it, the free mem+swap usually is around 
3.5G. Now I wonder which of the following is best employed limiting the 
system usage below die-off:

cputime, filesize, memoryuse, vmemoryuse?

Is there any further description, a link or a document available with a 
formula on 'how to prevent one's system from running out of resources at 
all cost'? That would be the greatest and best; then I could put that 
user into this class, and she could never bring down the system, right?


Thanks,

Uwe



Is this panic related to +ExecCGI?

2009-02-06 Thread Uwe Dippel

Dear all,

for the first time ever I had one of my production boxes crashing. And 
that was just a few hours after I allowed one of my users ExecCGI in her 
home, in the chrooted, default, Apache.

The user was deploying some perl-script.

Can anyone with insight please look at the trace and point out to me if 
there is a link between the +ExecCGI and the panic?


Was I blue-eyed when I allowed the deployment of perl as a 
non-privileged user in the chrooted Apache, thinking that Apache and 
perl would handle any exception well enough to not crash?


Thanks for advice,

Uwe


root on sd0a swap on sd0b dump on 
sd0b 
panic: kernel diagnostic assertion uvmexp.swpgonly = uvmexp.swpages 
failed: f
ile /usr/src/sys/uvm/uvm_pdaemon.c, line 
573 
Stopped at  Debugger+0x5:   
leave  
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS 
PANIC!
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT 
INFORMATION!

ddb{1} 
trace  
Debugger() at 
Debugger+0x5 
panic() at 
panic+0x122 
__assert() at 
__assert+0x21
uvm_aiodone_daemon() at 
uvm_aiodone_daemon+0x2fd   
uvm_aiodone_daemon() at 
uvm_aiodone_daemon+0x9e8   
uvm_pageout() at 
uvm_pageout+0xca  
end trace frame: 0x0, count: -6  

ddb{1} 

  PID   PPID   PGRPUID  S   FLAGS  WAIT  
COMMAND  
11671   7706  11671  0  7   0
cron 
 7706  12712  12712  0  3   0x280  piperd
cron 
10810  18329  18329 67  3   0x2000180  netcon
httpd
 7948  18329  18329 67  3   0x2000180  netio 
httpd
 7911  18329  18329 67  3   0x2000180  netcon
httpd
26768  18329  18329 67  3   0x2000180  netcon
httpd
22062  18329  18329 67  2   0x100
httpd
 1169  18329  18329 67  3   0x2000180  netcon
httpd
22624  18329  18329 67  2   0x100
httpd
18841  18329  18329 67  2   0x100
httpd
11509  18329  18329 67  7   0x100
httpd
22935  18329  18329 67  2   0x100
httpd
 9398  18329  18329 67  2   0x100
httpd
23859   6501  23859  0  2  0x4000
sshd 
10659   6501  10659  0  2  0x4000
sshd 
27338  1  27338   1231  2  0x4002
vi   
26143  18329  18329 67  2   0x100
httpd
 5820  18329  18329 67  3   0x2000180  netio 
httpd
 5758  18329  18329 67  2   0x100
httpd
12944  18329  18329 67  3   0x2000180  netio 
httpd
 2760  18329  18329 67  2   0x100
httpd
17215  18329  18329 67  3   0x2000180  netio 
httpd
26725  18329  18329 67  2   0x100
httpd
13133  18329  18329 67  3   0x2000180  netio 
httpd
 4211  18329  18329 67  2   0x100
httpd
 3040  18329  18329 67  3   0x2000180  netio 
httpd
19165  18329  18329 67  2   0x100
httpd
30747  18329  18329 67  2   0x100
httpd
 2122  18329  18329 67  3   0x2000180  netio 
httpd
30941  18329  18329 67  3   0x2000180  netio 
httpd
10089  18329  18329 67  3   0x2000180  netio 
httpd
 8959  18329  18329 67  2   0x100
httpd
24254  18329  18329 67  2   0x100
httpd
23026  19319  19319507  3   0x2004180  kqread
pickup   
30424  18329  18329 67  2   0x100
httpd
24054  18329  18329 67  2   0x100
httpd
18990  18329  18329 67  2   0x100
httpd
17856  18329  18329 67  2   0x100
httpd
29787  18329  18329 67  2   0x100
httpd
15163  18329  18329 

Re: Is this panic related to +ExecCGI?

2009-02-06 Thread Uwe Dippel

Ted Unangst wrote:

You ran out of swap.


Thanks, Ted, for the fast reply and immediate insight. So it must have 
to do with that perl-program, since what I usually see is something like:

Memory: Real: 107M/347M act/tot  Free: 1637M  Swap: 0K/2151M used/tot

Slightly off-topic:
Would it rather be perl committing all that memory or httpd?
Can this be prevented one way or another?
I understand the development was done on some Ubuntu-flavour, and then 
transferred to OpenBSD. Can we have something like a sandbox for such 
situations, so that applications are running on the system, but with 
limited resources? The underlying application was supposed to run on a 
production machine with 100+ users, and at one moment in time it did 
need to be tested, after it ran successfully on a laptop. What would be 
a better strategy?


Thanks again,

Uwe
P.S.: And Hurray! to the serial console. I could do the trace and reboot 
an otherwise unaccessible and remote machine!




4.4 is getting stuck

2009-01-09 Thread Uwe Dippel
Yesterday I upgraded my last production box (remote) from 4.3 to 4.4., 
without any hitch, rebooted, and so forth.
Last night at some innocuous time, it stopped accepting incoming mail 
(postfix). This morning, it did courier-imap well, until I used an 
existing ssh-session like this:



# pwd
/usr/src/usr.sbin/httpd
# cd 
/var/log/
# 
/usr/local/sbin/post 

postalias   postfix postkick
postqueue  
postcat postfix-disable postlock
postsuper  
postconfpostfix-enable  postlog
postdroppostfix-install postmap
# /usr/local/sbin/postfix 
status  
^C^Z


Now it is stuck like this for an hour or so. It still takes keyboard 
input, though.
Courier-imag also does not respond any longer. But nmap is still 
somewhat okay:



$ nmap -sV 172.16.0.4

Starting Nmap 4.68 ( http://nmap.org ) at 2009-01-10 08:58 SGT
Interesting ports on 172.16.0.4:
Not shown: 1707 closed ports
PORTSTATE SERVICE VERSION
13/tcp  open  daytime
22/tcp  open  ssh?
25/tcp  open  smtp?
37/tcp  open  time (32 bits)
53/tcp  open  domain?
80/tcp  open  httpApache httpd
110/tcp open  pop3?
993/tcp open  imaps?


daytime works fine, http works very well, but domain, pop3 and smtp time 
out; or worse: all get stuck like here:



$ telnet 172.16.0.4 25
Trying 172.16.0.4...
Connected to 172.16.0.4.
Escape character is '^]'.
helo
mail from:m...@gmail.com
quit
^C^Z



$ telnet 172.16.0.4 110
Trying 172.16.0.4...
Connected to 172.16.0.4.
Escape character is '^]'.
user udippel


Why do I write in:

1. I have no access. It is a remote production server. If I only could 
stop that 'hanging' postfix, I might be able to issue a 'reboot'


2. Any further trial to ssh into it also get stuck like this:

$ ssh -v 172.16.0.4
OpenSSH_5.1, OpenSSL 0.9.7j 04 May 2006
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Connecting to 172.16.0.4 [172.16.0.4] port 22.
debug1: Connection established.
debug1: identity file /home/users/udippel/.ssh/identity type -1
debug1: identity file /home/users/udippel/.ssh/id_rsa type -1
debug1: identity file /home/users/udippel/.ssh/id_dsa type -1

after which I can only leave by killing the session on the client.

3. Even if I went there with a huge effort, and some time delay, how can 
I debug the problem, so that it won't occur again?



Thanks for all ideas,

Uwe



Upgrade woes with httpd at 4.3-4.4 on amd64

2008-11-23 Thread Uwe Dippel

Here after reboot I find the following:

# apachectl start 


/usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isinf'
/usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isnan'
/usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isinf'
/usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isnan'
Syntax error on line 1 of /var/www/conf/modules/php5.conf:
Cannot load /usr/local/lib/php/libphp5.so into server: Cannot load 
specified object

/usr/sbin/apachectl start: httpd could not be started

Some details:

# ls -l /usr/local/lib/php/libphp5.so
-r--r--r--  1 root  bin  4580796 Mar 11  2008 /usr/local/lib/php/libphp5.so
# ls -l /usr/lib/libm.so.2.3
-r--r--r--  1 root  bin  613515 Mar 13  2008 /usr/lib/libm.so.2.3
# cat /var/www/conf/modules/php5.conf
LoadModule php5_module /usr/local/lib/php/libphp5.so

IfModule mod_php5.c
AddType application/x-httpd-php .php .phtml .php3
AddType application/x-httpd-php-source .phps
# Most php configs require this
DirectoryIndex index.php
/IfModule



I tried archives and Google, and now I wonder which was the mistake that 
I did? (I *think* I followed the upgrade guide meticulously.)


How can I get the services back?

Thanks for helping out,

Uwe



Re: Upgrade woes with httpd at 4.3-4.4 on amd64

2008-11-23 Thread Uwe Dippel

Uwe Dippel wrote:


Here after reboot I find the following:

# apachectl start
/usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isinf'
/usr/sbin/httpd:/usr/lib/libm.so.2.3: undefined symbol 'isnan'
/usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isinf'
/usr/sbin/httpd:/usr/local/lib/php/libphp5.so: undefined symbol 'isnan'
Syntax error on line 1 of /var/www/conf/modules/php5.conf:
Cannot load /usr/local/lib/php/libphp5.so into server: Cannot load 
specified object

/usr/sbin/apachectl start: httpd could not be started


Nevermind. It was sorted after the full package upgrade and application 
of all the softlinks and stuff and another reboot.


Uwe



Re: dhcpd on 4.4 is problematic

2008-11-05 Thread Uwe Dippel

Kenneth R Westerback wrote:

This (untested) diff might help. Unfortunately I have no Solaris to
test against and I'm off to work now. Test reports welcome, or better
fixes.
  


You lack the Solaris, and my firewall lacks the sources and stuff, so I
can't compile there. But if the dhcpd is a simple, stand-alone
executable, I am willing to just plug a modified version for i386 and
try it out.

Uwe



Re: dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel

Theo de Raadt wrote:


Oh, we are supposed to ask?  Please, get real.  If you want to give us
all the information you can file a bug report.  By now you should know
we won't bend over backwards to ask for information.  You want this
fixed as much as we do.


Sorry, Theo,

the message to gnats is on the way. I hope it goes through all our 
filters ...


Here are some relevant info, in case:


OpenBSD 4.4 (GENERIC) #1021: Tue Aug 12 17:16:55 MDT 2008
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium/MMX (GenuineIntel 586-class) 233 MHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,MMX
cpu0: F00F bug workaround installed
real mem  = 66678784 (63MB)
avail mem = 55009280 (52MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 07/07/97, BIOS32 rev. 0 @ 0xfd920
pcibios0 at bios0: rev 2.1 @ 0xf/0x1
pcibios0: PCI BIOS has 6 Interrupt Routing table entries
pcibios0: PCI Interrupt Router at 000:07:0 (Intel 82371SB ISA rev 0x00)
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xc/0x8000 0xea000/0x2000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 Intel 82437VX rev 0x02
pcib0 at pci0 dev 7 function 0 Intel 82371SB ISA rev 0x01
pciide0 at pci0 dev 7 function 1 Intel 82371SB IDE rev 0x00: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: WDC AC12500R
wd0: 16-sector PIO, LBA, 2441MB, 4999680 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 ignored (disabled)
vga1 at pci0 dev 8 function 0 S3 Trio32/64 rev 0x54
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
drm at vga1 unsupported
rl0 at pci0 dev 17 function 0 Realtek 8139 rev 0x10: irq 9, address 
00:40:95:00:21:a8
rlphy0 at rl0 phy 0: RTL internal PHY
xl0 at pci0 dev 19 function 0 3Com 3c905 100Base-TX rev 0x00: irq 9, address 
00:60:97:73:55:1a
nsphy0 at xl0 phy 24: DP83840 10/100 PHY, rev. 1
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
ne1 at isa0 port 0x300/32 irq 10, NE2000, address 00:50:ba:c0:41:17
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask f965 netmask ff65 ttymask 
softraid0 at root
root on wd0a swap on wd0b dump on wd0b



/etc/dhcpd.conf:


shared-network LOCAL-NET {
option  domain-name my.domain.com;
option  domain-name-servers 192.168.116.200;
option  netbios-name-servers 172.16.3.247;

subnet 192.168.116.0 netmask 255.255.255.0 {
range 192.168.116.101 192.168.116.199;
default-lease-time 86400;
max-lease-time 259200;
option broadcast-address 192.168.116.255;
option routers 192.168.116.200;
}
}



ps aux | grep dhcp:


_dhcp 2000  0.0  1.5   452   972 ??  Is 3:13PM0:00.05 dhcpd xl0



ifconfig:


lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33204
groups: lo
inet 127.0.0.1 netmask 0xff00
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
rl0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:40:95:00:21:a8
groups: egress
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 172.20.16.207 netmask 0xff00 broadcast 172.20.16.255
inet6 fe80::240:95ff:fe00:21a8%rl0 prefixlen 64 scopeid 0x1
xl0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:60:97:73:55:1a
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 192.168.116.200 netmask 0xff00 broadcast 192.168.116.255
inet6 fe80::260:97ff:fe73:551a%xl0 prefixlen 64 scopeid 0x2
ne1: flags=8822BROADCAST,NOTRAILERS,SIMPLEX,MULTICAST mtu 1500
lladdr 00:50:ba:c0:41:17
media: Ethernet manual
enc0: flags=0 mtu 1536
pflog0: flags=141UP,RUNNING,PROMISC mtu 33204
groups: pflog



Uwe



Re: dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel

Robert Blacquiere wrote:


Missing info would be output from dhcpd in the /var/log/daemon. Please
grep there on dhcpd and send this. Also add the mac address of the
failing machine. This will give us atleast info about the request and
offers to this box.


Sure. It is a tad long, but maybe, maybe, it adds information. In
between you can see the successful transaction with the Knoppix 5.3.1 to
which I booted the machine for debugging purposes, to exclude hardware.
I remember I saw the 167 in the Knoppix terminal.


Nov  4 14:00:23 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:00:23 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:01:28 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:01:28 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:02:33 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:02:33 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:03:36 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:03:36 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:04:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:04:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:05:45 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:05:45 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:06:50 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:06:50 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:07:53 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:07:53 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:08:57 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:08:57 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:10:00 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:10:00 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:11:05 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:11:05 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:12:09 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:12:09 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:13:13 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:13:13 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:14:17 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:14:17 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:15:21 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:15:21 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:16:25 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:16:25 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:17:30 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:17:30 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:18:34 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:18:34 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:19:38 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:19:38 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:20:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:20:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:21:46 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:21:46 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:22:49 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:22:49 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:23:52 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:23:52 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:24:56 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:24:56 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:25:59 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:25:59 firewall 

Re: dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel

Robert Blacquiere wrote:


Missing info would be output from dhcpd in the /var/log/daemon. Please
grep there on dhcpd and send this. Also add the mac address of the
failing machine. This will give us atleast info about the request and
offers to this box.


Sure. It is a tad long, but maybe, maybe, it adds information. In 
between you can see the successful transaction with the Knoppix 5.3.1 to 
which I booted the machine for debugging purposes, to exclude hardware.

I remember I saw the 167 in the Knoppix terminal.


Nov  4 14:00:23 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:00:23 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:01:28 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:01:28 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:02:33 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:02:33 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:03:36 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:03:36 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:04:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:04:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:05:45 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:05:45 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:06:50 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:06:50 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:07:53 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:07:53 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:08:57 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:08:57 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:10:00 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:10:00 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:11:05 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:11:05 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:12:09 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:12:09 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:13:13 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:13:13 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:14:17 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:14:17 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:15:21 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:15:21 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:16:25 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:16:25 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:17:30 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:17:30 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:18:34 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:18:34 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:19:38 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:19:38 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:20:41 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:20:41 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:21:46 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:21:46 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:22:49 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:22:49 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:23:52 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:23:52 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:24:56 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:24:56 firewall dhcpd[28296]: DHCPOFFER on 192.168.116.162 to 
00:20:ed:ee:ed:14 via xl0
Nov  4 14:25:59 firewall dhcpd[28296]: DHCPDISCOVER from 00:20:ed:ee:ed:14 via 
xl0
Nov  4 14:25:59 

dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel

I read the upgrade guide, followed it, and have a 4.4-router in front of me.
Alas, it does not at all dish out an IP-address to an OpenSolaris client 
(nv98). It used to do so before, without any fail at all, ever. 
Immediately after the upgrade to 4.4, it fails 100%.
It does dish out IP-addresses to Knoppix 5.3.1 on the very same 
interface of the very same machine when booted to Knoppix.
I have as well restarted the dhcpd with the interface on which I want 
the address to be given out ('dhcpd xl0'), to no avail.
'ifconfig nge0 dhcp' on OpenSolaris also times out all the time. I only 
need to 'ifonfig nge0 ..' and 'route add ..', on the OpenSolaris, 
though, to connect to the router. Therefore it is not a hardware problem.
It looks much more like a compatibility problem between OpenBSD 4.4 and 
(Open?)Solaris, which didn't exist before and which must not happen.


Any detail can be furnished on request,

Uwe



Re: dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel

Deraj Puma wrote:


This same thing happened to me last night between me and my ISP. I
deleted /var/db/dhclient.leases.if and rebooted which worked.


No cigar.
Of course, I have no /var/db/dhclient.leases, but I did move
dhcpd.leases out of the way and rebooted. It was recreated, but no IP
dished out.
Further, I also 'pfctl -d' the firewall out of the way. Still no success.

I tried to find the location of the dhcp leases on Solaris, but no success.

I found the only config file for the dhcpagent, and it has only one
uncommented line for inet4:


# By default, a parameter request list requesting a subnet mask (1),
# router (3), DNS server (6), hostname (12), DNS domain (15), broadcast
# address (28), and encapsulated vendor options (43), is sent to the DHCP
# server when the DHCP agent sends requests.  However, if desired, this
# can be changed by altering the following parameter-value pair.  The
# numbers correspond to the values defined in the IANA bootp-dhcp-parameters
# registry at the time of this writing.
#
PARAM_REQUEST_LIST=1,3,6,12,15,28,43


Then I continued debugging on the client side, and that was more promising.
Here is the session on the admin side:


# ifconfig nge0 dhcp status
Interface  State Sent  Recv  Declined  Flags
nge0   SELECTING7 0 0  
# ifconfig nge0 dhcp status

Interface  State Sent  Recv  Declined  Flags
nge0   SELECTING8 0 0  
# ifconfig nge0 dhcp drop  
# ifconfig nge0 dhcp status

ifconfig: nge0: interface is not under DHCP control
# ifconfig nge0 dhcp start
^C



# /sbin/dhcpagent -d 2 -v
# ps -eaf | grep dhcp
root   319 1   0 11:06:37 ?   0:00 /sbin/dhcpagent
# kill 319



# ifconfig nge0 192.168.116.91
# route add default 192.168.116.200
add net default: gateway 192.168.116.200


In the first part, one can see the counter of unsuccessful attempts 
incrementing.

Then I stop dhcpclient, and restart; and stop and ask for verbose output.

This is the dmesg, and it clearly shows a compatibility problem; under 
the default as well as under verbose states:



Nov  5 11:07:07 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: 
bad option overload
Nov  5 11:13:18 solN last message repeated 17 times
Nov  5 11:13:50 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: 
bad option overload
Nov  5 11:15:50 solN last message repeated 11 times
Nov  5 11:16:28 solN /sbin/dhcpagent[1156]: [ID 787751 daemon.error] 
dhcp_ipc_init: cannot bind to port 4999 (agent already running?)
Nov  5 11:16:53 solN /sbin/dhcpagent[319]: [ID 566172 daemon.warning] recv_pkt: 
bad option overload


It is beyond my horizon, what the recv_pkt: bad option overload means, 
but it shows that Solaris is not going to accept offers from the dhcpd 
of 4.4. Alas, my fault, I have no backup of 4.3, but exactly the same 
had worked flawlessly throughout up to and including 4.3.


Uwe



Re: dhcpd on 4.4 is problematic

2008-11-04 Thread Uwe Dippel
Here is what Stuart requested.
I hope the attachment goes through!

Uwe
12:10:18.698196 00:20:ed:df:a7:28 ff:ff:ff:ff:ff:ff 0800 342: 0.0.0.0.68  
255.255.255.255.67: [udp sum ok] xid:0x8e0c275e vend-rfc1048 DHCP:DISCOVER 
MSZ:1472 LT:4294967295 VC:83.85.78.87.46.105.56.54.112.99 
PR:SM+DG+NS+HN+DN+BR+VO (DF) (ttl 255, id 43389, len 328)
  : 4500 0148 a97d 4000 ff11 d127    E..H)[EMAIL PROTECTED]'
  0010:   0044 0043 0134 0ce5 0101 0600  .D.C.4.e
  0020: 8e0c 275e        ..'^
  0030:     0020 eddf a728   . m_'(..
  0040:          
  0050:          
  0060:          
  0070:          
  0080:          
  0090:          
  00a0:          
  00b0:          
  00c0:          
  00d0:          
  00e0:          
  00f0:          
  0100:     6382 5363 3501 0139  c.Sc5..9
  0110: 0205 c033 04ff  ff3c 0a53 554e 572e  [EMAIL PROTECTED].SUNW.
  0120: 6938 3670 6337 0701 0306 0c0f 1c2b ff00  i86pc7...+.
  0130:          
  0140:      

12:10:18.700699 00:60:97:73:55:1a 00:20:ed:df:a7:28 0800 362: 
192.168.116.200.67  192.168.116.102.68: [udp sum ok] xid:0x8e0c275e 
Y:192.168.116.102 S:192.168.116.200 vend-rfc1048 OO:0 DHCP:OFFER 
SID:192.168.116.200 LT:259200 SM:255.255.255.0 DG:192.168.116.200 
NS:192.168.116.200 DN:uwe.uniten.edu.my BR:192.168.116.255 RN:129600 
RB:226800 WNS:172.16.3.247 [tos 0x10] (ttl 16, id 0, len 348)
  : 4510 015c   1011 3f02 c0a8 74c8  E..\..?.@(tH
  0010: c0a8 7466 0043 0044 0148 82d5 0201 0600  @(tf.C.D.H.U
  0020: 8e0c 275e     c0a8 7466  ..'^@(tf
  0030: c0a8 74c8   0020 eddf a728   @(tH. m_'(..
  0040:          
  0050:          
  0060:          
  0070:          
  0080:          
  0090:          
  00a0:          
  00b0:          
  00c0:          
  00d0:          
  00e0:          
  00f0:          
  0100:     6382 5363 3401 0035  c.Sc4..5
  0110: 0102 3604 c0a8 74c8 3304 0003 f480 0104  ..6.@(tH3...t...
  0120:  ff00 0304 c0a8 74c8 0604 c0a8 74c8  ...@(tH..@(tH
  0130: 0f11 7577 652e 756e 6974 656e 2e65 6475  ..uwe.uniten.edu
  0140: 2e6d 791c 04c0 a874 ff3a 0400 01fa 403b  .my..@(t:...z@;
  0150: 0400 0375 f02c 04ac 1003 f7ff...up,.,..w

12:10:23.217543 00:20:ed:df:a7:28 ff:ff:ff:ff:ff:ff 0800 342: 0.0.0.0.68  
255.255.255.255.67: [udp sum ok] xid:0x8e0c275e secs:5 vend-rfc1048 
DHCP:DISCOVER MSZ:1472 LT:4294967295 VC:83.85.78.87.46.105.56.54.112.99 
PR:SM+DG+NS+HN+DN+BR+VO (DF) (ttl 255, id 43390, len 328)
  : 4500 0148 a97e 4000 ff11 d126    E..H)[EMAIL PROTECTED]
  0010:   0044 0043 0134 0ce0 0101 0600  .D.C.4.`
  0020: 8e0c 275e 0005       ..'^
  0030:     0020 eddf a728   . m_'(..
  0040:          
  0050:          
  0060:          
  0070:          
  0080:          
  0090:          
  00a0:          
  00b0:          
  00c0:          
  00d0:          
  00e0:          
  00f0:          
  0100:     6382 5363 3501 0139  c.Sc5..9
  0110: 

Re: Slow file access on Compact Flash [SOLVED]

2008-10-20 Thread Uwe Dippel

Uwe Dippel wrote:

4.2 runs out of the box, but with very slow access of files. The CF is 
reasonably fast, though, with ~6MB at 'dd'. But once it has to access 
files for r/w, it gets very slow.



Any hint welcome,


I got a really great hint. Let me start with the results:

tar -C /tmp -xzphf etc43.tgz
takes exactly 3 min 16 sec

With softdep on, it takes exactly 2 seconds.

Thanks so much! Now CF is as fast as hard disk. (I use a 133x Kingston)

Uwe



Slow file access on Compact Flash

2008-10-19 Thread Uwe Dippel

[I read all postings in the archive AFAIK]

Just started with CF on embedded hardware advertised to run OpenBSD; 
ARInfoTek. It does run OpenBSD very well!
Now I want the embedded system to run off CF; the board has a CF socket 
to be wd0.
4.2 runs out of the box, but with very slow access of files. The CF is 
reasonably fast, though, with ~6MB at 'dd'. But once it has to access 
files for r/w, it gets very slow.
I found some postings that 4.3 would be better, but the install of 4.3 
here mainly -stalled- and took a good hour, from a local ftp-site.

locate.updatedb is incredibly fast, while some file extraction takes ages.
It looks like a large, single, file copies very fast, similar to 'dd'. 
But opening a file for r/w seems to take ages. Something like

tar -C /tmp -xzphf etc43.tgz
takes a minute, easily. And etc43.tgz is only 1.2MB.
Copying of this file is quick:
$ date   cp etc43.tgz demo  date
Mon Oct 20 11:29:15 SGT 2008
Mon Oct 20 11:29:16 SGT 2008

Any hint welcome,

Uwe



OpenBSD 4.3 (GENERIC) #698: Wed Mar 12 11:07:05 MDT 2008
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Geode(TM) Integrated Processor by AMD PCS (AuthenticAMD 586-class) 500 
MHz
cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX
real mem  = 527785984 (503MB)
avail mem = 502276096 (479MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 05/23/08, BIOS32 rev. 0 @ 0xfaf00
apm0 at bios0: Power Management spec V1.2 (slowidle)
apm0: AC on, battery charge unknown
pcibios0 at bios0: rev 2.1 @ 0xf/0xdb74
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdaf0/128 (6 entries)
pcibios0: PCI Exclusive IRQs: 5 7 10 11
pcibios0: no compatible PCI ICU found: ICU vendor 0x1022 product 0x2090
pcibios0: Warning, unable to fix up PCI interrupt routing
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc/0x8000 0xef000/0x1000!
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 1 function 0 AMD Geode LX rev 0x31
vga1 at pci0 dev 1 function 1 AMD Geode LX Video rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
glxsb0 at pci0 dev 1 function 2 AMD Geode LX Crypto rev 0x00: RNG AES
glxpcib0 at pci0 dev 15 function 0 AMD CS5536 ISA rev 0x03: rev 0, 32-bit 
3579545Hz timer, watchdog, gpio
gpio0 at glxpcib0: 32 pins
pciide0 at pci0 dev 15 function 2 AMD CS5536 IDE rev 0x01: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: ULTIMATE CF CARD 4GB
wd0: 1-sector PIO, LBA, 3967MB, 8124480 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 ignored (disabled)
ohci0 at pci0 dev 15 function 4 AMD CS5536 USB rev 0x02: irq 7, version 1.0, 
legacy support
ehci0 at pci0 dev 15 function 5 AMD CS5536 USB rev 0x02: irq 7
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 AMD EHCI root hub rev 2.00/1.00 addr 1
ppb0 at pci0 dev 18 function 0 Pericom PCI-PCI rev 0x00
pci1 at ppb0 bus 1
fxp0 at pci1 dev 12 function 0 Intel 82559ER rev 0x10, i82551: irq 5, address 
00:14:b7:00:26:6a
inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4
fxp1 at pci1 dev 13 function 0 Intel 82559ER rev 0x10, i82551: irq 11, 
address 00:14:b7:00:26:6b
inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
fxp2 at pci1 dev 14 function 0 Intel 82559ER rev 0x10, i82551: irq 10, 
address 00:14:b7:00:26:6c
inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4
fxp3 at pci0 dev 19 function 0 Intel 82559ER rev 0x10, i82551: irq 11, 
address 00:14:b7:00:26:69
inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4
isa0 at glxpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbdprobe: reset response 0xfa
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
usb1 at ohci0: USB revision 1.0
uhub1 at usb1 AMD OHCI root hub rev 1.00/1.00 addr 1
biomask f3cf netmask ffef ttymask ffef
mtrr: K6-family MTRR support (2 registers)
uplcom0 at uhub1 port 1 Prolific Technology Inc. USB-Serial Controller rev 
1.10/3.00 addr 2
ucom0 at uplcom0
uhidev0 at uhub1 port 2 configuration 1 interface 0 NOVATEK USB Multimedia 
Keyboard rev 1.10/1.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes, country code 33
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev1 at uhub1 port 2 configuration 1 interface 1 NOVATEK USB Multimedia 
Keyboard rev 1.10/1.00 addr 3
uhidev1: iclass 3/0, 3 report ids
uhid0 at uhidev1 reportid 2: input=1, output=0, feature=0
uhid1 at uhidev1 reportid 3: input=3, output=0, feature=0
softraid0 at root
root on wd0a swap on wd0b dump on wd0b




Re: apc Back-UPS ES 525

2008-07-20 Thread Uwe Dippel
On Wed, 16 Jul 2008 19:41:55 +0700, sonjaya wrote:


 i have small ups seri APC / Back-UPS ES 525 , how to joint and control
 with openbsd , i try using apc-upsd when test  not working.
 then i try nut but unknown driver.
 if any sucsess story can share to me  :)

Yes, but not with ports, this tine. Here I use apcupsd, as in
http://www.apcupsd.org/
Check out: they have a (slightly) outdated section on OpenBSD, how to
install. 

Uwe



Re: Postfix race condition at boot

2008-07-20 Thread Uwe Dippel
On Mon, 14 Jul 2008 12:47:40 -0500, Karl O. Pinc wrote:


 I've an OpenBSD box that's been running postfix for a few
 years, strictly as a send-only mta, and every night the
 box gets rebooted.  Every couple of months postfix does
 not come up on reboot.
 
 All that shows up in the logs is:
 snip postfix/postfix-script[3005]: fatal: Postfix integrity check
 failed!
 
 My suspicion is that syslogd has not yet finished
 making the log socket and the postfix check that
 happens at postfix start fails.
 
 (/etc/rc.conf.local has:
 syslogd_flags=-a /var/spool/postfix/dev/log
 )
 
 I can always log in and start postfix manually
 using the same sendmail command that the rc scripts
 use.
 
 Any suggestions as to how to confirm the problem
 and/or what to do about it?  Does anyone else have
 this problem?  Should I be talking to the postfix
 port maintainer?

Alright. I have exactly the same problem, asked ports@ and got only an
off-list mail, confirming this. Plus, one of a chap who has a similar
problem with another application. 

I wonder why there was nothing on the list, though. I know all too well,
that the people here care for correctness, though the start sequence seems
faltering, or maybe unclear?

I do also confirm, that the problem appears only on my smallest and
oldest box: 1.7 GHz, 256 MB. 

Solution? Remove the sendmail-flags from rc.conf.local and put a 'postfix
start' at the end of rc.local. That should help.

Uwe



Re: Postfix race condition at boot

2008-07-20 Thread Uwe Dippel
On Sun, 20 Jul 2008 20:19:05 +1000, Damien Miller wrote:


  My suspicion is that syslogd has not yet finished
  making the log socket and the postfix check that
  happens at postfix start fails.
 
 That shouldn't happen, because syslogd delays its exit until after
 its log sockets have been established.

Damien, I am not so sure if it is syslog that fails. I have something else
failing before, please see my maillog in the ports@:

Jul 11 11:56:19 claude authdaemond: modules=authuserdb authpwd authpgsql authld
ap authmysql authpipe, daemons=5
Jul 11 11:56:19 claude authdaemond: Installing libauthuserdb
Jul 11 11:56:19 claude authdaemond: File not found
Jul 11 11:56:19 claude authdaemond: Installing libauthpwd
Jul 11 11:56:19 claude authdaemond: Installation complete: authpwd
Jul 11 11:56:19 claude authdaemond: Installing libauthpgsql
Jul 11 11:56:19 claude authdaemond: File not found
Jul 11 11:56:19 claude authdaemond: Installing libauthldap
Jul 11 11:56:19 claude authdaemond: File not found
Jul 11 11:56:19 claude authdaemond: Installing libauthmysql
Jul 11 11:56:19 claude authdaemond: File not found
Jul 11 11:56:19 claude authdaemond: Installing libauthpipe
Jul 11 11:56:19 claude authdaemond: Installation complete: authpipe
Jul 11 11:56:20 claude postfix/postfix-script[17841]: fatal: Postfix integrity c
heck failed!

I am not aware that I'd use courier-authlib for that postfix, but who
knows what it checks?
in any case, postfix seems to wait for something, that slower machines
cannot provide fast enough. If you have any idea how to debug this and
find out *what* it can't find, let me know,

Uwe



httpd-problem after upgrade 4.2 - 4.3

2008-05-07 Thread Uwe Dippel
After the successful upgrade of the first machine, I have some trouble
with the second.
Chances are that the trouble is my fault, but I could still appreciate a clue:

Apache reacts very slow. Despite of a load 0.5,
lynx 127.0.0.1 (as root) takes more than 5-10 seconds until the static
-rwxr-x---  1 root  www  2236 Dec 12  2006 /var/www/htdocs/index.html
props up. Any other task on the system is done instantaneous.
From other machines, on the same network, it takes a similar time to
see that page.
But it is not a data-rate problem, because after the lng wait, the
data itself comes down to the clients at close to 100Mb/sec.
Downloading a file of 60M, on the same subnet, takes about 20 seconds
to connect to the IP-address/subdir and 7 seconds for the transfer.

pf is disabled,
/etc/hosts is
::1 localhost.uniten.edu.my localhost
127.0.0.1 localhost.uniten.edu.my localhost
::1 metalab.uniten.edu.my metalab
172.16.0.2 metalab.uniten.edu.my metalab
Apache has been restarted, it stops and restarts 'graceful' within a
second or two
top says
Memory: Real: 76M/423M act/tot  Free: 1562M  Swap: 0K/2151M used/tot

I am stumped,

Uwe



Re: httpd-problem after upgrade 4.2 - 4.3

2008-05-07 Thread Uwe Dippel
On Thu, 08 May 2008 09:41:23 +0800, Uwe Dippel wrote:

 Apache reacts very slow. Despite of a load 0.5,
 lynx 127.0.0.1 (as root) takes more than 5-10 seconds until the static
 -rwxr-x---  1 root  www  2236 Dec 12  2006 /var/www/htdocs/index.html
 props up. Any other task on the system is done instantaneous.
 From other machines, on the same network, it takes a similar time to
 see that page.

Sorry, guys, brown bag. I seem to be very noisy these days.
The reason for this is just a little DDoS!

Once out of sight, everything behaves fine.

My excuses again,

Uwe



  1   2   3   >