Re: unpleasant ps output and possible related problems.

2001-09-06 Thread David O'Brien

On Wed, Sep 05, 2001 at 11:21:06PM -0400, Mike Barcroft wrote:
> I think it can safely be said that you're rebooting too much.  The
> process can be simplified to:
> make world
> make kernel
> mergemaster
> reboot

For -current I would suggest a slight modification to this -- to make
sure everything compiles before installing any bits:

make buildworld
make buildkernel
make installkernel
make installworld
mergemaster
reboot

reboot between the installkernel and installworld steps if you like.

-- 
-- David  ([EMAIL PROTECTED])

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unpleasant ps output and possible related problems.

2001-09-05 Thread Mike Barcroft

Dave Cornejo <[EMAIL PROTECTED]> writes:
> you wrote:
> 
> > When you rebuild and install a new kernel, are you also doing a
> > `make buildworld` and a `make installworld` in /usr/src before you
> > reboot?
> 
> My usual method is to build a kernel, reboot, build world, reboot,
> build a kernel using the new world, reboot again, do a mergemaster,
> one final build world, reboot, then test.
> 
> If I'm bored I'll do it all over again after combing for stale binaries
> 
> Fortunately, I have a very fast system. :-)

I think it can safely be said that you're rebooting too much.  The
process can be simplified to:
make world
make kernel
mergemaster
reboot

Best regards,
Mike Barcroft

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unpleasant ps output and possible related problems.

2001-09-05 Thread Dave Cornejo

you wrote:

> When you rebuild and install a new kernel, are you also doing a
> `make buildworld` and a `make installworld` in /usr/src before you
> reboot?

My usual method is to build a kernel, reboot, build world, reboot,
build a kernel using the new world, reboot again, do a mergemaster,
one final build world, reboot, then test.

If I'm bored I'll do it all over again after combing for stale binaries

Fortunately, I have a very fast system. :-)

> Sometimes changes to userland are trivial, and you may not need to
> rebuild userland, but utmp corruption is indicative of changes that
> require userland be rebuilt and installed.
> 
> Ideally, you should buildworld/installworld *EVERY* time you build a
> -current kernel.
> 
> Of course, if you have already done this, feel free to issue me a
> boot to the head.

I expect that this is the first question that should be asked of
anyone who doesn't state explicitly that they follow a rigorous
process for assuring a good build.  No boot to the head necessary...

> You note that you are running innd, please don't tell me that you
>are using -current in a production environment...  -current is always
>subject to massive *FUNDAMENTAL* changes with only a moment's notice,
>and breakage without any notice at all...  Using -current in a
>production environment, unless seriously justified [such as -current
>being more stable than -stable], is a fine way to put yourself in a
>position to commit hari-kari, and nobody wants that.

I would call it a non-production system - besides, how better to test
-current than by doing exactly what I would do with it in a real
production system?

I really don't think that this is an INN problem though - my best
guess at the moment is that either something is busted in csh/tcsh or
that something it relies upon is broken.  The outward symptoms I saw
that screwed up news or the boot sequence I think could be explained
by the scripts returning control to console rather than exiting the
shell, but that's a wild guess.

I have rolled most of my -current systems back to a source tree from
23:36 UT on Monday night which is the last time I built a fully
working system.  I don't have too much time to play with it but I
still have two very -current systems that exhibit the problem of the
ps corruption so I'll keep plugging and if I get some time this
weekend and they still are doing this, then I'll try and get more
info.

dave c

-- 
Dave Cornejo @ Dogwood Media, Fremont, California (also [EMAIL PROTECTED])
  "There aren't any monkeys chasing us..." - Xochi

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unpleasant ps output and possible related problems.

2001-09-05 Thread Jim Bryant

Dave Cornejo wrote:

> I apologize for not having any idea where to start on this.  I am not
> whining for someone to fix something, merely reporting an odd behavior
> that I have now seen on multiple machines in cae it means something to
> anybody.
> 
> I am tracking current almost daily on three machines.  Starting
> yesterday I managed to get one box that refused to go into multiuser
> mode it would run the rc script and then hang somewhere executing the
> scripts in /usr/local/etc/rc.d.  If I Ctrl-C'ed it - it would come up
> in the single-user mode shell.  no login prompt, just the shell.  I
> could however telnet into the thing most things seemed to work.
> 
> In this state it had hung without starting INN - so I su'ed and tried
> to start it.  INN starts, but I end up at a prompt with a uid of news!
> If I exit that, INN dies.
> 
> I do a ps -ax and I get some corrupt lines:
> 
>   471  p0  Is 0:00.07 -csh (csh)
>   473  p0  I  0:00.01 su -m
>   474  p0  S  0:00.04 \M-[\M-!\^D\b\M-X\M-!\^D\b (csh)
> 12673  p0  R+ 0:00.00 ps ax
>   466  v0  Is+0:00.01 /usr/libexec/getty Pc ttyv0
> 
> In troubleshooting this I went back to an older kernel and the problem
> persists.  Change back to an old world and it's gone.  Tried the
> new kernel with old world and it also seems to work fine.  So the
> problem seems to be somewhere in the libs or userland.
> 
> Now I went and looked at some other systems rebuilt yesterday evening
> and today and while they still work I see the same sort of corruption
> as above in the ps output - but no other apparent side effects.
> 
> The corrupted line shows up in many different places and users, and
> the exact contents vary, but there's always a "(csh)" at the end.


When you rebuild and install a new kernel, are you also doing a `make buildworld`
and a `make installworld` in /usr/src before you reboot?


Sometimes changes to userland are trivial, and you may not need to rebuild userland,
but utmp corruption is indicative of changes that require userland be rebuilt and
installed.

Ideally, you should buildworld/installworld *EVERY* time you build a -current
kernel.

Of course, if you have already done this, feel free to issue me a boot to the head.

You note that you are running innd, please don't tell me that you are using
-current in a production environment...  -current is always subject to massive
*FUNDAMENTAL* changes with only a moment's notice, and breakage without any
notice at all...  Using -current in a production environment, unless seriously
justified [such as -current being more stable than -stable], is a fine way to
put yourself in a position to commit hari-kari, and nobody wants that.

jim
-- 
 ET has one helluva sense of humor!
He's always anal-probing right-wing schizos!

   POWER TO THE PEOPLE!


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



unpleasant ps output and possible related problems.

2001-09-05 Thread Dave Cornejo

I apologize for not having any idea where to start on this.  I am not
whining for someone to fix something, merely reporting an odd behavior
that I have now seen on multiple machines in cae it means something to
anybody.

I am tracking current almost daily on three machines.  Starting
yesterday I managed to get one box that refused to go into multiuser
mode it would run the rc script and then hang somewhere executing the
scripts in /usr/local/etc/rc.d.  If I Ctrl-C'ed it - it would come up
in the single-user mode shell.  no login prompt, just the shell.  I
could however telnet into the thing most things seemed to work.

In this state it had hung without starting INN - so I su'ed and tried
to start it.  INN starts, but I end up at a prompt with a uid of news!
If I exit that, INN dies.

I do a ps -ax and I get some corrupt lines:

  471  p0  Is 0:00.07 -csh (csh)
  473  p0  I  0:00.01 su -m
  474  p0  S  0:00.04 \M-[\M-!\^D\b\M-X\M-!\^D\b (csh)
12673  p0  R+ 0:00.00 ps ax
  466  v0  Is+0:00.01 /usr/libexec/getty Pc ttyv0

In troubleshooting this I went back to an older kernel and the problem
persists.  Change back to an old world and it's gone.  Tried the
new kernel with old world and it also seems to work fine.  So the
problem seems to be somewhere in the libs or userland.

Now I went and looked at some other systems rebuilt yesterday evening
and today and while they still work I see the same sort of corruption
as above in the ps output - but no other apparent side effects.

The corrupted line shows up in many different places and users, and
the exact contents vary, but there's always a "(csh)" at the end.

-- 
Dave Cornejo @ Dogwood Media, Fremont, California (also [EMAIL PROTECTED])
  "There aren't any monkeys chasing us..." - Xochi

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message