Re: System Loads & FTape oddoties (long) (OT)

Wolfgang Weisselberg Thu, 28 Jan 1999 12:25:03 -0500
Hi!

Trying to kill the keyboard, [EMAIL PROTECTED] produced:
> > Don't use that for the X server. Any program going into a tight loop
> > will deprive the X server of any CPU time, so you won't be able to do
> > _anything_ (besides logging in over the network or a serial port) when
> > this happens, and if you don't have the necessary equipment, need the
> > reset button.

You could always write a (suid) watch-over program which runs
itself on a static priority slightly higher than your 'realtime'
program --- or in that case, even on normal scheduling.
With realtime programs it would check that a companion prog
slightly lower in priority reacts in time, with idle_sched it
would test that e.g. X ran at all.  If not, you could stop/kill or
reschedule the program to shed_other (timesharing, the default).

I don't know if such a program exists, if not I might one day
write one.  Then you could start 'real-time' progs with no fear
as the machine would _eventually_ react again.

> First, ftape totally agreed with my nice new-used Dual P133... after about a
> year, it stopped working properly... It kept burping, producing id_am
> errors, CRC errors, piles of overruns, and the like. It couldn't write to

Which backup program?  Did you use the program 'buffer'?  

[Tale of burned root HD]

> Well, I put the ftape drive on the crashed computer along with the emergency
> drive, booted up, and restored all my files.... interestingly enough...
> there were no id_am errors and the like... This was my first clue that
> something is going on here... it mysteriously worked... hrmmm...

> Oddly, it now works on fine on my system now... even with the old restored
> stuff...

> So I asked myself... *WHAT HAS CHANGED*
> The hardware is the same...

No, unless it's exactly the same HD, not just a similar
replacements.  I know for a fact that some HD's with even the
same type designation are/were aviable in 2 sizes! 

But that does not count, really, because ... it worked on the
old HD.

> The software is certainly the same...
> mmmm... change? YES one very IMPORTANT chang had occured...
> The file system is now DEFRAGMENTED.

Depends on your backup method.  dd would not have that effect. 

> I had installed all the libraries by hand compiling... and over the course
> of a year or 2 they certainly must have gotten fragmented, along with all
> the other files from all of my hand upgrades!

No.  You see, it does not matter how you install them, if by
hand or by some other program, they get written and deleted.
Unless dd would have been used otherwise.

> Possible solutions for the rest of you:

> 1: e2defrag is a good utility to try... two things to remember, tho
>    a: the fs cnnot be live, defrag it from an emergency restore disk
>    b: you will not be able to boot after it's defraged untill you lilo it...

There are people who (have to[1]) use loadlin.  Loadlin won't
mind, since it (and the kernel) lives on a DOS partition.
Also note that e2defrag is an experimental program, it could
burn and destroy your partition.

[1] Try initialising a AWE32 which has only a port assigned via
    jumpers, but gets the DMA, IRQ and so on via a dos proggy.
    No, isapnp turns up empty.

> 2: destroy and recover the entire primary system disk. Yes, it's quite a 
>    harsh way to defragment , but it may prove worthy of a last ditch effort.

It's also a goot test if you can recover from desaster, but a
bit close if you find out you can't.

3. Have a big enough partition (same or other HD), put on another
   / (root filesystem) and boot from it. Then, once you can
   boot from it, you can at will delete and copy the contents
   back from it to the original root-FS.  If your partition *is*
   large enough, you can repeat it with other partitions.

4. Use a sensible partitioning scheme.  Here, I have 
   Mountpoint           Size    % full  % frag  Notes
   /                     30M    55%      1.9%   [2]
   swap                  98M    ---      ----   32 MB ram
   /tmp                  68M    ---      3.2%   deleted on boot,[3]
   /var                 111M    64%      5.7%   [4]
   /usr                 1.1G    77%      3.5%   [5]
   /home                 99M    96%      9.9%   [6]
   /var/spool/wwwoffle   61M    80%     14.5%   [7]
   /var/spool/news       55M    86%      2.3%   [7][7a]
   /extraspace          301M    63%     10.9%   [8]

This mostly contains any fragmentation to it's partition.  If you
use just / and swap, you are of course inviting fragmentation.

[2] I could probably get away with 20 MB, or even with 15.  But /
    is to be small, because then the danger of data corruption
    is smaller.
[3] This is an extra partition, since the data on / is static,
    but /tmp changes a lot.  Less access to / (safer, more space)
    and the fragmentation stays in /tmp
[4] Again, this is a often-change partition, so see [3]
[5] This partition should be mostly static, but there's /usr/src
    (where, thanks RPM and the kernel, quite a few changes
    happen).  Next time there will be a separate /usr/src
    partition.
[6] Yes, I need to clean up my /home (I am the only user).
    And/or move to a bigger /home.  Both that and the fact that
    I use the partition add to the high fragmentation.
[7] They are separate, since they change often and I would
    not want overruns in /var or them to affect each other.
    Additionally, for the newsspool I had to change the Inode
    density ... the standard 1 Inode per every 4 K was was way
    to little :-)
[7a] Note that most of the files are shorter than one block (1k)
     so the fragmentation looks lower.  I've seen more, there,
     though, and it would be worse if lumped together with other
     type of data.
[8] Just some stuff I don't want to throw away right now and a
    place to use if I need a few MBs.

> I don't know how it could possibly affect a kernel driver or module, but,
> it really seems to. I do notice the same behavior on another system with
> fragmented libraries... so there definately is a relationship.

Not really.  Once the programs are loaded (and that includes the
relevent library parts) they should stay in memory.  Unless you
have so little RAM that you have to page heavily (remember
executables are not committed to swap but are reloaded from the
disk ...)

> Perhaps it's
> taking to much time to load in a library with seeks and whatnot and is
> chopping up the timing to the tape deck?

Hmmm ... the kernel has no need to look sequentially through
the library to find the relevent part ... it issuses a seek to a
position, then the VFS can just look in the inode, the indirect
and probably double indirect nodes to find the position of the
block(s) on the HD.  The inodes should still be in the cache
and if not, are close to each other in the same inode block.

Now, if you *are* very short on memory you probably could get
these effects, but you would hear the disk trashing all the
time as it struggles to read the backup data and the swapped
out data AND the paged out executables AND the paged library
data from the disk.  At least that is what I think happens.
Of course I may be terribly wrong.  Hmmm ... did you run the
system on SMP when you started to see the problems?  It could
be a SMP related oddity as well ... seeing that the SMP support
is experimental in 2.0.x.

> yes, I know ext2fs is fragment resistant... 
> but nothing is bulletproof... 
> last time the system did get a frag check it showed 20% fragmentation...

only 20% ? :-)  Oh .. you are of the 'There's root and there's
swap' religion? :-)  No wonder then ...  But does that not mean
that every *5th* file was not continously written, but had one
(or more) jump(s) in it?  That should not be really bad (unless
you start filling up the disk).

> Oh... btw, since this defrag, it even works when there's heavy net traffic,
> serial i/o, sound blaster jamming away mod files, and I'm in X.... and the
> tape doesn't even shoeshine :-) (HP-Colorado T-300 @ 1000mbps, dual p133
> single cpu mode)

You could induce an artificial disk load by 
        e2fsck -n /dev/zo'e[9]
and a memory shortage by continously running swapout.  Then we
shall see if we can induce that problem by memory shortage and a
slow disk (which should give much of the same effect as memory
shortage and a fragmented disk).

[9] -n means read-only, say no to everything (so it's safe).
    zo'e loosely means 'whatever' in Lojban.

> I have got to try SMP mode next... we'll see what happens. I recently diched
> SMP because it was loosing serial IRQ... this defragmentation might even
> help that :-) Soon as I can, I'll let everyone know how it worked out.

Hmmm ... loosing data on a serial modem sounds like you may want
to use setserial ... and if that is not the problem, it *is* SMP
increasing latency times to an unacceptable amount.  Modems,
especially high-speed ones, are quite sensitive to high latency
times (since the buffer on the I/O chip in your computer is only
16 bytes long ...).  I imaginge many floppy tapes are sensitive to
delayed interrupts as well.  In fact, the Iomega MAX (Pro) ones
seem to be extremely sensitive ...

-Wolfgang

-- 
             PGP 2 welcome: Mail me, subject "send PGP-key". 
      Unsolicited Bulk E-Mails: *You* pay for ads you never wanted.
   How to dominate the Internet/WWW/etc?  Destroy the protocols!  See:
                 http://www.opensource.org/halloween.html
Re: System Loads & FTape oddoties (long) (OT)

Reply via email to