Linux-Hardware Digest #328, Volume #9             Mon, 1 Feb 99 09:14:35 EST

Contents:
  Re: Same Disk RAID and Mirroring (Jim Sawyer)
  HD UDMA ("Amir Man")
  spurious APIC interrupt, ayiee, should never happen. (Ramon Huerta)
  Re: Same Disk RAID and Mirroring ("Thomas Womack")
  L1 cache area only 56M (was: Linux becomes slow with more memory) (Gert Wollny)
  Re: QIC02 TAPE, SuSE 5.3.. how to install and read? Thank you fellows in advance! 
("Blackey")
  Re: HD UDMA (Grant Leslie)
  Re: ARCHIVE Python 4320nt DAT drive (Michael Meissner)
  Number9 I128 configuration problem (howie)
  Re: Linux Redhat 5.2 with Compaq proliant 1500 (N.K. Lim)
  Linux for ARM7 ([EMAIL PROTECTED])
  Re: DMA (Benoit PLESSIS)
  Re: spurious APIC interrupt, ayiee, should never happen. (David Fox)

----------------------------------------------------------------------------

From: Jim Sawyer <[EMAIL PROTECTED]>
Crossposted-To: comp.arch,comp.arch.storage,alt.os.linux,comp.periphs
Subject: Re: Same Disk RAID and Mirroring
Date: Mon, 01 Feb 1999 02:05:44 -0800



Alberto Moreira wrote:

> Also sprach Jim Sawyer <[EMAIL PROTECTED]> :
>
> >
> >
> >Andy Glew wrote:
> >
> >> comp.arch readers may know that I am very interested
> >> in issues of how reliability for personal computers.
> >>
> >> In this instance, how to provide the reliability of RAID
> >> and/or mirroring to computers that have only one disk
> >> spindle, since the vast majority of PCs have only one
> >> or two hard disks.
> >>
> >
> >Very difficult to obtain even just truly horrible performance with these 
>constraints.  Consider the following:

> ...snip...

> >    Bottom line: 2 older, cheap slow disks with a stupid mirroring controller
> >                        will beat the pants off any implementation of RAID on a 
>single spindle.
> >                        (Unless the RAID designers are truly gifted, and the 
>implementation is perfect.)
>
> There was a time, back in the late sixties and early to mid seventies,
> where Univac 494s were sold to run in real-time transaction processing
> shops and uptimes in excess of 99.5% were contractually committed to.

I don't recognize the numbering you used.  It's been too long.  494 soundsmore like 
their 360 lookalike, and I wouldn't
expect such uptimes from that
architecture.  An 1100 running Exec8 would give you results like that.
Ditto the YUK whatevers, I forget the numbers (military and ATC machines).
Rock solid stuff.

> One of the schemes people used in those days was three disk drives,
> each mirrorring each other, each on its own power supply - what's the
> point of having redundancy if a power supply failure kills all drives
> at once ? - one was live, the second was the hot standby, and the
> third was what was called the "cold" standby.

Sure, but you're talking golden age here, when machines were not supposed
to crash ;-)  And to be fair, this solution costs a bit more than $100, the approximate
limit implied in the original post.  Also, the three drives in your example are two
more than the original poster allowed.  The solutions with 2 or more drives
are well known.  He wants a solution where there's only one disk.

You're right though, in terms of single-point-of-failure analysis.
If you want availability, you must have full redundance, and multi-port access.

Still, data redundancy is useful without full point-of-failure redundancy.

You lose power, all drives go down, the system crashes, the open files are lost,
as if the programs running at the time had never started.  No big.

Now instead, one disk stops actually writing to its platter.  No error interrupts
or anything, no indication of a problem, just the data isn't actually being recorded.
Of course, you're not doing byte-for-byte compare verifys - they take way too long.
So you can't tell.  This goes on for hours, or days.  Maybe a month goes by.  You run
the invoicing for this month and suddenly discover all the purchase and payment data
for the past month isn't there.  Or worse yet, you run payroll on Friday and all the
checks are wrong, as if last week never happened!

Wouldn't it be great if you could read the data even when the disk didn't actually 
write it?
If you're RAIDing your disks, you can.  Even without redundant power supplies.

> Moreover, these drives
> were multiple-accessed by more than one processor, so one could
> virtually walk to a disk unit or to a processor and hit the big red
> switch, and the system would still be on-line, and no transaction
> would be lost.

You're also correct about fault-tolerance.  I think one of those customers conducted an
acceptance test which involved machine gunning entire cabinets, and the system kept
running.  I believe the testing was done by contractors... at least, that's the way I 
heard it.

As for the big-red-button test, absolutely true.  It was nearly impossible to stop a
4 way machine that way - the exec would note the power failure, recycle the box,
and have it back online before you could get to the switch on the other 3 cabinets.
Even with four people, 1 at each switch, you couldn't do it every time, required
too much precision to get all four switches at once...

By the way, that's "system", as in the whole thing, including the running
applications, and every transaction on every terminal continuing unaffected.
No 'GPF' or sad-face.  No "reboot me - I forget where I was".

But it takes much more than redundant power supplies and multi-port disks.
Whole different universe.

-jim


------------------------------

From: "Amir Man" <[EMAIL PROTECTED]>
Subject: HD UDMA
Date: Mon, 1 Feb 1999 12:35:17 +0200

I have an IBM 6.4GB UDMA HD. It's UDMA ability is not recognised by kernel
2.0.36 (RedHat 5.2) or by the new 2.2.0 kernel I compiled.
Any ideas?

Amir



------------------------------

From: Ramon Huerta <[EMAIL PROTECTED]>
Subject: spurious APIC interrupt, ayiee, should never happen.
Date: Mon, 01 Feb 1999 11:53:08 +0100
Reply-To: [EMAIL PROTECTED]

Hi there,

I just installed kernel 2.2.0 on a dual pentium pro (motherboard:
super P6DNE) and it seems to work fine.

However it keeps saying: "spurious APIC interrupt, ayiee, should never
happen".

What does it mean? I'd like to know what is wrong

thanks, Ramon

my /proc/interrupts
           CPU0       CPU1
  0:     242849        648          XT-PIC  timer
  1:        593        763    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  3:         51          2    IO-APIC-edge  serial
  4:       3663       3927    IO-APIC-edge  serial
  7:     339867     366441    IO-APIC-edge  eth0
  8:          1          0    IO-APIC-edge  rtc
  9:         43         42    IO-APIC-edge  aha152x
 13:          1          0          XT-PIC  fpu
 14:      44754      42697    IO-APIC-edge  ide0
 15:        914        489    IO-APIC-edge  ide1
NMI:          0
ERR:          0

At boot time:
Linux version 2.2.0 (***********************) (gcc version 2.7.2.3) #5
SMP Mon Feb 1
10:36:04 CET 1999
Intel MultiProcessor Specification v1.4
    Virtual Wire compatibility mode.
OEM ID: INTEL    Product ID: 440FX        APIC at: 0xFEE00000
Processor #0 Pentium(tm) Pro APIC version 17
Processor #1 Pentium(tm) Pro APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
Detected 199434775 Hz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 199.07 BogoMIPS
Memory: 127860k/131072k available (1064k kernel code, 420k reserved,
1660k data, 6
8k init)
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.

Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.26 (19981001) Richard Gooch ([EMAIL PROTECTED])
per-CPU timeslice cutoff: 50.10 usecs.
CPU0: Intel Pentium Pro stepping 07
calibrating APIC timer ...
..... CPU clock speed is 199.4399 MHz.
..... system bus clock speed is 66.4796 MHz.
Booting processor 1 eip 2000
Calibrating delay loop... 199.07 BogoMIPS
OK.
CPU1: Intel Pentium Pro stepping 07
Total of 2 processors activated (398.13 BogoMIPS).
enabling symmetric IO mode... ...done.
ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC pin 0, 11, 15, 16, 17, 18, 21, 22, 23 not connected.
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer as ExtINT... .. (found pin 0) ... works.
number of MP IRQ sources: 16.
number of IO-APIC registers: 24.
testing the IO APIC.......................
.... register #00: 02000000
.......    : physical APIC id: 02
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 001 01  0    0    0   0   0    1    7    00
 01 000 00  0    0    0   0   0    1    1    59
 02 000 00  0    0    0   0   0    1    1    51
 03 000 00  0    0    0   0   0    1    1    61
 04 000 00  0    0    0   0   0    1    1    69
 05 000 00  0    0    0   0   0    1    1    71
 06 000 00  0    0    0   0   0    1    1    79
 07 000 00  0    0    0   0   0    1    1    81
 08 000 00  0    0    0   0   0    1    1    89
 09 000 00  0    0    0   0   0    1    1    91
 0a 000 00  0    0    0   0   0    1    1    99
 0b 000 00  1    0    0   0   0    0    0    00
 0c 000 00  0    0    0   0   0    1    1    A1
 0d 000 00  1    0    0   0   0    0    0    00
 0e 000 00  0    0    0   0   0    1    1    A9
 0f 000 00  1    0    0   0   0    0    0    00
 10 000 00  1    0    0   0   0    0    0    00
 11 000 00  1    0    0   0   0    0    0    00
 12 000 00  1    0    0   0   0    0    0    00
 13 0FF 0F  1    1    0   1   0    1    1    B1
 14 000 00  0    0    0   0   0    1    1    B9
 15 000 00  1    0    0   0   0    0    0    00
 16 000 00  1    0    0   0   0    0    0    00
 17 000 00  1    0    0   0   0    0    0    00
IRQ to pin mappings:
IRQ0 -> 2
IRQ1 -> 1
IRQ3 -> 3
IRQ4 -> 4
IRQ5 -> 5
IRQ6 -> 6
IRQ7 -> 7
IRQ8 -> 8
IRQ9 -> 9
IRQ10 -> 10
IRQ11 -> 19
IRQ12 -> 12
IRQ13 -> 13
IRQ14 -> 14
IRQ15 -> 20
.................................... done.
PCI: PCI BIOS revision 2.10 entry at 0xfdba1
PCI: Probing PCI hardware
PCI: 00:00 [8086/1237]: Passive release enable (00)

etc...
and then, keeps saying
spurious APIC interrupt, ayiee, should never happen.
spurious APIC interrupt, ayiee, should never happen.
spurious APIC interrupt, ayiee, should never happen.
spurious APIC interrupt, ayiee, should never happen.



------------------------------

From: "Thomas Womack" <[EMAIL PROTECTED]>
Crossposted-To: comp.arch,comp.arch.storage,alt.os.linux,comp.periphs
Subject: Re: Same Disk RAID and Mirroring
Date: Mon, 1 Feb 1999 12:03:08 -0000

Leslie Mikesell wrote in message <792upn$55q$[EMAIL PROTECTED]>...
>In article <78vef3$ds0$[EMAIL PROTECTED]>,
>Andy Glew <[EMAIL PROTECTED]> wrote:
>>
>>In this instance, how to provide the reliability of RAID
>>and/or mirroring to computers that have only one disk
>>spindle, since the vast majority of PCs have only one
>>or two hard disks.
>
>Hard disks aren't exotic hard-to-find devices anymore.

But it'll be a very, very long time before Dell would put in 5 4G drives
instead of 1 16G drive in a consumer machine (and I think Andy's
interest is consumer machines), simply because five drives are five
times as loud, and unavoidably have five sets of disc controllers and
five elaborate cast-aluminium cases and are more expensive (5x4G IDE is
5x£87 = £435, 1x16G is £199; if you need to go to SCSI to fit that many
devices, 5x4G is £825, one 18G is £595).

Two 8G EIDE drives cost as much as a 16G EIDE drive, but I know no
technique for getting more than 8G of reliable storage out of only a
pair of discs.

>Not in my experience.  I've seen many more complete
>disk failures than single-sector read errors on recent
>production disks.  Back in the old days you might have
>a small spot with errors show up before the whole thing
>dies.  These days the hardware has already tried to
>correct it and the disk is well on the way out before you
>see the first problem.

Is the solution, then, something like Compaq's SmartDrives, which give
an indication to the OS when they're starting to see lots of errors
requiring correction?

Someone on one of the groups I read has 'henceforth, all SCSI discs will
be required to send an email notification 24 hours before complete
hardware failure'; with SmartDrives, that becomes an OS feature rather
than a joke.

Tom



------------------------------

From: Gert Wollny <[EMAIL PROTECTED]>
Subject: L1 cache area only 56M (was: Linux becomes slow with more memory)
Date: Mon, 01 Feb 1999 12:06:09 +0000

Hi all,

now I will post what I found out:

A very helpful pointer was from Juergen:
> Are you sure that your memory is really cached ? Try to run the
> ctcm16n testprogram from the german publisher heise. It is under
> http://www.heise.de/ct/ftp/pcconfig.shtml with the name ctcm16n.zip. 
> You need to run the program from a standard DOS floppy and it will
> tell you what your memory is really doing.

Result:
L2 cachable area (write through mode) 64M
L1 cachable area 56M < Memory
So not the MB was the problem, but the CPU (AM5x86 P75), now I will see
what AMD has to say about this.

Erik gave me a pointer to a solution:
> The solution was to incorporate the slram patch into my kernel.  This
> patch allows one to reserve a specified amount of upper RAM to use as > a RAM swap 
>disk and forces Linux to park itself in the cachable region > of RAM.  Not as nice as 
>having all of RAM covered by cache, but better > than not using the new 32Mb module I 
>had just bought.  The patch can
> be found at http://www.andrew.cmu.edu/~keryan/slram/ .

Thanx for your time,

Bye

Gert.

-- 
Remove NOSPAM to reply or mailto:[EMAIL PROTECTED]                     
Max-Planck-Institute of Cognitive NeuroSience      http://www.cns.mpg.de

http://gerti.home.pages.de

------------------------------

From: "Blackey" <[EMAIL PROTECTED]>
Crossposted-To: 
comp.os.linux.help,comp.os.linux.misc,comp.os.linux.questions,comp.os.linux.setup
Subject: Re: QIC02 TAPE, SuSE 5.3.. how to install and read? Thank you fellows in 
advance!
Date: Mon, 1 Feb 1999 12:21:40 +0100

Thank you very much for your help.
I saw the site but that's for floppy "oriented" tapes.

Thank you again,
Blackey

A E Lawrence <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]...
>Blackey wrote:
>
>> I have old fashion QIC02 Tape (150MB extended to 250MB) which is
connected
>> through slot as modem etc and then, as a 25pin connector to the external
>> drive. I think that manufacturer is ARCHIVE.
>>
>> According instructions I made new kernel but I don't know how to init
>> (mount) and to read data from tape.
>
>Visit
> http://www.math1.rwth-aachen.de/~heine/ftape/
>and make sure that you have the latest version. There you will also find
>extensive documentation. If you still have problems, ask on the ftape
>list, mentioned in the documentation.
>
>ael
>--
>Dr A E Lawrence (from home)


------------------------------

From: Grant Leslie <[EMAIL PROTECTED]>
Subject: Re: HD UDMA
Date: Mon, 01 Feb 1999 08:32:16 -0400

What motherboard do have in you computer?

Amir Man wrote:
> 
> I have an IBM 6.4GB UDMA HD. It's UDMA ability is not recognised by kernel
> 2.0.36 (RedHat 5.2) or by the new 2.2.0 kernel I compiled.
> Any ideas?
> 
> Amir

-- 
"It looks so lovely, and fragile. Imagine how many millions of people
 are living on it, and don't even realize how fragile it is."
  Alan B. Shepard, 1971, said with a tear in his eye, on the
            Apollo 14 mission looking back at earth from the moon

------------------------------

From: Michael Meissner <[EMAIL PROTECTED]>
Subject: Re: ARCHIVE Python 4320nt DAT drive
Date: 31 Jan 1999 15:22:12 -0500

Tim Camilleri <[EMAIL PROTECTED]> writes:

> Has anyone managed to get a Python 4320NT DAT drive working. I am using
> Red Hat 5.1. The drive is made by Conner Peripherals (who were bought be
> Seagate). A SCSI Inquiry returns ARCHIVE Python 25588-xxx. I would be
> grateful for any help you can offer.

My Python DAT drive has been working under Linux for the last 3 years (though
it was made by Maynard before Conner bought them and in turn was bought by
Seagate).  Note, with recent kernels, you need to pay attention to termination
and use active termination, instead of using what the device calls termination.
I also found it worked better setting it to ANSI SCSI revision 1 instead of 2.
Obviously, don't forget to compile in support for your scsi adapter and for
scsi tape support (and if its a module, make sure the module is loaded).

-- 
Michael Meissner, Cygnus Solutions (Massachusetts office)
4th floor, 955 Massachusetts Avenue, Cambridge, MA 02139, USA
[EMAIL PROTECTED],    617-354-5416 (office),  617-354-7161 (fax)

------------------------------

From: howie <[EMAIL PROTECTED]>
Subject: Number9 I128 configuration problem
Date: Mon, 01 Feb 1999 12:52:07 +0000



Has anybody had any luck getting a Number9 Revolution 3d
card with 8mb of memory to run at 1280 x 1024 @ 32bpp? 
Although I run at this resolution under NT, I cannot seem to
get XFree86 to go better than 1024 x 768 @ 32bpp.  If
someone has done this, please post the XF86Config lines that
you used.  If someone knows that this is impossible, please
tell me so I stop banging my head against a wall.

Thanks,

========================howie
[EMAIL PROTECTED]

------------------------------

From: [EMAIL PROTECTED] (N.K. Lim)
Subject: Re: Linux Redhat 5.2 with Compaq proliant 1500
Date: Mon, 01 Feb 1999 12:56:04 GMT

I tried to install Redhat 5.2 on a Compaq Proliant 1500, but the Smart
SCSI card was not detected. I was not able accessed to the 2 hds that
were connected to this card. Try autoprobe all the available drivers
during installation but no help. The CDROM SCSI card was ok and
detected as NCRxxxx something. Any help?

------------------------------

From: [EMAIL PROTECTED]
Subject: Linux for ARM7
Date: Mon, 01 Feb 1999 13:05:08 GMT

Hello Everybody

One big Q .. err small Q... is linux available for ARM7/TDMI
processor..

I m not sure about what this TDMI means...

Thanx in Advance..

Need a quick reply !!!

Regards

Vasu

============= Posted via Deja News, The Discussion Network ============
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

------------------------------

From: Benoit PLESSIS <[EMAIL PROTECTED]>
Crossposted-To: linux.redhat.misc,alt.linux,alt.os.linux,comp.os.linux.misc
Subject: Re: DMA
Date: Mon, 01 Feb 1999 14:27:17 +0100

Chris Leahy wrote:
> 
> I have an ASUS P5A-B motherboard with the Ali 1541 AGP chip and Ali 1543
> super I/O controller chip.

 I've got the same chip as yours and he made me the same messages.
I hadn't found any solution exept for waiting Linus to finish his work
on the chip (He has annouced I 'll work for support this chipset after
finishing the 2.2.0 kernel, we are at the 2.2.1 so ...)

Benoit PLESSIS
[EMAIL PROTECTED]

------------------------------

From: d s f o x @ c o g s c i . u c s d . e d u (David Fox)
Subject: Re: spurious APIC interrupt, ayiee, should never happen.
Date: 01 Feb 1999 05:17:20 -0800

Ramon Huerta <[EMAIL PROTECTED]> writes:

> I just installed kernel 2.2.0 on a dual pentium pro (motherboard:
> super P6DNE) and it seems to work fine.
> 
> However it keeps saying: "spurious APIC interrupt, ayiee, should never
> happen".
> 
> What does it mean? I'd like to know what is wrong

This patch is from the kernel mailing list:

--- linux/arch/i386/kernel/smp.c~       Thu Jan 21 11:28:40 1999
+++ linux/arch/i386/kernel/smp.c        Sat Jan 30 05:05:59 1999
@@ -725,9 +725,8 @@
         * Set the logical destination ID to 'all', just to be safe.
         * also, put the APIC into flat delivery mode.
         */
-       value = apic_read(APIC_LDR);
-       value &= ~APIC_LDR_MASK;
-       value |= SET_APIC_LOGICAL_ID(0xff);
+       value = (1 << hard_smp_processor_id());
+       apic_write(APIC_LDR, value);
        apic_write(APIC_LDR,value);
 
        value = apic_read(APIC_DFR);


-- 
David Fox           http://hci.ucsd.edu/dsf             xoF divaD
UCSD HCI Lab                                         baL ICH DSCU

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and comp.os.linux.hardware) via:

    Internet: [EMAIL PROTECTED]

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Hardware Digest
******************************

Reply via email to