RE: Benchmarking.. how can I get more out of my box?
Under what circumstances are you "only" achieving 26MB/s - what file size? was it random or sequential? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Brian Pomerantz Sent: 08 March 2000 05:07 To: [EMAIL PROTECTED] Subject: Re: Benchmarking.. how can I get more out of my box? Well, I know I'm not getting the performance I want out of the Mylex DAC1164P. I was only getting 26MB/s on write throughput with 2 RAID 5 chains (5+p). I'm certain that either the card is not optimized, the driver is not optimized, or both. One of the things that we have found here at LLNL is that there MUST be synchronization between the firmware on the controller, the driver, the host OS VFS layer, and the file system.
blocksize changed during write
After crashes I see a lot of these messages (RAID5): www3 kernel: ll_rw_block: device 09:00: only 4096-char blocks implemented (1024) www3 last message repeated 226 times www3 kernel: md0: blocksize changed during write What do they actually mean? The first one (only 4096-char blocks implemented) totally fills my dmesg output. /Johan Ekenberg
quit mailinglist
How can I quit this mailinglist? Philipp Krause [EMAIL PROTECTED]
kernel not loading after application of the patch
Hello, I'm rather new to the linux world (only a year since I first put my hands in this) and I'm now assigned the task to maintain a server. I'm right now having a problem with RAID (software raid that is). it didn't work with the previous versions so I tried with the new version of the raidpatch (0145-19990824-2.2.11 ) after applying the patch and fixing the source files (ll_rw_blck.c) due to a .rej file, I compiled the new kernel, everything goes fine, I boot on that new image and get these messages Mounting local filesystems... proc on /proc type proc (rw) /dev/hda1 on /boot type ext2 (rw) kernel panic: B_FREE inserted into queues. the first 3 lines are normal, any idea what could cause the last one? that message I had the second time I tried to boot the image. The first time the computer simply froze after "Loading image" without even decompressing it. *sigh* I have a bad weekend scheduled it seems. Stephan Pirson, Network Engineer or something Saibot, Hesperian Immortal * * Like to try an online adventure game? Go to Hesperia: * *Telnet adress: telnet://hesperia-mud.org:7000 * *IP address : telnet://209.83.132.83:7000 * *Homepage URL : http://www.hesperia-mud.org * *
Re: Benchmarking.. how can I get more out of my box?
[ Tuesday, March 7, 2000 ] Matthew Clark wrote: Hey guys.. I just installed and ran iozone.. neat tool.. When the file size reaches 32Mb, I see a huge drop from around 129Mb/sec (obviously caching effects) right down to 10Mb/sec... then at 64Mb it drops to between 2.5 and 6.7 Mb/sec depending on record/block size... Could you try bonnie (textuality.com/bonnie) or tiotest (mirror available at sublogic.com/tio that includes the mmap code as 0.25)? The second opinions they offer would be interesting to see. I have a Dual Intel PIII 500 system with 256Mb of main Memory... It has a Hardware RAID 5 system on 5 18 Gb Seagate Barracuda drives spread over 3 LVD SCSI channels on a Megaraid controller. I have the latest megaraid source (1.05) from ami.com. what parameters did you use making the h/w array? (write-through vs write-back, stripe size, etc) James
RE: Benchmarking.. how can I get more out of my box?
Hmm.. well you may think 26Mb/Sec is poor for writing.. I would be drooling at such vast speeds.. Would you mind telling me how you set up your raid array (i.e. policies) and filesystem (inodes, block sizes, strides etc)...I'm seeing 2M/b per sec on writes and only 16-17mb/sec on reads.. sequential or random!!! Matthew Clark. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Brian Pomerantz Sent: 08 March 2000 17:28 To: Matthew Clark Cc: Brian Pomerantz; [EMAIL PROTECTED] Subject: Re: Benchmarking.. how can I get more out of my box? It was sequential using a modified Bonnie benchmark with multiple processes running (though I was getting around the same performance using the raw device using a program I wrote). I was writing 10GB
Re: Benchmarking.. how can I get more out of my box?
On Wed, Mar 08, 2000 at 09:14:02PM +0100, Holger Kiehl wrote: Why don't you try SW raid? The Mylex controllers I have don't do SCSI, it presents a block device. I think I'm going to try these drives on my NCR controller just to get a base-line on what kind of write performance they are capable of. Maybe I'll try software RAID when I do that. What sort of CPU usage does software RAID use? Also, does it work on the Alpha platform? Last time I tried to get it working (granted I didn't spend a lot of time on it), I was unable to get very far. In the end, I don't think software RAID is an option for HPC. It is likely in a production system we will want to have a great deal of space on each I/O server with many RAID chains on each of them. I don't think I would see the best performance using software RAID. To add to the complexity, I'll be doing striping across nodes for our cluster file system. Probably what will happen is I'll use Fibre Channel with an external RAID "smart enclosure". This will allow for more storage and more performance per I/O node than what I could achieve by using host RAID adapters. BAPper
Re: Benchmarking.. how can I get more out of my box?
On Wed, 08 Mar 2000, Brian Pomerantz wrote: On Wed, Mar 08, 2000 at 09:14:02PM +0100, Holger Kiehl wrote: Why don't you try SW raid? The Mylex controllers I have don't do SCSI, it presents a block device. I think I'm going to try these drives on my NCR controller just to get a base-line on what kind of write performance they are capable of. Maybe I'll try software RAID when I do that. What sort A benchmark of the SW vs. HW results would be _very_ interesting to a lot of us (hint, hint :) of CPU usage does software RAID use? Also, does it work on the Alpha There's next to no CPU overhead in SW RAID levels -linear and -0. Their overhead is barely-measurable, as no extra data is copied and no extra requests are ussued. (It's simply a re-mapping of an existing request) Level 1 has some overhead in the write case, as the write request must be duplicated and sent to all participating devices. This is mainly a RAM bandwidth eater, but it will show up as extra CPU usage. Levels 4 and 5 do parity calculation. A PII 350MHz is capable of parity calculation of 922 MB/s (number taken from the boot output). This means that on virtually any disk configuration one could think of, the XOR overhead imposed on the CPU would be something like less than 10% of all available cycles. In most cases more like 1-2%, and that's during continuous writing. Another overhead in -4 and -5 (and probably by far the most significant) is that in order to execute a write request to the array, this request must be re-mapped into a number of read requests, a parity calculation, and then two write requests (one for the parity). Even though this sounds expensive in terms of CPU and maybe latency, I would be rather surprised if you could build a setup where the RAID-5 layer would eat more than 10% of your cycles on any decent PII.Now compare 10% of a PII to the price of the Mylex ;) platform? Last time I tried to get it working (granted I didn't spend a lot of time on it), I was unable to get very far. Sorry I don't know. In the end, I don't think software RAID is an option for HPC. It is likely in a production system we will want to have a great deal of space on each I/O server with many RAID chains on each of them. I don't think I would see the best performance using software RAID. To You don't _think_ you would see better performance ? I'm pretty sure you will see better performance. But on the other hand, with a large number of disks, sometimes the hot-swap capability comes in handy, and sometimes it's just nice to have a red light flashing next to the disk that died. Hardware RAID certainly still has it's niche :) - it's just usually not the performance one. add to the complexity, I'll be doing striping across nodes for our cluster file system. Probably what will happen is I'll use Fibre Channel with an external RAID "smart enclosure". This will allow for more storage and more performance per I/O node than what I could achieve by using host RAID adapters. Ok, please try both SW and HW setup when you get the chance. This is a situation that calls for real numbers. However, when we're speculating wildly anyway, here's my guess: Software RAID will wipe the floor with any hardware RAID solution for the striping (RAID-0) setup, given the same or comparable PCI-something-disk busses. (And for good reasons, hotswap on RAID-0 is not often an issue ;) -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...:
Re: kernel not loading after application of the patch
Does this server use lilo to boot? Did you update LILO after you finsihed. If these questions are stupid please forgive! Jon. Saibot wrote: Hello, I'm rather new to the linux world (only a year since I first put my hands in this) and I'm now assigned the task to maintain a server. I'm right now having a problem with RAID (software raid that is). it didn't work with the previous versions so I tried with the new version of the raidpatch (0145-19990824-2.2.11 ) after applying the patch and fixing the source files (ll_rw_blck.c) due to a .rej file, I compiled the new kernel, everything goes fine, I boot on that new image and get these messages Mounting local filesystems... proc on /proc type proc (rw) /dev/hda1 on /boot type ext2 (rw) kernel panic: B_FREE inserted into queues. the first 3 lines are normal, any idea what could cause the last one? that message I had the second time I tried to boot the image. The first time the computer simply froze after "Loading image" without even decompressing it. *sigh* I have a bad weekend scheduled it seems. Stephan Pirson, Network Engineer or something Saibot, Hesperian Immortal * * Like to try an online adventure game? Go to Hesperia: * *Telnet adress: telnet://hesperia-mud.org:7000 * *IP address : telnet://209.83.132.83:7000 * *Homepage URL : http://www.hesperia-mud.org * *
Re: Benchmarking.. how can I get more out of my box?
On Wed, Mar 08, 2000 at 09:14:02PM +0100, Holger Kiehl wrote: Why don't you try SW raid? In the end, I don't think software RAID is an option for HPC. Consider that the poor little controller chip on the raid card is vastly underpowered for what you are asking it to do in raw IO speed plus handling all the raid calculations. Compare that to the excess number crunching capacity of your primary cpu + mmx processor (used for raid parity calculations) and you might find software raid to be a significant improvement if you can get the controller to just concentrate on delivering raw IO to the dma of your main machine. Michael [EMAIL PROTECTED]
Re: Benchmarking.. how can I get more out of my box?
Brian Pomerantz wrote: On Wed, Mar 08, 2000 at 06:52:52PM -, Matthew Clark wrote: Hmm.. well you may think 26Mb/Sec is poor for writing.. I would be drooling at such vast speeds.. Would you mind telling me how you set up your raid array (i.e. policies) and filesystem (inodes, block sizes, strides etc)...I'm seeing 2M/b per sec on writes and only 16-17mb/sec on reads.. sequential or random!!! I believe for the 26MB/s across two chains I set up the Mylex board to have 64KB stripe and 64KB segment size. For ext2, I modified it to use 8KB block size (which is the page size on Alpha, Ted says that won't work on Intel). So I called mke2fs like this: mke2fs -b 8192 -R stride=8 -i 16384 -s 1 /dev/rd/c0d0p1 I also use write-back, which increased the performance a bit. I've done a little testing with the Mylex 1164 to establish a baseline. These comments are based on bonnie sequential numbers. I am not using as many drives as Brian, so I am not controller limited. RAID5: Writeback improvement vs. Writethru delta is about 33% for a 32MB cache. Segment size of 8K improvement vs. 64K is about 10% The slight improvement that results from reducing the segment size in the Mylex controller indicates to me that there remains a lot of room for tuning in linux for sequential transfers. -- Dan Jones, Storage Engineer VA Linux Systems V:(408)542-5737 F:(408)745-9130 1382 Bordeaux Drive [EMAIL PROTECTED]Sunnyvale, CA 94089
question on raid
I tried to setup up raid on my linux. However it did not work. I am trying to setup a linear mode to expand my drive. I did exactly what is said in the How-to doc. Then I run " mkraid /dev/md0" It returns Destorying the contents of the /dev/md0 in 5 seconds.. Handling MD device /dev/md0 analyzing super-block disk 0: /dev/hda6 . disk 1: /dev/hdb1 . /dev/md0 Invalid argument What did I do wrong? I am running RedHat Linux 6.0 with kernel 2.2.5-15 I read other faq saying kernel2.2.x support linear without patch. So I didn't patch the kernel. (actually I don't know how to patch the kernel) Can you help? __ FREE Personalized Email at Mail.com Sign up at http://www.mail.com?sr=mc.mk.mcm.tag001
Re: question on raid
Hi: Benny HO [EMAIL PROTECTED] wrote: I tried to setup up raid on my linux. However it did not work. I am trying to setup a linear mode to expand my drive. I did exactly what is said in the How-to doc. Then I run " mkraid /dev/md0" It returns Destorying the contents of the /dev/md0 in 5 seconds.. Handling MD device /dev/md0 analyzing super-block disk 0: /dev/hda6 . disk 1: /dev/hdb1 . /dev/md0 Invalid argument What did I do wrong? Did you do "insmod raid0" before mkraid? I am running RedHat Linux 6.0 with kernel 2.2.5-15 I read other faq saying kernel2.2.x support linear without patch. So I didn't patch the kernel. (actually I don't know how to patch the kernel) Can you help? __ FREE Personalized Email at Mail.com Sign up at http://www.mail.com?sr=mc.mk.mcm.tag001 =---=---=---=---=---=---=---=---=---=---=---=---=---=---=---=---= Koichi Kawabata ThirdWare Co., Ltd. E-Mail: [EMAIL PROTECTED] URL: http://www.3ware.co.jp/ =---=---=---=---=---=---=---=---=---=---=---=---=---=---=---=---=
Re: Benchmarking.. how can I get more out of my box?
On Wed, 08 Mar 2000, Brian Pomerantz wrote: On Thu, Mar 09, 2000 at 12:44:32AM +0100, Jakob Østergaard wrote: You don't _think_ you would see better performance ? I'm pretty sure you will see better performance. But on the other hand, with a large number of disks, sometimes the hot-swap capability comes in handy, and sometimes it's just nice to have a red light flashing next to the disk that died. Hardware RAID certainly still has it's niche :) - it's just usually not the performance one. If there isn't hot-swap RAID 5 with auto rebuild, it will never happen. I meant; SW usually has really good performance, not HW is unsuitable for HPC. Anyway I had the impression that you were looking for the hightes possible performance from RAID-0 sets. I see from your reply that I misunderstood the proportions of your storage needs. Sure with external storage solutions, it may still be far the easiest to use a simple interconnect and let the external disk solution take care of the RAID logic. And given what one will have to pay for this anyway, going SW is probably not the way to cut costs in half. SW RAID is beautiful for a handfull or three of disks, but when you're working with hundreds of disks the administrative costs of not-so-flexible-if-any hotswap is a killer. I still maintain that it would be interesting to see software RAID-0 on this size of systems though, as hotswap usually doesn't matter so much on RAID-0 anyway. That will be a 2.4 task though, as the SW RAID in 2.2 is not able to handle this number of disks. It would be nice if a program such as ASCI could put the resources needed into Linux to actually get decent hot swap capability... Let's see what happens. Cheers, -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...: