Re: [CentOS] CVE-2015-0235 - glibc gethostbyname
At 15:09 -0800 28/1/15, David C. Miller wrote: Although I hate Oracle with a fury, one good thing is that they put all the updates they rebuild for their RHEL clone in a publicly viewable site. I'm guessing they pay Redhat for extended support on end of life RHEL4 to get access to the source rpms. I learned about this from another list member back when the bash shell shock exploit hit. http://public-yum.oracle.com/repo/EnterpriseLinux/EL4/latest/ Thanks David, I wasn't aware of that resource. Regards S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] CVE-2015-0235 - glibc gethostbyname
Hi, For reasons which are too tiresome to bore you all with, I have an obligation to look after a suite of legacy CentOS 4.x systems which cannot be migrated upwards. I note on https://access.redhat.com/articles/1332213 the following comment from a RHN person: We are currently working on and testing errata for RHEL 4, we will post an update for it as soon as it's ready. Thank you for your patience! Is there *any* prospect of updated glibc packages for CentOS 4.x being made available? Cheers S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] Virtualising legacy CentOS 4.x servers
Dear all, I look after a number of CentOS 4.x servers running legacy applications that depend on ancient versions of various things (such as MySQL 3.x) and which can't be upgraded without non-trivial development effort. I've been considering virtualising them and as a test have been trialling with a company that uses Parallels Cloud Server 6. However, I've run into a roadblock in that the Parallels Tools installer in PCS6 require a version of glibc higher than that which is available in CentOS 4.x (v2.5 required versus v2.3.4 installed). Without the guest OS tools installed it's impossible to migrate a VM from node to node or back it up without shutting the VM down first, which is less than useful. So I have two questions: 1) Does anyone know if there is a version of the PCS6 Tools built against glibc 2.3.4 available anywhere? 2) Is there an alternative virtualisation environment I should be looking at which fully supports CentOS 4.x as a guest OS? And if so, does anyone have recommendations for a hosting supplier that offers that environment (ideally UK based). Many thanks Simon ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Virtualising legacy CentOS 4.x servers
At 12:58 -0500 14/5/14, Les Mikesell wrote: If you are running physical machines now, you don't have that ability anyway... True, but that's a reason to try and migrate to a better environment which would allow it. Does it have to be hosted? You could run under KVM/Virtualbox/Vmware, etc. on your own hardware. Yes, it has to be hosted. Aiming to get away from having to own physical hardware with all that entails support-wise. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Inquiry:How to compare two files but not in line-by-line basis?
At 08:54 + 2/12/09, hadi motamedi wrote: Dear All Can you please do me favor and let me know how can I compare two files but not in line-by-line basis on my CentOS server ? I mean say row#1 in file1 has the same data as say row#5 in file2 , but the comm compares them in line-by-line basis that is not intended . It seems that the diff cannot do the job as well This'll show you which lines are common to both files, and for the ones that aren't which file they're in. perl -MData::Dumper -le 'while() {chomp; push @{$s-{$_}}, $ARGV}; END{ print Dumper($s) }' file1 file2 ... someone will be along shortly with a more elegant method. HTH S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] How to clone CentOS server ?
At 12:43 +0200 26/8/09, przemol...@poczta.fm wrote: Hello, I'd like to clone existing CentOS server. Can anybody recommend any working solution to achieve that ? I've used the dd + netcat + live CD technique with success in the past eg: http://alma.ch/blogs/bahut/2005/02/wonders-of-dd-and-netcat-cloning-os.html Cheers Simon ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] CentOS, PHP, Basic GIS
At 23:44 -0800 22/12/08, Michael A. Peters wrote: Thanks for any suggestions. I may try to find a GIS for dummies type book, though I've generally not been fond of dummy books, I kind of feel like one when it comes to GIS. Hi Michael, If you get no satisfactory answers here, you might try talking to the Antiquist group (http://www.antiquist.org/ and http://groups.google.com/group/antiquist) who work extensively with GIS and open source tools. Regards Simon ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re:[CentOS] ClamAV help needed
Every day I see in logwatch that my signatures are updated, and the database notified, but if I try to scan a file manually it tells me that my signatures are 55 days old. I think clamscan looks for the db files in a compiled-in default location of /usr/local/share/clamav and doesn't consult the clamd.conf or freshclam.conf files (after all, why would it?) I fixed it up by symlinking my confgured DatabaseDirectory to where clamscan expected to find things. HTH Simon ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] ClamAV help needed
At 14:48 +0200 17/6/08, Ralph Angenendt wrote: It doesn't here: Is your copy installed from rpm/yum or compiled from source? Mine's the latter. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] ClamAV help needed
At 16:43 +0200 17/6/08, Ralph Angenendt wrote: Is your copy installed from rpm/yum or compiled from source? Mine's the latter. rpmforge. Ah - looking more deeply, my source was configured without --with-dbdir=/var/lib/clamav which is why it defaulted to looking in /usr/local/share/clamav Now rebuilt with the --with-dbdir option, and everything's looking in the correct place. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 13:49 -0400 2/10/07, Ross S. W. Walker wrote: Sounds like the issue is more of a CPU issue then a disk issue, so just upgrading the hardware and OS should make a big difference in itself, Yeah, that was the plan :-) Basically, we worked out what we needed to do (alleviate peak load CPU bottleneck by upgrading hardware), sought what we imagined would be suitable (dual faster CPU, hardware RAID 1, lots of RAM), and then ran into a brick wall with disk performance while testing - something that's never been an issue to date on the existing webservers which have a single IDE disk each. but I would profile the SQL queries to make sure they are not trying to bite off more then they need to. Fair point - we've done a lot of database tuning in the 5 years this app's been under development, so that's pretty well covered. With the existing hardware, (the back-end dbserver's a 1GB 1.6GHz P4 with mdadm RAID 1) the dbserver load barely reaches 1 even under peak traffic - we're not SQL- or IO-bound, we're CPU-bound on the front end. Well when you created the file system the write cache wasn't installed yet right? True, but there have been many wipes and installs since the BBUs have been available and the same long pauses when the inodes are created (much more noticeable with CentOS 4.5 than 5, but then the default nr_requests is 128 in 5 rather than 8192 in 4.5) that initially drew my attention are still apparent. And it may be that when you were creating the file system it was right after you created the RAID1 array and the controller may have been still sync'ing up the disks, which will slow things down tremendously. I noted that from the documentation at the outset and did an initial verify of the RAID 1 through the 3ware BIOS before doing the original install. A previous life as a technical author makes me a bit of a RTFM freak :-) I agree that it is the edge cases that can come back and bite you just be sure you don't over-scope those edge cases for situations that will never arise. That's why I'm now building the machine as if there wasn't an issue, so I can hammer it with apachebench and see if I'm tilting at windmills or not. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 12:30 +0200 2/10/07, matthias platzer wrote: What I did to work around them was basically switching to XFS for everything except / (3ware say their cards are fast, but only on XFS) AND using very low nr_requests for every blockdev on the 3ware card. Hi Matthias, Thanks for this. In my CentOS 5 tests the nr_requests turned out by default to be 128, rather than the 8192 of CentOS 4.5. I'll have a go at reducing it still further. If you can, you could also try _not_ putting the system disks on the 3ware card, because additionally the 3ware driver/card gives writes priority. I've noticed that kicking off a simulataneous pair of dd reads and writes from/to the RAID 1 array indicates that very clearly - only with cfq as the elevator did reads get any kind of look-in. Sadly, I'm not able to separate the system disks off as there's no on-board SATA on the mboard nor any room for inboard disks, the original intention was to provide the resilience of hardware RAID 1 for the entire machine. People suggested the unresponsive system behaviour is because the cpu hanging in iowait for writing and then reading the system binaries won't happen till the writes are done, so the binaries should be on another io path. Yup, that certainly seems to be what's happening. Wish I had another io path... All this seem to be symptoms of a very complex issue consisting of kernel bugs/bad drivers/... and they seem to be worst on a AMD/3ware Combination. here is another link: http://bugzilla.kernel.org/show_bug.cgi?id=7372 Ouch - thanks for that link :-( Looks like I'm screwed big time. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 09:24 -0400 2/10/07, Ross S. W. Walker wrote: Actually the real-real fix was to use the 'deadline' or 'noop' scheduler with this card as the default 'cfq' scheduler was designed to work with a single drive and not a multiple drive RAID, so it acts as a govenor on the amount of IO that a single process can send to the device and when you do multiple overlapping IOs performance decreases instead of increases. Ah - that wasn't actually a complete fix Ross, but it did give a noticeable improvement in certain situations. I'm still chasing a real real 'general purpose' fix. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
What is the recurring performance problem you are seeing? Pretty much exactly the symptoms described in http://bugzilla.kernel.org/show_bug.cgi?id=7372 relating to read starvation under heavy write IO causing sluggish system response. I recently graphed the blocks in/blocks out from vmstat 1 for the same test using each of the four IO schedulers (see the PDF attached to the article below): http://community.novacaster.com/showarticle.pl?id=7492 The test was: dd if=/dev/sda of=/dev/null bs=1M count=4096 ; sleep 5; dd if=/dev/zero of=./4G bs=1M count=4096 Despite appearances, interactive responsiveness subjectively felt better using deadline than cfq - but this is obviously an atypical workload and so now I'm focusing on finishing building the machine completely so I can try profiling the more typical patterns of activity that it'll experience when in use. I find myself wondering whether the fact that the array looks like a single SCSI disk to the OS means that cfq is able to perform better in terms of interleaving reads and writes to the card but that some side effect of its work is causing the responsiveness issue at the same time. Pure speculation on my part - this is way outside my experience. I'm also looking into trying an Areca card instead (avoiding LSI because they're cited as having the same issue in the bugzilla mentioned above). S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 12:41 -0400 2/10/07, Ross S. W. Walker wrote: If the performance issue is identical to the kernel bug mentioned in the posting then the only real fix that was mentioned was to switch to 32bit from 64bit or to down-rev your kernel, which on CentOS means to go down to 4.5 from 5.0. The irony is that I'm already running 32bit[*], and that the responsiveness problem's worse on 4.5. S. * we specifically went for the Opteron 250 so we could stay at 32-bit because some software components we need to use may not yet be 64bit clean. The intention was to migrate later to 64bit on the same hardware, once those wrinkles had been ironed out. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 13:03 -0400 2/10/07, Ross S. W. Walker wrote: Have you tried calculating the performance of your current drives on paper to see if it matches your reality? It may just be that your disks suck... They're performing to spec for 7200rpm SATA II drives - your help in determining which was the appropriate elevator to use showed that. What is the server going to be doing? What is the workload of your application? Originally, it was going to be hosting a number of VMWare installations each containing a separate self contained LAMP website (for ease of subsequent migration), but that's gone by the board in favour of dispensing with the VMWare aspect. Now the websites will be NameVhosts under a single Apache directly on the native OS. The app on each website is MySQL-backed and Perl CGI intensive. DB intended to be on a separate (identical) server. All running swimmingly at present on 4 year old single 1.6GHz P4s with single IDE disks, 512MB RAM and RH7.3 - except at peak times when they're a bit CPU bound. Loadave rarely above 1 or 2 most of the time. Which is why I'm now focused on getting the non-VMWare approach up and running so I can profile it, instead of getting hung up on benchmarking the empty hardware. I'd never have started if I'd not noticed a terrific slowdown halfway through creating the filesystem when doing an initial CentOS 4.3 install many many weeks ago. It may be that it will work fine for what you need it to do? Yeah - but it's the edge cases that bite you. Can't be doing with a production server where it's possible to accidentally step on an indeterminate trigger that sends responsiveness into a nosedive. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 09:14 -0400 26/9/07, Ross S. W. Walker wrote: Could you try the benchmarks with the 'deadline' scheduler? OK, these are all with RHEL5, driver 2.26.06.002-2.6.18, RAID 1: elevator=deadline: Sequential reads: | 2007/09/26-16:19:30 | START | 3065 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) (-p u) | 2007/09/26-16:20:00 | STAT | 3065 | v1.2.8 | /dev/sdb | Total read throughput: 45353642.7B/s (43.25MB/s), IOPS 11072.7/s. Sequential writes: | 2007/09/26-16:20:00 | START | 3082 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) (-p u) | 2007/09/26-16:20:30 | STAT | 3082 | v1.2.8 | /dev/sdb | Total write throughput: 53781186.2B/s (51.29MB/s), IOPS 13130.2/s. Random reads: | 2007/09/26-16:20:30 | START | 3091 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) (-D 100:0) | 2007/09/26-16:21:00 | STAT | 3091 | v1.2.8 | /dev/sdb | Total read throughput: 545587.2B/s (0.52MB/s), IOPS 133.2/s. Random writes: | 2007/09/26-16:21:00 | START | 3098 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) (-D 0:100) | 2007/09/26-16:21:44 | STAT | 3098 | v1.2.8 | /dev/sdb | Total write throughput: 795852.8B/s (0.76MB/s), IOPS 194.3/s. Here are the others for comparison. elevator=noop: Sequential reads: | 2007/09/26-16:24:02 | START | 3167 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) (-p u) | 2007/09/26-16:24:32 | STAT | 3167 | v1.2.8 | /dev/sdb | Total read throughput: 45467374.9B/s (43.36MB/s), IOPS 11100.4/s. Sequential writes: | 2007/09/26-16:24:32 | START | 3176 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) (-p u) | 2007/09/26-16:25:02 | STAT | 3176 | v1.2.8 | /dev/sdb | Total write throughput: 53825672.5B/s (51.33MB/s), IOPS 13141.0/s. Random reads: | 2007/09/26-16:25:03 | START | 3193 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) (-D 100:0) | 2007/09/26-16:25:32 | STAT | 3193 | v1.2.8 | /dev/sdb | Total read throughput: 540954.5B/s (0.52MB/s), IOPS 132.1/s. Random writes: | 2007/09/26-16:25:32 | START | 3202 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) (-D 0:100) | 2007/09/26-16:26:16 | STAT | 3202 | v1.2.8 | /dev/sdb | Total write throughput: 795989.3B/s (0.76MB/s), IOPS 194.3/s. elevator=anticipatory: Sequential reads: | 2007/09/26-16:37:04 | START | 3277 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) (-p u) | 2007/09/26-16:37:34 | STAT | 3277 | v1.2.8 | /dev/sdb | Total read throughput: 45414126.9B/s (43.31MB/s), IOPS 11087.4/s. Sequential writes: | 2007/09/26-16:37:35 | START | 3284 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) (-p u) | 2007/09/26-16:38:04 | STAT | 3284 | v1.2.8 | /dev/sdb | Total write throughput: 53895168.0B/s (51.40MB/s), IOPS 13158.0/s. Random reads: | 2007/09/26-16:38:04 | START | 3293 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) (-D 100:0) | 2007/09/26-16:38:34 | STAT | 3293 | v1.2.8 | /dev/sdb | Total read throughput: 467080.5B/s (0.45MB/s), IOPS 114.0/s. Random writes: | 2007/09/26-16:38:34 | START | 3300 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) (-D 0:100) | 2007/09/26-16:39:18 | STAT | 3300 | v1.2.8 | /dev/sdb | Total write throughput: 793122.1B/s (0.76MB/s), IOPS 193.6/s. elevator=cfq (just to re-check): Sequential reads: | 2007/09/26-16:42:18 | START | 3353 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) (-p u) | 2007/09/26-16:42:48 | STAT | 3353 | v1.2.8 | /dev/sdb | Total read throughput: 2463470.9B/s (2.35MB/s), IOPS 601.4/s. Sequential writes: | 2007/09/26-16:42:48 | START | 3360 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) (-p u) | 2007/09/26-16:43:18 | STAT | 3360 | v1.2.8 | /dev/sdb | Total write throughput: 54572782.9B/s (52.04MB/s), IOPS 13323.4/s. Random reads: | 2007/09/26-16:43:19 | START | 3369 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) (-D 100:0) | 2007/09/26-16:43:48 | STAT | 3369 | v1.2.8 | /dev/sdb | Total read throughput: 267652.4B/s (0.26MB/s), IOPS 65.3/s. Random writes: | 2007/09/26-16:43:48 | START | 3376 | v1.2.8 | /dev/sdb | Start args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) (-D 0:100) | 2007/09/26-16:44:31 | STAT | 3376 | v1.2.8 | /dev/sdb | Total write throughput: 793122.1B/s (0.76MB/s), IOPS 193.6/s. Certainly cfq is severely cramping the reads, it appears. S. ___ CentOS mailing list
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 12:01 -0400 26/9/07, Ross S. W. Walker wrote: CFQ is intended for single disk workstations and it's io limits are based on that, so it actually acts as an io govenor on RAID setups. Only use 'cfq' on single disk workstations. Use 'deadline' on RAID setups and servers. Many thanks Ross, that's one variable tied down at least :-) S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 13:35 -0400 24/9/07, Ross S. W. Walker wrote: Ok, so here is the command I would use: Thanks - here are the results (tried CentOS 4.5 and RHEL5, with tests on sdb when configured as both RAID 0 and as RAID 1): Sequential reads: disktest -B 4k -h 1 -I BD -K 4 -p l -P T -T 300 -r /dev/sdX CentOS 4.5, RAID 0: | 2007/09/25-14:26:58 | STAT | 13944 | v1.2.8 | /dev/sdb | Total read throughput: 50249728.0B/s (47.92MB/s), IOPS 12268.0/s. | 2007/09/25-14:26:58 | END | 13944 | v1.2.8 | /dev/sdb | Test Done (Passed) CentOS 4.5, RAID 1: | 2007/09/25-14:20:06 | STAT | 13807 | v1.2.8 | /dev/sdb | Total read throughput: 44994150.4B/s (42.91MB/s), IOPS 10984.9/s. | 2007/09/25-14:20:06 | END | 13807 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 0: | 2007/09/25-11:07:46 | STAT | 2835 | v1.2.8 | /dev/sdb | Total read throughput: 2405171.2B/s (2.29MB/s), IOPS 587.2/s. | 2007/09/25-11:07:46 | END | 2835 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 1: | 2007/09/25-11:35:53 | STAT | 3022 | v1.2.8 | /dev/sdb | Total read throughput: 2461696.0B/s (2.35MB/s), IOPS 601.0/s. | 2007/09/25-11:35:53 | END | 3022 | v1.2.8 | /dev/sdb | Test Done (Passed) Sequential writes: disktest -B 4k -h 1 -I BD -K 4 -p l -P T -T 300 -w /dev/sdX CentOS 4.5, RAID 0: | 2007/09/25-14:28:19 | STAT | 13951 | v1.2.8 | /dev/sdb | Total write throughput: 66150946.1B/s (63.09MB/s), IOPS 16150.1/s. | 2007/09/25-14:28:19 | END | 13951 | v1.2.8 | /dev/sdb | Test Done (Passed) CentOS 4.5, RAID 1: | 2007/09/25-14:21:52 | STAT | 13815 | v1.2.8 | /dev/sdb | Total write throughput: 53170039.5B/s (50.71MB/s), IOPS 12981.0/s. | 2007/09/25-14:21:52 | END | 13815 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 0: | 2007/09/25-11:13:44 | STAT | 2850 | v1.2.8 | /dev/sdb | Total write throughput: 66031616.0B/s (62.97MB/s), IOPS 16121.0/s. | 2007/09/25-11:13:44 | END | 2850 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 1: | 2007/09/25-11:36:36 | STAT | 3031 | v1.2.8 | /dev/sdb | Total write throughput: 56870229.3B/s (54.24MB/s), IOPS 13884.3/s. | 2007/09/25-11:36:36 | END | 3031 | v1.2.8 | /dev/sdb | Test Done (Passed) Random reads: disktest -B 4k -h 1 -I BD -K 4 -p r -P T -T 300 -r /dev/sdX CentOS 4.5, RAID 0: | 2007/09/25-14:28:59 | STAT | 13958 | v1.2.8 | /dev/sdb | Total read throughput: 504217.6B/s (0.48MB/s), IOPS 123.1/s. | 2007/09/25-14:28:59 | END | 13958 | v1.2.8 | /dev/sdb | Test Done (Passed) CentOS 4.5, RAID 1: | 2007/09/25-14:23:14 | STAT | 13822 | v1.2.8 | /dev/sdb | Total read throughput: 549570.2B/s (0.52MB/s), IOPS 134.2/s. | 2007/09/25-14:23:14 | END | 13822 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 0: | 2007/09/25-11:16:21 | STAT | 2875 | v1.2.8 | /dev/sdb | Total read throughput: 273612.8B/s (0.26MB/s), IOPS 66.8/s. | 2007/09/25-11:16:21 | END | 2875 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 1: | 2007/09/25-11:39:20 | STAT | 3042 | v1.2.8 | /dev/sdb | Total read throughput: 546816.0B/s (0.52MB/s), IOPS 133.5/s. | 2007/09/25-11:39:20 | END | 3042 | v1.2.8 | /dev/sdb | Test Done (Passed) Random writes: disktest -B 4k -h 1 -I BD -K 4 -p r -P T -T 300 -w /dev/sdX CentOS 4.5, RAID 0: | 2007/09/25-14:29:34 | STAT | 13965 | v1.2.8 | /dev/sdb | Total write throughput: 1379532.8B/s (1.32MB/s), IOPS 336.8/s. | 2007/09/25-14:29:34 | END | 13965 | v1.2.8 | /dev/sdb | Test Done (Passed) CentOS 4.5, RAID 1: | 2007/09/25-14:24:15 | STAT | 13829 | v1.2.8 | /dev/sdb | Total write throughput: 782199.5B/s (0.75MB/s), IOPS 191.0/s. | 2007/09/25-14:24:15 | END | 13829 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5, RAID 0: | 2007/09/25-11:19:21 | STAT | 2894 | v1.2.8 | /dev/sdb | Total write throughput: 1377894.4B/s (1.31MB/s), IOPS 336.4/s. | 2007/09/25-11:19:21 | END | 2894 | v1.2.8 | /dev/sdb | Test Done (Passed) RHEL5 RAID 1: | 2007/09/25-11:40:08 | STAT | 3049 | v1.2.8 | /dev/sdb | Total write throughput: 798310.4B/s (0.76MB/s), IOPS 194.9/s. | 2007/09/25-11:40:08 | END | 3049 | v1.2.8 | /dev/sdb | Test Done (Passed) I'm not sure what to make of it, mind you. Cheers S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 07:46 +0800 24/9/07, Feizhou wrote: ... plus an Out of Memory kill of sshd. Second time around (logged in on the console rather than over ssh), it's just the same except it's hald that happens to get clobbered instead. Are you saying that running in RAID0 mode with this card and motherboard combination, you get a memory leak? Who is the culprit? I don't know if it's caused by a memory leak or something else, I'm just describing what happens. I would be tempted to suspect the RAM itself if another identical machine didn't have exactly the same issue. what's left to try? Bug report... I've reported the issue to 3ware but they've not responded. I replicated the problem with RHEL AS 4 update 5 and contacted RedHat but they told me evaluation subscriptions aren't supported. I see there's a new firmware version out today (3ware codeset 9.4.1.3... I guess I'll update it and push the whole thing back up the hill for another go. I hope that fixes things for you. Maybe I'm thinking about this all wrong - maybe this responsiveness issue won't even arise during normal operation, perhaps it's just a symptom of intensive benchmarking when all the resources of the machine are devoted to throwing data at the card/disks as fast as possible. I'm now way out of my depth, frankly. I'm going to try the latest firmware upgrade, followed by RHEL/CentOS 5, and finally see if I can replicate with a different card (Areca or LSI, perhaps). Thanks for all the feedback, at least I feel as if I've tried every conceivable obvious thing. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 10:04 -0400 24/9/07, Ross S. W. Walker wrote: How about trying your benchmarks with the 'disktest' utility from the LTP (Linux Test Project), Now fetched and installed - I'd be grateful for a suggestion as to an appropriate disktest command line for a 4GB RAM twin CPU box with 250GB RAID 1 array, because I think you had your tongue in your cheek when you said: it is also a lot easier to setup and use. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 17:34 +0800 14/9/07, Feizhou wrote: .ohdo you have a BBU for your write cache on your 3ware board? Not installed, but the machine's on a UPS. Ugh. The 3ware code will not give OK then until the stuff has hit disk. Having now installed BBUs, it's made no difference to the underlying responsiveness problem I'm afraid. With ports 2 and 3 now configured as RAID 0, with ext3 filesystem and mounted on /mnt/raidtest, running this bonnie++ command: bonnie++ -m RA-256_NR-8192 -n 0 -u 0 -r 4096 -s 20480 -f -b -d /mnt/raidtest (RA- and NR- relate to kernel params for readahead and nr_requests respectively - the values above are Centos post-installation defaults) ...causes load to climb: 16:36:12 up 13 min, 2 users, load average: 8.77, 4.78, 1.98 ... and uninterruptible processes: ps ax | grep D PID TTY STAT TIME COMMAND 59 ?D 0:03 [kswapd0] 2159 ?D 0:01 [kjournald] 2923 ?Ds 0:00 syslogd -m 0 4155 ?D 0:00 [pdflush] 4175 ?D 0:00 [pdflush] 4192 ?D 0:00 [pdflush] 4193 ?D 0:00 [pdflush] 4197 ?D 0:00 [pdflush] 4199 ?D 0:00 [pdflush] 4201 pts/1R+ 0:00 grep D ... plus an Out of Memory kill of sshd. Second time around (logged in on the console rather than over ssh), it's just the same except it's hald that happens to get clobbered instead. Now that the presence or otherwise of a BBU has been ruled out along with OS, 3ware recommended kernel param tweaks, RAID level, LVM, slot speed, different but identical-spec hardware (both machine and card), what's left to try? I see there's a new firmware version out today (3ware codeset 9.4.1.3 - driver's still at 2.26.05.007 but the fw's updated to from 3.08.02.005 to 3.08.02.007), so I guess I'll update it and push the whole thing back up the hill for another go. If there's anyone out there with a 9550SX and a two-disk RAID 1 or RAID 0 config on CentOS 4.5 who can give the above bonnie++ benchmark a go (params adjusted for their own installed RAM - I'm benchmarking using 5x my installed amount) and let me know if they also have the same responsiveness problem or not, I'd seriously appreciate it. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
Hmm, how are you creating your ext3 filesystem(s) that you test on? Try creating it with a large journal (maybe 256MB) and run it in full journal mode. The filesystem was created during the initial CentOS installation, and I've tried it with ext2 which made no difference. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 17:34 +0800 14/9/07, Feizhou wrote: .ohdo you have a BBU for your write cache on your 3ware board? Not installed, but the machine's on a UPS. I see where you're going with larger journal idea and I'll give that a go. Cheers S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 15:43 +0200 14/9/07, Sebastian Walter wrote: Simon Banton wrote: No, I haven't. This is 3ware hardware RAID-1 on two disks with a single LVM ext3 / partition - I'm afraid I don't know how to go about discovering the chunk size to plug into Ross's calcs. You can see the chunk size either in the raid's BIOS tool (Alt-3 at startup) or, if installed, in the 3dm CLI (defaults to 64k, I think). Hmmm, from what I can see in the tw_cli documentation, stripe size (and hence, presumably, chunk size) doesn't apply to RAID 1. (apologies is the formatting goes awry): Stripe consists of the logical unit stripe size to be used. The following table illustrates the supported and applicable stripes on unit types and controller models. Stripe size units are in K (kilo bytes). Model | Raid0 | Raid1 | Raid5 | Raid10 | JBOD | Spare | Raid50 | Single | --+-+++-+--+---+++ 9K| 16| N/A | 16 | 16| N/A | N/A | 16 | N/A | | 64|| 64 | 64| | | 64 || | 256 || 256 | 256 | | | 256 || --+-+++-+--+---+++ I'm focused now on swapping the card for a fresh one to see if it makes any difference, as per Ross's suggestion. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 09:41 -0400 14/9/07, Ross S. W. Walker wrote: Try getting another identical 3ware card and swapping them. If it produces the same problem, then try putting that card in another box with a different motherboard to see if it works then. I've got three identical machines here - two as yet not unpacked - so I guess I'd better start unpacking another one. Getting hold of a comparable class machine with a different motherboard is going to be tricky though. It's going to be a busy weekend... S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 23:07 +0800 14/9/07, Feizhou wrote: Well, I do not think it will help much with a larger journal...you want RAM speed, not single 250GB SATA disk speed. Right now, I'd be happy with being able to configure the 3Ware care as a plain old SATA II passthru interface and do software RAID1 with mdadm - but no, Export JBOD doesn't seem possible any more with the 9550 (unless the units have previously been JBODs on earlier cards), you've got to use their 'Single Disk' config which exhibits exactly the same problems. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
RE: [CentOS] 3Ware 9550SX and latency/system responsiveness
At 11:16 -0400 14/9/07, Ross S. W. Walker wrote: Yes, a write-back cache with a BBU will definitely help, also your config, The write-cache is enabled, but what I've not known up to now is that the absence of a BBU will impact IO performance in this way - which seems to be what you and Feizhou are saying. Is there any way to tell the card to forget about not having a BBU and behave as if it did? The main problem here is the latency when under IO load not the throughput (or lack of). I don't care if it can't achieve 300MB/s sustained write speeds, only that it shouldn't bring the machine to its knees in the process of getting 35MB/s. 4x Seagate ST3250820SV 250GB in a RAID 1 plus 2 hot spare config is kinda wasteful, why not create a 4 disk RAID10 and get a 5th drive for a hot-spare. Logistics meant that it was more important to be able to cope with a disk failure without needing to visit the hosting centre immediately afterwards (which we'd have to do if there was only one hot spare). Also think about getting 2 internal SATA drives for the OS and keep the RAID10 as purely for data, that should make things humm nicely and to be able to upgrade your data storage without messing with your OS/application installation. It wouldn't cost a lot either, 2 SATA drives + 1 SAS drive. The server box is a Supermicro AS2020A - there is no onboard SATA nor any space for internal disks - there are 6 bays on a hot swap backplane and they're all cabled to the 3ware controller. I've unpacked and fired up one of the other identical machines and moved the drives from the original to this one and booted straight off them. The only difference between hardware is that the firmware on the 3Ware card in this one has not been updated (it's 3.04.00.005 from codeset 9.3.0 as opposed to 3.08.02.005 from 9.4.1.2). # /opt/iozone/bin/iozone -s 20480m -r 64 -i 0 -i 1 -t 1 Original box: Initial write 34208.20703 Rewrite 38133.20313 Read 79596.36719 Re-read 79669.22656 Newly unpacked box: Initial write 50230.10547 Rewrite 46108.17969 Read 78739.14844 Re-read 79325.11719 ... but the new one still shows the same IO blocking/responsiveness issue. S. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos