Re: [CentOS] CVE-2015-0235 - glibc gethostbyname

2015-01-29 Thread Simon Banton

At 15:09 -0800 28/1/15, David C. Miller wrote:


Although I hate Oracle with a fury, one good thing is that they put 
all the updates they rebuild for their RHEL clone in a publicly 
viewable site. I'm guessing they pay Redhat for extended support on 
end of life RHEL4 to get access to the source rpms. I learned about 
this from another list member back when the bash shell shock exploit 
hit.


http://public-yum.oracle.com/repo/EnterpriseLinux/EL4/latest/


Thanks David, I wasn't aware of that resource.

Regards
S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CVE-2015-0235 - glibc gethostbyname

2015-01-28 Thread Simon Banton

Hi,

For reasons which are too tiresome to bore you all with, I have an 
obligation to look after a suite of legacy CentOS 4.x systems which 
cannot be migrated upwards.


I note on https://access.redhat.com/articles/1332213 the following 
comment from a RHN person:


We are currently working on and testing errata for RHEL 4, we will 
post an update for it as soon as it's ready. Thank you for your 
patience!


Is there *any* prospect of updated glibc packages for CentOS 4.x 
being made available?


Cheers
S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Virtualising legacy CentOS 4.x servers

2014-05-14 Thread Simon Banton
Dear all,

I look after a number of CentOS 4.x servers running legacy 
applications that depend on ancient versions of various things (such 
as MySQL 3.x) and which can't be upgraded without non-trivial 
development effort.

I've been considering virtualising them and as a test have been 
trialling with a company that uses Parallels Cloud Server 6.

However, I've run into a roadblock in that the Parallels Tools 
installer in PCS6 require a version of glibc higher than that which 
is available in CentOS 4.x (v2.5 required versus v2.3.4 installed).

Without the guest OS tools installed it's impossible to migrate a VM 
from node to node or back it up without shutting the VM down first, 
which is less than useful.

So I have two questions:

1) Does anyone know if there is a version of the PCS6 Tools built 
against glibc 2.3.4 available anywhere?

2) Is there an alternative virtualisation environment I should be 
looking at which fully supports CentOS 4.x as a guest OS? And if so, 
does anyone have recommendations for a hosting supplier that offers 
that environment (ideally UK based).

Many thanks
Simon
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Virtualising legacy CentOS 4.x servers

2014-05-14 Thread Simon Banton
At 12:58 -0500 14/5/14, Les Mikesell wrote:

If you are running physical machines now, you don't have that 
ability anyway...

True, but that's a reason to try and migrate to a better environment 
which would allow it.

Does it have to be hosted?  You could run under KVM/Virtualbox/Vmware,
etc. on your own hardware.

Yes, it has to be hosted. Aiming to get away from having to own 
physical hardware with all that entails support-wise.

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Inquiry:How to compare two files but not in line-by-line basis?

2009-12-02 Thread Simon Banton
At 08:54 + 2/12/09, hadi motamedi wrote:
Dear All
Can you please do me favor and let me know how can I compare two 
files but not in line-by-line basis on my CentOS server ? I mean say 
row#1 in file1 has the same data as say row#5 in file2 , but the 
comm compares them in line-by-line basis that is not intended . It 
seems that the diff cannot do the job as well

This'll show you which lines are common to both files, and for the 
ones that aren't which file they're in.

perl -MData::Dumper -le 'while() {chomp; push @{$s-{$_}}, 
$ARGV}; END{ print Dumper($s) }' file1 file2

... someone will be along shortly with a more elegant method.

HTH

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] How to clone CentOS server ?

2009-08-26 Thread Simon Banton
At 12:43 +0200 26/8/09, przemol...@poczta.fm wrote:
Hello,

I'd like to clone existing CentOS server. Can anybody
recommend any working solution to achieve that ?

I've used the dd + netcat + live CD technique with success in the past eg:

http://alma.ch/blogs/bahut/2005/02/wonders-of-dd-and-netcat-cloning-os.html

Cheers
Simon
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS, PHP, Basic GIS

2008-12-23 Thread Simon Banton
At 23:44 -0800 22/12/08, Michael A. Peters wrote:
Thanks for any suggestions. I may try to find a GIS for dummies type
book, though I've generally not been fond of dummy books, I kind of feel
like one when it comes to GIS.

Hi Michael,

If you get no satisfactory answers here, you might try talking to the 
Antiquist group (http://www.antiquist.org/ and 
http://groups.google.com/group/antiquist) who work extensively with 
GIS and open source tools.

Regards
Simon
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re:[CentOS] ClamAV help needed

2008-06-17 Thread Simon Banton

Every day I see in logwatch that my signatures are updated, and the database
notified, but if I try to scan a file manually it tells me that my signatures
are 55 days old.


I think clamscan looks for the db files in a compiled-in default 
location of /usr/local/share/clamav and doesn't consult the 
clamd.conf or freshclam.conf files (after all, why would it?)


I fixed it up by symlinking my confgured DatabaseDirectory to where 
clamscan expected to find things.


HTH

Simon
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ClamAV help needed

2008-06-17 Thread Simon Banton

At 14:48 +0200 17/6/08, Ralph Angenendt wrote:

It doesn't here:


Is your copy installed from rpm/yum or compiled from source? Mine's the latter.

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ClamAV help needed

2008-06-17 Thread Simon Banton

At 16:43 +0200 17/6/08, Ralph Angenendt wrote:
  Is your copy installed from rpm/yum or compiled from source? 
Mine's the latter.


rpmforge.


Ah - looking more deeply, my source was configured without 
--with-dbdir=/var/lib/clamav which is why it defaulted to looking in 
/usr/local/share/clamav


Now rebuilt with the --with-dbdir option, and everything's looking in 
the correct place.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-03 Thread Simon Banton

At 13:49 -0400 2/10/07, Ross S. W. Walker wrote:

Sounds like the issue is more of a CPU issue then a disk issue, so
just upgrading the hardware and OS should make a big difference in
itself,


Yeah, that was the plan :-) Basically, we worked out what we needed 
to do (alleviate peak load CPU bottleneck by upgrading hardware), 
sought what we imagined would be suitable (dual faster CPU, hardware 
RAID 1, lots of RAM), and then ran into a brick wall with disk 
performance while testing - something that's never been an issue to 
date on the existing webservers which have a single IDE disk each.



but I would profile the SQL queries to make sure they are
not trying to bite off more then they need to.


Fair point - we've done a lot of database tuning in the 5 years this 
app's been under development, so that's pretty well covered. With the 
existing hardware, (the back-end dbserver's a 1GB 1.6GHz P4 with 
mdadm RAID 1) the dbserver load barely reaches 1 even under peak 
traffic - we're not SQL- or IO-bound, we're CPU-bound on the front 
end.



Well when you created the file system the write cache wasn't installed
yet right?


True, but there have been many wipes and installs since the BBUs have 
been available and the same long pauses when the inodes are created 
(much more noticeable with CentOS 4.5 than 5, but then the default 
nr_requests is 128 in 5 rather than 8192 in 4.5) that initially drew 
my attention are still apparent.



And it may be that when you were creating the file system it was right
after you created the RAID1 array and the controller may have been
still sync'ing up the disks, which will slow things down tremendously.


I noted that from the documentation at the outset and did an initial 
verify of the RAID 1 through the 3ware BIOS before doing the original 
install. A previous life as a technical author makes me a bit of a 
RTFM freak :-)



I agree that it is the edge cases that can come back and bite you
just be sure you don't over-scope those edge cases for situations
that will never arise.


That's why I'm now building the machine as if there wasn't an issue, 
so I can hammer it with apachebench and see if I'm tilting at 
windmills or not.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-02 Thread Simon Banton

At 12:30 +0200 2/10/07, matthias platzer wrote:


What I did to work around them was basically switching to XFS for 
everything except / (3ware say their cards are fast, but only on 
XFS) AND using very low nr_requests for every blockdev on the 3ware 
card.


Hi Matthias,

Thanks for this. In my CentOS 5 tests the nr_requests turned out by 
default to be 128, rather than the 8192 of CentOS 4.5. I'll have a go 
at reducing it still further.


If you can, you could also try _not_ putting the system disks on the 
3ware card, because additionally the 3ware driver/card gives writes 
priority.


I've noticed that kicking off a simulataneous pair of dd reads and 
writes from/to the RAID 1 array indicates that very clearly - only 
with cfq as the elevator did reads get any kind of look-in. Sadly, 
I'm not able to separate the system disks off as there's no on-board 
SATA on the mboard nor any room for inboard disks, the original 
intention was to provide the resilience of hardware RAID 1 for the 
entire machine.


People suggested the unresponsive system behaviour is because the 
cpu hanging in iowait for writing and then reading the system 
binaries won't happen till the writes are done, so the binaries 
should be on another io path.


Yup, that certainly seems to be what's happening. Wish I had another io path...

All this seem to be symptoms of a very complex issue consisting of 
kernel bugs/bad drivers/... and they seem to be worst on a AMD/3ware 
Combination.

here is another link:
http://bugzilla.kernel.org/show_bug.cgi?id=7372


Ouch - thanks for that link :-( Looks like I'm screwed big time.

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-02 Thread Simon Banton

At 09:24 -0400 2/10/07, Ross S. W. Walker wrote:

Actually the real-real fix was to use the 'deadline' or 'noop' scheduler
with this card as the default 'cfq' scheduler was designed to work with
a single drive and not a multiple drive RAID, so it acts as a govenor on
the amount of IO that a single process can send to the device and when
you do multiple overlapping IOs performance decreases instead of
increases.


Ah - that wasn't actually a complete fix Ross, but it did give a 
noticeable improvement in certain situations. I'm still chasing a 
real real 'general purpose' fix.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-02 Thread Simon Banton

What is the recurring performance problem you are seeing?


Pretty much exactly the symptoms described in 
http://bugzilla.kernel.org/show_bug.cgi?id=7372 relating to read 
starvation under heavy write IO causing sluggish system response.


I recently graphed the blocks in/blocks out from vmstat 1 for the 
same test using each of the four IO schedulers (see the PDF attached 
to the article below):


http://community.novacaster.com/showarticle.pl?id=7492

The test was:

dd if=/dev/sda of=/dev/null bs=1M count=4096 ; sleep 5; dd 
if=/dev/zero of=./4G bs=1M count=4096 


Despite appearances, interactive responsiveness subjectively felt 
better using deadline than cfq - but this is obviously an atypical 
workload and so now I'm focusing on finishing building the machine 
completely so I can try profiling the more typical patterns of 
activity that it'll experience when in use.


I find myself wondering whether the fact that the array looks like a 
single SCSI disk to the OS means that cfq is able to perform better 
in terms of interleaving reads and writes to the card but that some 
side effect of its work is causing the responsiveness issue at the 
same time. Pure speculation on my part - this is way outside my 
experience.


I'm also looking into trying an Areca card instead (avoiding LSI 
because they're cited as having the same issue in the bugzilla 
mentioned above).


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-02 Thread Simon Banton

At 12:41 -0400 2/10/07, Ross S. W. Walker wrote:

If the performance issue is identical to the kernel bug mentioned
in the posting then the only real fix that was mentioned was to
switch to 32bit from 64bit or to down-rev your kernel, which on
CentOS means to go down to 4.5 from 5.0.


The irony is that I'm already running 32bit[*], and that the 
responsiveness problem's worse on 4.5.


S.

* we specifically went for the Opteron 250 so we could stay at 32-bit 
because some software components we need to use may not yet be 64bit 
clean. The intention was to migrate later to 64bit on the same 
hardware, once those wrinkles had been ironed out.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-10-02 Thread Simon Banton

At 13:03 -0400 2/10/07, Ross S. W. Walker wrote:

Have you tried calculating the performance of your current drives on
paper to see if it matches your reality? It may just be that your
disks suck...


They're performing to spec for 7200rpm SATA II drives - your help in 
determining which was the appropriate elevator to use showed that.



What is the server going to be doing? What is the workload of your
application?


Originally, it was going to be hosting a number of VMWare 
installations each containing a separate self contained LAMP website 
(for ease of subsequent migration), but that's gone by the board in 
favour of dispensing with the VMWare aspect. Now the websites will be 
NameVhosts under a single Apache directly on the native OS.


The app on each website is MySQL-backed and Perl CGI intensive. DB 
intended to be on a separate (identical) server. All running 
swimmingly at present on 4 year old single 1.6GHz P4s with single IDE 
disks, 512MB RAM and RH7.3 - except at peak times when they're a bit 
CPU bound. Loadave rarely above 1 or 2 most of the time.


Which is why I'm now focused on getting the non-VMWare approach up 
and running so I can profile it, instead of getting hung up on 
benchmarking the empty hardware. I'd never have started if I'd not 
noticed a terrific slowdown halfway through creating the filesystem 
when doing an initial CentOS 4.3 install many many weeks ago.



It may be that it will work fine for what you need it
to do?


Yeah - but it's the edge cases that bite you. Can't be doing with a 
production server where it's possible to accidentally step on an 
indeterminate trigger that sends responsiveness into a nosedive.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-26 Thread Simon Banton

At 09:14 -0400 26/9/07, Ross S. W. Walker wrote:

Could you try the benchmarks with the 'deadline' scheduler?


OK, these are all with RHEL5, driver 2.26.06.002-2.6.18, RAID 1:

elevator=deadline:
Sequential reads:
| 2007/09/26-16:19:30 | START | 3065 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:20:00 | STAT  | 3065 | v1.2.8 | /dev/sdb | Total read 
throughput: 45353642.7B/s (43.25MB/s), IOPS 11072.7/s.

Sequential writes:
| 2007/09/26-16:20:00 | START | 3082 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:20:30 | STAT  | 3082 | v1.2.8 | /dev/sdb | Total 
write throughput: 53781186.2B/s (51.29MB/s), IOPS 13130.2/s.

Random reads:
| 2007/09/26-16:20:30 | START | 3091 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) 
(-D 100:0)
| 2007/09/26-16:21:00 | STAT  | 3091 | v1.2.8 | /dev/sdb | Total read 
throughput: 545587.2B/s (0.52MB/s), IOPS 133.2/s.

Random writes:
| 2007/09/26-16:21:00 | START | 3098 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) 
(-D 0:100)
| 2007/09/26-16:21:44 | STAT  | 3098 | v1.2.8 | /dev/sdb | Total 
write throughput: 795852.8B/s (0.76MB/s), IOPS 194.3/s.


Here are the others for comparison.

elevator=noop:
Sequential reads:
| 2007/09/26-16:24:02 | START | 3167 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:24:32 | STAT  | 3167 | v1.2.8 | /dev/sdb | Total read 
throughput: 45467374.9B/s (43.36MB/s), IOPS 11100.4/s.

Sequential writes:
| 2007/09/26-16:24:32 | START | 3176 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:25:02 | STAT  | 3176 | v1.2.8 | /dev/sdb | Total 
write throughput: 53825672.5B/s (51.33MB/s), IOPS 13141.0/s.

Random reads:
| 2007/09/26-16:25:03 | START | 3193 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) 
(-D 100:0)
| 2007/09/26-16:25:32 | STAT  | 3193 | v1.2.8 | /dev/sdb | Total read 
throughput: 540954.5B/s (0.52MB/s), IOPS 132.1/s.

Random writes:
| 2007/09/26-16:25:32 | START | 3202 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) 
(-D 0:100)
| 2007/09/26-16:26:16 | STAT  | 3202 | v1.2.8 | /dev/sdb | Total 
write throughput: 795989.3B/s (0.76MB/s), IOPS 194.3/s.


elevator=anticipatory:
Sequential reads:
| 2007/09/26-16:37:04 | START | 3277 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:37:34 | STAT  | 3277 | v1.2.8 | /dev/sdb | Total read 
throughput: 45414126.9B/s (43.31MB/s), IOPS 11087.4/s.

Sequential writes:
| 2007/09/26-16:37:35 | START | 3284 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:38:04 | STAT  | 3284 | v1.2.8 | /dev/sdb | Total 
write throughput: 53895168.0B/s (51.40MB/s), IOPS 13158.0/s.

Random reads:
| 2007/09/26-16:38:04 | START | 3293 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) 
(-D 100:0)
| 2007/09/26-16:38:34 | STAT  | 3293 | v1.2.8 | /dev/sdb | Total read 
throughput: 467080.5B/s (0.45MB/s), IOPS 114.0/s.

Random writes:
| 2007/09/26-16:38:34 | START | 3300 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) 
(-D 0:100)
| 2007/09/26-16:39:18 | STAT  | 3300 | v1.2.8 | /dev/sdb | Total 
write throughput: 793122.1B/s (0.76MB/s), IOPS 193.6/s.


elevator=cfq (just to re-check):
Sequential reads:
| 2007/09/26-16:42:18 | START | 3353 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -r (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:42:48 | STAT  | 3353 | v1.2.8 | /dev/sdb | Total read 
throughput: 2463470.9B/s (2.35MB/s), IOPS 601.4/s.

Sequential writes:
| 2007/09/26-16:42:48 | START | 3360 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p l -P T -T 30 -w (-N 488259583) (-c) 
(-p u)
| 2007/09/26-16:43:18 | STAT  | 3360 | v1.2.8 | /dev/sdb | Total 
write throughput: 54572782.9B/s (52.04MB/s), IOPS 13323.4/s.

Random reads:
| 2007/09/26-16:43:19 | START | 3369 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -r (-N 488259583) (-c) 
(-D 100:0)
| 2007/09/26-16:43:48 | STAT  | 3369 | v1.2.8 | /dev/sdb | Total read 
throughput: 267652.4B/s (0.26MB/s), IOPS 65.3/s.

Random writes:
| 2007/09/26-16:43:48 | START | 3376 | v1.2.8 | /dev/sdb | Start 
args: -B 4k -h 1 -I BD -K 4 -p r -P T -T 30 -w (-N 488259583) (-c) 
(-D 0:100)
| 2007/09/26-16:44:31 | STAT  | 3376 | v1.2.8 | /dev/sdb | Total 
write throughput: 793122.1B/s (0.76MB/s), IOPS 193.6/s.


Certainly cfq is severely cramping the reads, it appears.

S.
___
CentOS mailing list

RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-26 Thread Simon Banton

At 12:01 -0400 26/9/07, Ross S. W. Walker wrote:

CFQ is intended for single disk workstations and it's io limits are
based on that, so it actually acts as an io govenor on RAID setups.

Only use 'cfq' on single disk workstations.

Use 'deadline' on RAID setups and servers.


Many thanks Ross, that's one variable tied down at least :-)

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-25 Thread Simon Banton

At 13:35 -0400 24/9/07, Ross S. W. Walker wrote:

Ok, so here is the command I would use:


Thanks - here are the results (tried CentOS 4.5 and RHEL5, with tests 
on sdb when configured as both RAID 0 and as RAID 1):



Sequential reads:
disktest -B 4k -h 1 -I BD -K 4 -p l -P T -T 300 -r /dev/sdX


CentOS 4.5, RAID 0:
| 2007/09/25-14:26:58 | STAT  | 13944 | v1.2.8 | /dev/sdb | Total 
read throughput: 50249728.0B/s (47.92MB/s), IOPS 12268.0/s.

| 2007/09/25-14:26:58 | END   | 13944 | v1.2.8 | /dev/sdb | Test Done (Passed)

CentOS 4.5, RAID 1:
| 2007/09/25-14:20:06 | STAT  | 13807 | v1.2.8 | /dev/sdb | Total 
read throughput: 44994150.4B/s (42.91MB/s), IOPS 10984.9/s.

| 2007/09/25-14:20:06 | END   | 13807 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 0:
| 2007/09/25-11:07:46 | STAT  | 2835 | v1.2.8 | /dev/sdb | Total read 
throughput: 2405171.2B/s (2.29MB/s), IOPS 587.2/s.

| 2007/09/25-11:07:46 | END   | 2835 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 1:
| 2007/09/25-11:35:53 | STAT  | 3022 | v1.2.8 | /dev/sdb | Total read 
throughput: 2461696.0B/s (2.35MB/s), IOPS 601.0/s.

| 2007/09/25-11:35:53 | END   | 3022 | v1.2.8 | /dev/sdb | Test Done (Passed)


Sequential writes:
disktest -B 4k -h 1 -I BD -K 4 -p l -P T -T 300 -w /dev/sdX


CentOS 4.5, RAID 0:
| 2007/09/25-14:28:19 | STAT  | 13951 | v1.2.8 | /dev/sdb | Total 
write throughput: 66150946.1B/s (63.09MB/s), IOPS 16150.1/s.

| 2007/09/25-14:28:19 | END   | 13951 | v1.2.8 | /dev/sdb | Test Done (Passed)

CentOS 4.5, RAID 1:
| 2007/09/25-14:21:52 | STAT  | 13815 | v1.2.8 | /dev/sdb | Total 
write throughput: 53170039.5B/s (50.71MB/s), IOPS 12981.0/s.

| 2007/09/25-14:21:52 | END   | 13815 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 0:
| 2007/09/25-11:13:44 | STAT  | 2850 | v1.2.8 | /dev/sdb | Total 
write throughput: 66031616.0B/s (62.97MB/s), IOPS 16121.0/s.

| 2007/09/25-11:13:44 | END   | 2850 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 1:
| 2007/09/25-11:36:36 | STAT  | 3031 | v1.2.8 | /dev/sdb | Total 
write throughput: 56870229.3B/s (54.24MB/s), IOPS 13884.3/s.

| 2007/09/25-11:36:36 | END   | 3031 | v1.2.8 | /dev/sdb | Test Done (Passed)


Random reads:
disktest -B 4k -h 1 -I BD -K 4 -p r -P T -T 300 -r /dev/sdX


CentOS 4.5, RAID 0:
| 2007/09/25-14:28:59 | STAT  | 13958 | v1.2.8 | /dev/sdb | Total 
read throughput: 504217.6B/s (0.48MB/s), IOPS 123.1/s.

| 2007/09/25-14:28:59 | END   | 13958 | v1.2.8 | /dev/sdb | Test Done (Passed)

CentOS 4.5, RAID 1:
| 2007/09/25-14:23:14 | STAT  | 13822 | v1.2.8 | /dev/sdb | Total 
read throughput: 549570.2B/s (0.52MB/s), IOPS 134.2/s.

| 2007/09/25-14:23:14 | END   | 13822 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 0:
| 2007/09/25-11:16:21 | STAT  | 2875 | v1.2.8 | /dev/sdb | Total read 
throughput: 273612.8B/s (0.26MB/s), IOPS 66.8/s.

| 2007/09/25-11:16:21 | END   | 2875 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 1:
| 2007/09/25-11:39:20 | STAT  | 3042 | v1.2.8 | /dev/sdb | Total read 
throughput: 546816.0B/s (0.52MB/s), IOPS 133.5/s.

| 2007/09/25-11:39:20 | END   | 3042 | v1.2.8 | /dev/sdb | Test Done (Passed)


Random writes:
disktest -B 4k -h 1 -I BD -K 4 -p r -P T -T 300 -w /dev/sdX


CentOS 4.5, RAID 0:
| 2007/09/25-14:29:34 | STAT  | 13965 | v1.2.8 | /dev/sdb | Total 
write throughput: 1379532.8B/s (1.32MB/s), IOPS 336.8/s.

| 2007/09/25-14:29:34 | END   | 13965 | v1.2.8 | /dev/sdb | Test Done (Passed)

CentOS 4.5, RAID 1:
| 2007/09/25-14:24:15 | STAT  | 13829 | v1.2.8 | /dev/sdb | Total 
write throughput: 782199.5B/s (0.75MB/s), IOPS 191.0/s.

| 2007/09/25-14:24:15 | END   | 13829 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5, RAID 0:
| 2007/09/25-11:19:21 | STAT  | 2894 | v1.2.8 | /dev/sdb | Total 
write throughput: 1377894.4B/s (1.31MB/s), IOPS 336.4/s.

| 2007/09/25-11:19:21 | END   | 2894 | v1.2.8 | /dev/sdb | Test Done (Passed)

RHEL5 RAID 1:
| 2007/09/25-11:40:08 | STAT  | 3049 | v1.2.8 | /dev/sdb | Total 
write throughput: 798310.4B/s (0.76MB/s), IOPS 194.9/s.

| 2007/09/25-11:40:08 | END   | 3049 | v1.2.8 | /dev/sdb | Test Done (Passed)

I'm not sure what to make of it, mind you.

Cheers
S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-24 Thread Simon Banton

At 07:46 +0800 24/9/07, Feizhou wrote:
... plus an Out of Memory kill of sshd. Second time around (logged 
in on the console rather than over ssh), it's just the same except 
it's hald that happens to get clobbered instead.


Are you saying that running in RAID0 mode with this card and 
motherboard combination, you get a memory leak? Who is the culprit?


I don't know if it's caused by a memory leak or something else, I'm 
just describing what happens. I would be tempted to suspect the RAM 
itself if another identical machine didn't have exactly the same 
issue.



what's left to try?


Bug report...


I've reported the issue to 3ware but they've not responded. I 
replicated the problem with RHEL AS 4 update 5 and contacted RedHat 
but they told me evaluation subscriptions aren't supported.


I see there's a new firmware version out today (3ware codeset 
9.4.1.3... I guess I'll update it and push the whole thing back up 
the hill for another go.


I hope that fixes things for you.


Maybe I'm thinking about this all wrong - maybe this responsiveness 
issue won't even arise during normal operation, perhaps it's just a 
symptom of intensive benchmarking when all the resources of the 
machine are devoted to throwing data at the card/disks as fast as 
possible. I'm now way out of my depth, frankly.


I'm going to try the latest firmware upgrade, followed by RHEL/CentOS 
5, and finally see if I can replicate with a different card (Areca or 
LSI, perhaps).


Thanks for all the feedback, at least I feel as if I've tried every 
conceivable obvious thing.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-24 Thread Simon Banton

At 10:04 -0400 24/9/07, Ross S. W. Walker wrote:

How about trying your benchmarks with the 'disktest' utility from the
LTP (Linux Test Project),


Now fetched and installed - I'd be grateful for a suggestion as to an 
appropriate disktest command line for a 4GB RAM twin CPU box with 
250GB RAID 1 array, because I think you had your tongue in your cheek 
when you said:



it is also a
lot easier to setup and use.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-21 Thread Simon Banton

At 17:34 +0800 14/9/07, Feizhou wrote:

.ohdo you have a BBU for your write cache on your 3ware board?


Not installed, but the machine's on a UPS.


Ugh. The 3ware code will not give OK then until the stuff has hit disk.


Having now installed BBUs, it's made no difference to the underlying 
responsiveness problem I'm afraid.


With ports 2 and 3 now configured as RAID 0, with ext3 filesystem and 
mounted on /mnt/raidtest, running this bonnie++ command:


bonnie++ -m RA-256_NR-8192 -n 0 -u 0 -r 4096 -s 20480 -f -b -d /mnt/raidtest

(RA- and NR- relate to kernel params for readahead and nr_requests 
respectively - the values above are Centos post-installation defaults)


...causes load to climb:

16:36:12 up 13 min,  2 users,  load average: 8.77, 4.78, 1.98

... and uninterruptible processes:

 ps ax | grep D
  PID TTY  STAT   TIME COMMAND
   59 ?D  0:03 [kswapd0]
 2159 ?D  0:01 [kjournald]
 2923 ?Ds 0:00 syslogd -m 0
 4155 ?D  0:00 [pdflush]
 4175 ?D  0:00 [pdflush]
 4192 ?D  0:00 [pdflush]
 4193 ?D  0:00 [pdflush]
 4197 ?D  0:00 [pdflush]
 4199 ?D  0:00 [pdflush]
 4201 pts/1R+ 0:00 grep D

... plus an Out of Memory kill of sshd. Second time around (logged in 
on the console rather than over ssh), it's just the same except it's 
hald that happens to get clobbered instead.


Now that the presence or otherwise of a BBU has been ruled out along 
with OS, 3ware recommended kernel param tweaks, RAID level, LVM, slot 
speed, different but identical-spec hardware (both machine and card), 
what's left to try?


I see there's a new firmware version out today (3ware codeset 9.4.1.3 
- driver's still at 2.26.05.007 but the fw's updated to from 
3.08.02.005 to 3.08.02.007), so I guess I'll update it and push the 
whole thing back up the hill for another go.


If there's anyone out there with a 9550SX and a two-disk RAID 1 or 
RAID 0 config on CentOS 4.5 who can give the above bonnie++ benchmark 
a go (params adjusted for their own installed RAM - I'm benchmarking 
using 5x my installed amount) and let me know if they also have the 
same responsiveness problem or not, I'd seriously appreciate it.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton
Hmm, how are you creating your ext3 filesystem(s) that you test on? 
Try creating it with a large journal (maybe 256MB) and run it in 
full journal mode.


The filesystem was created during the initial CentOS installation, 
and I've tried it with ext2 which made no difference.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton

At 17:34 +0800 14/9/07, Feizhou wrote:

.ohdo you have a BBU for your write cache on your 3ware board?


Not installed, but the machine's on a UPS.

I see where you're going with larger journal idea and I'll give that a go.

Cheers
S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton

At 15:43 +0200 14/9/07, Sebastian Walter wrote:

Simon Banton wrote:
  No, I haven't. This is 3ware hardware RAID-1 on two disks with a

 single LVM ext3 / partition - I'm afraid I don't know how to go about
 discovering the chunk size to plug into Ross's calcs.


You can see the chunk size either in the raid's BIOS tool (Alt-3 at
startup) or, if installed, in the 3dm CLI (defaults to 64k, I think).


Hmmm, from what I can see in the tw_cli documentation, stripe size 
(and hence, presumably, chunk size) doesn't apply to RAID 1.


(apologies is the formatting goes awry):

Stripe consists of the logical unit stripe size to be used. The 
following table illustrates the supported and applicable stripes on 
unit types and controller models. Stripe size units are in K (kilo 
bytes).


 Model | Raid0   | Raid1  | Raid5  | Raid10  | JBOD | Spare | 
Raid50 | Single |


--+-+++-+--+---+++
 9K|   16|   N/A  |   16   |   16| N/A  |  N/A  | 
16   |   N/A  |
   |   64||   64   |   64|  |   | 
64   ||
   |   256   ||   256  |   256   |  |   | 
256  ||


--+-+++-+--+---+++

I'm focused now on swapping the card for a fresh one to see if it 
makes any difference, as per Ross's suggestion.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton

At 09:41 -0400 14/9/07, Ross S. W. Walker wrote:

Try getting another identical 3ware card and swapping them. If it
produces the same problem, then try putting that card in another
box with a different motherboard to see if it works then.


I've got three identical machines here - two as yet not unpacked - so 
I guess I'd better start unpacking another one. Getting hold of a 
comparable class machine with a different motherboard is going to be 
tricky though.


It's going to be a busy weekend...

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton

At 23:07 +0800 14/9/07, Feizhou wrote:
Well, I do not think it will help much with a larger journal...you 
want RAM speed, not single 250GB SATA disk speed.


Right now, I'd be happy with being able to configure the 3Ware care 
as a plain old SATA II passthru interface and do software RAID1 with 
mdadm - but no, Export JBOD doesn't seem possible any more with the 
9550 (unless the units have previously been JBODs on earlier cards), 
you've got to use their 'Single Disk' config which exhibits exactly 
the same problems.


S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] 3Ware 9550SX and latency/system responsiveness

2007-09-14 Thread Simon Banton

At 11:16 -0400 14/9/07, Ross S. W. Walker wrote:

Yes, a write-back cache with a BBU will definitely help, also your config,


The write-cache is enabled, but what I've not known up to now is that 
the absence of a BBU will impact IO performance in this way - which 
seems to be what you and Feizhou are saying. Is there any way to tell 
the card to forget about not having a BBU and behave as if it did?


The main problem here is the latency when under IO load not the 
throughput (or lack of). I don't care if it can't achieve 300MB/s 
sustained write speeds, only that it shouldn't bring the machine to 
its knees in the process of getting 35MB/s.



  4x Seagate ST3250820SV 250GB in a RAID 1 plus 2 hot spare config

is kinda wasteful, why not create a 4 disk RAID10 and get a 5th drive for
a hot-spare.


Logistics meant that it was more important to be able to cope with a 
disk failure without needing to visit the hosting centre immediately 
afterwards (which we'd have to do if there was only one hot spare).



Also think about getting 2 internal SATA drives for the OS and keep the
RAID10 as purely for data, that should make things humm nicely and to be
able to upgrade your data storage without messing with your OS/application
installation. It wouldn't cost a lot either, 2 SATA drives + 1 SAS drive.


The server box is a Supermicro AS2020A - there is no onboard SATA nor 
any space for internal disks - there are 6 bays on a hot swap 
backplane and they're all cabled to the 3ware controller.


I've unpacked and fired up one of the other identical machines and 
moved the drives from the original to this one and booted straight 
off them.


The only difference between hardware is that the firmware on the 
3Ware card in this one has not been updated (it's 3.04.00.005 from 
codeset 9.3.0 as opposed to 3.08.02.005 from 9.4.1.2).


# /opt/iozone/bin/iozone -s 20480m -r 64 -i 0 -i 1 -t 1

Original box:
  Initial write 34208.20703
Rewrite 38133.20313
   Read 79596.36719
Re-read 79669.22656

Newly unpacked box:
  Initial write 50230.10547
Rewrite 46108.17969
   Read 78739.14844
Re-read 79325.11719

... but the new one still shows the same IO blocking/responsiveness issue.

S.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos