SATA-performance part 2

2007-02-14 Thread Martin A. Fink
Dear all,

now I installed oprofile as suggested, and very interesting things happend:

System: OpenSuSE 10.2 with AHCI on, disk: Solid State Disk (Flash Disk)
Test: Write blocks of 1MB. Do fsync() every 1GB. Measure time for each GB.

before installation of oprofile:
testOpenSuSE 10.2   
FreeBSD 6.2
write to raw device  25GB   26+/-1 MB/s at 4-10% CPU48+/-0 
MB/s at 1% CPU
write to ext3 2GB   39+/-5 MB/s at 10-15% CPU

after installation of oprofile:
testOpenSuSE 10.2   
FreeBSD 6.2
write to raw device  25GB   48+/-0.5 MB/s at 4-10% CPU  49+/-0 
MB/s at 1% CPU
write to ext3 (writeback) 2GB   40+/-5 MB/s at 10-15% CPU

after deinstallation of oprofile and only soft reboots (no hardware power off) 
these values STAYED (linux 48 MB/s) !! even for a brand new installation of 
OpenSuSE 10.2 to another partition!
After a hardware power off everything was again like before (26 MB/s).

So now the interesting questions to me are:

1. What is oprofile doing with my system ?? Especially what is been changed 
that remains a reboot ??

2. Buffers: All those that told me, that linux raw devices are totally 
unbuffered and thus are slower than devices with filesystems. Are you sure? 
If yes, where do you think comes this increase of speed (26 to 48) ??

3. Advices using ext3 with writeback option: I do not see an increase of 
performance with that.

Thanks,

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA-performance part 2

2007-02-14 Thread Martin A. Fink
Dear all,

now I installed oprofile as suggested, and very interesting things happend:

System: OpenSuSE 10.2 with AHCI on, disk: Solid State Disk (Flash Disk)
Test: Write blocks of 1MB. Do fsync() every 1GB. Measure time for each GB.

before installation of oprofile:
testOpenSuSE 10.2   
FreeBSD 6.2
write to raw device  25GB   26+/-1 MB/s at 4-10% CPU48+/-0 
MB/s at 1% CPU
write to ext3 2GB   39+/-5 MB/s at 10-15% CPU

after installation of oprofile:
testOpenSuSE 10.2   
FreeBSD 6.2
write to raw device  25GB   48+/-0.5 MB/s at 4-10% CPU  49+/-0 
MB/s at 1% CPU
write to ext3 (writeback) 2GB   40+/-5 MB/s at 10-15% CPU

after deinstallation of oprofile and only soft reboots (no hardware power off) 
these values STAYED (linux 48 MB/s) !! even for a brand new installation of 
OpenSuSE 10.2 to another partition!
After a hardware power off everything was again like before (26 MB/s).

So now the interesting questions to me are:

1. What is oprofile doing with my system ?? Especially what is been changed 
that remains a reboot ??

2. Buffers: All those that told me, that linux raw devices are totally 
unbuffered and thus are slower than devices with filesystems. Are you sure? 
If yes, where do you think comes this increase of speed (26 to 48) ??

3. Advices using ext3 with writeback option: I do not see an increase of 
performance with that.

Thanks,

Martin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 13:24 schrieben Sie:
> Martin A. Fink wrote:
> 
> >> Also you have skipped the information how the images "arrive" on the 
system 
> > (PCI(e) card?), that may be important for an "end to end" view of the 
> > problem.
> > 
> > Images arrive via Gigabit Ethernet. GigE Vision standard. (PCIe x4)
> 
> The the next question is: ChipSet/Used Protocol/JumboFrames/(NAPI)/... .
> 
> Have you already determined the load caused by this part?
> Depending on the GigE-Chipset, and Protocol/JumboFrames/(NAPI)/..., the 
involved overhead can be quite serious.
> 
> >> And what's also missing. What is "a long period of time".
> >> Calculating best-case with the SSD:
> >> 27GB divided by 30MB/s only gives a bit more than 15 Minutes.
> >> And worst case with 50MB/s is less than 10 Minutes.
> > 
> > Well. The testdrive has 27GB. The final drive will have 225 GB. And there 
will 
> > be 3 cameras and thus 3 disks. This means we talk about 140 MB/s for 
around 
> > 90 minutes.
> > For space applications with low power but high performance this is a long 
> > time... ;-)
> 
> The MB/CPU/RAM will be the one specified in the first mail?
> My gut feeling says: Forget it.
> 
> The needed total bandwidth may be to high and at least the incoming part via 
GigE may have serious overhead.
> 150MB/s in via (at least 2) GigE, without Zero-Copy there is another 150MB/s 
memory to memory.
> Then there is the next 150MB/s memory to the discs, without Zero-Copy there 
also another 150MB/s memory to memory.
> In total that's 300MB/s to 600MB/s without any processing.

I dont understand your calculation: from 3 GE ports come around 50 MB/each. 
These altogether 150MB/s have to be copied to memory. From there they will be 
copied to disk. So we talk about 2x150 MB/s running through my system. That 
is less than 2 PCIe lanes can handle... And there are more than 2 lanes 
between north and south bridge
> 
> But on the other hand, hdparm -T says my system (Core2Duo E6700, FSB1066, 
2GB DDR2-800 RAM, 32Bit) has a buffer-cache bandwidth around 4000MB/s.
> As you don't said which FSB and Memory-Type you have i would guess that your 
system should reach between 2000MB/s and 3500MB/s of LINEAR(!) memory 
bandwidth.
> (Total usable Memory-Bandwidth is unfortunately also dependent on usage 
pattern. Large & linear is not as important as with a rotating HDD, but it 
factors in)
> 
> 
> 
> Btw. On the topic of filesystem and Linux performance:
> SGI did a "really big" test some time ago width a big iron having 24 
Itanium2-CPUs in 12 nodes, and 12*2 GB of ram and having 256 discs using 
XFS(Which is from SGI!).
> The pdf-file is here:
> http://oss.sgi.com/projects/xfs/papers/ols2006/ols-2006-paper.pdf
> 
> According the the paper the system had a theoretical peak IO-performance of 
11.5 GB/s and practically peaked at 10.7GB/s reading and 8.9GB/s writing.
> IOW Linux and XFS CAN perform quite well, but the system has to have enough 
muscle for the job.
> And since the paper (and Kernel 2.6.5) the development of Linux hasn't 
stopped.
> 
> 
> 
> -- 
> Real Programmers consider "what you see is what you get" to be just as
> bad a concept in Text Editors as it is in women. No, the Real Programmer
> wants a "you asked for it, you got it" text editor -- complicated,
> cryptic, powerful, unforgiving, dangerous.
> 
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 12:25 schrieben Sie:
> > Well they do. The Flash disk I have (SATA-I) is capable of 48 MB/s and 
this 
> > value is reached over the whole disk size by windows as well as by 
FreeBSD. 
> > See my test results in the first thread.
> 
> Ok a flash disk should be more stable
> 
> > My Seagate Barracuda Harddisk drive (SATA-II) starts with 76 MB/s and 
> > decreases linearly to 35 MB/s due to the fact that it has to write to a 
> > rotating disk. But on a flash disk there is nothing rotating...
> 
> The hard disk one isn't guaranteed or stable but the flash especially if
> it is aimed at it ought to behave.
> 
> > So where is the difference between SATA-I and SATA-II ?
> 
> All physical side if they are on the same controller when you do the
> tests. Mostly latency,
> 
> > And why is FreeBSD able to write with constant rates (the complete 25 GB, 
all 
> > with 48+/-0.1 MB/s) but Linux 2.6.18 not ?
> 
> Does the FreeBSD fsync sync to media ? Also what controller is being used
> here, and do you have EHCI USB support running ?
Manual of FreeBSD fsync says it syncs to media.

I used the same controller: Same computer, same harddisk. two partitions on 
the system disk, one for linux, one for freebsd.

EHCI:

ehci_hcd :00:1d.7: EHCI Host Controller
ehci_hcd :00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: Product: EHCI Host Controller

AHCI

ahci :00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode

> 
> > With a dedicated (rotating) SATA II device, using the first 70% of disk 
space 
> > no problem -- tested ! With a SATA-I device only a problem with Linux 
2.6.18
> 
> I suspect the SATA-1 itself may not be the decider but something else -
> eg the hard disk using NCQ, which would cover up any latency related
> problems.
> 
> > Journaling of data: you are right, ext2 performs better than ext3.
> 
> And ext3 in writeback mode ought in theory (but practice is always
> harder ;)) be faster than ext2.
> 
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 11:16 schrieben Sie:
> Martin A. Fink wrote:
> > Am Dienstag, 13. Februar 2007 00:31 schrieben Sie:
> >> Martin A. Fink wrote:
> >>> I have to store big amounts of data coming from 2 digital cameras to 
disk. 
> >>> Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
> >>> for 
> >>> a long period of time. So it is important for me that the harddisk drive 
> >>> is 
> >>> reliable in the sense of "if it is capable of 50 MB/s then it should 
> >>> operate 
> >>> at this speed. Constantly."
> >> The good old handful of suggestions:
> >>
> >> - Use a dedicated disc for the task.
> > 
> > I used a dedicated disk for this task. No one else besides the task is 
writing 
> > to it!
> 
> OK.
> 
> >> - Use an empty disc so there is no fragmentation.
> > 
> > All tests were performed on empty disk!
> 
> OK.
> 
> >> - Buy a bigger disk, they have high bandwidths.
> > 
> > I have a flash disk from a manufacturer who grants me 48 MB/s. And FreeBSD 
as 
> > well as Windows reach this value. Only Linux 2.6.18 is far away from it 
(42 
> > MB/s)
> 
> Even 48MB/s is quite low.
> I've reached up to 70MB/s with a single 500GB Seagate model and even my 
older HDDs all reach 60MB/s (at least on the outer cylinders)
> But i haven't tested any "sync/fsync" in between, only after.

Please Read Carefully! I talk about flash disk, not normal harddisks. There 
are no mechanical parts in flash disks, only flash memory. And therefore 
48MB/s is excellent (compared to all other available disks)

> 
> >> - Buy a more "specialized" disc.
> > 
> > see above
> > 
> >>   for e.x.: Western Digital Raptor X(*) a 150GB, 10-KRPM S-ATA disc.
> >> - Buy several discs and use RAID 0
> >>   or alternate between discs when writing.
> > 
> > What I have to build is an application for the International Space Station 
> > ISS. I am limited with power and space. So If the disk is able to write 
> > constantly 48 MB/s then the Operating System should do this!
> 
> OK. That appears to be a serious constraint.
> Do HDDs cope well with zero gravity?

Yes and no. Yes: standard desktop HDDs are unproblematic. Laptop HDDs have 
g-force shock hardware that works on zero-g detection and thus Laptop HDDs 
can't be used in space. At least modern ones can't...

> At least the SSD won't have a problem with that. ;-)
> 
> > The problem is: FreeBSD is fast, but lacks of some special drivers. Linux 
has 
> > all drivers but access to harddisk is unpredictable and thus unreliable!
> > What can I do??
> 
> Personally i haven't had such bad write speeds in years. Taking USB 
connected and/or encrypted partitions aside.
> But on the other hand: I don't sync(fsync) until i have to.

If you don't have to - no problem. But if you use filesystem you do a fsync 
every time you close the file (and filesize is less then 1-2 GB)
> And personally i have good (and constant bandwidth) experience using XFS as 
a filesystem.
> (I have 41 HDDs with a total capacity of 10.5 TB, performance is quite 
important for me.)
> 
> Also you have skipped the information how the images "arrive" on the system 
(PCI(e) card?), that may be important for an "end to end" view of the 
problem.

Images arrive via Gigabit Ethernet. GigE Vision standard. (PCIe x4)
> 
> And what's also missing. What is "a long period of time".
> Calculating best-case with the SSD:
> 27GB divided by 30MB/s only gives a bit more than 15 Minutes.
> And worst case with 50MB/s is less than 10 Minutes.

Well. The testdrive has 27GB. The final drive will have 225 GB. And there will 
be 3 cameras and thus 3 disks. This means we talk about 140 MB/s for around 
90 minutes.
For space applications with low power but high performance this is a long 
time... ;-)
> 
> 
> 
> 
> 
> -- 
> Real Programmers consider "what you see is what you get" to be just as
> bad a concept in Text Editors as it is in women. No, the Real Programmer
> wants a "you asked for it, you got it" text editor -- complicated,
> cryptic, powerful, unforgiving, dangerous.
> 
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Montag, 12. Februar 2007 20:08 schrieben Sie:
> On Mon, 12 Feb 2007 18:56:29 +0100
> "Martin A. Fink" <[EMAIL PROTECTED]> wrote:
> 
> > I have to store big amounts of data coming from 2 digital cameras to disk. 
> > Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
for 
> > a long period of time. So it is important for me that the harddisk drive 
is 
> > reliable in the sense of "if it is capable of 50 MB/s then it should 
operate 
> > at this speed. Constantly."
> 
> Hard disks don't do this. They support operations/second based upon
> physical and rotational latency constraints, vibration levels, mechanism,
> internal layout policy and the need to do housekeeping. 

Well they do. The Flash disk I have (SATA-I) is capable of 48 MB/s and this 
value is reached over the whole disk size by windows as well as by FreeBSD. 
See my test results in the first thread.
My Seagate Barracuda Harddisk drive (SATA-II) starts with 76 MB/s and 
decreases linearly to 35 MB/s due to the fact that it has to write to a 
rotating disk. But on a flash disk there is nothing rotating...

So where is the difference between SATA-I and SATA-II ?
And why is FreeBSD able to write with constant rates (the complete 25 GB, all 
with 48+/-0.1 MB/s) but Linux 2.6.18 not ?

> 
> If you have an ATA7 drive with suitable firmware sets you can talk to it
> directly via the SG_IO interface and use the streaming feature set which
> is quite different to filesystem type operations and lets you ask the
> drive to do this sort of stuff - if you can find any general PC firmware
> ones that support it anyway.
> 
> I'm not sure you'll get 50MB/sec sustained to work although you might
> with a good current drive used for nothing else, a linear stream of data
> (no seeking and file system overhead), and a non PCI controller (PCI
> Express, host chipset bus etc). 

With a dedicated (rotating) SATA II device, using the first 70% of disk space 
no problem -- tested ! With a SATA-I device only a problem with Linux 2.6.18
> 
> If you are using a file system then the more you fsync the more I'd
> expect you to see stalling as you keep draining whats effectively an 8MB
> plus pipeline on a modern drive precisely because fsync does "hitting
> disk" guarantees. You also want to be sure you are not journalling data.

That is true. Thus i do the sync only after every 1GB of written data. That is 
not to often in my eyes...
Journaling of data: you are right, ext2 performs better than ext3.


Martin
> 
> Alan
> 
> 
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 00:31 schrieben Sie:
> Martin A. Fink wrote:
> > I have to store big amounts of data coming from 2 digital cameras to disk. 
> > Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
for 
> > a long period of time. So it is important for me that the harddisk drive 
is 
> > reliable in the sense of "if it is capable of 50 MB/s then it should 
operate 
> > at this speed. Constantly."
> 
> The good old handful of suggestions:
> 
> - Use a dedicated disc for the task.

I used a dedicated disk for this task. No one else besides the task is writing 
to it!

> - Use an empty disc so there is no fragmentation.

All tests were performed on empty disk!

> - Buy a bigger disk, they have high bandwidths.

I have a flash disk from a manufacturer who grants me 48 MB/s. And FreeBSD as 
well as Windows reach this value. Only Linux 2.6.18 is far away from it (42 
MB/s)

> - Buy a more "specialized" disc.

see above

>   for e.x.: Western Digital Raptor X(*) a 150GB, 10-KRPM S-ATA disc.
> - Buy several discs and use RAID 0
>   or alternate between discs when writing.

What I have to build is an application for the International Space Station 
ISS. I am limited with power and space. So If the disk is able to write 
constantly 48 MB/s then the Operating System should do this!

> - use XFS. AFAIK XFS has about the best "large file" and "high
> bandwidth" characteristics.
> - that with XFS you can preallocate the files doesn't seem relevant in
> this case. It's more for the case that you write several files
> simultaneously over a longer period of time.
> - Write to one large file and separate the individual files later.
> 
> if you are sure that you don't get a power-failure:
> - Disable Write-Barriers, especially on a logging-filesystem.
> - Enable write-caching.
> (hdparm doesn't appear to be able to do that with a SATA-disc, but
> blktool appears to be able to)
> The later has a good chance of corrupting your filesystem when you do
> get a power-failure!!!
> 
> 
> 
> *:
> I don't think you want something from the server-line,
> SCSI/FibreChannel/...?
> IIRC i read a something about the first 100MB/s disc with in the 15-KRPM
> league.

Power consumption! See above.
> 
> Bis denn
> 
The problem is: FreeBSD is fast, but lacks of some special drivers. Linux has 
all drivers but access to harddisk is unpredictable and thus unreliable!
What can I do??
> -- 
> Real Programmers consider "what you see is what you get" to be just as
> bad a concept in Text Editors as it is in women. No, the Real Programmer
> wants a "you asked for it, you got it" text editor -- complicated,
> cryptic, powerful, unforgiving, dangerous.
> 
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 00:31 schrieben Sie:
 Martin A. Fink wrote:
  I have to store big amounts of data coming from 2 digital cameras to disk. 
  Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
for 
  a long period of time. So it is important for me that the harddisk drive 
is 
  reliable in the sense of if it is capable of 50 MB/s then it should 
operate 
  at this speed. Constantly.
 
 The good old handful of suggestions:
 
 - Use a dedicated disc for the task.

I used a dedicated disk for this task. No one else besides the task is writing 
to it!

 - Use an empty disc so there is no fragmentation.

All tests were performed on empty disk!

 - Buy a bigger disk, they have high bandwidths.

I have a flash disk from a manufacturer who grants me 48 MB/s. And FreeBSD as 
well as Windows reach this value. Only Linux 2.6.18 is far away from it (42 
MB/s)

 - Buy a more specialized disc.

see above

   for e.x.: Western Digital Raptor X(*) a 150GB, 10-KRPM S-ATA disc.
 - Buy several discs and use RAID 0
   or alternate between discs when writing.

What I have to build is an application for the International Space Station 
ISS. I am limited with power and space. So If the disk is able to write 
constantly 48 MB/s then the Operating System should do this!

 - use XFS. AFAIK XFS has about the best large file and high
 bandwidth characteristics.
 - that with XFS you can preallocate the files doesn't seem relevant in
 this case. It's more for the case that you write several files
 simultaneously over a longer period of time.
 - Write to one large file and separate the individual files later.
 
 if you are sure that you don't get a power-failure:
 - Disable Write-Barriers, especially on a logging-filesystem.
 - Enable write-caching.
 (hdparm doesn't appear to be able to do that with a SATA-disc, but
 blktool appears to be able to)
 The later has a good chance of corrupting your filesystem when you do
 get a power-failure!!!
 
 
 
 *:
 I don't think you want something from the server-line,
 SCSI/FibreChannel/...?
 IIRC i read a something about the first 100MB/s disc with in the 15-KRPM
 league.

Power consumption! See above.
 
 Bis denn
 
The problem is: FreeBSD is fast, but lacks of some special drivers. Linux has 
all drivers but access to harddisk is unpredictable and thus unreliable!
What can I do??
 -- 
 Real Programmers consider what you see is what you get to be just as
 bad a concept in Text Editors as it is in women. No, the Real Programmer
 wants a you asked for it, you got it text editor -- complicated,
 cryptic, powerful, unforgiving, dangerous.
 
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Montag, 12. Februar 2007 20:08 schrieben Sie:
 On Mon, 12 Feb 2007 18:56:29 +0100
 Martin A. Fink [EMAIL PROTECTED] wrote:
 
  I have to store big amounts of data coming from 2 digital cameras to disk. 
  Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
for 
  a long period of time. So it is important for me that the harddisk drive 
is 
  reliable in the sense of if it is capable of 50 MB/s then it should 
operate 
  at this speed. Constantly.
 
 Hard disks don't do this. They support operations/second based upon
 physical and rotational latency constraints, vibration levels, mechanism,
 internal layout policy and the need to do housekeeping. 

Well they do. The Flash disk I have (SATA-I) is capable of 48 MB/s and this 
value is reached over the whole disk size by windows as well as by FreeBSD. 
See my test results in the first thread.
My Seagate Barracuda Harddisk drive (SATA-II) starts with 76 MB/s and 
decreases linearly to 35 MB/s due to the fact that it has to write to a 
rotating disk. But on a flash disk there is nothing rotating...

So where is the difference between SATA-I and SATA-II ?
And why is FreeBSD able to write with constant rates (the complete 25 GB, all 
with 48+/-0.1 MB/s) but Linux 2.6.18 not ?

 
 If you have an ATA7 drive with suitable firmware sets you can talk to it
 directly via the SG_IO interface and use the streaming feature set which
 is quite different to filesystem type operations and lets you ask the
 drive to do this sort of stuff - if you can find any general PC firmware
 ones that support it anyway.
 
 I'm not sure you'll get 50MB/sec sustained to work although you might
 with a good current drive used for nothing else, a linear stream of data
 (no seeking and file system overhead), and a non PCI controller (PCI
 Express, host chipset bus etc). 

With a dedicated (rotating) SATA II device, using the first 70% of disk space 
no problem -- tested ! With a SATA-I device only a problem with Linux 2.6.18
 
 If you are using a file system then the more you fsync the more I'd
 expect you to see stalling as you keep draining whats effectively an 8MB
 plus pipeline on a modern drive precisely because fsync does hitting
 disk guarantees. You also want to be sure you are not journalling data.

That is true. Thus i do the sync only after every 1GB of written data. That is 
not to often in my eyes...
Journaling of data: you are right, ext2 performs better than ext3.


Martin
 
 Alan
 
 
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 11:16 schrieben Sie:
 Martin A. Fink wrote:
  Am Dienstag, 13. Februar 2007 00:31 schrieben Sie:
  Martin A. Fink wrote:
  I have to store big amounts of data coming from 2 digital cameras to 
disk. 
  Thus I have to write blocks of around 1 MB at 30 to 50 frames per second 
  for 
  a long period of time. So it is important for me that the harddisk drive 
  is 
  reliable in the sense of if it is capable of 50 MB/s then it should 
  operate 
  at this speed. Constantly.
  The good old handful of suggestions:
 
  - Use a dedicated disc for the task.
  
  I used a dedicated disk for this task. No one else besides the task is 
writing 
  to it!
 
 OK.
 
  - Use an empty disc so there is no fragmentation.
  
  All tests were performed on empty disk!
 
 OK.
 
  - Buy a bigger disk, they have high bandwidths.
  
  I have a flash disk from a manufacturer who grants me 48 MB/s. And FreeBSD 
as 
  well as Windows reach this value. Only Linux 2.6.18 is far away from it 
(42 
  MB/s)
 
 Even 48MB/s is quite low.
 I've reached up to 70MB/s with a single 500GB Seagate model and even my 
older HDDs all reach 60MB/s (at least on the outer cylinders)
 But i haven't tested any sync/fsync in between, only after.

Please Read Carefully! I talk about flash disk, not normal harddisks. There 
are no mechanical parts in flash disks, only flash memory. And therefore 
48MB/s is excellent (compared to all other available disks)

 
  - Buy a more specialized disc.
  
  see above
  
for e.x.: Western Digital Raptor X(*) a 150GB, 10-KRPM S-ATA disc.
  - Buy several discs and use RAID 0
or alternate between discs when writing.
  
  What I have to build is an application for the International Space Station 
  ISS. I am limited with power and space. So If the disk is able to write 
  constantly 48 MB/s then the Operating System should do this!
 
 OK. That appears to be a serious constraint.
 Do HDDs cope well with zero gravity?

Yes and no. Yes: standard desktop HDDs are unproblematic. Laptop HDDs have 
g-force shock hardware that works on zero-g detection and thus Laptop HDDs 
can't be used in space. At least modern ones can't...

 At least the SSD won't have a problem with that. ;-)
 
  The problem is: FreeBSD is fast, but lacks of some special drivers. Linux 
has 
  all drivers but access to harddisk is unpredictable and thus unreliable!
  What can I do??
 
 Personally i haven't had such bad write speeds in years. Taking USB 
connected and/or encrypted partitions aside.
 But on the other hand: I don't sync(fsync) until i have to.

If you don't have to - no problem. But if you use filesystem you do a fsync 
every time you close the file (and filesize is less then 1-2 GB)
 And personally i have good (and constant bandwidth) experience using XFS as 
a filesystem.
 (I have 41 HDDs with a total capacity of 10.5 TB, performance is quite 
important for me.)
 
 Also you have skipped the information how the images arrive on the system 
(PCI(e) card?), that may be important for an end to end view of the 
problem.

Images arrive via Gigabit Ethernet. GigE Vision standard. (PCIe x4)
 
 And what's also missing. What is a long period of time.
 Calculating best-case with the SSD:
 27GB divided by 30MB/s only gives a bit more than 15 Minutes.
 And worst case with 50MB/s is less than 10 Minutes.

Well. The testdrive has 27GB. The final drive will have 225 GB. And there will 
be 3 cameras and thus 3 disks. This means we talk about 140 MB/s for around 
90 minutes.
For space applications with low power but high performance this is a long 
time... ;-)
 
 
 
 
 
 -- 
 Real Programmers consider what you see is what you get to be just as
 bad a concept in Text Editors as it is in women. No, the Real Programmer
 wants a you asked for it, you got it text editor -- complicated,
 cryptic, powerful, unforgiving, dangerous.
 
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 12:25 schrieben Sie:
  Well they do. The Flash disk I have (SATA-I) is capable of 48 MB/s and 
this 
  value is reached over the whole disk size by windows as well as by 
FreeBSD. 
  See my test results in the first thread.
 
 Ok a flash disk should be more stable
 
  My Seagate Barracuda Harddisk drive (SATA-II) starts with 76 MB/s and 
  decreases linearly to 35 MB/s due to the fact that it has to write to a 
  rotating disk. But on a flash disk there is nothing rotating...
 
 The hard disk one isn't guaranteed or stable but the flash especially if
 it is aimed at it ought to behave.
 
  So where is the difference between SATA-I and SATA-II ?
 
 All physical side if they are on the same controller when you do the
 tests. Mostly latency,
 
  And why is FreeBSD able to write with constant rates (the complete 25 GB, 
all 
  with 48+/-0.1 MB/s) but Linux 2.6.18 not ?
 
 Does the FreeBSD fsync sync to media ? Also what controller is being used
 here, and do you have EHCI USB support running ?
Manual of FreeBSD fsync says it syncs to media.

I used the same controller: Same computer, same harddisk. two partitions on 
the system disk, one for linux, one for freebsd.

EHCI:

ehci_hcd :00:1d.7: EHCI Host Controller
ehci_hcd :00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: Product: EHCI Host Controller

AHCI

ahci :00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode

 
  With a dedicated (rotating) SATA II device, using the first 70% of disk 
space 
  no problem -- tested ! With a SATA-I device only a problem with Linux 
2.6.18
 
 I suspect the SATA-1 itself may not be the decider but something else -
 eg the hard disk using NCQ, which would cover up any latency related
 problems.
 
  Journaling of data: you are right, ext2 performs better than ext3.
 
 And ext3 in writeback mode ought in theory (but practice is always
 harder ;)) be faster than ext2.
 
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-13 Thread Martin A. Fink
Am Dienstag, 13. Februar 2007 13:24 schrieben Sie:
 Martin A. Fink wrote:
 
  Also you have skipped the information how the images arrive on the 
system 
  (PCI(e) card?), that may be important for an end to end view of the 
  problem.
  
  Images arrive via Gigabit Ethernet. GigE Vision standard. (PCIe x4)
 
 The the next question is: ChipSet/Used Protocol/JumboFrames/(NAPI)/... .
 
 Have you already determined the load caused by this part?
 Depending on the GigE-Chipset, and Protocol/JumboFrames/(NAPI)/..., the 
involved overhead can be quite serious.
 
  And what's also missing. What is a long period of time.
  Calculating best-case with the SSD:
  27GB divided by 30MB/s only gives a bit more than 15 Minutes.
  And worst case with 50MB/s is less than 10 Minutes.
  
  Well. The testdrive has 27GB. The final drive will have 225 GB. And there 
will 
  be 3 cameras and thus 3 disks. This means we talk about 140 MB/s for 
around 
  90 minutes.
  For space applications with low power but high performance this is a long 
  time... ;-)
 
 The MB/CPU/RAM will be the one specified in the first mail?
 My gut feeling says: Forget it.
 
 The needed total bandwidth may be to high and at least the incoming part via 
GigE may have serious overhead.
 150MB/s in via (at least 2) GigE, without Zero-Copy there is another 150MB/s 
memory to memory.
 Then there is the next 150MB/s memory to the discs, without Zero-Copy there 
also another 150MB/s memory to memory.
 In total that's 300MB/s to 600MB/s without any processing.

I dont understand your calculation: from 3 GE ports come around 50 MB/each. 
These altogether 150MB/s have to be copied to memory. From there they will be 
copied to disk. So we talk about 2x150 MB/s running through my system. That 
is less than 2 PCIe lanes can handle... And there are more than 2 lanes 
between north and south bridge
 
 But on the other hand, hdparm -T says my system (Core2Duo E6700, FSB1066, 
2GB DDR2-800 RAM, 32Bit) has a buffer-cache bandwidth around 4000MB/s.
 As you don't said which FSB and Memory-Type you have i would guess that your 
system should reach between 2000MB/s and 3500MB/s of LINEAR(!) memory 
bandwidth.
 (Total usable Memory-Bandwidth is unfortunately also dependent on usage 
pattern. Large  linear is not as important as with a rotating HDD, but it 
factors in)
 
 
 
 Btw. On the topic of filesystem and Linux performance:
 SGI did a really big test some time ago width a big iron having 24 
Itanium2-CPUs in 12 nodes, and 12*2 GB of ram and having 256 discs using 
XFS(Which is from SGI!).
 The pdf-file is here:
 http://oss.sgi.com/projects/xfs/papers/ols2006/ols-2006-paper.pdf
 
 According the the paper the system had a theoretical peak IO-performance of 
11.5 GB/s and practically peaked at 10.7GB/s reading and 8.9GB/s writing.
 IOW Linux and XFS CAN perform quite well, but the system has to have enough 
muscle for the job.
 And since the paper (and Kernel 2.6.5) the development of Linux hasn't 
stopped.
 
 
 
 -- 
 Real Programmers consider what you see is what you get to be just as
 bad a concept in Text Editors as it is in women. No, the Real Programmer
 wants a you asked for it, you got it text editor -- complicated,
 cryptic, powerful, unforgiving, dangerous.
 
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Am Montag, 12. Februar 2007 19:41 schrieben Sie:
> "Martin A. Fink" <[EMAIL PROTECTED]> writes:
> 
> Your mailer seems to be broken. It drops cc.
> > 
> > If you call fsync in BSD then you get what you expect. anything that is 
still 
> > not on disk will be written. Afterwards fsync returns... So this should be 
> > the same like with linux?!
> 
> Not necessarily.  The disk may buffer additionally. Handling that
> differs widely, but modern Linux forces flushes to platter if the hardware 
support 
> it.
> 
> > But the big question still is -- buffered or not -- where do the big 
> > variations within linux come frome? I am not writing small blocks. I write 
> > huge amounts of data.
> 
> 1MB is nowhere near huge by modern standards. Many IO subsystems are
> only happy with multi MB requests. 
> 
> > So the buffer will always be full.
> 
> Hardly. Especially not if you do synchronous fsync inbetween.

Well no. I write 1 GB in blocks of 1 MB. After that I call fsync. Then I 
process the next Gigabyte...
> 
> > If I use a normal SATA-II disk, there are no differences between BSD and 
Linux 
> > when writing to the raw device... So it cant be a buffer-problem alone.
> 
> Yes that is something that needs to be investigated. That is why I suggested
> oprofile if your assertation of a more CPU overhead on Linux is true.
> 
> > I still don't understand the buffer argument. If one writes 25 GB in 
blocks of 
> > 1 MB your buffer should be always full...
> 
> Your mental model of a IO subsystem seems to be quite off.
> Think what happens when you fsync and submit synchronously.

See above, how I do writing.
> 
> It's like sending something down a long pipe and waiting until it arrives
> at the bottom and you hear the echo of the impact. Then only then you send 
again. 
> There will be always long periods when the pipe will be empty.
> 
> If you use large enough blocks these gaps will be quite small and
> might effectively become unimportant, but 1MB is nowhere near big enough 
> for that.

I tested this: When I write in blocks of 8kB or less the effect you describe 
happens. But above 100kB blocksize there is no more increase of speed.

> 
> > Is there a buffered io device that I can use, but that does not use a 
> > filesystem?
> 
> /dev/sdX*. However it has some other issues that also don't make
> it ideal. File systems are usually best.

My experience with filesystems is: I write some data and the write-function 
returns nearly immediatelly. So I write again. Sometimes it returns only 
after some 100-300ms. I think this happens always then when the buffer is 
full and thus linux starts to write to disk. After this happend, it returns 
again nearly immediatelly and after another while the same trouble happen 
again. But not in a regular order...

I have to store big amounts of data coming from 2 digital cameras to disk. 
Thus I have to write blocks of around 1 MB at 30 to 50 frames per second for 
a long period of time. So it is important for me that the harddisk drive is 
reliable in the sense of "if it is capable of 50 MB/s then it should operate 
at this speed. Constantly."

> 
> -Andi
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
System Details:

dmesg: (parts)

Bootdata ok (command line is root=/dev/sda7 vga=0x31aresume=/dev/sda5 
splash=silent)
Linux version 2.6.18.2-34-default ([EMAIL PROTECTED]) (gcc version 4.1.2 
20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
...
Using ACPI (MADT) for SMP configuration information
...
Intel(R) Core(TM)2 CPU  6300  @ 1.86GHz stepping 06
Brought up 2 CPUs
...
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Processor [CPU2] (supports 8 throttling states)
...
ICH7: IDE controller at PCI slot :00:1f.1
GSI 18 sharing vector 0xD9 and IRQ 18
ACPI: PCI Interrupt :00:1f.1[A] -> GSI 22 (level, low) -> IRQ 217
ICH7: chipset revision 1
ICH7: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: HL-DT-STDVD-RAM GSA-H22N, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
libata version 2.00 loaded.
ahci :00:1f.2: version 2.0
GSI 19 sharing vector 0xE1 and IRQ 19
ACPI: PCI Interrupt :00:1f.2[B] -> GSI 23 (level, low) -> IRQ 225
PCI: Setting latency timer of device :00:1f.2 to 64
ahci :00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci :00:1f.2: flags: 64bit ncq led clo pio slum part 
ata1: SATA max UDMA/133 cmd 0xC2026D00 ctl 0x0 bmdma 0x0 irq 233
ata2: SATA max UDMA/133 cmd 0xC2026D80 ctl 0x0 bmdma 0x0 irq 233
ata3: SATA max UDMA/133 cmd 0xC2026E00 ctl 0x0 bmdma 0x0 irq 233
ata4: SATA max UDMA/133 cmd 0xC2026E80 ctl 0x0 bmdma 0x0 irq 233
scsi0 : ahci
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 31/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : ahci
ata2: SATA link down (SStatus 0 SControl 300)
scsi2 : ahci
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ATA-6, max UDMA/100, 57337056 sectors: LBA 
ata3.00: ata3: dev 0 multi count 1
ata3.00: applying bridge limits
ata3.00: configured for UDMA/100
scsi3 : ahci
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 31/32)
ata4.00: ata4: dev 0 multi count 16
ata4.00: configured for UDMA/133
  Vendor: ATA   Model: ST380811ASRev: 3.AA
Losing some ticks... checking if CPU frequency changed.
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
 sda2: 
sd 0:0:0:0: Attached scsi disk sda
  Vendor: ATA   Model: Adtron A25FB-28G  Rev: BF22
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sdb: 57337056 512-byte hdwr sectors (29357 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write through
SCSI device sdb: 57337056 512-byte hdwr sectors (29357 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write through
 sdb: sdb1
sd 2:0:0:0: Attached scsi disk sdb
  Vendor: ATA   Model: ST3250820AS   Rev: 3.AA
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 2:0:0:0: Attached scsi generic sg1 type 0
SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
 sdc:
sd 3:0:0:0: Attached scsi disk sdc
sd 3:0:0:0: Attached scsi generic sg2 type 0
...


strace output:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 73.73   49.9040491947 25627   write
 25.66   17.365062  69460225   fsync
  0.620.416500   59500 7   close
  0.000.00   0 4   read
  0.000.00   0 7   open
  0.000.00   0 5   fstat
  0.000.00   016   mmap
  0.000.00   0 7   mprotect
  0.000.00   0 1   munmap
  0.000.00   0 3   brk
  0.000.00   0 1 1 access
  0.000.00   0 1   execve
  0.000.00   0 1   uname
  0.000.00   0 

Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Some more info:

:~> strace -c -T -o trace.out dd if=/dev/zero of=test.txt bs=10MB count=200

200+0 Datensätze ein
200+0 Datensätze aus
20 bytes (2,0 GB) copied, 52,8632 seconds, 37,8 MB/s

test.txt:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 93.266.845265   33555   204   write
  6.410.470283   117574018 open
  0.320.023687 116   205   read
  0.000.000149   916   mmap2
  0.000.000119  40 3   munmap
  0.000.81   324   close
  0.000.68   611   old_mmap
  0.000.64   320   fstat64
  0.000.40   410   rt_sigaction
  0.000.36  12 3   madvise
  0.000.14   7 2   clock_gettime
  0.000.10   3 3   brk
  0.000.08   8 1   _sysctl
  0.000.07   7 1 1 access
  0.000.06   6 1   mprotect
  0.000.05   5 1   futex
  0.000.04   4 1   uname
  0.000.04   4 1   _llseek
  0.000.03   3 1   rt_sigprocmask
  0.000.03   3 1   getrlimit
  0.000.03   3 1   set_thread_area
  0.000.03   3 1   set_tid_address
-- --- --- - - 
100.007.339862   55119 total

This means, that the CPU is only 7.3 of 52.8 seconds working. This is what one 
can hear: If I run programs where the time they need is the same time as 
strace says, then I have 100% CPU load and the cpu fan starts to blow 
heavily. In the case here, the heat fan does not do anything. It looks like 
the SATA driver simply blocks the CPU while doing whatever...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Am Montag, 12. Februar 2007 18:04 schrieb Andi Kleen:
> "Martin A. Fink" <[EMAIL PROTECTED]> writes:
> > 
> > What I did:
> > I wrote blocks of 1 MB size to file. Each 1 GB I made a fsync and took the 
> > time. For those tests with filesystems I wrote files of 1 GB size, 
otherwise 
> > I just wrote to the raw device.
> 
> Newer Linux versions depending on the disk and the file system will tell
> the disk to flush the buffers to disk on fsync. FreeBSD might or might not
> do that, but if it doesn't it would explain the difference.

If you call fsync in BSD then you get what you expect. anything that is still 
not on disk will be written. Afterwards fsync returns... So this should be 
the same like with linux?!
> 
> > 
> > Results: -1-
> > 
> > TestOpenSuSE(AHCI)  
> > FreeBSD(AHCI)
> > 
---
> > SSD(vfat 25GB)  41+/-2 MB/s at 4-10%15+/-0 
> > MB/s at 2% CPU
> 
> vfat is certainly not a performance optimized file system.
That is just a minor test.
> 
> > SSD(raw  25GB)  26+/-1 MB/s at 4-10% CPU48+/-0 MB/s at 
> > 1% CPU

The above line is what makes me wondering !!!

> > SSD(ext3 25GB)  39+/-5 MB/s at 10-15% CPU   34+/-0 MB/s at 
> > 14% CPU
> > SSD(ext2 25GB)  42+/-1 MB/s at 10-15% CPU   32+/-0 MB/s at 
> > 10% CPU
> 
> 
> You could use oprofile (http://oprofile.sourceforge.net) to find out
> where the CPU is being used.
> 
> 
> > 
---
> > 
> > TestOpenSuSE (AHCI off) 
> > FreeBSD (AHCI off)
> > 
---
> > SSD(vfat 25GB)  22+/-4 MB/s at 6-19% CPU--
> > SSD(raw  25GB)  33+/-4 MB/s at 7-14% CPU41+/-0 MB/s at 
> > 1% CPU
> 
> I remember vaguely (but I might be wrong here) the standard block
> character devices on FreeBSD are buffered, while raw is truly
> unbuffered on Linux. Naive programs (no optimized IO threads or aio) 
> on truly unbuffered devices tend to perform poorly because they
> don't do any write behind.

But the big question still is -- buffered or not -- where do the big 
variations within linux come frome? I am not writing small blocks. I write 
huge amounts of data. So the buffer will always be full. And: Linux is even 
slower then BSD if it can use a buffer. The maximum performance of Linux is 
42 MB/s (buffered) while the maximum performance of BSD is 48 MB/s (buffered 
or not -- i don't know).
If I use a normal SATA-II disk, there are no differences between BSD and Linux 
when writing to the raw device... So it cant be a buffer-problem alone.
> 
> It might also useful if you post the libata related parts of your
> boot log.

> > 
> > Question 2:
> > Can anybody explain to me, why writing to a solid state disk (a kind of 
> > memory  
> > that always has the same constant bandwidth) has such big standard errors 
> > in  
> > writing rate using Linux (between 1 to 6 MB/s error) while FreeBSD gives 
> > an  
> > almost constant writing rate (as one would expect it for a SSD) ?
> 
> Could be buffered vs unbuffered. Unbuffered single threaded writes
> tend to be quite variable.
This does not answer the big variation when writing with ext3 of +/- 5 MB/s.

I still don't understand the buffer argument. If one writes 25 GB in blocks of 
1 MB your buffer should be always full...
> 
> > Question 3:
> > Why is writing to a raw device in Linux slower than using e.g. ext2 ? And 
why 
> > is Linux writing rate much lower (-12.5 % for the best case) compared to 
> > writing rate of FreeBSD?
> 
> It's really hard to make raw io perform well without complicated
> efforts because nobody will hide the IO latencies. That is why
> buffered IO is normally recommend

Is there a buffered io device that I can use, but that does not use a 
filesystem?

> 
> -Andi
> 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Dear all,

I did some performance tests that made me really wonder:

My Hardware:
Asus P5LD2 board with Intel i945P chipset, ICH7R southbridge
CPU Intel Core 2 Duo E6300 at 1.86 GHz, 2 MB Cache
1 GB RAM
My Software:
OpenSuSE 10.2 with Linux kernel 2.6.18, x86-64 architecture
FreeBSD 6.2

Testdrives:
1. HDD: Seagate ST3250820AS RPM 7200.9, 8 MB Cache, 250 GB, SATA-II
   (Harddisk Drive)
2. SSD: Adtron AF25FB, 27GB, SATA Revision 1.0a (Solid State Disk)

What I did:
I wrote blocks of 1 MB size to file. Each 1 GB I made a fsync and took the 
time. For those tests with filesystems I wrote files of 1 GB size, otherwise 
I just wrote to the raw device.

Results: -1-

TestOpenSuSE(AHCI)  
FreeBSD(AHCI)
---
SSD(vfat 25GB)  41+/-2 MB/s at 4-10%15+/-0 MB/s at 
2% CPU
SSD(raw  25GB)  26+/-1 MB/s at 4-10% CPU48+/-0 MB/s at 1% CPU
SSD(ext3 25GB)  39+/-5 MB/s at 10-15% CPU   34+/-0 MB/s at 14% CPU
SSD(ext2 25GB)  42+/-1 MB/s at 10-15% CPU   32+/-0 MB/s at 10% CPU
---

TestOpenSuSE (AHCI off) FreeBSD 
(AHCI off)
---
SSD(vfat 25GB)  22+/-4 MB/s at 6-19% CPU--
SSD(raw  25GB)  33+/-4 MB/s at 7-14% CPU41+/-0 MB/s at 1% CPU
SSD(ext2 25GB)  27+/-6 MB/s at 6-14% CPU--
---

Question 1:
Can anybody explain to me, why writing to a SATA-I device with AHCI consumes 
so much CPU time using Linux, while it takes almost no CPU time on FreeBSD 
6.2 ? Especially comparing values of writing to the raw device?

Question 2:
Can anybody explain to me, why writing to a solid state disk (a kind of memory 
that always has the same constant bandwidth) has such big standard errors in 
writing rate using Linux (between 1 to 6 MB/s error) while FreeBSD gives an 
almost constant writing rate (as one would expect it for a SSD) ?

Question 3:
Why is writing to a raw device in Linux slower than using e.g. ext2 ? And why 
is Linux writing rate much lower (-12.5 % for the best case) compared to 
writing rate of FreeBSD?

Question 4:
When writing to the SATA-II HDD Linux is around 10% slower than FreeBSD when 
using ext3, but around as fast as FreeBSD when writing raw. Why?


How can I improve the speed of Linux,
Thanks for advices

Martin

PS: part of my testcode:

  int fd=open(fileName, O_WRONLY | O_CREAT | O_TRUNC, 0666);
  (void)gettimeofday(, 0);
  for (long bl=0; bl < blocksPerGigaByte; ++bl)
    write(fd, block, blockSize);
  fsync(fd);
  (void)gettimeofday(, 0);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Dear all,

I did some performance tests that made me really wonder:

My Hardware:
Asus P5LD2 board with Intel i945P chipset, ICH7R southbridge
CPU Intel Core 2 Duo E6300 at 1.86 GHz, 2 MB Cache
1 GB RAM
My Software:
OpenSuSE 10.2 with Linux kernel 2.6.18, x86-64 architecture
FreeBSD 6.2

Testdrives:
1. HDD: Seagate ST3250820AS RPM 7200.9, 8 MB Cache, 250 GB, SATA-II
   (Harddisk Drive)
2. SSD: Adtron AF25FB, 27GB, SATA Revision 1.0a (Solid State Disk)

What I did:
I wrote blocks of 1 MB size to file. Each 1 GB I made a fsync and took the 
time. For those tests with filesystems I wrote files of 1 GB size, otherwise 
I just wrote to the raw device.

Results: -1-

TestOpenSuSE(AHCI)  
FreeBSD(AHCI)
---
SSD(vfat 25GB)  41+/-2 MB/s at 4-10%15+/-0 MB/s at 
2% CPU
SSD(raw  25GB)  26+/-1 MB/s at 4-10% CPU48+/-0 MB/s at 1% CPU
SSD(ext3 25GB)  39+/-5 MB/s at 10-15% CPU   34+/-0 MB/s at 14% CPU
SSD(ext2 25GB)  42+/-1 MB/s at 10-15% CPU   32+/-0 MB/s at 10% CPU
---

TestOpenSuSE (AHCI off) FreeBSD 
(AHCI off)
---
SSD(vfat 25GB)  22+/-4 MB/s at 6-19% CPU--
SSD(raw  25GB)  33+/-4 MB/s at 7-14% CPU41+/-0 MB/s at 1% CPU
SSD(ext2 25GB)  27+/-6 MB/s at 6-14% CPU--
---

Question 1:
Can anybody explain to me, why writing to a SATA-I device with AHCI consumes 
so much CPU time using Linux, while it takes almost no CPU time on FreeBSD 
6.2 ? Especially comparing values of writing to the raw device?

Question 2:
Can anybody explain to me, why writing to a solid state disk (a kind of memory 
that always has the same constant bandwidth) has such big standard errors in 
writing rate using Linux (between 1 to 6 MB/s error) while FreeBSD gives an 
almost constant writing rate (as one would expect it for a SSD) ?

Question 3:
Why is writing to a raw device in Linux slower than using e.g. ext2 ? And why 
is Linux writing rate much lower (-12.5 % for the best case) compared to 
writing rate of FreeBSD?

Question 4:
When writing to the SATA-II HDD Linux is around 10% slower than FreeBSD when 
using ext3, but around as fast as FreeBSD when writing raw. Why?


How can I improve the speed of Linux,
Thanks for advices

Martin

PS: part of my testcode:

  int fd=open(fileName, O_WRONLY | O_CREAT | O_TRUNC, 0666);
  (void)gettimeofday(start, 0);
  for (long bl=0; bl  blocksPerGigaByte; ++bl)
    write(fd, block, blockSize);
  fsync(fd);
  (void)gettimeofday(ende, 0);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Am Montag, 12. Februar 2007 18:04 schrieb Andi Kleen:
 Martin A. Fink [EMAIL PROTECTED] writes:
  
  What I did:
  I wrote blocks of 1 MB size to file. Each 1 GB I made a fsync and took the 
  time. For those tests with filesystems I wrote files of 1 GB size, 
otherwise 
  I just wrote to the raw device.
 
 Newer Linux versions depending on the disk and the file system will tell
 the disk to flush the buffers to disk on fsync. FreeBSD might or might not
 do that, but if it doesn't it would explain the difference.

If you call fsync in BSD then you get what you expect. anything that is still 
not on disk will be written. Afterwards fsync returns... So this should be 
the same like with linux?!
 
  
  Results: -1-
  
  TestOpenSuSE(AHCI)  
  FreeBSD(AHCI)
  
---
  SSD(vfat 25GB)  41+/-2 MB/s at 4-10%15+/-0 
  MB/s at 2% CPU
 
 vfat is certainly not a performance optimized file system.
That is just a minor test.
 
  SSD(raw  25GB)  26+/-1 MB/s at 4-10% CPU48+/-0 MB/s at 
  1% CPU

The above line is what makes me wondering !!!

  SSD(ext3 25GB)  39+/-5 MB/s at 10-15% CPU   34+/-0 MB/s at 
  14% CPU
  SSD(ext2 25GB)  42+/-1 MB/s at 10-15% CPU   32+/-0 MB/s at 
  10% CPU
 
 
 You could use oprofile (http://oprofile.sourceforge.net) to find out
 where the CPU is being used.
 
 
  
---
  
  TestOpenSuSE (AHCI off) 
  FreeBSD (AHCI off)
  
---
  SSD(vfat 25GB)  22+/-4 MB/s at 6-19% CPU--
  SSD(raw  25GB)  33+/-4 MB/s at 7-14% CPU41+/-0 MB/s at 
  1% CPU
 
 I remember vaguely (but I might be wrong here) the standard block
 character devices on FreeBSD are buffered, while raw is truly
 unbuffered on Linux. Naive programs (no optimized IO threads or aio) 
 on truly unbuffered devices tend to perform poorly because they
 don't do any write behind.

But the big question still is -- buffered or not -- where do the big 
variations within linux come frome? I am not writing small blocks. I write 
huge amounts of data. So the buffer will always be full. And: Linux is even 
slower then BSD if it can use a buffer. The maximum performance of Linux is 
42 MB/s (buffered) while the maximum performance of BSD is 48 MB/s (buffered 
or not -- i don't know).
If I use a normal SATA-II disk, there are no differences between BSD and Linux 
when writing to the raw device... So it cant be a buffer-problem alone.
 
 It might also useful if you post the libata related parts of your
 boot log.

  
  Question 2:
  Can anybody explain to me, why writing to a solid state disk (a kind of 
  memory  
  that always has the same constant bandwidth) has such big standard errors 
  in  
  writing rate using Linux (between 1 to 6 MB/s error) while FreeBSD gives 
  an  
  almost constant writing rate (as one would expect it for a SSD) ?
 
 Could be buffered vs unbuffered. Unbuffered single threaded writes
 tend to be quite variable.
This does not answer the big variation when writing with ext3 of +/- 5 MB/s.

I still don't understand the buffer argument. If one writes 25 GB in blocks of 
1 MB your buffer should be always full...
 
  Question 3:
  Why is writing to a raw device in Linux slower than using e.g. ext2 ? And 
why 
  is Linux writing rate much lower (-12.5 % for the best case) compared to 
  writing rate of FreeBSD?
 
 It's really hard to make raw io perform well without complicated
 efforts because nobody will hide the IO latencies. That is why
 buffered IO is normally recommend

Is there a buffered io device that I can use, but that does not use a 
filesystem?

 
 -Andi
 

-- 
Dipl. Physiker
Martin Anton Fink
Max Planck Institute for extraterrestrial Physics
Giessenbachstrasse
85741 Garching
Germany
Tel. +49-(0)89-3-3645
Fax. +49-(0)89-3-3569
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
System Details:

dmesg: (parts)

Bootdata ok (command line is root=/dev/sda7 vga=0x31aresume=/dev/sda5 
splash=silent)
Linux version 2.6.18.2-34-default ([EMAIL PROTECTED]) (gcc version 4.1.2 
20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
...
Using ACPI (MADT) for SMP configuration information
...
Intel(R) Core(TM)2 CPU  6300  @ 1.86GHz stepping 06
Brought up 2 CPUs
...
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Processor [CPU2] (supports 8 throttling states)
...
ICH7: IDE controller at PCI slot :00:1f.1
GSI 18 sharing vector 0xD9 and IRQ 18
ACPI: PCI Interrupt :00:1f.1[A] - GSI 22 (level, low) - IRQ 217
ICH7: chipset revision 1
ICH7: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: HL-DT-STDVD-RAM GSA-H22N, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
libata version 2.00 loaded.
ahci :00:1f.2: version 2.0
GSI 19 sharing vector 0xE1 and IRQ 19
ACPI: PCI Interrupt :00:1f.2[B] - GSI 23 (level, low) - IRQ 225
PCI: Setting latency timer of device :00:1f.2 to 64
ahci :00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci :00:1f.2: flags: 64bit ncq led clo pio slum part 
ata1: SATA max UDMA/133 cmd 0xC2026D00 ctl 0x0 bmdma 0x0 irq 233
ata2: SATA max UDMA/133 cmd 0xC2026D80 ctl 0x0 bmdma 0x0 irq 233
ata3: SATA max UDMA/133 cmd 0xC2026E00 ctl 0x0 bmdma 0x0 irq 233
ata4: SATA max UDMA/133 cmd 0xC2026E80 ctl 0x0 bmdma 0x0 irq 233
scsi0 : ahci
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 31/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : ahci
ata2: SATA link down (SStatus 0 SControl 300)
scsi2 : ahci
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ATA-6, max UDMA/100, 57337056 sectors: LBA 
ata3.00: ata3: dev 0 multi count 1
ata3.00: applying bridge limits
ata3.00: configured for UDMA/100
scsi3 : ahci
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 31/32)
ata4.00: ata4: dev 0 multi count 16
ata4.00: configured for UDMA/133
  Vendor: ATA   Model: ST380811ASRev: 3.AA
Losing some ticks... checking if CPU frequency changed.
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4  sda5 sda6 sda7 sda8 
 sda2: bsd: sda9 sda10 sda11 sda12 sda13 
sd 0:0:0:0: Attached scsi disk sda
  Vendor: ATA   Model: Adtron A25FB-28G  Rev: BF22
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sdb: 57337056 512-byte hdwr sectors (29357 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write through
SCSI device sdb: 57337056 512-byte hdwr sectors (29357 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write through
 sdb: sdb1
sd 2:0:0:0: Attached scsi disk sdb
  Vendor: ATA   Model: ST3250820AS   Rev: 3.AA
  Type:   Direct-Access  ANSI SCSI revision: 05
SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 2:0:0:0: Attached scsi generic sg1 type 0
SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
 sdc:
sd 3:0:0:0: Attached scsi disk sdc
sd 3:0:0:0: Attached scsi generic sg2 type 0
...


strace output:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 73.73   49.9040491947 25627   write
 25.66   17.365062  69460225   fsync
  0.620.416500   59500 7   close
  0.000.00   0 4   read
  0.000.00   0 7   open
  0.000.00   0 5   fstat
  0.000.00   016   mmap
  0.000.00   0 7   mprotect
  0.000.00   0 1   munmap
  0.000.00   0 3   brk
  0.000.00   0 1 1 access
  0.000.00   0 1   execve
  0.000.00   0 1   uname
  0.00

Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Martin A. Fink
Am Montag, 12. Februar 2007 19:41 schrieben Sie:
 Martin A. Fink [EMAIL PROTECTED] writes:
 
 Your mailer seems to be broken. It drops cc.
  
  If you call fsync in BSD then you get what you expect. anything that is 
still 
  not on disk will be written. Afterwards fsync returns... So this should be 
  the same like with linux?!
 
 Not necessarily.  The disk may buffer additionally. Handling that
 differs widely, but modern Linux forces flushes to platter if the hardware 
support 
 it.
 
  But the big question still is -- buffered or not -- where do the big 
  variations within linux come frome? I am not writing small blocks. I write 
  huge amounts of data.
 
 1MB is nowhere near huge by modern standards. Many IO subsystems are
 only happy with multi MB requests. 
 
  So the buffer will always be full.
 
 Hardly. Especially not if you do synchronous fsync inbetween.

Well no. I write 1 GB in blocks of 1 MB. After that I call fsync. Then I 
process the next Gigabyte...
 
  If I use a normal SATA-II disk, there are no differences between BSD and 
Linux 
  when writing to the raw device... So it cant be a buffer-problem alone.
 
 Yes that is something that needs to be investigated. That is why I suggested
 oprofile if your assertation of a more CPU overhead on Linux is true.
 
  I still don't understand the buffer argument. If one writes 25 GB in 
blocks of 
  1 MB your buffer should be always full...
 
 Your mental model of a IO subsystem seems to be quite off.
 Think what happens when you fsync and submit synchronously.

See above, how I do writing.
 
 It's like sending something down a long pipe and waiting until it arrives
 at the bottom and you hear the echo of the impact. Then only then you send 
again. 
 There will be always long periods when the pipe will be empty.
 
 If you use large enough blocks these gaps will be quite small and
 might effectively become unimportant, but 1MB is nowhere near big enough 
 for that.

I tested this: When I write in blocks of 8kB or less the effect you describe 
happens. But above 100kB blocksize there is no more increase of speed.

 
  Is there a buffered io device that I can use, but that does not use a 
  filesystem?
 
 /dev/sdX*. However it has some other issues that also don't make
 it ideal. File systems are usually best.

My experience with filesystems is: I write some data and the write-function 
returns nearly immediatelly. So I write again. Sometimes it returns only 
after some 100-300ms. I think this happens always then when the buffer is 
full and thus linux starts to write to disk. After this happend, it returns 
again nearly immediatelly and after another while the same trouble happen 
again. But not in a regular order...

I have to store big amounts of data coming from 2 digital cameras to disk. 
Thus I have to write blocks of around 1 MB at 30 to 50 frames per second for 
a long period of time. So it is important for me that the harddisk drive is 
reliable in the sense of if it is capable of 50 MB/s then it should operate 
at this speed. Constantly.

 
 -Andi
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA-performance with AHCI

2006-12-04 Thread Martin A. Fink
Dear all,

now I was able to do a performance test with an Intel ICH6R chipset.

Basic hardware data:
- Intel Pentium 4 Xeon at 3.2 GHz
- Intel ICH6R chipset, AHCI enabled
- Intel Hyperthreading On and Off
- 1 GB SDDR RAM
- SATA controller onboard (4x)
- SATA harddisks 250 GB

I used SuSE Linux 9.3 with Linux kernel 2.6.11.4-21.14-smp.

I tried to put data from memory to harddisk by writing blocks of 1 MB size 
with 2 GB overall filesize. And I got the following results:

SuSE 9.3 - 32-bit Installation - Hyperthreading Off:
 - CPU time 16% (sys, approx. 0% user) - Write speed 40-45 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading Off:
 - CPU time 14% (sys, approx. 0% user) - Write speed 50-60 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading ON:
 - CPU time 16% (sys, approx. 0% user) - Write speed 40-50 MB/s

Compared to ICH6R with AHCI OFF the only difference I can see is that with 
AHCI the system seems to reac much faster on keyboard events and screen 
redraw seems to be as fast as normal. It looks like that CPU usage has not 
decreased that dramatically as I would have expected it.

Thus I did a small calculation:
Assuming that the processor gives workloads of (a) 1B (b) 1kB (c) 64kB to the 
DMA controller in AHCI mode to write 45 MB/s to disk, I calculate for 10% CPU 
time usage of the 3.2 GHz Pentium
(a) 10% * 3.2GHz / 45M calls = 7.3 CPU cycles per 1B call to DMA
(b) 10% * 3.2GHz / 45k calls = 7.4E+03 CPU cycles per 1kB call to DMA
(c) 10% * 3.2GHz / 720 calls = 4.8E+05 CPU cycles per 64kB call to DMA

For me (a) looks reasonable (some overhead per byte), but stupid - if 
implemented. Giving bigger packages like (b) and (c) looks better to me, but 
then I can't understand that huge overhead (1E3 to 1E5 cpu cycles per 
package) for one package.

Is this normal or do I still have something wrong in my system?

Thank you for your help,

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA-performance with AHCI

2006-12-04 Thread Martin A. Fink
Dear all,

now I was able to do a performance test with an Intel ICH6R chipset.

Basic hardware data:
- Intel Pentium 4 Xeon at 3.2 GHz
- Intel ICH6R chipset, AHCI enabled
- Intel Hyperthreading On and Off
- 1 GB SDDR RAM
- SATA controller onboard (4x)
- SATA harddisks 250 GB

I used SuSE Linux 9.3 with Linux kernel 2.6.11.4-21.14-smp.

I tried to put data from memory to harddisk by writing blocks of 1 MB size 
with 2 GB overall filesize. And I got the following results:

SuSE 9.3 - 32-bit Installation - Hyperthreading Off:
 - CPU time 16% (sys, approx. 0% user) - Write speed 40-45 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading Off:
 - CPU time 14% (sys, approx. 0% user) - Write speed 50-60 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading ON:
 - CPU time 16% (sys, approx. 0% user) - Write speed 40-50 MB/s

Compared to ICH6R with AHCI OFF the only difference I can see is that with 
AHCI the system seems to reac much faster on keyboard events and screen 
redraw seems to be as fast as normal. It looks like that CPU usage has not 
decreased that dramatically as I would have expected it.

Thus I did a small calculation:
Assuming that the processor gives workloads of (a) 1B (b) 1kB (c) 64kB to the 
DMA controller in AHCI mode to write 45 MB/s to disk, I calculate for 10% CPU 
time usage of the 3.2 GHz Pentium
(a) 10% * 3.2GHz / 45M calls = 7.3 CPU cycles per 1B call to DMA
(b) 10% * 3.2GHz / 45k calls = 7.4E+03 CPU cycles per 1kB call to DMA
(c) 10% * 3.2GHz / 720 calls = 4.8E+05 CPU cycles per 64kB call to DMA

For me (a) looks reasonable (some overhead per byte), but stupid - if 
implemented. Giving bigger packages like (b) and (c) looks better to me, but 
then I can't understand that huge overhead (1E3 to 1E5 cpu cycles per 
package) for one package.

Is this normal or do I still have something wrong in my system?

Thank you for your help,

Martin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA Performance with Intel ICH6

2006-11-28 Thread Martin A. Fink
Dear Alan,

You wrote
> The PIIX interface needs CPU intervention each command, so in practice
> about every 64K or so, and the CPU gets stalled waiting for the disk
> during the setup of each I/O. The newer kernels support AHCI which does
> not have this overhead, but it is only present on the newest intel
> controllers.

Can you tell me the name of these newest controllers? Is it ICH7 or 8 ?
What kernel versions? dmesg only shows ACPI and u/e/o hci_* host controller.
(kernel version is 2.6.8-24.25-smp). How can I switch to AHCI ?

Thank you very much,

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA Performance with Intel ICH6

2006-11-28 Thread Martin A. Fink
Dear Alan,

You wrote
 The PIIX interface needs CPU intervention each command, so in practice
 about every 64K or so, and the CPU gets stalled waiting for the disk
 during the setup of each I/O. The newer kernels support AHCI which does
 not have this overhead, but it is only present on the newest intel
 controllers.

Can you tell me the name of these newest controllers? Is it ICH7 or 8 ?
What kernel versions? dmesg only shows ACPI and u/e/o hci_* host controller.
(kernel version is 2.6.8-24.25-smp). How can I switch to AHCI ?

Thank you very much,

Martin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/