Re: [squid-users] Hypothetically comparing SATA\SAS to NAS\SAN for squid.

2014-01-19 Thread Kinkie
On Sun, Jan 19, 2014 at 7:42 AM, Eliezer Croitoru elie...@ngtech.co.il wrote:
 While working here and there I have seen that ZFS is a very robust FS.
 I will not compare it to any others because there is no need for that.

 OK so zfs, ext3, ext4 and others are FS which sits on SPINNING disks or
 flash drives.
 The SATA and SAS interfaces are limited to a serial interface in the limits
 (by a standard) to 3,6 Gbps.
 So SATA is max 6Gbps while a NIC can have a bandwidth of 10Gbps.
 Are there any real reasons to not use a 10Gbps line?

At least two come to mind: first, behind that 10Gbps line there are
still spinning disks and/or SSDs; the transmission line is simply
going to add some latency. Maybe not much (10% overhead over spinning
disks due to processing, propagation, transmission, error correction
etc). Bandwidth is important, but latency is even more so.
Second, packet loss: SATA, SAS and FC guarantee 0% packet loss. If
there is any, it is immediately detected, and the data is
retransmitted. On Ethernet, you're not so sure. On IP-over-ethernet,
even less. I was told that a 1% packet loss is enough to completely
kill transmission performance in a FCoE environment, and that's the
reason why people who do FCoE use special (converged) adapters,
which look more like FC adapters than to Ethernet adapters.

 For example if I have 10 SAS or SATA disks SSD or Spinning under a machine
 with 128GB of ram it is possible to allow some flow of data to be used and
 be faster then only one and even SSD drive.

 A dual 10Gbps machine can potentially be faster in lots of aspects then a
 local disk.

 I do not have the answer but a NAS might be sometimes the right choice as a
 cache storage.

Benchmarks are welcome :)

 Indeed there are overheads for each and every TCP connection and indeed
 there are many aspects that needs to be tested and verified but I still do
 suspect that there are some assumptions that needs to be verified to make
 sure that a SAN\NAS might worth a lot then it is assumed to be.

The main advantage I can think of for using a NAS is that these
usually have huge RAM caches, which can help by keeping the directory
structure in RAM thus making small file retrieval faster than doing
multiple roundtrips to disk.

-- 
/kinkie


Re: [squid-users] Hypothetically comparing SATA\SAS to NAS\SAN for squid.

2014-01-19 Thread Robert Collins
On 19 January 2014 22:15, Kinkie gkin...@gmail.com wrote:
 On Sun, Jan 19, 2014 at 7:42 AM, Eliezer Croitoru elie...@ngtech.co.il 
 wrote:

 The main advantage I can think of for using a NAS is that these
 usually have huge RAM caches, which can help by keeping the directory
 structure in RAM thus making small file retrieval faster than doing
 multiple roundtrips to disk.

Seagate Kinetic drives might be interesting too.

-Rob

-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Converged Cloud


Re: [squid-users] Hypothetically comparing SATA\SAS to NAS\SAN for squid.

2014-01-19 Thread Marcus Kool



On 01/19/2014 04:42 AM, Eliezer Croitoru wrote:

While working here and there I have seen that ZFS is a very robust FS.
I will not compare it to any others because there is no need for that.

OK so zfs, ext3, ext4 and others are FS which sits on SPINNING disks or flash 
drives.
The SATA and SAS interfaces are limited to a serial interface in the limits (by 
a standard) to 3,6 Gbps.
So SATA is max 6Gbps while a NIC can have a bandwidth of 10Gbps.
Are there any real reasons to not use a 10Gbps line?

For example if I have 10 SAS or SATA disks SSD or Spinning under a machine with 128GB of 
ram it is possible to allow some flow of data to be used and be faster then 
only one and even SSD drive.

A dual 10Gbps machine can potentially be faster in lots of aspects then a local 
disk.

I do not have the answer but a NAS might be sometimes the right choice as a 
cache storage.

Indeed there are overheads for each and every TCP connection and indeed there 
are many aspects that needs to be tested and verified but I still do suspect 
that there are some assumptions that needs to
be verified to make sure that a SAN\NAS might worth a lot then it is assumed to 
be.

Eliezer


The raw transfer speed of a disk is only interesting when an application does
very large sequential I/Os and squid does not do that.
Squid writes a lot to disk and reads relatively little and since the average 
object size is
often around 13 KB, this is also the average I/O size.
A better performance parameter of disks is I/Os per second (IOPS).
Average latency is also an interesting parameter but usually the IOPS is the
more important parameter.

The following numbers indicate the speed of disk systems for random 16K I/O:
individual disk: 75-200 IOPS
individual SSD: 1000-6 IOPS
internal RAID disk array with 12 disks and battery backed cache: 600-2000 IOPS
high end SAN or NAS with RAID: 600-2+ IOPS

Other parameters like squid memory cache size are also important since it 
determines
how many cached object are in the memory cache and hence determines the 
percentage
of reads vs writes on the disk system.
Also the choice between aufs and rock store type determines different write 
patterns
for the disk system and hence different performance characteristics.

For NAS and SAN performance varies because multiple hosts use the storage array
and configuration parameters determine a lot.  On a disk array one can also make
a virtual disk based on 60 or more physical disks so IOPS can be very high.
And some disk arrays also support SSD which result in higher IOPS.

With disk arrays the rule of thumb is that the more money you spend the faster
they are.

For those Squid caches with very high requirements for disk system performance
there are various implementation options:
- split the cache in multiple caches
- get an (expensive) NAS/SAN
- do not cache on disk, which may be faster if the memory cache is large and
the internet pipe is big enough

Marcus