On 14 Jan 2011, at 23:12, Joe Landman wrote: > If most of your file access times are dominated by latency (e.g. small, seeky > like loads), and you are going over a gigabit connection, yeah, your > performance is going to crater on any cluster file system. > > Local latency to traverse the storage stack is on the order of 10's of > microseconds. Physical latency of the disk medium is on the order of 10's of > microseconds for RAMdisk, 100's of microseconds for flash/ssd, and 1000's of > microseconds (e.g. milliseconds) for spinning rust. > > Now take 1 million small file writes. Say 1024 bytes. These million writes > have to traverse the storage stack in the kernel to get to disk. > > Now add in a network latency event on the order of 1000's of microseconds for > the remote storage stack and network stack to respond. > > I haven't measured it yet in a methodical manner, but I wouldn't be surprised > to see IOP rates within a factor of 2 of the bare metal for a sufficiently > fast network such as Infiniband, and within a factor of 4 or 5 for a slow > network like Gigabit. > > Our own experience has been generally that you are IOP constrained because of > the stack you have to traverse. If you add more latency into this stack, you > have more to traverse, and therefore, you have more you need to wait. Which > will have a magnification effect upon times for small IO ops which are seeky > (stat, small writes, random ops).
Sure, and all that applies equally to both NFS and gluster, yet in Max's example NFS was ~50x faster than gluster for an identical small-file workload. So what's gluster doing over and above what NFS is doing that's taking so long, given that network and disk factors are equal? I'd buy a factor of 2 for replication, but not 50. In case you missed what I'm on about, it was these stats that Max posted: > Here is the results per command: > dd if=/dev/zero of=M/tmp bs=1M count=16384 69.2 MB/se (Native) 69.2 > MB/sec(FUSE) 52 MB/sec (NFS) > dd if=/dev/zero of=M/tmp bs=1K count=163840000 88.1 MB/sec (Native) > 1.1MB/sec (FUSE) 52.4 MB/sec (NFS) > time tar cf - M | pv > /dev/null 15.8 MB/sec (native) 3.48MB/sec > (FUSE) 254 Kb/sec (NFS) In my case I'm running 30kiops SSDs over gigabit. At the moment my problem (running 3.0.6) isn't performance but reliability - files are occasionally reported as 'vanished' by front-end apps (like rsync) even though they are present on both backing stores; no errors in gluster logs, self-heal doesn't help. Marcus _______________________________________________ Gluster-users mailing list [email protected] http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
