>My xfs inode size is 265 Bytes.
> When I increase the memory and change vm.vfs_cache_pressure to 1, is it 
> possible to store the inode tree in the > memory?
> Maybe the random disk seeks are the problem.
>
> Here is a iostat snapshot:
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
>sde               0.00     0.00  143.80    2.60  1028.80   123.90    15.75     
>1.15    7.84    7.97    0.62   6.49  95.04
>sdh               0.00     0.00   94.80    3.80   681.60   156.70    17.00     
>1.27   12.51   13.00    0.21   9.89  97.52


Well, at least 2 disks have pretty much 100% util my guess would be the 
replicator / rsync and maybe auditor is currently busy with those disks.
Setting the cache_pressure to 1 does not guarantee anything yet, just that it 
is less likely to evict data.
The best check to see if it all fits would be setting the cache-pressure to 0 
and do a find on all the disks.
If the OOM killer does not appear your fine, if does though you will have a 
flaky server ;)

However we can also do an estimate without potentially crashing a server:
The amount of memory you need to keep it al in memory would be number of used 
inodes on the disks * inode size.
df -i should gives you the number of inodes in use.
I'm not sure if the xattr info now also fits inside those 256k inodes so it 
might be a bit more.
(with older kernels you needed 1K inodes so I wonder where that data goes now).

My guess based on the info you gave:
360 million files * replica 3 / 7 Nodes = 154 Milion inodes per server
256 Bytes * 154 Million = 40GB

Note that this is just for the files and does not include the inodes for the 
directories (which will also be a ridiculous amount)
So it probably does not fit in 32GB maybe just in 64GB.

So you should probably see quite a lot of cache misses currently in the xfs 
stats (xfs xs_dir_lookup, xs_ig_missed)

Cheers,
Robert van Leeuwen



-----Ursprüngliche Nachricht-----
Von: Robert van Leeuwen [mailto:[email protected]]
Gesendet: Dienstag, 10. Februar 2015 12:23
An: Klaus Schürmann; '[email protected]'
Betreff: RE: [Openstack] [SWIFT] Bad replication performance after adding new 
drives

> I set the vfs_cache_pressure to 10 and moved container- and account-server to 
> SSD harddrives.
> The normal performance for object writes and reads are quite ok.

> But why takes moving some partions to only two new harddisks so much time?
> Will it be faster if I add more memory?

My guess: Probably the source disks/servers are slow.
When the inode tree is not in memory it will do a lot of random reads to the 
disks (for both the inode tree and the actual file).
An rsync of any directory will become slow on the source side ( iirc you can 
see this in the replicator log) You should be able to see in e.g. atop if the 
source or destination disks are the limiting factor.

If the source is the issue it might help to increase the maximum number of 
simultaneous rsync processes so you have more parallel slow processes ;) Note 
that this can have impact on the general speed of the Swift cluster.
More memory will probably help a bit.

Cheers,
Robert

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : [email protected]
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to