Hello,
still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode – recovery first did run policies on the cache cluster, but now I see a ‘tcpcachescan’ process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/…/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I’m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can’t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Zürich Informatikdienste Tel.: +41 44 632 72 56 [email protected] ========================
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
