Hello,

still new to AFM, so some basic question on how Recovery works for a SW cache:

we have an AFM SW cache in recovery mode – recovery first did run policies on 
the cache cluster, but now I see a ‘tcpcachescan’ process on cache slowly 
scanning home via nfs. Single host, single process, no parallelism as far as I 
can see, but I may be wrong. This scan of home on a cache afmgateway takes very 
long while further updates on cache queue up. Home has about 100M files. After 
8hours I see about 70M entries in the file /var/mmfs/afm/…/recovery/homelist, 
i.e. we get about 2500 lines/s.  (We may have very many changes on cache due to 
some recursive ACL operations, but I’m not sure.)

So I expect that 12hours pass to buildup filelists before recovery starts to 
update home. I see some risk: In this time new changes pile up on cache. Memory 
may become an issue? Cache may fill up and we can’t evict?

I wonder

  *   Is this to be expected and normal behavior?  What to do about it?
  *   Will every reboot of a gateway node trigger a recovery of all afm 
filesets and a full scan of home? This would make normal rolling updates  very 
unpractical, or is there some better way?

Home is a gpfs cluster, hence we easily could produce the needed filelist on 
home with a policyscan in a few minutes.

Thank you, I will welcome and clarification, advice or comments.

Kind regards,

Heiner
.

--
=======================
Heinrich Billich
ETH Zürich
Informatikdienste
Tel.: +41 44 632 72 56
[email protected]
========================



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to