Suppose that I have a large archive in HDFS, say, containing 500 files and 4GB. 
 I want to make this available via YARN LocalResource.  The archive doesn't 
change very often (maybe once per month).  Will YARN optimize for this?  Does 
the expanded per-node cache persist across application runs (using something 
like modification time to know if re-expansion is needed)?

If the archive is re-expanded on each node every time the app is launched, 
should I set the replication factor higher to reduce rack bandwidth?

Thanks
John

Reply via email to