Hi John, If the resources are located in HDFS, and you specify the resource by HDFS URI, then the answer is yes. The node managers will cache resources, and it will automatically update the resources by modification time (of HDFS file).
It is recommended to increase the resources' replica number, if the resources been uploaded from client machine, the replica number is automatically set to 10 by mapreduce framework. On Fri, Jun 7, 2013 at 4:10 AM, John Lilley <[email protected]>wrote: > Suppose that I have a large archive in HDFS, say, containing 500 files > and 4GB. I want to make this available via YARN LocalResource. The > archive doesn’t change very often (maybe once per month). Will YARN > optimize for this? Does the expanded per-node cache persist across > application runs (using something like modification time to know if > re-expansion is needed)?**** > > ** ** > > If the archive is re-expanded on each node every time the app is launched, > should I set the replication factor higher to reduce rack bandwidth?**** > > ** ** > > Thanks**** > > John**** > > ** ** > -- Regards, Ted Xu
