On Dec 10, 2008, at 8:52 PM, Anthony Urso wrote:

I have been having problems with changes to DistributedCache files on
HDFS not being reflected on subsequently run jobs.  I can change the
filename to work around this, but I would prefer a way to invalidate
the Cache when neccesary.


Which version of hadoop are you using?

The DistributedCache uses the modification time of the file and invalidates the cached files, so if you change the file on HDFS it should automatically refresh for your subsequent jobs. If this isn't the behaviour you are seeing please file a jira... we'd appreciate a test case if possible too!

thanks,
Arun

Reply via email to