>From the book: "Hadoop The definitive guide" -- P242
>>
When you launch a job, Hadoop copies the files specified by the -files and
-archives options to the jobtracker’s filesystem (normally HDFS). Then,
before a task
is run, the tasktracker copies the files from the jobtracker’s filesystem to
a local disk—
the cache—so the task can access the files.
>>

I wonder why hadoop wants to copy the files to jobtracker's filesystem.
Since it is already in HDFS, it should be available to tasks.
Any considerations?

Reply via email to