Hi Prav,
Yes, you are correct that DistributedCache does not upload file into
memory. Also using job configuration and DistributedCache are 2 different
approaches. I am referring based on Hadoop: The definitive guide
Chapter:8 Side Data Distribution (Page 288-295).
As you are saying that now
Hi Amit,
Side data distribution is altogether a different concept at all. Its when
you set custom (key,value) pairs and use Job object for doing that, so that
you can use them in your mappers/reducers. It is good when you want to pass
some small information to your mappers/reducers like extra
Hi Prav,
You are correct, thanks for the explanation. As per below link, I can see
that Job's method internally calls to DistributedCache itself (
I noticed that in Hadoop 2.2.0
org.apache.hadoop.mapreduce.filecache.DistributedCache has been deprecated.
(http://hadoop.apache.org/docs/current/api/deprecated-list.html#class)
Is there a class that provides equivalent functionality? My application relies
heavily on DistributedCache.
I think you can use the Job class.
http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html
Regards
Prav
On Wed, Jan 29, 2014 at 9:13 PM, Giordano, Michael
michael.giord...@vistronix.com wrote:
I noticed that in Hadoop 2.2.0
@Jay - I don't know how Job class is replacing the DistributedCache class ,
but I remember trying distributed cache functions like
void *addArchiveToClassPath
http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html#addArchiveToClassPath%28org.apache.hadoop.fs.Path%29*
@Jay - Plus if you see DistributedCache class, these methods have been
added inside the Job class, I am guessing they have kept the functionality
same, just merged DistributedCache class into Job class itself. giving more
methods for developers with less classes to worry about, thus simplifying
gotcha this makes sense
On Wed, Jan 29, 2014 at 4:44 PM, praveenesh kumar praveen...@gmail.comwrote:
@Jay - Plus if you see DistributedCache class, these methods have been
added inside the Job class, I am guessing they have kept the functionality
same, just merged DistributedCache class into
...@gmail.com
*Sent:* Wednesday, January 29, 2014 4:41 PM
*To:* user@hadoop.apache.org
*Subject:* Re: DistributedCache deprecated
@Jay - I don't know how Job class is replacing the DistributedCache
class , but I remember trying distributed cache functions like
void *addArchiveToClassPath
Hi Mike Prav,
Although I am new to Hadoop, but would like to add my 2 cents if that helps.
We are having 2 ways for distribution of shared data, one is using Job
configuration and other is DistributedCache.
As job configuration is read by the JT, TT and child JVMs, and each time
the
10 matches
Mail list logo