Re: DistributedCache deprecated

2014-01-30 Thread Amit Mittal
Hi Prav, Yes, you are correct that DistributedCache does not upload file into memory. Also using job configuration and DistributedCache are 2 different approaches. I am referring based on Hadoop: The definitive guide Chapter:8 Side Data Distribution (Page 288-295). As you are saying that now

Re: DistributedCache deprecated

2014-01-30 Thread praveenesh kumar
Hi Amit, Side data distribution is altogether a different concept at all. Its when you set custom (key,value) pairs and use Job object for doing that, so that you can use them in your mappers/reducers. It is good when you want to pass some small information to your mappers/reducers like extra

Re: DistributedCache deprecated

2014-01-30 Thread Amit Mittal
Hi Prav, You are correct, thanks for the explanation. As per below link, I can see that Job's method internally calls to DistributedCache itself (

DistributedCache deprecated

2014-01-29 Thread Giordano, Michael
I noticed that in Hadoop 2.2.0 org.apache.hadoop.mapreduce.filecache.DistributedCache has been deprecated. (http://hadoop.apache.org/docs/current/api/deprecated-list.html#class) Is there a class that provides equivalent functionality? My application relies heavily on DistributedCache.

Re: DistributedCache deprecated

2014-01-29 Thread praveenesh kumar
I think you can use the Job class. http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html Regards Prav On Wed, Jan 29, 2014 at 9:13 PM, Giordano, Michael michael.giord...@vistronix.com wrote: I noticed that in Hadoop 2.2.0

Re: DistributedCache deprecated

2014-01-29 Thread praveenesh kumar
@Jay - I don't know how Job class is replacing the DistributedCache class , but I remember trying distributed cache functions like void *addArchiveToClassPath http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html#addArchiveToClassPath%28org.apache.hadoop.fs.Path%29*

Re: DistributedCache deprecated

2014-01-29 Thread praveenesh kumar
@Jay - Plus if you see DistributedCache class, these methods have been added inside the Job class, I am guessing they have kept the functionality same, just merged DistributedCache class into Job class itself. giving more methods for developers with less classes to worry about, thus simplifying

Re: DistributedCache deprecated

2014-01-29 Thread Jay Vyas
gotcha this makes sense On Wed, Jan 29, 2014 at 4:44 PM, praveenesh kumar praveen...@gmail.comwrote: @Jay - Plus if you see DistributedCache class, these methods have been added inside the Job class, I am guessing they have kept the functionality same, just merged DistributedCache class into

Re: DistributedCache deprecated

2014-01-29 Thread praveenesh kumar
...@gmail.com *Sent:* Wednesday, January 29, 2014 4:41 PM *To:* user@hadoop.apache.org *Subject:* Re: DistributedCache deprecated @Jay - I don't know how Job class is replacing the DistributedCache class , but I remember trying distributed cache functions like void *addArchiveToClassPath

Re: DistributedCache deprecated

2014-01-29 Thread Amit Mittal
Hi Mike Prav, Although I am new to Hadoop, but would like to add my 2 cents if that helps. We are having 2 ways for distribution of shared data, one is using Job configuration and other is DistributedCache. As job configuration is read by the JT, TT and child JVMs, and each time the