Re: task jvm bootstrapping via distributed cache

2012-07-31 Thread Stan Rosenberg
I am guessing this is either a well-known problem or an edge case. In any case, would it be a bad idea to designate predetermined output paths? E.g., DistributedCache.addCacheFileInto(uri, conf, outputPath) would attempt to copy the cached file into the specified path resolving to a task's local f

Compare Hadoop and Pig Map\Reduce

2012-07-31 Thread Manoj Babu
Hi, It would be great if any of you compare Pig and Hadoop map reduce. When we should go for Hadoop or Pig? I love to program using java but peoples were arguing that can be easily achieved in ping with very few lines of code even my boss too... I am a fresh developer for Hadoop. Could kindly prov

Re: Compare Hadoop and Pig Map\Reduce

2012-07-31 Thread Abhishek Shivkumar
Hi Manoj, Pig is basically a data-flow language used to perform high-level simple operations such as summarizations and basic analysis on top of the data residing on HDFS. It uses a language called Pig-Latin. It gives your HDFS a datawarehouse kind of perspective, and lets you do a data analysi

Re: Compare Hadoop and Pig Map\Reduce

2012-07-31 Thread Manoj Babu
Thanks Abhishek. Cheers! Manoj. On Tue, Jul 31, 2012 at 10:43 PM, Abhishek Shivkumar < abhisheksgum...@gmail.com> wrote: > Hi Manoj, > >Pig is basically a data-flow language used to perform high-level simple > operations such as summarizations and basic analysis on top of the data > residi