Thanks a lot people. I will try all these things and hopefully try to get the clear picture..
Regards, Praveenesh On Tue, Apr 19, 2011 at 11:38 AM, Mehmet Tepedelenlioglu < mehmets...@gmail.com> wrote: > As was suggested, create your own input and put it into HDFS. You can > create it > in your HD and copy it to hdfs by a simple command. Create a list of > 1000 random "words". Pick from the list randomly a few million times > and place that into the hdfs in a file or several files whose sizes are 64 > megs or more. > That should do it. But things that are not CPU intensive and that you can > fit > in a RAM will be done quicker in 1 machine than 4. The benefit starts when > you have more data than fits the RAM. The M/R gives you a tool for > gathering values > by the key and processing them in batches where each set of values that > corresponds to a > key can hopefully can fit in some ram. Usually the applications are not to > make things > faster, but make things at all. > > > On Apr 18, 2011, at 10:41 PM, praveenesh kumar wrote: > > > Thank you guys for clearing my glasses.. now I can see the clean picture > :-) > > So how can I test my cluster... Can anyone suggest any scenario or have > any > > data set or any website where I can get dataset of this range ?? > > > > Thanks, > > Praveenesh > > > > On Tue, Apr 19, 2011 at 11:03 AM, Mehmet Tepedelenlioglu < > > mehmets...@gmail.com> wrote: > > > >> For such small input, the only way you would see speed gains would be if > >> your job was dominated > >> by cpu time, and not i/o. Since word-count is mostly an i/o problem and > >> your > >> input size is quite small, you are seeing similar run times. 3 computers > is > >> better than 1 > >> only if you need them. > >> > >> On Apr 18, 2011, at 10:06 PM, praveenesh kumar wrote: > >> > >>> The input were 3 plain text files.. > >>> > >>> 1 file was around 665 KB and other 2 files were around 1.5 MB each.. > >>> > >>> Thanks, > >>> Praveeenesh > >>> > >>> > >>> > >>> On Tue, Apr 19, 2011 at 10:27 AM, real great.. < > >> greatness.hardn...@gmail.com > >>>> wrote: > >>> > >>>> Whats your input size? > >>>> > >>>> On Tue, Apr 19, 2011 at 10:21 AM, praveenesh kumar < > >> praveen...@gmail.com > >>>>> wrote: > >>>> > >>>>> Hello everyone, > >>>>> > >>>>> I am new to hadoop... > >>>>> I set up a hadoop cluster of 4 ubuntu systems. ( Hadoop 0.20.2) > >>>>> and I am running the well known word count (gutenberg) example to > test > >>>> how > >>>>> fast my hadoop is working.. > >>>>> > >>>>> But whenever I am running wordcount example..I am not able to see any > >>>> much > >>>>> processing time difference.. > >>>>> On single node the wordcount is taking the same time.. and on cluster > >> of > >>>> 4 > >>>>> systems also it is taking almost the same time.. > >>>>> > >>>>> Am I doing anything wrong here ?? > >>>>> Can anyone explain me why its happening.. and how can I make maximum > >> use > >>>> of > >>>>> my cluster ?? > >>>>> > >>>>> Thanks. > >>>>> Praveenesh > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Regards, > >>>> R.V. > >>>> > >> > >> > >