Thanks a lot people.
I will try all these things and hopefully try to get the clear picture..

Regards,
Praveenesh

On Tue, Apr 19, 2011 at 11:38 AM, Mehmet Tepedelenlioglu <
mehmets...@gmail.com> wrote:

> As was suggested, create your own input and put it into HDFS. You can
> create it
> in your HD and copy it to hdfs by a simple command. Create a list of
> 1000 random "words". Pick from the list randomly a few million times
> and place that into the hdfs in a file or several files whose sizes are 64
> megs or more.
> That should do it. But things that are not CPU intensive and that you can
> fit
> in a RAM will be done quicker in 1 machine than 4. The benefit  starts when
> you have more data than fits the RAM. The M/R gives you a tool for
> gathering values
> by the key and processing them in batches where each set of values that
> corresponds to a
> key can hopefully can fit in some ram. Usually the applications are not to
> make things
> faster, but make things at all.
>
>
> On Apr 18, 2011, at 10:41 PM, praveenesh kumar wrote:
>
> > Thank you guys for clearing my glasses.. now I can see the clean picture
> :-)
> > So how can I test my cluster... Can anyone suggest any scenario or have
> any
> > data set or any website where I can get dataset of this range ??
> >
> > Thanks,
> > Praveenesh
> >
> > On Tue, Apr 19, 2011 at 11:03 AM, Mehmet Tepedelenlioglu <
> > mehmets...@gmail.com> wrote:
> >
> >> For such small input, the only way you would see speed gains would be if
> >> your job was dominated
> >> by cpu time, and not i/o. Since word-count is mostly an i/o problem and
> >> your
> >> input size is quite small, you are seeing similar run times. 3 computers
> is
> >> better than 1
> >> only if you need them.
> >>
> >> On Apr 18, 2011, at 10:06 PM, praveenesh kumar wrote:
> >>
> >>> The input were  3  plain text files..
> >>>
> >>> 1 file was around 665 KB and other 2 files were around 1.5 MB each..
> >>>
> >>> Thanks,
> >>> Praveeenesh
> >>>
> >>>
> >>>
> >>> On Tue, Apr 19, 2011 at 10:27 AM, real great.. <
> >> greatness.hardn...@gmail.com
> >>>> wrote:
> >>>
> >>>> Whats your input size?
> >>>>
> >>>> On Tue, Apr 19, 2011 at 10:21 AM, praveenesh kumar <
> >> praveen...@gmail.com
> >>>>> wrote:
> >>>>
> >>>>> Hello everyone,
> >>>>>
> >>>>> I am new to hadoop...
> >>>>> I set up a  hadoop cluster of 4 ubuntu systems. ( Hadoop 0.20.2)
> >>>>> and I am running the well known word count (gutenberg) example to
> test
> >>>> how
> >>>>> fast my hadoop is working..
> >>>>>
> >>>>> But whenever I am running wordcount example..I am not able to see any
> >>>> much
> >>>>> processing time difference..
> >>>>> On single node the wordcount is taking the same time.. and on cluster
> >> of
> >>>> 4
> >>>>> systems also it is taking almost the same time..
> >>>>>
> >>>>> Am I  doing anything wrong here ??
> >>>>> Can anyone explain me why its happening.. and how can I make maximum
> >> use
> >>>> of
> >>>>> my cluster ??
> >>>>>
> >>>>> Thanks.
> >>>>> Praveenesh
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Regards,
> >>>> R.V.
> >>>>
> >>
> >>
>
>

Reply via email to