Hi Anil,

Yes, I'm sure I'm running cluster in distributed mode (I see 21 parallel
map tasks in job tracker and processes on each node). max map tasks set to
7 for each node.

I run my job with the same cluster configuration on two tables:
1. Table located only on 1 node (I see it on HBase master page) - 10M
records
2. Table even distribute on 3 nodes (also checked on HBase master page) -
150M records.


Thanks.

On Sat, Mar 31, 2012 at 12:57 AM, anil gupta <anilg...@buffalo.edu> wrote:

> Hi Alexander,
>
> If you can provide more details of the stuff you are doing then it would be
> helpful. Are you sure that your cluster is running in distributed mode? Did
> you ran the job with 1 node in cluster and then added 2 additional node to
> the same cluster?
>
> Thanks,
> Anil
>
> 2012/3/30 Alexander Goryunov <a.goryu...@gmail.com>
>
> >  Hi Anil,
> >
> > Yes, the second table is distributed, the first is not and I have 3х
> better
> > results for nondistrubuted table.
> >
> > I use distributed hadoop mode for all cases.
> >
> > Thanks.
> >
> >
> >
> > On Fri, Mar 30, 2012 at 3:26 AM, anil gupta <anilg...@buffalo.edu>
> wrote:
> >
> > > Hi Alexander,
> > >
> > > Is data properly distributed over the cluster in Distributed Mode? If
> the
> > > data is not then you wont get good results in distributed mode.
> > >
> > > Thanks,
> > > Anil Gupta
> > >
> > > On Thu, Mar 29, 2012 at 8:37 AM, Alexander Goryunov <
> > a.goryu...@gmail.com
> > > >wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm running 3 data node cluster (8core Xeon, 16G) + 1 node for
> > jobtracker
> > > > and namenode with Hadoop and HBase and have strange performance
> > results.
> > > >
> > > > The same map job runs with speed about 300 000 records per second
> for 1
> > > > node table and 100 000 records per second for table  distributed to 3
> > > > nodes.
> > > >
> > > > Scan caching is 1000, each row is about 0.2K, compression is off,
> > > > setCacheBlock is false.
> > > >
> > > > 7 map tasks in parallel for each node. (281 for the big table in
> > summary
> > > > and 16 for the small table)
> > > >
> > > > Map job reads some sequential data and writes down a few from it. No
> > > reduce
> > > > tasks are set for this job.
> > > >
> > > >
> > > > Both table have the same data and have sizes about 10M (first one)
> > > records
> > > > and 150M (second one) records.
> > > >
> > > > Do you have any idea what could be the reason of such behavior?
> > > >
> > > > Thanks.
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks & Regards,
> > > Anil Gupta
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Reply via email to