If you are using just two nodes, you should aware about the resources that you
allocate for every process (JT, TT, MS, RS, etc) particularly the memory that
you are
using for region hosting, Java ops processes, sorting, map/reduces tasks, etc
--
Marcos Ortiz[1] (@marcosluis2186[2])
http://about.me/marcosortiz[3]
On Tuesday, May 20, 2014 05:18:08 PM Flavio Pompermaier wrote:
> Thanks for the explanation Marcos. For the moment we started this cluster
> with 2 nodes so I had to share almost everything.. :)
> Do I have to be careful with something? Do I have to increase some timeout
> or decrease the caching of the scan maybe?
>
> Best,
> Flavio
>
> On Tue, May 20, 2014 at 4:05 PM, Marcos Ortiz <[email protected]> wrote:
> > Based in your hbase-cmf-hbase1-MASTER.log, the problems come after the
> >
> > region splitting process, particularly when the SplitManager finishes its
> > spliting tasks, the regions in the myserver1 server are put offline, and
> > the Master throw the NotServingRegionException.
> >
> >
> >
> > Then. the process continues with the myserver2, after the same step of the
> > SplitManager finishes.
> >
> >
> >
> > Zookeeper seems to work OK .
> >
> >
> >
> > Do you have the RegionServers sharing the same resources with the
> > TaskTrackers?
> >
> > --
> >
> > Marcos Ortiz <http://www.linkedin.com/in/mlortiz>
> > (@marcosluis2186<http://twitter.com/marcosluis2186> )
> >
> > http://about.me/marcosortiz
> >
> > On Tuesday, May 20, 2014 02:18:50 PM Flavio Pompermaier wrote:
> > > In the attached zip the config files generated by Cloudera. The
> > > core-site
> > >
> > > and the hdfs-site are slightly different if I download them from
> >
> > mapreduce
> >
> > > or hbase service..and I don't know why..
> > >
> > >
> > >
> > > Attached also the logs of the HBase master, zookeeper (in the range of
> >
> > time
> >
> > > where I experienced region server problems).
> > >
> > > Can you find something useful to solve the issue?
> > >
> > >
> > >
> > > When I set up the scanner I do:
> > >
> > >
> > >
> > > Scan scan = new Scan();
> > >
> > > scan.setCacheBlocks(false);
> > >
> > > scan.addColumn(family, qualifier);
> > >
> > > scan.setCaching(1000);
> > >
> > > scan.setMaxVersions(1);
> > >
> > >
> > >
> > > Best,
> > >
> > > Flavio
> > >
> > >
> > >
> > > On Tue, May 20, 2014 at 12:24 PM, Geovanie Marquez <
> > >
> > > > [email protected]> wrote:
> > > >> It's really not going to be useful to guess without more log
> > > >>
> > > >> investigation.check the master node logs to see when the first region
> > > >>
> > > >> server went down and correlate zookeeper and region server logs to
> > > >> the
> > > >>
> > > >> minute or two before it died.
> > > >>
> > > >>
> > > >>
> > > >> It could be garbage collection or high scan batches killing your
> >
> > servers
> >
> > > >> occasionally.
> > > >>
> > > >> On May 20, 2014 3:17 AM, "Flavio Pompermaier" <[email protected]>
> > > >>
> > > >> wrote:
> > > >> > Hi to all,
> > > >> >
> > > >> >
> > > >> >
> > > >> > I'm using Cloudera CDH4 (4.5.0) with default parameters and HBase
> > > >>
> > > >> 0.94.6.
> > > >>
> > > >> > I'm experiencing a bad behaviour of my mapreduce jobs, where region
> > > >>
> > > >> servers
> > > >>
> > > >> > keep crashing. I checked the logs and the region servers seems to
> >
> > die
> >
> > > >> > without logging anything..this seems to happen at the 2nd or 3rd
> >
> > times
> >
> > > >> > I
> > > >> >
> > > >> > submit a job..can someone help me in figuring out what's happening?
> > > >> >
VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 de julio de
2014. Ver www.uci.cu