Re: Region servers crashing during mapreduce

2014-05-20 Thread Marcos Ortiz
If you are using just two nodes, you should aware about the resources that you allocate for every process (JT, TT, MS, RS, etc) particularly the memory that you are using for region hosting, Java ops processes, sorting, map/reduces tasks, etc -- Marcos Ortiz[1] (@marcosluis2186[2]) http://about

Re: Region servers crashing during mapreduce

2014-05-20 Thread Flavio Pompermaier
Thanks for the explanation Marcos. For the moment we started this cluster with 2 nodes so I had to share almost everything.. :) Do I have to be careful with something? Do I have to increase some timeout or decrease the caching of the scan maybe? Best, Flavio On Tue, May 20, 2014 at 4:05 PM, Marc

Re: Region servers crashing during mapreduce

2014-05-20 Thread Marcos Ortiz
Based in your hbase-cmf-hbase1-MASTER.log, the problems come after the region splitting process, particularly when the SplitManager finishes its spliting tasks, the regions in the myserver1 server are put offline, and the Master throw the NotServingRegionException. Then. the process continues

Re: Region servers crashing during mapreduce

2014-05-20 Thread Marcos Ortiz
On 20/05/14 03:16, Flavio Pompermaier wrote: Hi to all, I'm using Cloudera CDH4 (4.5.0) with default parameters and HBase 0.94.6. I'm experiencing a bad behaviour of my mapreduce jobs, where region servers keep crashing. I checked the logs and the region servers seems to die without logging any

Re: Region servers crashing during mapreduce

2014-05-20 Thread Geovanie Marquez
It's really not going to be useful to guess without more log investigation.check the master node logs to see when the first region server went down and correlate zookeeper and region server logs to the minute or two before it died. It could be garbage collection or high scan batches killing your s

Region servers crashing during mapreduce

2014-05-20 Thread Flavio Pompermaier
Hi to all, I'm using Cloudera CDH4 (4.5.0) with default parameters and HBase 0.94.6. I'm experiencing a bad behaviour of my mapreduce jobs, where region servers keep crashing. I checked the logs and the region servers seems to die without logging anything..this seems to happen at the 2nd or 3rd ti

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-11 Thread David Koch
Hello, Thank you for your replies. In the end we dropped the concerned tables and are in the process of re-importing data. Looking through the mailing list it seems like this issue [1] may be identical to what we are experiencing. TLDR: Region splits fail when there is a lack of disk space, leavin

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-11 Thread ramkrishna vasudevan
>From the UI can you figure out how many store files are present? Also if you can check the logs it will tel you if the compactions were happening. I may be wrong without checking your cluster, just some inputs that we have faced sometime back. Regards Ram On Mon, Feb 11, 2013 at 8:54 PM, David

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-11 Thread David Koch
Hello, No, we did not change anything, so compactions should run at automatically - I guess it's once a day - however, I don't know to what extent jobs running on the cluster have impeded compactions - if this is even a possibility. /David On Mon, Feb 11, 2013 at 4:58 AM, ramkrishna vasudevan <

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-10 Thread ramkrishna vasudevan
Hi David, Have you changed anything on the configurations related to compactions? If there are more store files created and if the compactions are not run frequently we end up in this problem. Atleast there will be a consistent increase in the file handler count. Could you run compactions manua

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-10 Thread David Koch
Like I said, the maximum permissible number of filehandlers is set to 65535 for users hbase (the one who starts HBase), mapred and hdfs The too many files warning occurs on the region servers but not on the HDFS namenode. /David On Sun, Feb 10, 2013 at 3:53 PM, shashwat shriparv < dwivedishash.

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-10 Thread shashwat shriparv
On Sun, Feb 10, 2013 at 6:21 PM, David Koch wrote: > problems but could not find any. The settings increase the u limit for the user using you are starting the hadoop and hbase services, in os ∞ Shashwat Shriparv

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-10 Thread David Koch
ld be appreciated, > > Thanks, > > /David > > On Sun, Feb 10, 2013 at 2:24 AM, Dhaval Shah > wrote: > > > It seems like you need to increase the limit on the number of xceivers on > the hdfs config looking at your error messages. > > > -----

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-09 Thread Marcos Ortiz
--- On Sun 10 Feb, 2013 6:37 AM IST David Koch wrote: Hello, As of lately, we have been having issues with Region Servers crashing in our cluster. This happens while running Map/Reduce jobs over HBase tables in particular but also spontaneously when the cluster is seemingl

Re: : Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-09 Thread David Koch
at 2:24 AM, Dhaval Shah wrote: > > It seems like you need to increase the limit on the number of xceivers on > the hdfs config looking at your error messages. > > > -- > On Sun 10 Feb, 2013 6:37 AM IST David Koch wrote: > > >Hello, > &g

Region Servers crashing following: "File does not exist", "Too many open files" exceptions

2013-02-09 Thread David Koch
Hello, As of lately, we have been having issues with Region Servers crashing in our cluster. This happens while running Map/Reduce jobs over HBase tables in particular but also spontaneously when the cluster is seemingly idle. Restarting the Region Servers or even HBase entirely as well as HDFS

Re: Region Servers Crashing during Random Reads

2011-02-04 Thread Ryan Rawson
Under our load at su, the new gen would grow to max size and take 800+ ms. I would consider setting the ms goal to 20-40ms (what we get in prod now). At 1gb par new i would expect large pauses. Plus in my previous tests the promotion was like 75% even with a huge par new. This is all based on my b

Re: Region Servers Crashing during Random Reads

2011-02-04 Thread Stack
On Fri, Feb 4, 2011 at 12:20 AM, Lars George wrote: > I saw the -XX:MaxGCPauseMillis option too and assumed it is not that > effective as it was never suggested so far. So it was simply not tried > yet and someone has to be the guinea pig? > Yeah, haven't had good experience with these upper-boun

Re: Region Servers Crashing during Random Reads

2011-02-04 Thread Lars George
increasing the RAM? >> >>> > > >> >> >>> > > >> I am adding some more info about the app. >> >>> > > >> >> >>> > > >>> We are storing web page data in HBase. >> >>> > > >

Re: Region Servers Crashing during Random Reads

2011-02-04 Thread Todd Lipcon
nt > plan > >>> to > >>> > > do > >>> > > >> scan's.. > >>> > > >>> We have LZOCompression Set on this column family. > >>> > > >>> We were noticing 1500 Reads, when reading the page conte

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Lars George
olumn family. >>> > > >>> We were noticing 1500 Reads, when reading the page content. >>> > > >>> We have a column family, which stores just metadata of the page >>> > "title" >>> > > >> etc... When reading this the perf

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Stack
performance is whopping 12000 TPS. >> > > >> >> > > >> We though the issue could be because of N/w bandwidth used between >> > HBase >> > > >> and Clients. So we disable LZO Compression on Column Family and >> > started >> >

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
gt; doing the compression of the raw page on the client and decompress > it > > > when > > > >> readind (LZO). > > > >> > > > >>> With this my write performance jumped up from 2000 to 5000 at peak. > > > >>> With this approach, th

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Todd Lipcon
the raw page on the client and decompress it > > when > > >> readind (LZO). > > >> > > >>> With this my write performance jumped up from 2000 to 5000 at peak. > > >>> With this approach, the servers are crashing... Not sure , why only > >

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
With this my write performance jumped up from 2000 to 5000 at peak. > >>> With this approach, the servers are crashing... Not sure , why only > >> after > >> turning of LZO... and doing the same from client. > >> > >> > >> > >>

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Charan K
Not sure , why only >> after >> turning of LZO... and doing the same from client. >> >> >> >> On Thu, Feb 3, 2011 at 12:13 PM, Jonathan Gray wrote: >> >>> How much heap are you running on your RegionServers? >>> >>> 6GB of

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Todd Lipcon
M is on the low end. For high throughput applications, I > > would recommend at least 6-8GB of heap (so 8+ GB of RAM). > > > > > -Original Message- > > > From: charan kumar [mailto:charan.ku...@gmail.com] > > > Sent: Thursday, February 03, 2011 11

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
-Original Message- > > From: charan kumar [mailto:charan.ku...@gmail.com] > > Sent: Thursday, February 03, 2011 11:47 AM > > To: user@hbase.apache.org > > Subject: Region Servers Crashing during Random Reads > > > > Hello, > > > > I am using hbase 0.9

RE: Region Servers Crashing during Random Reads

2011-02-03 Thread Jonathan Gray
bruary 03, 2011 11:47 AM > To: user@hbase.apache.org > Subject: Region Servers Crashing during Random Reads > > Hello, > > I am using hbase 0.90.0 with hadoop-append. h/w ( Dell 1950, 2 CPU, 6 GB > RAM) > > I had 9 Region Servers crash (out of 30) in a span of 30 m

Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
Hello, I am using hbase 0.90.0 with hadoop-append. h/w ( Dell 1950, 2 CPU, 6 GB RAM) I had 9 Region Servers crash (out of 30) in a span of 30 minutes during a heavy reads. It looks like a GC, ZooKeeper Connection Timeout thingy to me. I did all recommended configuration from the Hbase wiki... An

Re: Region Servers Crashing - LeaseExpired

2011-01-03 Thread Jean-Daniel Cryans
First, never swap with Java. Disable it on your machines. Second, to go at the bottom of this issue you need to go where it starts showing exceptions in the logs. In your case it seems we only see indirect symptoms of a forceful failover by the HBase master. Somewhere before that there should be a

Re: Region Servers Crashing - LeaseExpired

2011-01-01 Thread Martin Arnandze
Hi All, I was just wondering if anyone had a chance too look at this issue, or experienced something similar. I'm stuck with this and would appreciate any hint. Many thanks, Martin On Dec 27, 2010, at 11:27 AM, Martin Arnandze wrote: > Hi, > I'm using hbase 0.20.5 and hadoop 0.20.1. Some

Region Servers Crashing - LeaseExpired

2010-12-27 Thread Martin Arnandze
Hi, I'm using hbase 0.20.5 and hadoop 0.20.1. Some region servers are crashing, saying that an file cannot be found, and that a lease has expired (log detail below). Tried to find in this mailing list for the exact problem but was not successful. These are the symptoms: - Typically I see high

Re: region servers crashing

2010-08-26 Thread Stack
t;>> Thanks all - we'll look into GC tuning some more. >> >>> >> >>> On Wed, Jul 14, 2010 at 3:47 PM, Jonathan Gray >> wrote: >> >>>> >> >>>> This doesn't look like a clock skew issue. >> >>&g

Re: region servers crashing

2010-08-26 Thread Dmitry Chechik
> >>>> > >>>> @Dmitry, while you should be running CMS, this is still a garbage > >>>> collector and is still vulnerable to GC pauses. There are additional > >>>> configuration parameters to tune even more. > >>>> > >>

Re: region servers crashing

2010-08-26 Thread Ryan Rawson
ill vulnerable to GC pauses.  There are additional >>>> configuration parameters to tune even more. >>>> >>>> How much heap are you running with on your RSs?  If you are hitting your >>>> servers with lots of load you should run with 4GB or more

Re: region servers crashing

2010-08-26 Thread Ryan Rawson
h 4GB or more. >>> >>> Also, having ZK on the same servers as RS/DN is going to create problems >>> if you're already hitting your IO limits. >>> >>> JG >>> >>> > -Original Message- >>> > From: Arun Ramakrishn

Re: region servers crashing

2010-07-14 Thread Dmitry Chechik
rvers as RS/DN is going to create problems if > you're already hitting your IO limits. > > JG > > > -Original Message- > > From: Arun Ramakrishnan [mailto:aramakrish...@languageweaver.com] > > Sent: Wednesday, July 14, 2010 3:33 PM > > To: user@hbase

RE: region servers crashing

2010-07-14 Thread Jonathan Gray
gt; Sent: Wednesday, July 14, 2010 3:33 PM > To: user@hbase.apache.org > Subject: RE: region servers crashing > > Had a problem that caused issues that looked like this. > > > 2010-07-12 15:10:03,299 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 86246m

RE: region servers crashing

2010-07-14 Thread Arun Ramakrishnan
ezones detected on all the machines were the same. -Original Message- From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of Jean-Daniel Cryans Sent: Wednesday, July 14, 2010 3:11 PM To: user@hbase.apache.org Subject: Re: region servers crashing Dmitry, Your log shows this:

Re: region servers crashing

2010-07-14 Thread Jean-Daniel Cryans
itoring, setting swappiness to 0 and giving more memory to HBase (if available). J-D On Wed, Jul 14, 2010 at 3:03 PM, Dmitry Chechik wrote: > Hi all, > We've been having issues for a few days with HBase region servers crashing > when under load from mapreduce jobs. > There are a f

region servers crashing

2010-07-14 Thread Dmitry Chechik
Hi all, We've been having issues for a few days with HBase region servers crashing when under load from mapreduce jobs. There are a few different errors in the region server logs - I've attached a sample log of 4 different region servers crashing within an hour of each other. So