> >> Hi >> >> My system is quite simple: >> - two (one quad core, one dual core) servers with 2GB mem and 150 GB >> allocated to dfs.
Whats do you have for replication level in your hdfs? Default is 3. If you have two servers only, thats odd. >> - I use it to crawl multiple supports but mainly filesystems and >> save the results onto hbase (not too many files < 100.000 but rows can get >> easily to 30 MB each) >> >> I constantly getting NullPointerExceptions (on the client caused by >> NotServingRegionExceptions on regionserver) when creating tables or >> RegionOfflineExceptions when doing puts or sometimes just time outs. Tell us more about these. Paste in the stacktraces. Please enable DEBUG logging (See FAQ for how). >> When started with hbase I developed in 'local' mode, I then migrated >> to a small dev 2 servers cluster (weaker than production is now) where I >> tested the functionality, and it worked fine but, my bad, due to pressing >> scheduling I didn't do any real load tests, so the system is now >> continuously going under in production. I've only been able to do a full >> crawl by resetting the cluster to one node and putting it in 'local' mode. >> >> My question is what can cause regions to be offline in >> regionservers? As Bryan said, it shouldn't be happening (There was a case a while back where it could happen but was fixed in 0.1.0 -- perhaps there is another path that provokes this condition). >> >> I ask so that I can investigate the matter further but having a >> starting point. Get your table all onlined -- run the little method in MetaUtils to online any offline regions if running 'enable table' in HQL doesn't do it for you -- and then enable DEBUG and let it run. If region goes offline, send us logs from regionservers and master. One other thing to consider is filehandles. Are you running w/ the usual default of 1024? If so, things will fail in odd ways if you have upward of tens of regions. St.Ack
