Dealing with data locality in the HBase Java API

2015-03-03 Thread Gokul Balakrishnan
Hello, I'm fairly new to HBase so would be grateful for any assistance. My project is as follows: use HBase as an underlying data store for an analytics cluster (powered by Apache Spark). In doing this, I'm wondering how I may set about leveraging the locality of the HBase data during processing

Re: BucketCache Configuration Problem

2015-03-03 Thread Stack
On Tue, Mar 3, 2015 at 6:26 PM, donhoff_h <165612...@qq.com> wrote: > Hi, Stack > > Yes, what I mean is that the working set will not fit in RAM and so we > consider using SSD. As to the access pattern, no, they are not accessed > randomly. Usually in a loan business, after the pictures are loaded

Re: Where is HBase failed servers list stored

2015-03-03 Thread Ted Yu
Please see HBASE-13067 Fix caching of stubs to allow IP address changes of restarted remote servers Cheers On Tue, Mar 3, 2015 at 8:26 PM, Sandeep L wrote: > Hi nkeywal, > While trying to get more details about this issue I got to know that > HMaster is trying to connect to wrong IP Address. >

RE: Where is HBase failed servers list stored

2015-03-03 Thread Sandeep L
Hi nkeywal, While trying to get more details about this issue I got to know that HMaster is trying to connect to wrong IP Address. Here is exact issue: Due to some unavoidable reason we are forced to change IP Address of regionsserver & then updated new IP Address in /etc/hosts file across all HB

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jianshi Huang
The error disappeared after changing write buffer from 20MB to 2MB. Thanks for the help! Jianshi On Wed, Mar 4, 2015 at 12:12 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > It depends on how you manage your connection, your table and your puts. If > it works for you with reducing th

回复: BucketCache Configuration Problem

2015-03-03 Thread donhoff_h
Hi, Stack Yes, what I mean is that the working set will not fit in RAM and so we consider using SSD. As to the access pattern, no, they are not accessed randomly. Usually in a loan business, after the pictures are loaded, some users will first read these pictures to check the correctness of the

Different time ranges for different cfs when using TableInputFormat

2015-03-03 Thread Felipe Sodré Silva
When using TableInputFormat to make HBase data available to map/reduce jobs we can use the settings SCAN_TIMERANGE_START and SCAN_TIMERANGE_END to specify a time range during scan. Is it possible to somehow have different time ranges for different column families? This is my problem: I have table

Re: BucketCache Configuration Problem

2015-03-03 Thread Stack
On Tue, Mar 3, 2015 at 1:02 AM, donhoff_h <165612...@qq.com> wrote: > Hi, Stack > > Still thanks much for your quick reply. > > The reason that we don't shrink the heap and allocate the savings to the > offheap is that we want to cache datablocks as many as possible. The memory > size is limited.

Re: InvocationTargetException exception from org.apache.hadoop.hbase.client.HConnectionManager.createConnection

2015-03-03 Thread Chandrashekhar Kotekar
Hi JM, Thanks for the answer. My code was missing "hdfs-site.xml". I added this file as well while creating configuration object and error was gone but some other errors came which I solved by putting proper jar files on the class path. I had to use "hdfs-site.xml" in configuration because my Had

Re: Where is HBase failed servers list stored

2015-03-03 Thread Nicolas Liochon
It's in local memory. When HBase cannot connect to a server, it puts it into the "failedServerList" for 2 seconds. This is to avoid having all the threads going into a potentially long socket timeout. Are you sure that you can connect from the master to this machine/port? You can change the time i

Re: InvocationTargetException exception from org.apache.hadoop.hbase.client.HConnectionManager.createConnection

2015-03-03 Thread Jean-Marc Spaggiari
Hi Chandrashekhar, Can you make sure your hbase-site.xml is into the classpath and remove the addResouce line from your code? JM 2015-03-03 0:07 GMT-05:00 Chandrashekhar Kotekar : > My tomcat based REST API application is not able to process request due to > above mentioned error. I have tried

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jianshi Huang
Yes, looks like reducing the batch buffer size works (still validating). But why setAutoFlush(false) is harmful here? I just want maximum write speed. Jianshi On Tue, Mar 3, 2015 at 10:54 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Let HBase manage the flushes for you. Remove ed

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jean-Marc Spaggiari
It depends on how you manage your connection, your table and your puts. If it works for you with reducing the batch buffer size, then just keep it the way it is... JM 2015-03-03 11:10 GMT-05:00 Jianshi Huang : > Yes, looks like reducing the batch buffer size works (still validating). > > But why

Force-assign the META region to a dedicated RegionServer

2015-03-03 Thread Margarita Savova
Hello, I have observed very high load on the box, which hosts the META region, so I wanted to try isolating the META region to run on its dedicated box and make sure the RegionServer there does not host any other regions besides META. I stopped all other services (MapReduce and DataNode) on this

Standalone == Dev Only?

2015-03-03 Thread Rose, Joseph
Folks, I’m new to HBase (but not new to these sorts of data stores.) I think HBase would be a good fit for a project I’m working on, except for one thing: the amount of data we’re talking about, here, is far smaller than what’s usually recommended for HBase. As I read the docs, though, it seems

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jean-Marc Spaggiari
Let HBase manage the flushes for you. Remove edgeTable.setAutoFlush(false) and maybe reduce your batch size. I don't think that increasing the memstore is the good way to go. Sound more like a plaster on the issue than a good fix (for me). JM 2015-03-03 9:43 GMT-05:00 Ted Yu : > Default value f

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Ted Yu
Default value for hbase.regionserver.global.memstore.size is 0.4 Meaning Maximum size of all memstores in the region server before new updates are blocked and flushes are forced is 7352m which is lower than 774m. You can increase the value for hbase.regionserver.global.memstore.size Please also

Where is HBase failed servers list stored

2015-03-03 Thread Sandeep L
Hi, While trying to run hbase balancer I am getting error message as "This server is in the failed servers list".Due to this cluster is not getting balanced. Even though regionserver is up and running hmaster is unable to connect to it. The odd thing here is hmaster is able to start regionserver a

回复: BucketCache Configuration Problem

2015-03-03 Thread donhoff_h
Hi, Stack Still thanks much for your quick reply. The reason that we don't shrink the heap and allocate the savings to the offheap is that we want to cache datablocks as many as possible. The memory size is limited. No matter how much we shrink it can not store so many datablocks. So we want t

Re: Timerange scan

2015-03-03 Thread Kristoffer Sjögren
What can I say? Awesome community! :-) On Mon, Mar 2, 2015 at 11:17 PM, Gary Helmling wrote: > Proving it to yourself is sometimes the hardest part! > > On Mon, Mar 2, 2015 at 2:11 PM Nick Dimiduk wrote: > > > Gary to the rescue! Does it still count as being right even if you cannot > > prove i

Problem: about hbase.regionserver.restart.on.zk.expire configuration

2015-03-03 Thread Yu, Bella
Hi Ted, Now we found in version of HBASE v0.98.5, there can't be set the configuration about "hbase.regionserver.restart.on.zk.expire" ,so would you please tell us there have other ways to replace this configuration in v0.98.5. thanks

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jianshi Huang
Hi JM, Thanks for the hints. Here's my settings for writer. edgeTable.setAutoFlush(false) edgeTable.setWriteBufferSize(20971520) The write buffer seems quite large as the region server is hosting 12 related regions I'm writing to. I'll test with smaller write buffer size. The size of each put i

Re: Need Help: RegionTooBusyException: Above memstore limit

2015-03-03 Thread Jianshi Huang
Hi Ted, Only one region server is problematic. hbase.regionserver.global.memstore.size is not set, the problematic region is using 774m for memstore. Max heap is 18380m for all region servers. Jianshi On Mon, Mar 2, 2015 at 10:59 PM, Ted Yu wrote: > What's the value for hbase.regionserver.g