Hi, Considering this is called the *online* region merge, I would assume regions being merged never go offline during the merge and both regions being merged are available for reading and writing at all times, even during the merge.... though I don't get how writes would work if one region is being moved from one RS to another.... so maybe this is not truly online and writes are either rejected or buffered/blocked until the region is moved AND merged? Anyone knows for sure?
I see this in one of the comments: Q: If one (or both) of the regions were receiving non-trivial load prior to this action, would client(s) be affected ? A: Yes, region would be off services in a short time, it is equal with moving region, e.g balance a region Also took a look at the patch: https://issues.apache.org/jira/secure/attachment/12574965/hbase-7403-trunkv33.patch And see: + /** + * The merging region A has been taken out of the server's online regions list. + */ + OFFLINED_REGION_A, ... and if you look for the word "offline" in the patch I think it's pretty clear that BOTH regions being merged do go offline at some point. I guess it could be after the merge, too, not before.... ... maybe others know? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jan 19, 2015 at 4:17 AM, Vladimir Tretyakov < [email protected]> wrote: > Hi, I have one question about 'online region merge' ( > https://issues.apache.org/jira/browse/HBASE-7403). > How I've understood regions which will be passed to merge method will be > unavailable for some time. > > That means: > 1. Some data will be unavailable some time. > 2. If client will try to write data to these regions it will get > exceptions. > > Are above sentences correct? > > Somebody can estimate time which 1 and 2 will be true? Seconds, minutes or > hours? Is there any way to avoid 1 and 2? > > I am asking because now we have problem during time with number of regions > (our key contains timestamp), count of regions growing constantly > (splitting) and it become a cause of performance problem with time. > For avoiding this effect we use 2 tables: > 1. First table we use for writing and reading data. > 2. Second we use only for reading data. > > After some time we truncate second table and rotate these tables (first > become second and second become first). That allow us control count of > regions, but solution looks a bit ugly, I looked at 'online region merge', > but we can't live with restrictions I've described in first part of > question. > > Can somebody help with answers? > > Thx, Vladimir Tretyakov. >
