I just checked. No snapshots were taken and 'list_snapshots' also returns nothing.
Regards, Shahab On Fri, Nov 14, 2014 at 12:39 PM, Shahab Yunus <[email protected]> wrote: > No. Not that I can recall but I can check. > > From resolution perspective, is there any way we can resolve this. More > importantly, anyway we can automate the resolution, if we run into such > issues in future? 'Cleaning the qualifier', that is. > > Regards, > Shahab > > On Fri, Nov 14, 2014 at 12:12 PM, Ted Yu <[email protected]> wrote: > >> One possibility was that region 7373f75181c71eb5061a6673cee15931 was >> involved in some hbase snapshot. >> >> Was the underlying table being snapshotted in recent past ? >> >> Cheers >> >> On Fri, Nov 14, 2014 at 9:05 AM, Shahab Yunus <[email protected]> >> wrote: >> >> > Thanks again. >> > >> > But I have been polling for a while and it still doesn't merge. I mean >> this >> > particular region example that I sent you, I am trying to merge it since >> > yesterday. I ran the polling-base code all night and I have to kill it. >> > Then in the morning, I tried manual merging through hbase shell and it >> > still doesn't merge. Note that the current polling logic doesnot try to >> > call merge again. It just checks the region size. >> > >> > So how to clean it then? Or actually make it merge? Plus is this >> something >> > expected (a region keeping a reference)? How can we avoid it? >> > >> > Note that this is not limited to this table only. We are seeing this in >> > other regions of other tables as well. Are we merging too fast? >> > >> > >> > >> > Regards, >> > Shahab >> > >> > On Fri, Nov 14, 2014 at 11:58 AM, Ted Yu <[email protected]> wrote: >> > >> > > Polling as you described is fine. >> > > >> > > catalogJanitor.cleanMergeQualifier() is called by >> > > DispatchMergingRegionHandler. >> > > >> > > If clean was successful, you would see the following: >> > > >> > > LOG.debug("Deleting region " + regionA.getRegionNameAsString() >> + " >> > > and " >> > > >> > > + regionB.getRegionNameAsString() >> > > >> > > + " from fs because merged region no longer holds >> references"); >> > > >> > > Assuming there was no log below in your master log: >> > > >> > > LOG.error("Merged region " + region.getRegionNameAsString() >> > > >> > > + " has only one merge qualifier in META."); >> > > >> > > It would be the case that 7373f75181c71eb5061a6673cee15931 still had >> > > reference file. >> > > >> > > Cheers >> > > >> > > On Fri, Nov 14, 2014 at 8:35 AM, Shahab Yunus <[email protected] >> > >> > > wrote: >> > > >> > > > Hi Ted. >> > > > >> > > > The log bit is below at the end of the email. This is the command to >> > > merge >> > > > that I gave just now through hbase shell. forcible was false but it >> > > behaves >> > > > similarly if forcible is true too. This is from master log. Indeed >> the >> > > > region merging was skipped! What does this mean? Data seems to be >> > intact >> > > > for this table. >> > > > >> > > > Just to give you a background. This table was first merge by the >> auto >> > > mated >> > > > java application. What we are doing is that we are merging tables >> > > > programmatically. As the HBaseAdmin.mergeRegions calls i async, we >> poll >> > > for >> > > > the number of regions getting lowered after this merge call. The >> > > > application hangs and continues polling for ever as the previous >> merge >> > > > didn't happen. >> > > > >> > > > In this poll loop, we do get the number of regions by a fresh call >> to >> > > > HBaseAdmin.getTableRegions(tableName).getSize(). >> > > > >> > > > What are these merge qualifiers and what are we doing wrong or >> should >> > do? >> > > > >> > > > In the polling loop we can somehow retry merge again? But how can we >> > > know, >> > > > that we need to call merge again as it works for some regions. Is >> the >> > > table >> > > > meta corrupted for some reason by the above logic? >> > > > >> > > > Thanks a lot. >> > > > >> > > > >> > > > >> > > > >> > ------------------------------------------------------------------------ >> > > > >> > > > 2014-11-14 11:25:02,643 INFO org.apache.zookeeper.ZooKeeper: >> Session: >> > > > 0x348c7017707236b closed >> > > > 2014-11-14 11:25:02,643 INFO org.apache.zookeeper.ClientCnxn: >> > EventThread >> > > > shut down >> > > > 2014-11-14 11:25:02,645 INFO org.apache.zookeeper.ZooKeeper: >> Initiating >> > > > client connection, >> > > > >> > > > >> > > >> > >> connectString=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181 >> > > > sessionTimeout=60000 >> watcher=catalogtracker-on-hconnection-0x47d865f2, >> > > > >> > > > >> > > >> > >> quorum=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181, >> > > > baseZNode=/hbase >> > > > 2014-11-14 11:25:02,645 INFO >> > > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process >> > > > identifier=catalogtracker-on-hconnection-0x47d865f2 connecting to >> > > ZooKeeper >> > > > >> > > > >> > > >> > >> ensemble=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181 >> > > > 2014-11-14 11:25:02,645 INFO org.apache.zookeeper.ClientCnxn: >> Opening >> > > > socket connection to server ip-1010018.ec2.internal/1010019:2181. >> Will >> > > not >> > > > attempt to authenticate using SASL (unknown error) >> > > > 2014-11-14 11:25:02,646 INFO org.apache.zookeeper.ClientCnxn: Socket >> > > > connection established to ip-1010018.ec2.internal/1010019:2181, >> > > initiating >> > > > session >> > > > 2014-11-14 11:25:02,648 INFO org.apache.zookeeper.ClientCnxn: >> Session >> > > > establishment complete on server >> ip-1010018.ec2.internal/1010019:2181, >> > > > sessionid = 0x348c7017707236c, negotiated timeout = 60000 >> > > > 2014-11-14 11:25:02,703 INFO org.apache.zookeeper.ZooKeeper: >> Session: >> > > > 0x348c7017707236c closed >> > > > 2014-11-14 11:25:02,703 INFO org.apache.zookeeper.ClientCnxn: >> > EventThread >> > > > shut down >> > > > 2014-11-14 11:25:30,713 INFO >> > > > org.apache.hadoop.hbase.master.handler.DispatchMergingRegionHandler: >> > Skip >> > > > merging regions >> > > > TABLE_NAME,,1415915112497.7373f75181c71eb5061a6673cee15931., >> > > > >> > > > >> > > >> > >> TABLE_NAME,\x02\xFA\xF0\x80\x00\x00\x01I\xAA\xD5\x87\xA8\x19\x99\x99\x99\x99\x99\x99\x90,1415910559217.43f4d3685d113d3ae18eea9f189de096., >> > > > because region 7373f75181c71eb5061a6673cee15931 has merge qualifier >> > > > 2014-11-14 11:25:41,383 INFO org.apache.zookeeper.ZooKeeper: >> Initiating >> > > > client connection, >> > > > >> > > > >> > > >> > >> connectString=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181 >> > > > sessionTimeout=60000 >> watcher=catalogtracker-on-hconnection-0x47d865f2, >> > > > >> > > > >> > > >> > >> quorum=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181, >> > > > baseZNode=/hbase >> > > > 2014-11-14 11:25:41,384 INFO >> > > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process >> > > > identifier=catalogtracker-on-hconnection-0x47d865f2 connecting to >> > > ZooKeeper >> > > > >> > > > >> > > >> > >> ensemble=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181 >> > > > 2014-11-14 11:25:41,384 INFO org.apache.zookeeper.ClientCnxn: >> Opening >> > > > socket connection to server ip-1010018.ec2.internal/1010019:2181. >> Will >> > > not >> > > > attempt to authenticate using SASL (unknown error) >> > > > 2014-11-14 11:25:41,386 INFO org.apache.zookeeper.ClientCnxn: Socket >> > > > connection established to ip-1010018.ec2.internal/1010019:2181, >> > > initiating >> > > > session >> > > > 2014-11-14 11:25:41,389 INFO org.apache.zookeeper.ClientCnxn: >> Session >> > > > establishment complete on server >> ip-1010018.ec2.internal/1010019:2181, >> > > > sessionid = 0x348c7017707236e, negotiated timeout = 60000 >> > > > 2014-11-14 11:25:41,397 INFO org.apache.zookeeper.ZooKeeper: >> Session: >> > > > 0x348c7017707236e closed >> > > > 2014-11-14 11:25:41,398 INFO org.apache.zookeeper.ClientCnxn: >> > EventThread >> > > > shut down >> > > > >> > > > >> > > >> > >> ------------------------------------------------------------------------------------------------------------------------------------ >> > > > >> > > > Regards, >> > > > Shahab >> > > > >> > > > On Fri, Nov 14, 2014 at 10:56 AM, Ted Yu <[email protected]> >> wrote: >> > > > >> > > > > Looking at DispatchMergingRegionHandler, it does some check before >> > > > > initiating the merge. >> > > > > e.g.: >> > > > > >> > > > > LOG.info("Skip merging regions " + >> > > region_a.getRegionNameAsString() >> > > > > >> > > > > + ", " + region_b.getRegionNameAsString() + ", because >> > > region " >> > > > > >> > > > > + (regionAHasMergeQualifier ? region_a.getEncodedName() >> : >> > > > > region_b >> > > > > >> > > > > .getEncodedName()) + " has merge qualifier"); >> > > > > >> > > > > Can you take a look at master log around the time merge request >> was >> > > > issued >> > > > > to see if you can get some clue ? >> > > > > >> > > > > Cheers >> > > > > >> > > > > On Fri, Nov 14, 2014 at 6:41 AM, Shahab Yunus < >> > [email protected]> >> > > > > wrote: >> > > > > >> > > > > > The documentation of online merge tool (merge_region) states >> that >> > if >> > > we >> > > > > > forcibly merge regions (by setting the 3rd attribute as true) >> then >> > it >> > > > can >> > > > > > create overlapping regions. if this happens then will this >> render >> > the >> > > > > > region or table unusable or it is just a performance hit? I mean >> > how >> > > > > bigger >> > > > > > of a deal it is? >> > > > > > >> > > > > > Actually, we are merging regions using the programmatic API for >> > this >> > > > and >> > > > > > setting this flag ('forcible') as false. But for some tables (we >> > > > haven't >> > > > > > figured out a pattern yet, data is still accessible), merge of >> > > regions >> > > > do >> > > > > > not happen at all. Afterwards we tried with this flag = true, >> and >> > it >> > > > > still >> > > > > > doesn't merge them. >> > > > > > >> > > > > > CDH 5.1.0 >> > > > > > (Hbase is 0.98.1-cdh5.1.0) >> > > > > > >> > > > > > Regards, >> > > > > > Shahab >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
