Will integrate patch once QA run finishes. Thanks
On Tue, Feb 3, 2015 at 5:40 PM, Bi,hongyu—mike <[email protected]> wrote: > Hi ted, > sorry for the late response, > i just file a jira https://issues.apache.org/jira/browse/HBASE-12957 for > this issue > thanks > > 2015-01-07 23:40 GMT+08:00 Ted Yu <[email protected]>: > > > In 0.98, HRegionServer is annotated with @InterfaceAudience.Private > > In 1.0+, it is annotated > > with @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.TOOLS) > > > > FYI > > > > On Tue, Jan 6, 2015 at 7:55 PM, Bi,hongyu—mike <[email protected]> > wrote: > > > > > Thanks Ted , I didn't notice that ;P > > > > > > 2015-01-07 11:47 GMT+08:00 Ted Yu <[email protected]>: > > > > > > > In master and branch-1 branches, there is no 'GetResponse get()' > method > > > in > > > > HRegionServer anymore. > > > > > > > > FYI > > > > > > > > On Tue, Jan 6, 2015 at 7:26 PM, Bi,hongyu—mike <[email protected]> > > > wrote: > > > > > > > > > Hi Ted, > > > > > > > > > > KeyOnlyFilter may improve the scan speed but I don't think the scan > > may > > > > > finish less than leaseTimeout in such case; > > > > > From the HRegionServer#get I see: > > > > > > > > > > HRegion region = getRegion(regionName); here getRegion may throw > > > > > NotServingRegionException that is need by isSuccessfulScan; > > > > > > > > > > and HRegionServer#get can return as soon as possible; > > > > > > > > > > 2015-01-07 11:00 GMT+08:00 Ted Yu <[email protected]>: > > > > > > > > > > > For isSuccessfulScan(), I see: > > > > > > > > > > > > scan.setBatch(1) > > > > > > scan.setCaching(1) > > > > > > scan.setFilter(FirstKeyOnlyFilter.new()) > > > > > > > > > > > > How about adding a KeyOnlyFilter as well ? > > > > > > > > > > > > On Tue, Jan 6, 2015 at 6:37 PM, Bi,hongyu—mike < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > Thanks Ted, > > > > > > > Finally I resolved the issue, the RC is :region_mover will call > > > > > > > isSuccessfulScan to scan the startkey of the moved region which > > > > filled > > > > > > with > > > > > > > lots of expired cells,so it seems scan hang; > > > > > > > I think isSuccessfulScan is just to test whether the moved > region > > > is > > > > > > > readable or not, why not to use get instead which may avoid > such > > > case > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2015-01-06 20:59 GMT+08:00 Ted Yu <[email protected]>: > > > > > > > > > > > > > > > Can you pastebin region server log ? > > > > > > > > > > > > > > > > When the scan is being performed, can you get jstack and > > pastebin > > > > it > > > > > ? > > > > > > > > > > > > > > > > 0.94.15 was an old release, any chance of upgrade ? > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jan 6, 2015, at 2:34 AM, Bi,hongyu—mike < > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > > > > sorry , forgot to attach the version: 0.94.15; > > > > > > > > > > > > > > > > > > and i call compact (as well as many times flush region) > from > > > > hbase > > > > > > > shell > > > > > > > > > didn't take effect, no compaction happened; > > > > > > > > > > > > > > > > > > 2015-01-06 18:26 GMT+08:00 Bi,hongyu—mike < > [email protected] > > >: > > > > > > > > > > > > > > > > > >> scan debug log: > > > > > > > > >> 15/01/06 18:20:56 DEBUG client.ClientScanner: Creating > > scanner > > > > > over > > > > > > T > > > > > > > > >> starting at key 'Rowx' > > > > > > > > >> 15/01/06 18:20:56 DEBUG client.ClientScanner: Advancing > > > internal > > > > > > > scanner > > > > > > > > >> to startKey at 'Rowx' > > > > > > > > >> 15/01/06 18:20:56 DEBUG client.MetaScanner: Scanning > .META. > > > > > starting > > > > > > > at > > > > > > > > >> row=XXXX for max=10 rows using > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@427b7b5d > > > > > > > > >> 15/01/06 18:20:56 DEBUG > > > > > > > > >> client.HConnectionManager$HConnectionImplementation: > Cached > > > > > location > > > > > > > for > > > > > > > > >> <THAT REGION> is RS_IP:60020 > > > > > > > > >> ...... > > > > > > > > >> 15/01/06 18:21:16 DEBUG zookeeper.ClientCnxn: Got ping > > > response > > > > > for > > > > > > > > >> sessionid: 0x3499df682b076cf after 0ms > > > > > > > > >> 15/01/06 18:21:36 DEBUG zookeeper.ClientCnxn: Got ping > > > response > > > > > for > > > > > > > > >> sessionid: 0x3499df682b076cf after 0ms > > > > > > > > >> 15/01/06 18:21:56 DEBUG zookeeper.ClientCnxn: Got ping > > > response > > > > > for > > > > > > > > >> sessionid: 0x3499df682b076cf after 0ms > > > > > > > > >> 15/01/06 18:21:56 DEBUG zookeeper.ClientCnxn: Reading > reply > > > > > > > > >> sessionid:0x3499df682b076cf, packet:: clientPath:null > > > > > > serverPath:null > > > > > > > > >> finished:false header:: 9,4 replyHeader:: > > 9,21519728740,-101 > > > > > > > request:: > > > > > > > > >> '/hbase/table/T,F response:: > > > > > > > > >> 15/01/06 18:21:56 DEBUG > > > > > > > > >> client.HConnectionManager$HConnectionImplementation: > Removed > > > > <THAT > > > > > > > > REGION> > > > > > > > > >> for tableName=T from cache because of Rowx > > > > > > > > >> 15/01/06 18:21:56 DEBUG > > > > > > > > >> client.HConnectionManager$HConnectionImplementation: > Cached > > > > > location > > > > > > > for > > > > > > > > >> <THAT REGION> is RS_IP:60020 > > > > > > > > >> 15/01/06 18:21:56 DEBUG client.ClientScanner: Advancing > > > internal > > > > > > > scanner > > > > > > > > >> to startKey at 'Rowx' > > > > > > > > >> > > > > > > > > >> 2015-01-06 18:09 GMT+08:00 Bi,hongyu—mike < > > [email protected] > > > >: > > > > > > > > >> > > > > > > > > >>> write traffic is ok: > > > > > > > > >>> 2015-01-06 17:46:01,127 WARN > > > > > > > org.apache.hadoop.hbase.ipc.SecureServer: > > > > > > > > >>> (responseTooSlow): > > > > {"processingtimems":68,"call":"multi(Region=Rx > > > > > > of > > > > > > > > 149 > > > > > > > > >>> actions and first row key= Rowx), rpc version=1, client > > > > > version=29, > > > > > > > > >>> methodsFingerPrint=-1105746420","client":"IP:port} > > > > > > > > >>> > > > > > > > > >>> scan on that region slow: > > > > > > > > >>> 015-01-06 16:23:25,087 ERROR > > > > > > > > >>> org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > > > > > >>> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: > > > > Aborting > > > > > > on > > > > > > > > >>> region Rx, call next(8002464006782223710, 1, 0), rpc > > > version=1, > > > > > > > client > > > > > > > > >>> version=29, methodsFingerPrint=-1771721648 from > > > > > > 10.201.202.31:31285 > > > > > > > > >>> after 87821 ms, since caller disconnected > > > > > > > > >>> at > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:438); > > > > > > > > >>> > > > > > > > > >>> hbase hfile -r 'Rx' -p can produce the result > > > > > > > > >>> > > > > > > > > >>> 2015-01-06 18:03 GMT+08:00 Bi,hongyu—mike < > > [email protected] > > > >: > > > > > > > > >>> > > > > > > > > >>>> Hi all, > > > > > > > > >>>> > > > > > > > > >>>> There's one region which can take write request but > scan; > > > > > > > > >>>> If I scan on that region I'll get scanner lease > > timeout(60s > > > by > > > > > > > > >>>> default),while I can scan other region of the same > table > > > and > > > > > get > > > > > > > the > > > > > > > > >>>> result less than 10ms(our slow rpc threadhold is 10ms); > > > > > > > > >>>> > > > > > > > > >>>> hbck report OK, and I use "hbase hfile" tool to check > that > > > > > > region's > > > > > > > > >>>> storefile and the region ,which all extract the result; > > > > > > > > >>>> > > > > > > > > >>>> so I don't have any idea on it... > > > > > > > > >>>> any help will be appreciate, many thanks! > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
