There are two timeout properties. one on the region server side and the other one on master side (the coordinator).
"hbase.snapshot.master.timeoutMillis" "hbase.snapshot.region.timeout" increasing the master side only has no effect since the region server side will send a timeout to the master after the default 60sec. Matteo On Mon, Jul 21, 2014 at 2:56 PM, Brian Jeltema < [email protected]> wrote: > There are 174 regions, not well balanced. One RegionServer has 69 regions. > That RegionServer generates a > series of log entries (modified and shown below), one for each region, at > roughly 1 to 2 second intervals. The timeout period expires when > it reaches region 36. > > 2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for > hfiles > 2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references > for [hdfs:// > xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2] > hfiles > 2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file > (1/1) : hdfs:// > xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2 > 2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush > Snapshotting region > hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6. > completed. > 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region > operation on > hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21 > 07:49:45,137 DEBUG > [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1] > snapshot.FlushSnapshotSubprocedure: Starting region operation on > hosts,\x00\x8A\x90\xD6\x08,1400 > 659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG > [member: ‘xxx.digitalenvoy.net,60020,1405943192177' > subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager: > Completed 1/174 local region snapshots. > 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush > Snapshotting region > hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729. > started... > 2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for > snapshot. > > On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <[email protected]> > wrote: > > > Can you also tell us more about your table? How many regions on how many > > region servers? > > > > > > 2014-07-21 8:23 GMT-04:00 Ted Yu <[email protected]>: > > > >> Normally such timeout is caused by one region server which is slow in > >> completing its part of the snapshot procedure. > >> > >> Have you looked at region server logs ? > >> Feel free to pastebin relevant portion. > >> > >> Thanks > >> > >> On Jul 21, 2014, at 4:03 AM, Brian Jeltema < > [email protected]> > >> wrote: > >> > >>> I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing > >> out after 60 seconds. > >>> I increased the value of hbase.snapshot.master.timeoutMillis and > >> restarted HBase, > >>> but the timeout still happens after 60 seconds. Any suggestions? > >>> > >>> Brian > >> > >
