There are 174 regions, not well balanced. One RegionServer has 69 regions. That RegionServer generates a series of log entries (modified and shown below), one for each region, at roughly 1 to 2 second intervals. The timeout period expires when it reaches region 36.
2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for hfiles 2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references for [hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2] hfiles 2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file (1/1) : hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2 2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush Snapshotting region hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6. completed. 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region operation on hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21 07:49:45,137 DEBUG [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1] snapshot.FlushSnapshotSubprocedure: Starting region operation on hosts,\x00\x8A\x90\xD6\x08,1400 659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG [member: ‘xxx.digitalenvoy.net,60020,1405943192177' subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager: Completed 1/174 local region snapshots. 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush Snapshotting region hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729. started... 2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for snapshot. On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <[email protected]> wrote: > Can you also tell us more about your table? How many regions on how many > region servers? > > > 2014-07-21 8:23 GMT-04:00 Ted Yu <[email protected]>: > >> Normally such timeout is caused by one region server which is slow in >> completing its part of the snapshot procedure. >> >> Have you looked at region server logs ? >> Feel free to pastebin relevant portion. >> >> Thanks >> >> On Jul 21, 2014, at 4:03 AM, Brian Jeltema <[email protected]> >> wrote: >> >>> I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing >> out after 60 seconds. >>> I increased the value of hbase.snapshot.master.timeoutMillis and >> restarted HBase, >>> but the timeout still happens after 60 seconds. Any suggestions? >>> >>> Brian >>
