The load balancer in 0.98 considers many factors when making balancing decisions.
Can you take a look at the master log and look for balancer related lines ? That would give you some clue. Cheers On Jul 22, 2014, at 5:03 AM, Brian Jeltema <[email protected]> wrote: > I ran the balancer from hbase shell, but don’t see any change. Is there a way > to balance a specific table? > >> bq. One RegionServer has 69 regions >> >> Can you run load balancer so that your regions are better balanced ? >> >> Cheers >> >> >> On Mon, Jul 21, 2014 at 6:56 AM, Brian Jeltema < >> [email protected]> wrote: >> >>> There are 174 regions, not well balanced. One RegionServer has 69 regions. >>> That RegionServer generates a >>> series of log entries (modified and shown below), one for each region, at >>> roughly 1 to 2 second intervals. The timeout period expires when >>> it reaches region 36. >>> >>> 2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for >>> hfiles >>> 2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references >>> for [hdfs:// >>> xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2] >>> hfiles >>> 2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file >>> (1/1) : hdfs:// >>> xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2 >>> 2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush >>> Snapshotting region >>> hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6. >>> completed. >>> 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region >>> operation on >>> hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21 >>> 07:49:45,137 DEBUG >>> [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1] >>> snapshot.FlushSnapshotSubprocedure: Starting region operation on >>> hosts,\x00\x8A\x90\xD6\x08,1400 >>> 659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG >>> [member: ‘xxx.digitalenvoy.net,60020,1405943192177' >>> subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager: >>> Completed 1/174 local region snapshots. >>> 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush >>> Snapshotting region >>> hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729. >>> started... >>> 2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for >>> snapshot. >>> >>> On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <[email protected]> >>> wrote: >>> >>>> Can you also tell us more about your table? How many regions on how many >>>> region servers? >>>> >>>> >>>> 2014-07-21 8:23 GMT-04:00 Ted Yu <[email protected]>: >>>> >>>>> Normally such timeout is caused by one region server which is slow in >>>>> completing its part of the snapshot procedure. >>>>> >>>>> Have you looked at region server logs ? >>>>> Feel free to pastebin relevant portion. >>>>> >>>>> Thanks >>>>> >>>>> On Jul 21, 2014, at 4:03 AM, Brian Jeltema < >>> [email protected]> >>>>> wrote: >>>>> >>>>>> I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing >>>>> out after 60 seconds. >>>>>> I increased the value of hbase.snapshot.master.timeoutMillis and >>>>> restarted HBase, >>>>>> but the timeout still happens after 60 seconds. Any suggestions? >>>>>> >>>>>> Brian >
