[ 
https://issues.apache.org/jira/browse/HBASE-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20770:
--------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+. Thanks for review [~reidchan]

> WAL cleaner logs way too much; gets clogged when lots of work to do
> -------------------------------------------------------------------
>
>                 Key: HBASE-20770
>                 URL: https://issues.apache.org/jira/browse/HBASE-20770
>             Project: HBase
>          Issue Type: Bug
>          Components: logging
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 2.0.2
>
>         Attachments: HBASE-20770.branch-2.0.001.patch
>
>
> Been here before (HBASE-7214 and HBASE-19652). Testing on large cluster, 
> Master log is in a continuous spew of logging output fililng disks. It is 
> stuck making no progress but hard to tell because it is logging minutiae 
> rather than a call on whats actually wrong.
> Log is full of this:
> {code}
> 2018-06-21 01:19:12,761 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/meta/bd49572de3914b66985fff5ea2ca7403
> 2018-06-21 01:19:12,761 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/meta/fad01294c6ca421db209e89b5b97d364
> 2018-06-21 01:19:12,823 WARN 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms 
> for deleting 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/d3f759d0495257fc1d33ae780b634455/tiny/b72bac4036444dcf9265c7b5664fd403,
>  exit...
> 2018-06-21 01:19:12,823 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9
> 2018-06-21 01:19:12,824 WARN 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms 
> for deleting 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/2425053ad86823081b368e00bc471e56/tiny/6ea3cb1174434aecbc448e322e2a062c,
>  exit...
> 2018-06-21 01:19:12,824 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/big
> 2018-06-21 01:19:12,824 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/tiny
> 2018-06-21 01:19:12,827 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta
> 2018-06-21 01:19:12,844 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0
> 2018-06-21 01:19:12,844 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c
> 2018-06-21 01:19:12,844 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta/90f21dec28d140cda48d37eeb44d37e8
> 2018-06-21 01:19:12,844 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta/8a4cf6410d5a4201963bc1415945f877
> 2018-06-21 01:19:12,848 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta
> 2018-06-21 01:19:12,849 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta
> 2018-06-21 01:19:12,927 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta/6043fce5761e4479819b15405183f193
> 2018-06-21 01:19:12,927 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta/69e6bf4650124859b2bc7ddf134be642
> 2018-06-21 01:19:13,011 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta/1a46700fbc434574a005c0b55879d5ed
> 2018-06-21 01:19:13,011 DEBUG 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta/44beca2adfb544999dd82e9cf8151be1
> 2018-06-21 01:19:13,048 WARN 
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms 
> for deleting 
> hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/1f60d5dcf666b1e47de3445c369b6411/tiny/29b934a9c8f74ec1a5806f3fabeee094,
>  exit...
> {code}
> It looks like we make no progress because we give up after 60 seconds (see 
> above). The dir in this case has ~17k WALs in it (Cluster has hundreds of 
> servers and has been running test for a while). When I do a listing, the 
> count of WALs is not changing.
> There is also this output below that does not help.
> {code}
> 2018-06-21 00:16:05,852 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gW^6\xB5Z}\x00,1529565022477.4c27c2ef56a0aa3adae325a20d0a229b.
> 2018-06-21 00:16:05,855 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gP\x10\xE2\xD1\xFD\xC2\x00,1529565022477.cee1017900682956e90e94be1e1c10da.
> 2018-06-21 00:16:05,858 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gH\xC3\x8E\xEE\xA1\x07\x00,1529565022477.d4184dddf837bb3c19eb0cc0d87ccdf5.
> 2018-06-21 00:16:05,861 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g6\x82=693\x80,1529565022477.87faebdd14107f09126efa169b543dd6.
> 2018-06-21 00:16:05,863 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gS\xB7\x8C\xC3\xAC\x1F\x80,1529565022477.4d5a3efc33988180ad11c4eac8e64a4a.
> 2018-06-21 00:16:05,866 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g^\xAB\x8A\x98\xB78\x00,1529565022477.6a6945672c19b937fc2ccde13d2aff5d.
> 2018-06-21 00:16:05,869 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gmF2_p\xAE\x00,1529565022477.6607c6fef270c71c44caa1b8e69a63cb.
> 2018-06-21 00:16:05,872 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gi\x9F\x88m\xC2P\x80,1529565022477.5ecbc236145f85d857ffea9ad3c1d89c.
> 2018-06-21 00:16:05,875 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g[\x04\xE0\xA7\x08\xDA\x80,1529565022477.02a09a3ce49e05ea35a2ed51d306b37d.
> 2018-06-21 00:16:05,878 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gp\xEC\xDCQ\x1F\x0B\x80,1529565022477.0ce748886ee50a79691c7569a17ea30a.
> 2018-06-21 00:16:05,881 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g{\xE0\xDA&*$\x00,1529565022477.8d65e26b122c25f411c19555c9103b03.
> 2018-06-21 00:16:05,884 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g\x86\xD4\xD7\xFB5<\x80,1529565022477.88c1deebe2e914d9f323a557d1517ae8.
> 2018-06-21 00:16:05,888 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g\x95o\x7F\xC1\xEE\xB2\x80,1529565022477.1c3849e505edf7efd1391ae8da88ec09.
> 2018-06-21 00:16:05,891 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g\x8E"+\xDE\x91\xF7\x80,1529565022477.bb4eb5e775869e6cc3bc95fbc661c487.
> 2018-06-21 00:16:05,894 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,ge\xF8\xDE|\x13\xF3\x00,1529565022477.76afec6687b4ee782177424af54e8767.
> 2018-06-21 00:16:05,897 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gt\x93\x86B\xCDi\x00,1529565022477.0dd540d73e6dd115585de4e576b771ca.
> 2018-06-21 00:16:05,900 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g\x99\x16)\xB3\x9D\x10\x00,1529565022477.0dd49eaefa19cf1935f41661aa4e1dde.
> 2018-06-21 00:16:05,903 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,gx:04{\xC6\x80,1529565022477.525b1aff5c2361101afce5f6c05a68b6.
> 2018-06-21 00:16:05,906 DEBUG 
> org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: 
> HDFSBlocksDistribution not found in cache for region 
> IntegrationTestBigLinkedList,g\x9C\xBC\xD3\xA5Km\x80,1529565022477.9fbb338f7e006c02c3a84ca926749674.
> ...
> {code}
> And this....
> {code}
> 2018-06-21 00:16:39,617 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 104 for processing by initSASLContext
> 2018-06-21 00:16:39,617 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 0 from initSASLContext.
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 104 for processing by initSASLContext
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 50 for processing by initSASLContext
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 0 from initSASLContext.
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 50 from initSASLContext.
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 104 for processing by initSASLContext
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClient: SASL client context 
> established. Negotiated QoP: auth
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 0 from initSASLContext.
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 50 for processing by initSASLContext
> 2018-06-21 00:16:39,618 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 50 from initSASLContext.
> 2018-06-21 00:16:39,619 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClient: SASL client context 
> established. Negotiated QoP: auth
> 2018-06-21 00:16:39,619 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read 
> input token of size 50 for processing by initSASLContext
> 2018-06-21 00:16:39,619 DEBUG 
> org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send 
> token of size 50 from initSASLContext.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to