Re: hbase.bucketcache.bucket.sizes had set multiple of 1024 but still got "Invalid HFile block magic"

2017-11-25 Thread Weizhan Zeng
sorry , I had  a  calculate  mistake。  Just ignore it ...

2017-11-25 20:52 GMT+08:00 Weizhan Zeng <qgweiz...@gmail.com>:

> Hi , guys
>In https://issues.apache.org/jira/browse/HBASE-16993 , I found that
>
> hbase.bucketcache.bucket.sizes must set multiple of 1024, But when I set
>
>   
> hbase.bucketcache.bucket.sizes
> 6144,9216,41984,50176,58368,66560,99328,132096,
> 198211,263168,394240,525312,1049600,2099200
>   
>
> And I still got  error :
>
>
> 2017-11-25 20:37:37,222 ERROR 
> [B.defaultRpcServer.handler=20,queue=1,port=60020]
> bucket.BucketCache: Failed reading block 
> d444ab4b244140c199f23a3870f59136_250591965
> from bucket cache
> java.io.IOException: Invalid HFile block magic:
> \x00\x00\x00\x00\x00\x00\x00\x00
> at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:155)
> at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:167)
> at org.apache.hadoop.hbase.io.hfile.HFileBlock.(HFileBlock.java:275)
> at org.apache.hadoop.hbase.io.hfile.HFileBlock$1.
> deserialize(HFileBlock.java:136)
> at org.apache.hadoop.hbase.io.hfile.HFileBlock$1.
> deserialize(HFileBlock.java:123)
> at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.
> getBlock(BucketCache.java:428)
> at org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.
> getBlock(CombinedBlockCache.java:85)
> at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.
> getCachedBlock(HFileReaderV2.java:278)
> at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(
> HFileReaderV2.java:418)
> at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.
> loadDataBlockWithScanInfo(HFileBlockIndex.java:271)
> at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$
> AbstractScannerV2.seekTo(HFileReaderV2.java:649)
> at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$
> AbstractScannerV2.seekTo(HFileReaderV2.java:599)
> at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(
> StoreFileScanner.java:268)
> at org.apache.hadoop.hbase.regionserver.StoreFileScanner.
> seek(StoreFileScanner.java:173)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.
> seekScanners(StoreScanner.java:350)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.<
> init>(StoreScanner.java:199)
> at org.apache.hadoop.hbase.regionserver.HStore.
> getScanner(HStore.java:2077)
> at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(
> HRegion.java:5556)
> at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(
> HRegion.java:2574)
> at org.apache.hadoop.hbase.regionserver.HRegion.
> getScanner(HRegion.java:2560)
> at org.apache.hadoop.hbase.regionserver.HRegion.
> getScanner(HRegion.java:2541)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6830)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6809)
> at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> get(RSRpcServices.java:2049)
> at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:748)
>
>
> Is there anything I missed ?
>


hbase.bucketcache.bucket.sizes had set multiple of 1024 but still got "Invalid HFile block magic"

2017-11-25 Thread Weizhan Zeng
Hi , guys 
   In https://issues.apache.org/jira/browse/HBASE-16993 
 , I found that  

hbase.bucketcache.bucket.sizes must set multiple of 1024, But when I set 

  
hbase.bucketcache.bucket.sizes

6144,9216,41984,50176,58368,66560,99328,132096,198211,263168,394240,525312,1049600,2099200
  

And I still got  error :


2017-11-25 20:37:37,222 ERROR 
[B.defaultRpcServer.handler=20,queue=1,port=60020] bucket.BucketCache: Failed 
reading block d444ab4b244140c199f23a3870f59136_250591965 from bucket cache
java.io.IOException: Invalid HFile block magic: \x00\x00\x00\x00\x00\x00\x00\x00
at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:155)
at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:167)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.(HFileBlock.java:275)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$1.deserialize(HFileBlock.java:136)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$1.deserialize(HFileBlock.java:123)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getBlock(BucketCache.java:428)
at 
org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.getBlock(CombinedBlockCache.java:85)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.getCachedBlock(HFileReaderV2.java:278)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:418)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:271)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:649)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:599)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:268)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:173)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:350)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:199)
at 
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2077)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5556)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2574)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2560)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2541)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6830)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6809)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2049)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)


Is there anything I missed ?

java.lang.IllegalStateException: Invalid currKeyLen 33554496

2017-11-23 Thread Weizhan Zeng

Hi,guys 

I user hbase-1.2.6 , and I found a strange  problem, the key is very short but 
get “ Invalid currKeyLen 33554496 "

Is anyone had met that ? 


hbase(main):018:0> get 'test', '20#1960620#20171026'
COLUMN CELL

ERROR: java.io.IOException: java.lang.IllegalStateException: Invalid currKeyLen 
33554496 or currValueLen 706. Block offset: 3773220580287066162, block length: 
158499, position: 0 (without header).
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5600)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5570)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2574)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2560)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2541)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6830)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6809)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2049)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Invalid currKeyLen 33554496 or 
currValueLen 706. Block offset: 3773220580287066162, block length: 158499, 
position: 0 (without header).
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.checkKeyValueLen(HFileReaderV2.java:985)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV3$ScannerV3.readKeyValueLen(HFileReaderV3.java:245)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.updateCurrBlock(HFileReaderV2.java:962)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:933)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:655)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:599)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:268)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:173)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:350)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:199)
at 
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2077)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5556)
... 12 more



hbase org.apache.hadoop.hbase.io.hfile.HFile  -p  -f  
/hbase/data/default/test/5e71e0ac15c82619de9602b713aa8cb9/f/b6695b40cdb14c189c7cc41fa2dd21e0
 -w '20#1960620#20171026'
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
[jar:file:/home/hadoop/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 2017-11-24 14:31:25,756 INFO  [main] hfile.CacheConfig: Created cacheConfig: 
CacheConfig:disabled
 K: 20#1960620#20171026/f:d/1511428582215/Put/vlen=68/seqid=80546 V: 
["20#1960620#20171026#20171026#1", "20#1960620#20171026#20171026#2"]
 Scanned kv count -> 1

Re: "java.net.SocketException: Too many open files" AbortRegionServer But not ShutDown

2017-01-03 Thread Weizhan Zeng
My HBase version is 1.1.6 And Hadoop version is 2.6.1 。 I had jstack info ,
I can give it to you tomorrow after I arrived my company .

I guess the reason why "Too many open files" is too many storeFiles . I saw
my monitor and found storeFileCount is 33K , but ulimit  is 65535 。 The
reason why so many stofeFiles  seens compaction not worked.



But confused me is why rs  not exit .
​

2017-01-03 23:05 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:

> Switching to user@
>
> What's the version of hbase / hadoop you're using ?
>
> Before issuing, "kill -9", did you capture stack trace of the region server
> process ?
>
> Have you read 'Limits on Number of Files and Processes' under
> http://hbase.apache.org/book.html#basic.prerequisites ?
>
> On Tue, Jan 3, 2017 at 6:56 AM, Weizhan Zeng <qgweiz...@gmail.com> wrote:
>
> > Hi guys:
> > I met an issue on one of my RS.
> > After SocketException happend, It should shut down , but after 8 hours ,
> I
> > found it still alive and use kill -9 process to end up it.
> >
> > Here is my RegionServer log:
> >
> > In 01:58 AM , SocketException Happen,
> >
> >
> >1. [2017-01-02T01:58:00.469+08:00] [INFO] hdfs.DFSClient :
> > Exception in createBlockOutputStream java.net.SocketException: Too
> > many open files
> >2. at sun.nio.ch.Net.socket0(Native Method)
> >3. at sun.nio.ch.Net.socket(Net.java:423)
> >4. at sun.nio.ch.Net.socket(Net.java:416)
> >5. at sun.nio.ch.SocketChannelImpl.(SocketChannelImp.java:
> > 104)
> >
> > And in 01:58 AM, RegionServer aborted itself. And began to close region.
> >
> >
> >1. [2017-01-02T01:58:00.632+08:00] [INFO]
> > regionserver.HRegionServer : aborting server
> > HBASE-VENUS-149106.hadoop.local,16020,1482236933819
> >2. [2017-01-02T01:58:00.632+08:00] [INFO]
> > client.ConnectionManager$HConnectionImplementation : Closing zookeeper
> > sessionid=0x456f9b55fda457b
> >3. [2017-01-02T01:58:00.632+08:00] [INFO] regionserver.HStore :
> Closed
> > f
> >
> >
> >1. 2017-01-02T01:59:18.067+08:00] [INFO]
> > regionserver.HRegionServer$MovedRegionsCleaner : Chore:
> > MovedRegionsCleaner for region
> > HBASE-VENUS-149106.hadoop.local,16020,1482236933819 was stopped
> >2. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication
> > : Normal source for cluster 1: Total replicated edits: 39081044,
> > currently replicating from:
> > hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.
> > local%2C16020%2C1482236933819.default.1483293299516
> > at position: 0
> >
> >
> >1. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication
> > : Sink: age in ms of last applied edit: 0, total replicated edits:
> > 160769427
> >
> > After one Hour, It still log
> >
> >
> >1. [2017-01-02T02:04:18.225+08:00] [INFO] regionserver.Replication
> > : Normal source for cluster 1: Total replicated edits: 39081044,
> > currently replicating from:
> > hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.
> > local%2C16020%2C1482236933819.default.1483293299516
> > at position: 0
> >
> > At 8 AM
> >
> >
> >1. [2017-01-02T08:09:18.225+08:00] [INFO] regionserver.Replication
> > : Sink: age in ms of last applied edit: 0, total replicated edits:
> > 160769427
> >2. [2017-01-02T08:14:18.225+08:00] [INFO] regionserver.Replication
> > : Normal source for cluster 1: Total replicated edits: 39081044,
> > currently replicating
> >
> > Is anyone can give me some tips to find it out . thanks .
> >
>


Re: "java.net.SocketException: Too many open files" AbortRegionServer But not ShutDown

2017-01-03 Thread Weizhan Zeng
机器ip:

LF-HBASE-VENUS-149106.hadoop.jd.local


jstack信息:/data0/hbase-logs/46384.out

2017-01-03 23:44 GMT+08:00 Weizhan Zeng <qgweiz...@gmail.com>:

> My HBase version is 1.1.6 And Hadoop version is 2.6.1 。 I had jstack info
> , I can give it to you tomorrow after I arrived my company .
>
> I guess the reason why "Too many open files" is too many storeFiles . I
> saw my monitor and found storeFileCount is 33K , but ulimit  is 65535 。 The
> reason why so many stofeFiles  seens compaction not worked.
>
>
>
> But confused me is why rs  not exit .
> ​
>
> 2017-01-03 23:05 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
>
>> Switching to user@
>>
>> What's the version of hbase / hadoop you're using ?
>>
>> Before issuing, "kill -9", did you capture stack trace of the region
>> server
>> process ?
>>
>> Have you read 'Limits on Number of Files and Processes' under
>> http://hbase.apache.org/book.html#basic.prerequisites ?
>>
>> On Tue, Jan 3, 2017 at 6:56 AM, Weizhan Zeng <qgweiz...@gmail.com> wrote:
>>
>> > Hi guys:
>> > I met an issue on one of my RS.
>> > After SocketException happend, It should shut down , but after 8 hours
>> , I
>> > found it still alive and use kill -9 process to end up it.
>> >
>> > Here is my RegionServer log:
>> >
>> > In 01:58 AM , SocketException Happen,
>> >
>> >
>> >1. [2017-01-02T01:58:00.469+08:00] [INFO] hdfs.DFSClient :
>> > Exception in createBlockOutputStream java.net.SocketException: Too
>> > many open files
>> >2. at sun.nio.ch.Net.socket0(Native Method)
>> >3. at sun.nio.ch.Net.socket(Net.java:423)
>> >4. at sun.nio.ch.Net.socket(Net.java:416)
>> >5. at sun.nio.ch.SocketChannelImpl.(SocketChannelImp.java:
>> > 104)
>> >
>> > And in 01:58 AM, RegionServer aborted itself. And began to close region.
>> >
>> >
>> >1. [2017-01-02T01:58:00.632+08:00] [INFO]
>> > regionserver.HRegionServer : aborting server
>> > HBASE-VENUS-149106.hadoop.local,16020,1482236933819
>> >2. [2017-01-02T01:58:00.632+08:00] [INFO]
>> > client.ConnectionManager$HConnectionImplementation : Closing zookeeper
>> > sessionid=0x456f9b55fda457b
>> >3. [2017-01-02T01:58:00.632+08:00] [INFO] regionserver.HStore :
>> Closed
>> > f
>> >
>> >
>> >1. 2017-01-02T01:59:18.067+08:00] [INFO]
>> > regionserver.HRegionServer$MovedRegionsCleaner : Chore:
>> > MovedRegionsCleaner for region
>> > HBASE-VENUS-149106.hadoop.local,16020,1482236933819 was stopped
>> >2. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication
>> > : Normal source for cluster 1: Total replicated edits: 39081044,
>> > currently replicating from:
>> > hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.
>> > local%2C16020%2C1482236933819.default.1483293299516
>> > at position: 0
>> >
>> >
>> >1. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication
>> > : Sink: age in ms of last applied edit: 0, total replicated edits:
>> > 160769427
>> >
>> > After one Hour, It still log
>> >
>> >
>> >1. [2017-01-02T02:04:18.225+08:00] [INFO] regionserver.Replication
>> > : Normal source for cluster 1: Total replicated edits: 39081044,
>> > currently replicating from:
>> > hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.
>> > local%2C16020%2C1482236933819.default.1483293299516
>> > at position: 0
>> >
>> > At 8 AM
>> >
>> >
>> >1. [2017-01-02T08:09:18.225+08:00] [INFO] regionserver.Replication
>> > : Sink: age in ms of last applied edit: 0, total replicated edits:
>> > 160769427
>> >2. [2017-01-02T08:14:18.225+08:00] [INFO] regionserver.Replication
>> > : Normal source for cluster 1: Total replicated edits: 39081044,
>> > currently replicating
>> >
>> > Is anyone can give me some tips to find it out . thanks .
>> >
>>
>
>