[
https://issues.apache.org/jira/browse/HBASE-25507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Stack resolved HBASE-25507.
-----------------------------------
Fix Version/s: 2.4.2
2.3.5
3.0.0-alpha-1
Hadoop Flags: Reviewed
Resolution: Fixed
Merged to branch-2.3+. Thanks for the thorny fix [~Xiaolin Ha]. Nice one.
> Leak of ESTABLISHED sockets when compaction encountered "java.io.IOException:
> Invalid HFile block magic"
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-25507
> URL: https://issues.apache.org/jira/browse/HBASE-25507
> Project: HBase
> Issue Type: Improvement
> Components: Compaction
> Affects Versions: 3.0.0-alpha-1, 2.4.1
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.5, 2.4.2
>
> Attachments: errorlogs.png,
> increasing-of-established-sockets-image.png, problem-region-move-logs.png
>
>
> Recently, we found socket leaks on our production cluster. The leaked sockets
> are in ESTABLISHED state. We found this happened on RS who owned a particular
> region from our analysis of metrics monitor and logs. RS without this region
> works normally.
> On the RS who owns the particular region, we found Exceptions as follows,
> {code:java}
> java.io.IOException: java.io.IOException: Could not seek
> StoreFileScanner[org.apache.hadoop.hbase.ioo
> .HalfStoreFileReader$1@5388be2f, cur=null] to key
> org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFF
> amilyCell@25aa56fd
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.parallelSeek(StoreScanner.java:1128)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:437)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:329)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:302)
> at
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:88
> 06)
> at
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor$StripeInternalScannerFacc
> tory.createScanner(StripeCompactor.java:82)
> at
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:316)
> at
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor.compact(StripeCompactor..
> java:120)
> at
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactionPolicy$SplitStripeCompacc
> tionRequest.execute(StripeCompactionPolicy.java:662)
> at
> org.apache.hadoop.hbase.regionserver.StripeStoreEngine$StripeCompaction.compact(StripeStoo
> reEngine.java:114)
> at
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1461)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2121)
> at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(Comm
> pactSplitThread.java:519)
> at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitt
> Thread.java:555)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Could not seek
> StoreFileScanner[org.apache.hadoop.hbase.io.HalfStoree
> FileReader$1@5388be2f, cur=null] to key
> org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFamilyCell@@
> 25aa56fd
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:229)
> at
> org.apache.hadoop.hbase.regionserver.handler.ParallelSeekHandler.process(ParallelSeekHandd
> ler.java:56)
> at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
> ... 3 more
> Caused by: java.io.IOException: Invalid HFile block magic:
> \x00\x00\x00\x00\x00\x00\x00\x00
> at
> org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:159)
> at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:171)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock.createFromBuff(HFileBlock.java:333)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlockk
> .java:1753)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:155
> 52)
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:539)
> at
> org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.readAndUpdateNewBlock(HFileScannerImpl..
> java:737)
> at
> org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.seekTo(HFileScannerImpl.java:726)
> at
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:161)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.javaa
> :315)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:211)
> ... 5 more
> {code}
> The count of established sockets is always increasing, see picture,
> !increasing-of-established-sockets-image.png|width=833,height=390!
>
> !problem-region-move-logs.png|width=942,height=319!
> !errorlogs.png|width=722,height=411!
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)