[ 
https://issues.apache.org/jira/browse/HBASE-25507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264993#comment-17264993
 ] 

Xiaolin Ha edited comment on HBASE-25507 at 1/14/21, 4:19 PM:
--------------------------------------------------------------

When creating InternalScanner for Compactor, all selected files for this 
compaction will be created Readers in the new StoreScanner, which is also 
calling seek() for all the selected files internal. When the seek() of one or 
more file throws Exception like upwards, the InternalScanner will not be 
created, so the finally close() for the InternalScanner will not work. And 
what's worse, even only one file seeks fail, all the selected file Readers 
which are newly created for the compaction will not be closed. While the 
compaction is running once by once, there are leaks of ESTABLISHED sockets.


was (Author: xiaolin ha):
When creating InternalScanner for Compactor, all selected files for this 
compaction will be created new StoreScanner, which calling seek() for all the 
selected files internal. When the seek() of one or more file throws Exception 
like upwards, the InternalScanner will not be created, so the finally close() 
will not work, and even only one file seeks fail, all the selected file Readers 
which are newly created for the compaction will not be closed. While the 
compaction is running once by once, there are leaks of ESTABLISHED sockets.

> Leak of ESTABLISHED sockets when encountered "java.io.IOException: Invalid 
> HFile block magic"
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-25507
>                 URL: https://issues.apache.org/jira/browse/HBASE-25507
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>         Attachments: errorlogs.png, 
> increasing-of-established-sockets-image.png, problem-region-move-logs.png
>
>
> Recently, we found socket leaks on our production cluster. The leaked sockets 
> are in ESTABLISHED state. We found this happened on RS who owned a particular 
> region from our analysis of metrics monitor and logs. RS without this region 
> works normally. 
> On the RS who owns the particular region, we found Exceptions as follows,
> {code:java}
> java.io.IOException: java.io.IOException: Could not seek 
> StoreFileScanner[org.apache.hadoop.hbase.ioo
> .HalfStoreFileReader$1@5388be2f, cur=null] to key 
> org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFF
> amilyCell@25aa56fd
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.parallelSeek(StoreScanner.java:1128)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:437)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:329)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:302)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:88
> 06)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor$StripeInternalScannerFacc
> tory.createScanner(StripeCompactor.java:82)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:316)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor.compact(StripeCompactor..
> java:120)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.StripeCompactionPolicy$SplitStripeCompacc
> tionRequest.execute(StripeCompactionPolicy.java:662)
>         at 
> org.apache.hadoop.hbase.regionserver.StripeStoreEngine$StripeCompaction.compact(StripeStoo
> reEngine.java:114)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1461)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2121)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(Comm
> pactSplitThread.java:519)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitt
> Thread.java:555)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Could not seek 
> StoreFileScanner[org.apache.hadoop.hbase.io.HalfStoree
> FileReader$1@5388be2f, cur=null] to key 
> org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFamilyCell@@
> 25aa56fd
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:229)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.ParallelSeekHandler.process(ParallelSeekHandd
> ler.java:56)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>         ... 3 more
> Caused by: java.io.IOException: Invalid HFile block magic: 
> \x00\x00\x00\x00\x00\x00\x00\x00
>         at 
> org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:159)
>         at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:171)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.createFromBuff(HFileBlock.java:333)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlockk
> .java:1753)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:155
> 52)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:539)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.readAndUpdateNewBlock(HFileScannerImpl..
> java:737)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.seekTo(HFileScannerImpl.java:726)
>         at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:161)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.javaa
> :315)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:211)
>         ... 5 more
> {code}
> The count of established sockets is always increasing, see picture,
> !increasing-of-established-sockets-image.png!
>  
> !problem-region-move-logs.png!
> !errorlogs.png!
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to