[ 
https://issues.apache.org/jira/browse/PHOENIX-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867365#comment-17867365
 ] 

Ujjawal Kumar commented on PHOENIX-7367:
----------------------------------------

Similar issue was also observed in HBASE-28743 while trying to do the same via 
HBase

> Snapshot based mapreduce jobs fails after HBASE-28401
> -----------------------------------------------------
>
>                 Key: PHOENIX-7367
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7367
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ujjawal Kumar
>            Priority: Major
>
> [HBASE-28401|https://issues.apache.org/jira/browse/HBASE-28401] had a 
> regression due to which HRegion#close throws NPE while trying to close the 
> memstore within the mapper
> Due to this, snapshot based MR jobs have started failing in phoenix. 
> This is due to the fact that TableSnapshotResultIterator ends up trying to 
> release the read lock twice via HRegion#closeRegionOperation 
>  * TableSnapshotResultIterator's next method [calls ScanningResultIterator's 
> next 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L180].
>  * ScanningResultIterator's [next tries to close the SnapshotScanner 
> early|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-client/src/main/java/org/apache/phoenix/iterate/ScanningResultIterator.java#L225]
>  * [SnapshotScanner's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/SnapshotScanner.java#L180-L187]
>  * 
>  **  HRegion#closeRegionOperation released the read lock and was successful
>  **  HRegion#close which threw IOException due to memstore issue (HBASE-28401)
>  **  SnapshotScanner catches the IOException but doesn't set region field to 
> null
>  * TableSnapshotResultIterator's [finally block calls 
> ScanningResultIterator's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L187-L190].
>  * 
>  ** *ScanningResultIterator's close is called again*
>  ** *Since region field wasn't null,* *HRegion#closeRegionOperation is called 
> again and throws IllegalMonitorStateException while trying to release the 
> read lock*
>  * 
>  ** The IllegalMonitorStateException then causes the whole mapper to fail
> It doesn't cause failure while doing snapshot reads via HBase (ref 
> HBASE-28743 where same NPE was observed but mapper still passes)
> , because the closest equivalent code (RecordReader within 
> TableSnapshotInputFormat) doesn't tries to close the region [as part of it's 
> nextKeyValue 
> method|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L275-L280].
>   
> This is generally much safer [because record readers are always closed 
> explicitly (even if mapper's run method 
> fails)|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java#L466-L481]
> There are 2 improvements that can be done here : 
> 1. Disable mslab for region created within snapshot (by setting 
> hbase.hregion.memstore.mslab.enabled set to false)
> 2. In TableSnapshotResultIterator - Remove the the SnapshotScanner's close 
> (via ScanningResultIterator) called within next method. It would anyways be 
> closed by the mapper at the end



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to