[
https://issues.apache.org/jira/browse/HBASE-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598120#comment-13598120
]
Lars Hofhansl commented on HBASE-8055:
--------------------------------------
I spent some time looking through the code. I can't see where this goes wrong.
Checked the following:
* bulk load will open the reader in all code paths (if the open was missing the
metadata would not have been loaded)
* in all circumstances the StoreFile's metadata is written. I had initially
suspected the adhoc splitting in bulk load, but that code copies the metadata
from the original file.
* record writer in HFileOutputFormat writes the metadata
Not sure where else to look. In any case, we should either remove all the other
null-checks for timeRangeTracker or add the same null-check to the only methods
where this is not done.
Even then, I'd be worried about how this came about.
> Potentially missing null check in StoreFile.Reader.getMaxTimestamp()
> --------------------------------------------------------------------
>
> Key: HBASE-8055
> URL: https://issues.apache.org/jira/browse/HBASE-8055
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Fix For: 0.95.0, 0.98.0, 0.94.7
>
>
> We just ran into a scenario where we got the following NPE:
> {code}
> 13/03/08 11:52:13 INFO regionserver.Store: Successfully loaded store file
> file:/tmp/hfile-import-00Dxx0000001lmJ-09Cxx00000000Jm/COLFAM/file09Cxx00000000Jm
> into store COLFAM (new location:
> file:/tmp/localhbase/data/SFDC.ENTITY_HISTORY_ARCHIVE/aeacee43aaf1748c6e60b9cc12bcac3d/COLFAM/120d683414e44478984b50ddd79b6826)
> 13/03/08 11:52:13 ERROR regionserver.HRegionServer: Failed openScanner
> java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.getMaxTimestamp(StoreFile.java:1702)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:301)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:127)
> at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2070)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:3383)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1628)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1620)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2342)
> at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
> 13/03/08 11:52:14 ERROR regionserver.HRegionServer: Failed openScanner
> {code}
> It's not clear, yet, how we got into this situation (we are generating HFiles
> via HFileOutputFormat and bulk load those). It seems that can only happen
> when the HFile itself is corrupted.
> Looking at the code, though, I see this is the only place where we access
> StoreFile.reader.timeRangeTracker without a null check. So it appears we are
> expecting scenarios in which it can be null.
> A simple fix would be:
> {code}
> public long getMaxTimestamp() {
> return timeRangeTracker == null ? Long.MAX_VALUE :
> timeRangeTracker.maximumTimestamp;
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira