[ 
https://issues.apache.org/jira/browse/HBASE-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14177:
-----------------------------------
    Fix Version/s:     (was: 0.98.17)
                   0.98.18

> Full GC on client may lead to missing scan results
> --------------------------------------------------
>
>                 Key: HBASE-14177
>                 URL: https://issues.apache.org/jira/browse/HBASE-14177
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.98.12, 0.98.13, 1.0.2
>            Reporter: James Estes
>            Priority: Critical
>              Labels: dataloss
>             Fix For: 1.0.4, 0.98.18
>
>
> After adding a large row, scanning back that row winds up being empty. After 
> a few attempts it will succeed (all attempts over the same data on an hbase 
> getting no other writes).
> Looking at logs, it seems this happens when there is memory pressure on the 
> client and there are several Full GCs that happen. Then messages that 
> indicate that region locations are being removed from the local client cache:
> 2015-07-31 12:50:24,647 [main] DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation  
> - Removed 192.168.1.131:50981 as a location of 
> big_row_1438368609944,,1438368610048.880c849594807bdc7412f4f982337d6c. for 
> tableName=big_row_1438368609944 from cache
> Blaming the GC may sound fanciful, but if the test is run with -Xms4g -Xmx4g 
> then it always passes on the first scan attempt. Maybe the pause is enough to 
> remove something from the cache, or the client is using weak references 
> somewhere?
> More info 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201507.mbox/%3CCAE8tVdnFf%3Dob569%3DfJkpw1ndVWOVTkihYj9eo6qt0FrzihYHgw%40mail.gmail.com%3E
> Test used to reproduce:
> https://github.com/housejester/hbase-debugging#fullgctest
> I tested and had failures in:
> 0.98.12 client/server
> 0.98.13 client 0.98.12 server
> 0.98.13 client/server
> 1.1.0 client 0.98.13 server
> 0.98.13 client and 1.1.0 server
> 0.98.12 client and 1.1.0 server
> I tested without failure in:
> 1.1.0 client/server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to