[jira] [Commented] (HBASE-6195) Increment data will lost when the memstore flushed

Xing Shi (JIRA) Sun, 10 Jun 2012 20:13:46 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292625#comment-13292625
 ]


Xing Shi commented on HBASE-6195:
---------------------------------

Here is the data:
I delete the row first, and then use 2000 threads to increment one row, each 
increment 1000, after all threads done, I read the increment row's value, do 11 
times.

for i in `seq 0 10`
do
    /home/shubao.sx/hadoop-0.20.2-cdh3u3/bin/hadoop --config 
/home/shubao.sx/0.90-hadoop-config jar /home/shubao.sx/inc-no-delete/inc.jar 
com.taobao.hbase.MultiThreadsIncrement --threadNum 2000 --inc 1000 
>/home/shubao.sx/inc-no-delete/inc.$i.log
done

and the results:

inc.0.log : return 199838                                                       
                                                           
inc.1.log : return 399729
inc.2.log : return 599579
inc.3.log : return 799441
inc.4.log : return 999305
inc.5.log : return 1199173
inc.6.log : return 1399037
inc.7.log : return 1598939
inc.8.log : return 1798804
inc.9.log : return 1998708
inc.10.log : return 2198637

Because I set the  hlog's parameter
  <property>
    <name>hbase.regionserver.logroll.multiplier</name>
    <value>0.005</value>
  </property>
  <property>
    <name>hbase.regionserver.maxlogs</name>
    <value>3</value>
  </property>

so the memstore flush occurs often.
                
> Increment data will lost when the memstore flushed
> --------------------------------------------------
>
>                 Key: HBASE-6195
>                 URL: https://issues.apache.org/jira/browse/HBASE-6195
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Xing Shi
>
> There are two problems in increment() now:
> First:
> I see that the timestamp(the variable now) in HRegion's Increment() is 
> generated before got the rowLock, so when there are multi-thread increment 
> the same row, although it generate earlier, it may got the lock later. 
> Because increment just store one version, so till now, the result will still 
> be right.
> When the region is flushing, these increment will read the kv from snapshot 
> and memstore with whose timestamp is larger, and write it back to memstore. 
> If the snapshot's timestamp larger than the memstore, the increment will got 
> the old data and then do the increment, it's wrong.
> Secondly:
> Also there is a risk in increment. Because it writes the memstore first and 
> then HLog, so if it writes HLog failed, the client will also read the 
> incremented value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6195) Increment data will lost when the memstore flushed

Reply via email to