[
https://issues.apache.org/jira/browse/HBASE-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar updated HBASE-12449:
----------------------------------
Status: Patch Available (was: Open)
> Use the max timestamp of current or old cell's timestamp in HRegion.append()
> ----------------------------------------------------------------------------
>
> Key: HBASE-12449
> URL: https://issues.apache.org/jira/browse/HBASE-12449
> Project: HBase
> Issue Type: Bug
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: hbase-12449-0.98.patch, hbase-12449.patch
>
>
> We have observed an issue in SLES clusters where the system timestamp
> regularly goes back in time. This happens frequently enough to cause test
> failures when LTT is used with updater.
> Everytime an mutation is performed, the updater creates a string in the form
> "#column:mutation_type" and appends it to the column "mutate_info".
> It seems that when the test fails, it is always the case that the mutate_info
> information for that particular column reported is not there in the column
> mutate_info. However, according to the MultiThreadedUpdater source code, if a
> row gets updated, all the columns will be mutated. So if a row contains 15
> columns, all 15 should appear in mutate_info.
> When the test fails though, we get an exception like:
> {code}
> 2014-11-02 04:31:12,018 ERROR [HBaseReaderThread_7] util.MultiThreadedAction:
> Error checking data for key [b0485292cde20d8a76cca37410a9f115-23787], column
> family [test_cf], column [8], mutation [null]; value of length 818
> {code}
> For the same row, the mutate info DOES NOT contain columns 8 (and 9) while it
> should:
> {code}
> test_cf:mutate_info timestamp=1414902651388,
> value=#increment:1#0:0#1:0#10:3#11:0#12:3#13:0#14:0#15:0#16:2#2:3#3:0#4:2#5:3#6:0#7:0
>
> {code}
> Further debugging led to finding the root cause where It seems that on SUSE
> System.currentTimeMillis() can go back in time freely (especially when run in
> a virtualized env like EC2), and actually happens very frequently.
> This is from a debug log that was put in place:
> {code}
> 2014-11-04 01:16:05,025 INFO
> [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765025/Put/mvcc=8239/#increment:1
> 2014-11-04 01:16:05,038 INFO
> [B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765038/Put/mvcc=8255/#increment:1#0:3
> 2014-11-04 01:16:05,047 INFO
> [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765047/Put/mvcc=8265/#increment:1#0:3#1:3
> 2014-11-04 01:16:05,057 INFO
> [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765056/Put/mvcc=8274/#increment:1#0:3#1:3#10:2
> 2014-11-04 01:16:05,061 INFO
> [B.DefaultRpcServer.handler=6,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765061/Put/mvcc=8278/#increment:1#0:3#1:3#10:2#11:0
> 2014-11-04 01:16:05,070 INFO
> [B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765070/Put/mvcc=8285/#increment:1#0:3#1:3#10:2#11:0#12:3
> 2014-11-04 01:16:05,076 INFO
> [B.DefaultRpcServer.handler=3,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765076/Put/mvcc=8289/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0
> 2014-11-04 01:16:05,084 INFO
> [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765084/Put/mvcc=8293/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0
> 2014-11-04 01:16:05,090 INFO
> [B.DefaultRpcServer.handler=7,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765090/Put/mvcc=8297/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0
> 2014-11-04 01:16:05,097 INFO
> [B.DefaultRpcServer.handler=0,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765097/Put/mvcc=8301/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0
> 2014-11-04 01:16:05,100 INFO
> [B.DefaultRpcServer.handler=14,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765100/Put/mvcc=8303/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0
> 2014-11-04 01:16:05,103 INFO
> [B.DefaultRpcServer.handler=11,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765103/Put/mvcc=8305/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0
> 2014-11-04 01:16:05,110 INFO
> [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765110/Put/mvcc=8309/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2
> 2014-11-04 01:16:05,114 INFO
> [B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765114/Put/mvcc=8312/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2
> 2014-11-04 01:16:05,119 INFO
> [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765119/Put/mvcc=8316/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0
> 2014-11-04 01:16:04,941 INFO
> [B.DefaultRpcServer.handler=22,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063764941/Put/mvcc=8320/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#5:0
> 2014-11-04 01:16:04,945 INFO
> [B.DefaultRpcServer.handler=25,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063764945/Put/mvcc=8324/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#6:0
> 2014-11-04 01:16:04,988 INFO
> [B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063764988/Put/mvcc=8329/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#7:3
> 2014-11-04 01:16:04,998 INFO
> [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063764998/Put/mvcc=8337/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#8:2
> 2014-11-04 01:16:05,005 INFO
> [B.DefaultRpcServer.handler=17,queue=2,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765005/Put/mvcc=8343/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#9:3
> {code}
> Notice that we keep appending entries like {{#13:0}} to the end. Up until
> {code}
> 2014-11-04 01:16:05,119 INFO
> [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765119/Put/mvcc=8316/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0
> {code}
> the timestamps are always increasing. Then the next line has a smaller
> timestamp. It still does the append though:
> {code}
> 2014-11-04 01:16:04,941 INFO
> [B.DefaultRpcServer.handler=22,queue=1,port=60020] regionserver.MemStore:
> upserting:
> 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063764941/Put/mvcc=8320/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0#2:2#3:2#4:0#5:0
> {code}
> The timestamp went from 1415063765119 to 1415063764941 (a 178ms difference).
> Since append code always checks for latest value (latest value in hbase sort
> order), then appends to the latest value, it does not look like an HBase
> level issue. You can even notice that the log4j timestamp is going back.
> Apart from SUSE timestamp issues (and NTP setup etc), I think our append code
> should ensure that the timestamp of the new key value that is appended should
> not have a lesser timestamp than the old keyvalue. We can think about
> implementing an EnvEdgeManager to do monotonically increment the ts within
> the process, but that may be expensive.
> I think we can ensure to use the max timestamp in append anyway since that
> will ensure that even if system ts goes back, the new kv always sorts before
> the previous value.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)