Yeah verifyrep is a pretty basic tool, there's tons of room for improvement. For the moment I guess you can ignore the 8 bytes cells that aren't printable strings. Feel free to hack around that MR job and maybe contribute back?
The use case for which I built it had loads of tables and the ones that had ICVs pretty much only had that, so it was easy to verify just a couple of tables to have a good idea of how it was doing. J-D On Thu, Jul 11, 2013 at 2:36 PM, Patrick Schless <[email protected]> wrote: > Interesting (thanks for the info). I don't suppose there's an easy way to > filter those incremented cells out, so the response from verifyRep is > meaningful? :) > > > On Thu, Jul 11, 2013 at 3:44 PM, Jean-Daniel Cryans > <[email protected]>wrote: > >> Yeah increments won't work. I guess the warning isn't really visible >> but one place you can see it is: >> >> $ ./bin/hadoop jar ../hbase/hbase.jar >> An example program must be given as the first argument. >> Valid program names are: >> CellCounter: Count cells in HBase table >> completebulkload: Complete a bulk data load. >> copytable: Export a table from local cluster to peer cluster >> export: Write table data to HDFS. >> import: Import data written by Export. >> importtsv: Import data in TSV format. >> rowcounter: Count rows in HBase table >> vvvv >> verifyrep: Compare the data from tables in two different clusters. >> WARNING: It doesn't work for incrementColumnValues'd cells since the >> timestamp is changed after being appended to the log. >> ^^^^ >> >> The problem is that increments' timestamps are different in the WAL >> and in the final KV that's stored in HBase. >> >> J-D >> >> On Thu, Jul 11, 2013 at 12:19 PM, Patrick Schless >> <[email protected]> wrote: >> > It's possible, but I'm not sure. This is a live system, and we do use >> > increment, and it's a smaller portion of our writes into HBase. I can try >> > to duplicate it, but I can't say how these specific cells got written. >> > >> > Would incremented cells not get replicated correctly? >> > >> > >> > On Thu, Jul 11, 2013 at 12:53 PM, Jean-Daniel Cryans < >> [email protected]>wrote: >> > >> >> Are those incremented cells? >> >> >> >> J-D >> >> >> >> On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless >> >> <[email protected]> wrote: >> >> > I have had replication running for about a week now, and have had a >> lot >> >> of >> >> > data flowing to our slave cluster over that time. Now, I'm running the >> >> > verifyrep MR job over a 1-hour period a couple days ago (which should >> be >> >> > fully replicated), and I'm seeing a small number of "BADROWS". >> >> > Spot-checking a few of them, the issue seems to be that the rows are >> >> > present, and have the same values, but a single cell in the row will >> be >> >> off >> >> > by 1ms. >> >> > >> >> > For instance, the log reports this error: >> >> > java.lang.Exception: This result was different: >> >> > >> >> >> keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:&s\xC0\x01/1373470923084/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} >> >> > compared to >> >> > >> >> >> keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:&s\xC0\x01/1373470923084/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} >> >> > >> >> > Some diffing reduces the issue down to: >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8 >> >> > compared to >> >> > >> >> >> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8. >> >> > >> >> > I'm assuming that the value before "/Put" is the cell's timestamp, >> which >> >> > means that the copies are off by 1ms. >> >> > >> >> > Any idea what could cause this? So far (the job is still running), the >> >> > problem seems rare (about 0.05% of rows). >> >> > >> >> > Thanks, >> >> > Patrick >> >> >>
