Keep digging Jon. St.Ack
On Mon, Oct 10, 2011 at 3:55 PM, Jonathan Hsieh <[email protected]> wrote: > Yup, I'm pretty convinced this is a consistency and isolation problem. > Here's the output of my program, (expected 59 rows) and some output from > problem rows. > > The row has 10 column families ("f0","f1"... "f9") with two cols. There are > actually two 'row0000021332' rows -- the 48 and the 49 versions. Each came > from a separate calls to next(). Notice how they are split between column > families! Over many runs this happens many times although "splitting" > between different column families. > > scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows > scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows > scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows > scan items counted: 60 with 0 null ColumnFamilyMaps and 0 nullRows > Row row0000021332 had time stamps: [48: > keyvalues={row0000021332/f0:data/1318200437341/Put/vlen=1000, > row0000021332/f0:qual/1318200437341/Put/vlen=10, > row0000021332/f1:data/1318200437341/Put/vlen=1000, > row0000021332/f1:qual/1318200437341/Put/vlen=10, > row0000021332/f2:data/1318200437341/Put/vlen=1000, > row0000021332/f2:qual/1318200437341/Put/vlen=10, > row0000021332/f3:data/1318200437341/Put/vlen=1000, > row0000021332/f3:qual/1318200437341/Put/vlen=10, > row0000021332/f4:data/1318200437341/Put/vlen=1000, > row0000021332/f4:qual/1318200437341/Put/vlen=10, > row0000021332/f5:data/1318200437341/Put/vlen=1000, > row0000021332/f5:qual/1318200437341/Put/vlen=10}, 49: > keyvalues={row0000021332/f6:data/1318200437341/Put/vlen=1000, > row0000021332/f6:qual/1318200437341/Put/vlen=10, > row0000021332/f7:data/1318200437341/Put/vlen=1000, > row0000021332/f7:qual/1318200437341/Put/vlen=10, > row0000021332/f8:data/1318200437341/Put/vlen=1000, > row0000021332/f8:qual/1318200437341/Put/vlen=10, > row0000021332/f9:data/1318200437341/Put/vlen=1000, > row0000021332/f9:qual/1318200437341/Put/vlen=10}] > > Jon. > > On Mon, Oct 10, 2011 at 1:50 PM, Jonathan Hsieh <[email protected]> wrote: > >> I've modified scenario 2's filter to setFilterIfMissing(true) which I >> missed in scenario 1 and now have an undercount like in that scenario. In >> another experiement, I found that getting the latest version >> Record.getColumnLatest(..) would return null. >> >> This seems like a consistency+isolation violation of the expected row-acid >> properties. >> >> ---- >> Scan s = new Scan(); >> ... >> SingleColumnValueFilter filter = new SingleColumnValueFilter( >> Bytes.toBytes("f1"), Bytes.toBytes("qual"), CompareOp.EQUAL, >> value); >> filter.setFilterIfMissing(true); // this is what drops events. >> s.setFilter(filter); >> --- >> >> Jon. >> >> On Mon, Oct 10, 2011 at 12:52 PM, Jonathan Hsieh <[email protected]> wrote: >> >>> I've working a problem related to ACID rules on scans as defined here >>> http://hbase.apache.org/acid-semantics.html. >>> >>> In the two scenarios no new rowkeys are written -- all writes are >>> "overwrites" of already existing rows. So lets say the table has 3 rows: >>> A1, B1, C1; and overwrites rows with values A2,B2,C2 respectively. I'd >>> expect all the scans would have the same number of elements (3) but with >>> potentially any combination of A1|A2, B1|B2, C1|C2. (since no acid >>> guarantees across rows). >>> >>> Scenario 1: I have an MR job that does a filtered scan (confirmed this >>> problem happens without filter as well) from a table, takes the row and >>> writes back to the same row and the same table. I then run the job 2-3x >>> concurrently on the same table with the same filter. I believe both should >>> return the same number of elements read. It seems that in the case of a >>> multiple column family table, this is not the case -- in particular >>> sometimes MR counters reports that it had *fewer* than than the expected >>> number of input records (but never more). This seems wrong. Agree? >>> >>> Scenario 2: When trying to duplicate the problem excluding the MR >>> portions, I wrote two programs -- one that does a filtered scan and that >>> overwrites of existing rows, and one that just does the same filtered scan >>> that just counts the number rows read. I've actually also used the >>> TableRecordReaderImpl code that the MR TableInputFormat uses. In this case, >>> sometimes the scan/counting job actually returns *too many* entries -- for >>> some rowkeys return two records. (but never fewer). This should probably >>> not happen as well. Agree? >>> >>> I've observed this on 0.90.1+patches and 0.90.3+patches. It was also >>> claimed that this issue was not seen in a 0.89.x based hbase. Thus far I've >>> been able to reproduce this on multiple-column-family tables. >>> >>> I believe scenario 1's problem is related to these. Concur? >>> https://issues.apache.org/jira/browse/HBASE-2856 >>> https://issues.apache.org/jira/browse/HBASE-3498 >>> >>> I think scenario 2 is probably related but not sure if it is the same >>> issue. >>> >>> Are there other related JIRA's? >>> >>> Any hints for where to hunt this down? (I'm starting to go into the scan >>> and write paths on the RegionServer, but this is a fairly large area... >>> >>> Thanks >>> Jon. >>> >>> -- >>> // Jonathan Hsieh (shay) >>> // Software Engineer, Cloudera >>> // [email protected] >>> >>> >>> >> >> >> -- >> // Jonathan Hsieh (shay) >> // Software Engineer, Cloudera >> // [email protected] >> >> >> > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // [email protected] >
