[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127887#comment-13127887
 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

@Ted

I have a strange situation where just with the fixes (first two patches, no 
instrumentation) I still get a lot of the failures in my test setup.  However 
with extra instrumentation failure seem to go away (runs a long time without 
encountering problems).  Note in my table setup, I have 10 cf's each with 2 
cols so the instrumentation is written to always expect 20 KVs.  I have two 
process -- one that does a filtered scan and twiddle, and another that just 
dues a filtered scan and count.

I ran TestAcidGuarantees in a loop on the instrumented version.  It eventually 
failed :(

{code}
Tests in error:
  testScanAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
  testMixedAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@54697123
 closed
{code}

With the instrumented version TestAcidGuarentees still fails -- 
It took about 10th iterations before this happened.

{code}
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 127.479 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.662 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.508 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.208 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.513 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.472 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.869 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.435 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 118.946 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 85.81 sec <<< 
FAILURE!
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0
{code}

                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get 
> split into two rows if there are concurrent writes.  In this particular case 
> we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is 
> actually returned as two rows (#55 and #56). Interestingly if the two were 
> merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: 
> keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, 
> row0000024461/f0:qual/1318200440867/Put/vlen=10, 
> row0000024461/f1:data/1318200440867/Put/vlen=1000, 
> row0000024461/f1:qual/1318200440867/Put/vlen=10, 
> row0000024461/f2:data/1318200440867/Put/vlen=1000, 
> row0000024461/f2:qual/1318200440867/Put/vlen=10, 
> row0000024461/f3:data/1318200440867/Put/vlen=1000, 
> row0000024461/f3:qual/1318200440867/Put/vlen=10, 
> row0000024461/f4:data/1318200440867/Put/vlen=1000, 
> row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, 
> row0000024461/f5:qual/1318200440867/Put/vlen=10, 
> row0000024461/f6:data/1318200440867/Put/vlen=1000, 
> row0000024461/f6:qual/1318200440867/Put/vlen=10, 
> row0000024461/f7:data/1318200440867/Put/vlen=1000, 
> row0000024461/f7:qual/1318200440867/Put/vlen=10, 
> row0000024461/f8:data/1318200440867/Put/vlen=1000, 
> row0000024461/f8:qual/1318200440867/Put/vlen=10, 
> row0000024461/f9:data/1318200440867/Put/vlen=1000, 
> row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
> consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to