[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128867#comment-13128867
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

@stack I've done testing on trunk and an 0.90 branch and the symptoms 
encountered with the testing programs is fixed.  Would be great to get on 0.90, 
0.92 and trunk.  Thanks!

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129026#comment-13129026
 ] 

Todd Lipcon commented on HBASE-4570:


Cool, I will commit this to 90, 92, and trunk momentarily.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127711#comment-13127711
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

Current experiment seems to indicate that Bytes.equals, when it uses the 
UNSAFE_COMPARER class doesn't always tell the truth, and causes scan rows to 
get chopped up into two rows.  I've modified code to use the PureJavaComparer 
and the described problem hasn't appeared yet (runing for 30 mins or so).  

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127725#comment-13127725
 ] 

Todd Lipcon commented on HBASE-4570:


woah, that's interesting... but I thought you could reproduce this on 0.90.4 
where the UNSAFE_COMPARER doesn't exist?

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127774#comment-13127774
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

The way this is setup, I can't tell if problem will never happen, but I can 
detect if it ever does.

I'm still experimenting on trunk and will move to previous versions when I feel 
confident with this potential root cause.  I'm using a combo of HBASE-2856 on 
trunk and reverting to the java comparator -- it might the combo of the two 
that is required. 


 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127780#comment-13127780
 ] 

Ted Yu commented on HBASE-4570:
---

@Jonathan:
Can you either post the combined patch or run TestAcidGuarantees in a loop ?
Your findings may give us clue for pushing 2856 forward.

Thanks

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127887#comment-13127887
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

@Ted

I have a strange situation where just with the fixes (first two patches, no 
instrumentation) I still get a lot of the failures in my test setup.  However 
with extra instrumentation failure seem to go away (runs a long time without 
encountering problems).  Note in my table setup, I have 10 cf's each with 2 
cols so the instrumentation is written to always expect 20 KVs.  I have two 
process -- one that does a filtered scan and twiddle, and another that just 
dues a filtered scan and count.

I ran TestAcidGuarantees in a loop on the instrumented version.  It eventually 
failed :(

{code}
Tests in error:
  testScanAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
  testMixedAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@54697123
 closed
{code}

With the instrumented version TestAcidGuarentees still fails -- 
It took about 10th iterations before this happened.

{code}
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 127.479 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.662 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.508 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.208 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.513 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.472 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.869 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.435 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 118.946 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 85.81 sec  
FAILURE!
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0
{code}


 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127995#comment-13127995
 ] 

Ted Yu commented on HBASE-4570:
---

I reproduced the above test failure using patch for 4485 (including 2856) 
combined with 0002-Only-use-safe-java-comparator-don-t-use-sun.misc.Uns.patch

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128008#comment-13128008
 ] 

Todd Lipcon commented on HBASE-4570:


Jon and I spent the afternoon with his test cases. We've found the issue - it's 
a nice one!

In KeyValue, we have the following code:
{code}
  public byte [] getRow() {
if (rowCache == null) {
  int o = getRowOffset();
  short l = getRowLength();
  rowCache = new byte[l];
  System.arraycopy(getBuffer(), o, rowCache, 0, l);
}
return rowCache;
  }
{code}
which is called extensively by KeyValueHeaps throughout the scanner code. In 
the case of scanning MemStore, an individual KeyValue ends up as {{next}} in 
multiple MemStoreScanners. Then, if multiple threads call {{getRow}} at the 
same time, we see the following race:
- Thread 1 sees {{rowCache}} as null, and initializes {{rowCache = new 
byte[...]}}
- Thread 2 sees {{rowCache}} as non-null, and returns a byte array of all 0s
- Thread 1 initializes the row with {{arrayCopy}}, and returns the right result

The byte array returned to Thread 2 is modified while it's working with it, so 
depending on the interleaving of events, it can cause an invalid heap, or 
invalid results, or a weird split row like Jon was seeing, etc.

The fix is pretty simple - we need to declare {{rowCache}} volatile, and 
initialize it in a temporary variable before overwriting the volatile 
reference. If this is too slow, we could use an AtomicFieldUpdater with 
{{lazySet}} to put the cost only on the write side, but I don't think it really 
matters.


 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128016#comment-13128016
 ] 

Todd Lipcon commented on HBASE-4570:


(btw, the unsafe comparator was probably just a red herring - it's faster, so 
the race is more likely, but pretty sure the above is the true case)

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128031#comment-13128031
 ] 

Ted Yu commented on HBASE-4570:
---

Using patch for 4485 (including 2856, without variable length memstoreTS) 
combined with hbase-4570.txt, I still got:
{code}
Tests in error: 
  testScanAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
  testMixedAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
...
TestAcidGuarantees failed, iteration: 3
{code}
But this is some progress - previously TestAcidGuarantees failed every time.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128033#comment-13128033
 ] 

Jonathan Ellis commented on HBASE-4570:
---

bq. we could use an AtomicFieldUpdater with lazySet to put the cost only on the 
write side

I thought that (a) ARFU requires that its target be volatile still, and (b) 
that the point of lazySet was to allow cheaper writes, with no effect on reads.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128065#comment-13128065
 ] 

Todd Lipcon commented on HBASE-4570:


bq. I thought that (a) ARFU requires that its target be volatile still, and (b) 
that the point of lazySet was to allow cheaper writes, with no effect on reads.

I don't think it requires a volatile target - it just treats the target as 
having part of the volatile semantics for the particular update in question. 
The trick here would be that we don't need an up-to-date read whenever we read 
the field in order for lazy initialization to work. If a second thread 
recomputes the same array copy, that's fine. We only need to make sure that the 
writes happen in the correct order (ie the reference to the byte array isn't 
published before the byte array itself has been copied)

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128073#comment-13128073
 ] 

stack commented on HBASE-4570:
--

@Jon and Todd -- Nice find.  I'm +1 on applying patch as is.  If 'too slow', we 
can come back around later with AtomicFieldUpdater kung-fu

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-13 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126369#comment-13126369
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

Ran the unit test version of this test and it did not fail as the separate 
programs did after 3-4 hours.



 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126644#comment-13126644
 ] 

stack commented on HBASE-4570:
--

Good on you Jon.  Keep digging (smile).

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-13 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127162#comment-13127162
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

I can still run these and see acid failues on today's trunk with git hash 
b45dfec.  

I've also tried on a build that applies HBASE-2856 v11 
(https://reviews.apache.org/r/2224/diff/#index_header) it also still has the 
same problem.  




 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-12 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126001#comment-13126001
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

rephrase: I have not been able to duplicate this in a unit test yet.  

This test seems scenario is similar to TestAcidGuarentees (HBASE-2856) but uses 
filters and seems a little focused on this particular symptom.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira