[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4570:
--

Fix Version/s: 0.90.5

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4570:
--

Attachment: 4570-instrumentation.tgz

4570-instrumentation.tgz includes a few incremental patches -- the first 
applies v11 of HBASE-2856, the second comments out the use of sun.misc.Unsafe, 
and other add instrumentation around the RS's internal scanner's next and row 
delimiting functions.  

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-14 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4570:
---

Attachment: hbase-4570.txt

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-12 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4570:
--

Attachment: hbase-4570.tgz

I've attached a file with some standalone programs that generate data, 
scan+count, and scan+twiddle.  It includes instructions on how to duplicate the 
problem.

I've tried duplicating the problem in a unit test but have not been able to 
reproduce it as reliably.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: hbase-4570.tgz


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira