date:20111006

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121762#comment-13121762
]

Lars Hofhansl commented on HBASE-4536:
--

That might be getting hard to understand. minVersions would have slightly
different meaning depending on whether that extra flag is set. Without the flag
minVersions is like maxVersions for deleted rows, not sure who would need
that.

Having just the flag for deleted rows would also make the code easier to follow
as the column tracker would no longer need to distinguish between normal
rows, delete markers, and deleted rows as it does in the current patch; but
only between rows (deleted or not) and delete markers.

Allow CF to retain deleted rows
---

Key: HBASE-4536
URL: https://issues.apache.org/jira/browse/HBASE-4536
Project: HBase
Issue Type: Sub-task
Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Fix For: 0.92.0, 0.94.0

Parent allows for a cluster to retain rows for a TTL or keep a minimum number
of versions.
However, if a client deletes a row all version older than the delete tomb
stone will be remove at the next major compaction (and even at memstore flush
- see HBASE-4241).
There should be a way to retain those version to guard against software error.
I see two options here:
1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED.
2. Folds this into the parent change. I.e. keep minimum-number-of-versions of
versions even past the delete marker.
#1 would allow for more flexibility. #2 comes somewhat naturally with parent
(from a user viewpoint)
Comments? Any other options?

[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-06 Thread dhruba borthakur (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut3.txt

1. The flush of memstore waits for current transactions to quiesce before 
committing the flushed files. This should address the problem pointed out by 
Kannan.

2. The Hlog.syncer() does not throw an exception, instead causes the 
regionserver to exit if it is unable to sync to hdfs. The assumption here is 
that if hbase is unable to write/sync to hdfs, then the simplest and correct 
error recovery is to exit. (For example, if the memstore flush fails, the 
regionserver exits)



 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
 appendNoSyncPut3.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

[
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121789#comment-13121789
]

jirapos...@reviews.apache.org commented on HBASE-4528:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/
---

(Updated 2011-10-06 08:08:49.288861)

Review request for hbase.

Changes
---

1. The flush of memstore waits for current transactions to quiesce before
committing the flushed files. This should address the problem pointed out by
Kannan.

2. The Hlog.syncer() does not throw an exception, instead causes the
regionserver to exit if it is unable to sync to hdfs. The assumption here is
that if hbase is unable to write/sync to hdfs, then the simplest and correct
error recovery is to exit. (For example, if the memstore flush fails, the
regionserver exits)

Summary
---

The changes the multiPut operation so that the sync to the wal occurs outside
the rowlock.

This enhancement is done only to HRegion.mut(Put[]) because this is the only
method that gets invoked from an application. The HRegion.put(Put) is used only
by unit tests and should possibly be deprecated.

I have attached a unit test. I have not yet run all unit tests, but early
feedback on this patch will be very helpful.

This addresses bug HBASE-4528.
https://issues.apache.org/jira/browse/HBASE-4528

Diffs (updated)
-

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179529

/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
1179529
/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179529
/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1179529
/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 1179529
/src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java
PRE-CREATION
/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1179529

Diff: https://reviews.apache.org/r/2141/diff

Testing
---

Not yet run the full suite of unit tests.

Thanks,

Dhruba

The put operation can release the rowlock before sync-ing the Hlog
--

Key: HBASE-4528
URL: https://issues.apache.org/jira/browse/HBASE-4528
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt,
appendNoSyncPut3.txt

This allows for better throughput when there are hot rows. A single row
update improves from 100 puts/sec/server to 5000 puts/sec/server.

[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.


 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Sell updated HBASE-1744:


Attachment: HBASE-1744.4.patch

Added patch with Bob Copeland's fixes and updated to trunk

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Lars Francke
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Sell updated HBASE-1744:


Assignee: Tim Sell  (was: Lars Francke)
  Status: Patch Available  (was: Open)

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-10-06 Thread Tim Sell (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121851#comment-13121851
 ] 

Tim Sell commented on HBASE-1744:
-

Thanks Bob, 
my patch still doesn't have tests, looking at that now.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121968#comment-13121968
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2390
---



/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
https://reviews.apache.org/r/2141/#comment5491

If advanceMemstore() returns true above, can we skip this call ?


- Ted


On 2011-10-06 08:08:49, Dhruba Borthakur wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2141/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 08:08:49)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  The changes the multiPut operation so that the sync to the wal occurs 
outside the rowlock.
bq.  
bq.  This enhancement is done only to HRegion.mut(Put[]) because this is the 
only method that gets invoked from an application. The HRegion.put(Put) is used 
only by unit tests and should possibly be deprecated.
bq.  
bq.  I have attached a unit test. I have not yet run all unit tests, but early 
feedback on this patch will be very helpful.
bq.  
bq.  
bq.  This addresses bug HBASE-4528.
bq.  https://issues.apache.org/jira/browse/HBASE-4528
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179529 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1179529 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179529 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 
1179529 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 
1179529 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 
1179529 
bq.  
bq.  Diff: https://reviews.apache.org/r/2141/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Not yet run the full suite of unit tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Dhruba
bq.  
bq.



 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
 appendNoSyncPut3.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4482) Race Condition Concerning Eviction in SlabCache


[ 
https://issues.apache.org/jira/browse/HBASE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122034#comment-13122034
 ] 

Jonathan Gray commented on HBASE-4482:
--

+1 on keeping this in 0.92 regardless of stability and marking as experimental.

 Race Condition Concerning Eviction in SlabCache
 ---

 Key: HBASE-4482
 URL: https://issues.apache.org/jira/browse/HBASE-4482
 Project: HBase
  Issue Type: Sub-task
Reporter: Li Pi
Assignee: Li Pi
Priority: Blocker
 Fix For: 0.92.0

 Attachments: hbase-4482v1.txt, hbase-4482v2.txt, hbase-4482v4.2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.


[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122048#comment-13122048
 ] 

Ted Yu commented on HBASE-1744:
---

I applied patch v4 and got the following:
http://pastebin.com/rmpXff7m

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.


 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Sell updated HBASE-1744:


Status: Open  (was: Patch Available)

Weird. Strange it works for me, turns out I am compiling with thrift 0.6.1 and 
hbase is using 0.7,

cancelling the patch.

I'll stick a new one up in a bit.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Sell updated HBASE-1744:


Attachment: HBASE-1744.5.patch

Added patch which has the thrift2 generated files generated from 0.7.0,

Also has a few tests.


 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.preview.1.patch, 
 thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

[
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122075#comment-13122075
]

jirapos...@reviews.apache.org commented on HBASE-4469:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2235/
---

Review request for hbase.

Summary
---

The problem is that when seeking for the row/col in the hfile, we will go to
top of the row in order to check for row delete marker (delete family).
However, if the bloomfilter is enabled for the column family, then if a delete
family operation is done on a row, the row is already being added to
bloomfilter.
We can take advantage of this factor to avoid seeking to the top of row.

Also, Update the TestBlocksRead unit tests. since most of block read count has
dropped to a lower number.

Evaluation:
In TestSeekingOptimization, it saved 31.6% seek operation perviously.
Now it saves about 41.82% seek operation.
10% more seek operation.

==
Before this diff:
For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with
optimization: 1714 (68.40%), savings: 31.60%

=
Apply this diff:
For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with
optimization: 1458 (58.18%), savings: 41.82%
=

Thanks Mikhail and Kannan's help and discussion.

This addresses bug HBASE-4469.
https://issues.apache.org/jira/browse/HBASE-4469

Diffs
-

src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 7b0b9e6
src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
8dd8a68
src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
abccea4

Diff: https://reviews.apache.org/r/2235/diff

Testing
---

Run all the unit tests.
There are 2 unit tests failed with and without my change.
TestDistributedLogSplitting
TestHTablePool

Thanks,

Liyin

Avoid top row seek by looking up bloomfilter

Key: HBASE-4469
URL: https://issues.apache.org/jira/browse/HBASE-4469
Project: HBase
Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

The problem is that when seeking for the row/col in the hfile, we will go to
top of the row in order to check for row delete marker (delete family).
However, if the bloomfilter is enabled for the column family, then if a
delete family operation is done on a row, the row is already being added to
bloomfilter. We can take advantage of this factor to avoid seeking to the top
of row.

[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.


[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122083#comment-13122083
 ] 

Ted Yu commented on HBASE-1744:
---

TestThriftHBaseServiceHandler passed for patch v5.

Can we change the wording for:
{code}
+  echo   thrift2  run the new HBase Thrift server
{code}
I believe there would be newer Thrift server down the road :-)

Also, experience using thrift2 would be helpful for other users.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.preview.1.patch, 
 thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122088#comment-13122088
 ] 

Ted Yu commented on HBASE-4469:
---

I don't see TestBlocksRead in the latest review.

 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing


[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122106#comment-13122106
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
---

Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Summary
---

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler 
scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
https://issues.apache.org/jira/browse/HBASE-4540


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179238 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179238 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179238 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179238 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
---

Yes


Thanks,

ramkrishna



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122123#comment-13122123
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2395
---



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5498

The two tests share a lot of the same code, some refactoring would be good



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5497

You should be resetting the conf to what was created inside TEST_UTIL.


- Jean-Daniel


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122147#comment-13122147
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2399
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5504

Can this debugLog be folded into the one above ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5505

Remove this extra line.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5506

'for transition ZK node' seems redundant.


- Ted


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state

[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122150#comment-13122150
 ] 

jirapos...@reviews.apache.org commented on HBASE-4070:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2029/#review2398
---


Looks good to me. Ship it after some minor fixes. 


src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
https://reviews.apache.org/r/2029/#comment5502

masterCoprocessors is a string array. I don't think you can use equals() 
here. 

If the 2 arrays are sorted, you may use Arrays.equals(). 



src/main/java/org/apache/hadoop/hbase/HServerLoad.java
https://reviews.apache.org/r/2029/#comment5507

This is cool. 

Array.toString() can return null. You want to check?



src/main/java/org/apache/hadoop/hbase/HServerLoad.java
https://reviews.apache.org/r/2029/#comment5509

As above: Arrays.toString() can be null.



src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
https://reviews.apache.org/r/2029/#comment5508

can you finish the TODO?


- Mingjie


On 2011-10-05 21:45:30, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2029/
bq.  ---
bq.  
bq.  (Updated 2011-10-05 21:45:30)
bq.  
bq.  
bq.  Review request for hbase and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Proposed fix for HBASE-4070. 
bq.  
bq.  
bq.  This addresses bug HBASE-4070.
bq.  https://issues.apache.org/jira/browse/HBASE-4070
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon 
abeb850 
bq.src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon 
be6fceb 
bq.src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 01bc1dd 
bq.src/main/java/org/apache/hadoop/hbase/HServerLoad.java 0c680e4 
bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java a55a4b1 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
dbae4fd 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java f80d232 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
3840279 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
eda5a9b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2029/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Two new tests : testRegionServerCoprocessorReported() and 
testMasterServerCoprocessorsReported() included in a new source file 
src/test/java/o.a.h.h/coprocessor/TestCoprocessorReporting.java.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



 [Coprocessors] Improve region server metrics to report loaded coprocessors to 
 master
 

 Key: HBASE-4070
 URL: https://issues.apache.org/jira/browse/HBASE-4070
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.3
Reporter: Mingjie Lai
Assignee: Eugene Koontz
 Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, 
 master-web-ui.jpg, rs-status-web-ui.jpg


 HBASE-3512 is about listing loaded cp classes at shell. To make it more 
 generic, we need a way to report this piece of information from region to 
 master (or just at region server level). So later on, we can display the 
 loaded class names at shell as well as web console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122156#comment-13122156
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2400
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
https://reviews.apache.org/r/2251/#comment5510

This new method is similar to deleteNode() above.
Maybe we should retrofit the existing deleteNode() by adding 
expectedVersion ?
We can designate some negative constant to signify that version check 
should be skipped.


- Ted


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING,

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-06 Thread Liyin Tang (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122161#comment-13122161
 ] 

Liyin Tang commented on HBASE-4469:
---

Yes, I didn't change that unit tests TestBlocksRead, which is passed 
successfully. 


 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

2011-10-06 Thread Gary Helmling (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-4282:
-

Affects Version/s: 0.90.5
   0.94.0
   0.92.0
   Status: Patch Available  (was: Open)

@Stack (or anyone else), can you take a look at the updated patch for trunk -- 
HBASE-4282_trunk_3.patch?

Since HBASE-4487 was only applied to trunk, the previous version should still 
be applicable for 0.90/0.92.

 Potential data loss in retries of WAL close introduced in HBASE-4222
 

 Key: HBASE-4282
 URL: https://issues.apache.org/jira/browse/HBASE-4282
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0, 0.90.5
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Blocker
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch, 
 HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch


 The ability to ride over WAL close errors on log rolling added in HBASE-4222 
 could lead to missing HLog entries if:
 * A table has DEFERRED_LOG_FLUSH=true
 * There are unflushed WALEdit entries for that table in the current 
 SequenceFile writer buffer
 Since the writes were already acknowledged to the client, just ignoring the 
 close error to allow for another log roll doesn't seem like the right thing 
 to do here.
 We could easily flag this state and only ride over the close error if there 
 aren't unflushed entries.  This would bring the above condition back to the 
 previous behavior of aborting the region server.  However, aborting the 
 region server in this state is still guaranteeing data loss.  Is there 
 anything we can do better in this case?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4546) Upgrade to ZooKeeper 3.3.2 or 3.3.3

2011-10-06 Thread Jonathan Gray (Created) (JIRA)

Upgrade to ZooKeeper 3.3.2 or 3.3.3
---

 Key: HBASE-4546
 URL: https://issues.apache.org/jira/browse/HBASE-4546
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0


HBase is still depending on 3.3.1.  There many critical bug fixes in 3.3.2 and 
two more critical fixes in 3.3.3.

We recently tripped on ZOOKEEPER-822 which was fixed in 3.3.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4546) Upgrade to ZooKeeper 3.3.2 or 3.3.3

2011-10-06 Thread Jonathan Gray (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray resolved HBASE-4546.
--

   Resolution: Not A Problem
Fix Version/s: (was: 0.92.0)

Nevermind, I'm looking at a stale pom.  We are already on 3.3.3 in 92 and trunk.

 Upgrade to ZooKeeper 3.3.2 or 3.3.3
 ---

 Key: HBASE-4546
 URL: https://issues.apache.org/jira/browse/HBASE-4546
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Reporter: Jonathan Gray
Assignee: Jonathan Gray

 HBase is still depending on 3.3.1.  There many critical bug fixes in 3.3.2 
 and two more critical fixes in 3.3.3.
 We recently tripped on ZOOKEEPER-822 which was fixed in 3.3.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4402) Retaining locality after restart broken

2011-10-06 Thread stack (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122207#comment-13122207
 ] 

stack commented on HBASE-4402:
--

Let me apply the updated patch.  The below failures seem unrelated:

{code}

Failed tests:   
testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
  testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
was:1
  testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
expected:2 but was:1
{code}

I saw them in a clean 0.92 run.  I'm working on fixing these elsewhere.

 Retaining locality after restart broken
 ---

 Key: HBASE-4402
 URL: https://issues.apache.org/jira/browse/HBASE-4402
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.92.0

 Attachments: hbase-4402.txt, hbase-4402.txt


 In DefaultLoadBalancer, we implement the retain assignment function like so:
 {code}
   if (sn != null  servers.contains(sn)) {
 assignments.get(sn).add(region.getKey());
 {code}
 but this will never work since after a cluster restart, all servers have a 
 new ServerName with a new startcode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4402) Retaining locality after restart broken

2011-10-06 Thread Nicolas Spiegelberg (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4402:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Applied to 0.92 branch and trunk.  Thanks Todd.

 Retaining locality after restart broken
 ---

 Key: HBASE-4402
 URL: https://issues.apache.org/jira/browse/HBASE-4402
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4402-v3.txt, hbase-4402.txt, hbase-4402.txt


 In DefaultLoadBalancer, we implement the retain assignment function like so:
 {code}
   if (sn != null  servers.contains(sn)) {
 assignments.get(sn).add(region.getKey());
 {code}
 but this will never work since after a cluster restart, all servers have a 
 new ServerName with a new startcode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter


[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122232#comment-13122232
 ] 

Nicolas Spiegelberg commented on HBASE-4469:


+1. lgtm

 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122264#comment-13122264
 ] 

Ted Yu commented on HBASE-4282:
---

+1 on patch v3.
There seems to be some missing javadoc for TestLogRollAbort.
Please remove the following line in TestLogRollAbort:
{code}
/ configuration for testLogRollOnDatanodeDeath /
{code}

 Potential data loss in retries of WAL close introduced in HBASE-4222
 

 Key: HBASE-4282
 URL: https://issues.apache.org/jira/browse/HBASE-4282
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0, 0.90.5
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Blocker
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch, 
 HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch


 The ability to ride over WAL close errors on log rolling added in HBASE-4222 
 could lead to missing HLog entries if:
 * A table has DEFERRED_LOG_FLUSH=true
 * There are unflushed WALEdit entries for that table in the current 
 SequenceFile writer buffer
 Since the writes were already acknowledged to the client, just ignoring the 
 close error to allow for another log roll doesn't seem like the right thing 
 to do here.
 We could easily flag this state and only ride over the close error if there 
 aren't unflushed entries.  This would bring the above condition back to the 
 previous behavior of aborting the region server.  However, aborting the 
 region server in this state is still guaranteeing data loss.  Is there 
 anything we can do better in this case?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

2011-10-06 Thread Gary Helmling (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122285#comment-13122285
]

Gary Helmling commented on HBASE-4282:
--

Yes, bad copy-n-paste job on my part. I will clean up.

Thanks for the review, Ted.

Potential data loss in retries of WAL close introduced in HBASE-4222

Key: HBASE-4282
URL: https://issues.apache.org/jira/browse/HBASE-4282
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0, 0.90.5
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Blocker
Fix For: 0.92.0, 0.90.5

Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch,
HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch

The ability to ride over WAL close errors on log rolling added in HBASE-4222
could lead to missing HLog entries if:
* A table has DEFERRED_LOG_FLUSH=true
* There are unflushed WALEdit entries for that table in the current
SequenceFile writer buffer
Since the writes were already acknowledged to the client, just ignoring the
close error to allow for another log roll doesn't seem like the right thing
to do here.
We could easily flag this state and only ride over the close error if there
aren't unflushed entries. This would bring the above condition back to the
previous behavior of aborting the region server. However, aborting the
region server in this state is still guaranteeing data loss. Is there
anything we can do better in this case?

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter


[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122289#comment-13122289
 ] 

jirapos...@reviews.apache.org commented on HBASE-4469:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2235/#review2417
---


+1. 

Nice optimization Liyin. Changes look good.  [This is running nicely on our 
internal branch.]

- Kannan


On 2011-10-06 17:17:23, Liyin wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2235/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 17:17:23)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  The problem is that when seeking for the row/col in the hfile, we will go 
to top of the row in order to check for row delete marker (delete family).
bq.  However, if the bloomfilter is enabled for the column family, then if a 
delete family operation is done on a row, the row is already being added to 
bloomfilter.
bq.  We can take advantage of this factor to avoid seeking to the top of row.
bq.  
bq.  Also, Update the TestBlocksRead unit tests. since most of block read count 
has dropped to a lower number.
bq.  
bq.  Evaluation:
bq.  In TestSeekingOptimization, it saved 31.6% seek operation perviously.
bq.  Now it saves about 41.82% seek operation.
bq.  10% more seek operation.
bq.  
bq.  ==
bq.  Before this diff:
bq.  For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
optimization: 1714 (68.40%), savings: 31.60%
bq.  
bq.  =
bq.  Apply this diff:
bq.  For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
optimization: 1458 (58.18%), savings: 41.82%
bq.  =
bq.  
bq.  Thanks Mikhail and Kannan's help and discussion.
bq.  
bq.  
bq.  This addresses bug HBASE-4469.
bq.  https://issues.apache.org/jira/browse/HBASE-4469
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
7b0b9e6 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
8dd8a68 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
abccea4 
bq.  
bq.  Diff: https://reviews.apache.org/r/2235/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Run all the unit tests.
bq.  There are 2 unit tests failed with and without my change.
bq.  TestDistributedLogSplitting
bq.  TestHTablePool
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Liyin
bq.  
bq.



 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4536) Allow CF to retain deleted rows

2011-10-06 Thread Lars Hofhansl (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-4536:
-

Fix Version/s: (was: 0.92.0)

Turns out this is a bit more complicated than I thought. There are three types
of deletes:
# version deletes - effective for a specific version of a specific column
# column deletes - effective for all versions of a specific column
# family deletes - effective for all versions of all columns of a family

The first two are sorted before the puts they affect based on their resp.
timestamps, but after newer puts.
Family deletes, always sort before all versions of all columns.

The problems is deciding when the delete rows (the marker rows) themselves can
be removed during a major compaction.

For #1 and #2 I can just do version counting, and newer puts will eventually
push out the delete markers from the store.
With #3 this will never happen as they always sort before all puts of the same
family, regardless of any timestamp set on them.
Here it is necessary to scan all puts for that family and then decide whether
the delete needs to be included based on whether the delete had any affect on
any of the puts in the same family.

Because of this, moving out of 0.92 as changes will be bigger. Put back if you
think otherwise.

I still think that timetravel is an important feature of HBase and incomplete
if it cannot include deleted rows.

Allow CF to retain deleted rows
---

[jira] [Created] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found

2011-10-06 Thread stack (Created) (JIRA)

TestAdmin failing in 0.92 because .tableinfo not found
--

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack


I've been running tests before commit and found the following happens with some 
regularity, sporadic of course, but they fail fairly frequently:
{code}
Failed tests:   
testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
  testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
was:1
  testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
expected:2 but was:1
{code}
Looking, it seems like we fail to find .tableinfo in the tests that modify 
table schema while table is online.

The update of a table schema just does an overwrite.  In the tests we sometimes 
fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found


 [ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4547:
-

Attachment: 4547.txt

This patch which does create in tmp dir, a delete and rename seems to fix the 
failing TestAdmin in repeated runs.

 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Attachments: 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

[
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122311#comment-13122311
]

Jonathan Gray commented on HBASE-4536:
--

Lars, I agree that this is an important feature. Also agree that we should
take time and do it right and not push for 0.92.

Could we just support some kind of raw scanner along with a TTKAKV config
(Time To Keep All Key Values)?

Allow CF to retain deleted rows
---

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

[
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122322#comment-13122322
]

Lars Hofhansl commented on HBASE-4536:
--

A simple option would be to only allow the KEEP_DELETED flag set when TTL is
also set.
Then we'd do simple version counting for #1 and #2 type deletes and rely on TTL
to expire #3.
(That means you could have more #3 delete markers than max versions, which
would also be the case with TTKAKV. That might be acceptable).

Allow CF to retain deleted rows
---

[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

2011-10-06 Thread Gary Helmling (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122326#comment-13122326
 ] 

Gary Helmling commented on HBASE-4282:
--

bq. On v3, the txids are pretty useless at least out in logs? No harm logging 
them I suppose but there is nothing I can infer given a txid? Is that so?

Yes, txids are not so useful.  I can drop them from the logs.  I left them in 
as the analog of the previous version's deferred seqNum, which are moderately 
more useful.

{code}
-if (unflushedEntries.get() = syncedTillHere) {
-  Thread.sleep(this.optionalFlushInterval);
-}
+Thread.sleep(this.optionalFlushInterval);
{code}

This is reverting what I think is a dangerous change introduced by HBASE-4487.  
If the sync fails, then the if condition will be false, making the LogSyncer 
thread go into a hard loop until the sync succeeds.  This is going to interfere 
with attempting to perform the log roll, so I think it at least needs to be 
throttled.  The simplest change seemed to be restoring previous behavior.  I 
can move this into a separate issue, if you think broader discussion would be 
good.

{code}
+TEST_UTIL.cleanupTestDir();
+TEST_UTIL.shutdownMiniCluster();
{code}

cleanupTestDir() actually deletes the test directory in HDFS, so the cluster 
would need to be running for it.  But shutdownMiniCluster() does it's own 
cleanup of the local FS dirs for testing, so I don't think we need the 
additional cleanupTestDir() at all.

{code}
+assertTrue(Need HDFS-826 for this test, log.canGetCurReplicas());
{code}

Sure, I'll add that in.

 Potential data loss in retries of WAL close introduced in HBASE-4222
 

 Key: HBASE-4282
 URL: https://issues.apache.org/jira/browse/HBASE-4282
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0, 0.90.5
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Blocker
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch, 
 HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch


 The ability to ride over WAL close errors on log rolling added in HBASE-4222 
 could lead to missing HLog entries if:
 * A table has DEFERRED_LOG_FLUSH=true
 * There are unflushed WALEdit entries for that table in the current 
 SequenceFile writer buffer
 Since the writes were already acknowledged to the client, just ignoring the 
 close error to allow for another log roll doesn't seem like the right thing 
 to do here.
 We could easily flag this state and only ride over the close error if there 
 aren't unflushed entries.  This would bring the above condition back to the 
 previous behavior of aborting the region server.  However, aborting the 
 region server in this state is still guaranteeing data loss.  Is there 
 anything we can do better in this case?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4482) Race Condition Concerning Eviction in SlabCache


 [ 
https://issues.apache.org/jira/browse/HBASE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4482:
-

Attachment: hbase-4482v4.2.txt

Here is what I applied copied from RB.

 Race Condition Concerning Eviction in SlabCache
 ---

 Key: HBASE-4482
 URL: https://issues.apache.org/jira/browse/HBASE-4482
 Project: HBase
  Issue Type: Sub-task
Reporter: Li Pi
Assignee: Li Pi
Priority: Blocker
 Fix For: 0.92.0

 Attachments: hbase-4482v1.txt, hbase-4482v2.txt, hbase-4482v4.2.txt, 
 hbase-4482v4.2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4482) Race Condition Concerning Eviction in SlabCache


 [ 
https://issues.apache.org/jira/browse/HBASE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4482:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to 0.92 branch and trunk because of Ted and Jon +1s.  Thanks for the 
patch Li Pi.

 Race Condition Concerning Eviction in SlabCache
 ---

 Key: HBASE-4482
 URL: https://issues.apache.org/jira/browse/HBASE-4482
 Project: HBase
  Issue Type: Sub-task
Reporter: Li Pi
Assignee: Li Pi
Priority: Blocker
 Fix For: 0.92.0

 Attachments: hbase-4482v1.txt, hbase-4482v2.txt, hbase-4482v4.2.txt, 
 hbase-4482v4.2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass

2011-10-06 Thread stack (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4430.
--

Resolution: Fixed

Marking resolved by hbase-4482; that patch reenabled these tests.

 Disable TestSlabCache and TestSingleSizedCache temporarily to see if these 
 are cause of build box failure though all tests pass
 ---

 Key: HBASE-4430
 URL: https://issues.apache.org/jira/browse/HBASE-4430
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: stack
Assignee: Li Pi
Priority: Blocker
 Fix For: 0.92.0

 Attachments: TestSlabCache.trace




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

2011-10-06 Thread stack (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122337#comment-13122337
]

stack commented on HBASE-4282:
--

I'm good on commit as is.

I'd say open new issue to fix the 'dangerous change' in trunk.

And you don't need to add in the check for hdfs-826. You already have it in
there.

Good stuff G.

Potential data loss in retries of WAL close introduced in HBASE-4222

Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch,
HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch

[jira] [Updated] (HBASE-1621) merge tool should work on online cluster, but disabled table


 [ 
https://issues.apache.org/jira/browse/HBASE-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-1621:
-

Priority: Major  (was: Blocker)

Undoing this as blocker now we have a merge script that has been run a few 
times in production; having such script takes the heat off the need for this... 
but we still need it.  Marking major.

 merge tool should work on online cluster, but disabled table
 

 Key: HBASE-1621
 URL: https://issues.apache.org/jira/browse/HBASE-1621
 Project: HBase
  Issue Type: Bug
Reporter: ryan rawson
Assignee: stack
 Fix For: 0.92.0

 Attachments: 1621-trunk.txt, HBASE-1621-v2.patch, HBASE-1621.patch, 
 hbase-onlinemerge.patch, online_merge.rb


 taking down the entire cluster to merge 2 regions is a pain, i dont see why 
 the table or regions specifically couldnt be taken offline, then merged then 
 brought back up.
 this might need a new API to the regionservers so they can take direction 
 from not just the master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found


 [ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4547:
-

 Priority: Critical  (was: Major)
Fix Version/s: 0.92.0
 Assignee: stack

Bringing into 0.92 and marking critical.

 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-06 Thread Jonathan Hsieh (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122344#comment-13122344
]

Jonathan Hsieh commented on HBASE-4377:
---

In the 0.90 branch, after deleting meta and restarting the # of tables present
is 0.
In trunk and 0.92 branch, after deleting meta and restart the # of tables
present is 1.

This actually does make sense because HBASE-451 changed the behavior of HMaster
-- in 0.90 (pre-HBASE-451) it HConnectionManager.listTables() loads table info
on the client side via a meta scan. Post HBASE-451, table data from
HConnectionManager.listTables() comes from the files system and is cached by
the HMaster, and ignores the meta table.

[hbck] Offline rebuild .META. from fs data only.

Key: HBASE-4377
URL: https://issues.apache.org/jira/browse/HBASE-4377
Project: HBase
Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Attachments:
0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch,
hbase-4377-trunk.v2.patch

In a worst case situation, it may be helpful to have an offline .META.
rebuilder that just looks at the file system's .regioninfos and rebuilds meta
from scratch. Users could move bad regions out until there is a clean
rebuild.
It would likely fill in region split holes. Follow on work could given
options to merge or select regions that overlap, or do online rebuilds.

[jira] [Resolved] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found

2011-10-06 Thread stack (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4547.
--

Resolution: Fixed

Ran this a bunch of times and couldn't get TestAdmin to fail.  Applied 0.92 
branch and trunk.

 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-06 Thread Todd Lipcon (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122354#comment-13122354
 ] 

Todd Lipcon commented on HBASE-4377:


bq. Post HBASE-451, table data from HConnectionManager.listTables() comes from 
the files system and is cached by the HMaster, and ignores the meta table

This seems like a bug - clients should never have to have direct access to 
HDFS! I filed HBASE-4548

 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 hbase-4377-trunk.v2.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4548) Client should not look on HDFS to list tables

2011-10-06 Thread Todd Lipcon (Created) (JIRA)

Client should not look on HDFS to list tables
-

 Key: HBASE-4548
 URL: https://issues.apache.org/jira/browse/HBASE-4548
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


In HBASE-4377, Jon noticed that HConnectionManager.listTable now looks on HDFS 
for the table list. This seems incorrect, since the client may not have access 
to the hbase directory on HDFS (eg in a secure cluster). At the least, it 
should RPC to the master to find a table list, and have the master do the list 
on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4549) Add thrift API to read version and build date of HBase

2011-10-06 Thread Song Liu (Created) (JIRA)

Add thrift API to read version and build date of HBase 
---

 Key: HBASE-4549
 URL: https://issues.apache.org/jira/browse/HBASE-4549
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Song Liu
Priority: Minor


Adding API to get the hbase server version and build date will be helpful for 
the client to communicate with different versions of the server accordingly. 

class VersionInfo can be reused to provide required information. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4402) Retaining locality after restart broken

2011-10-06 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122364#comment-13122364
 ] 

Hudson commented on HBASE-4402:
---

Integrated in HBase-0.92 #48 (See 
[https://builds.apache.org/job/HBase-0.92/48/])
HBASE-4402 Retaining locality after restart broken

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestDefaultLoadBalancer.java


 Retaining locality after restart broken
 ---

 Key: HBASE-4402
 URL: https://issues.apache.org/jira/browse/HBASE-4402
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4402-v3.txt, hbase-4402.txt, hbase-4402.txt


 In DefaultLoadBalancer, we implement the retain assignment function like so:
 {code}
   if (sn != null  servers.contains(sn)) {
 assignments.get(sn).add(region.getKey());
 {code}
 but this will never work since after a cluster restart, all servers have a 
 new ServerName with a new startcode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-06 Thread Jonathan Hsieh (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122367#comment-13122367
]

Jonathan Hsieh commented on HBASE-4377:
---

@Todd,

I think there is some confusion. Clients do not directly access hdfs. Let me
add more detail.

In trunk post HBASE-451, the HMaster reads and caches data from the file system
(not the client). It then serves this the HTableDescriptors to the client
rpc's via HConnectionManager to talk to the HMaster which just ships the
cached HTD data.

HMaster on initialization reads file system for HTD data.
Client calls listTables() - HMaster (serve cached data from file system).

Pre-HBASE-451, it the client HConnectionManager does a meta scan and builds
HTableDescriptors.

Client calls listTables() which actually is a metascan and that builds htds.

[hbck] Offline rebuild .META. from fs data only.

[jira] [Commented] (HBASE-4548) Client should not look on HDFS to list tables

2011-10-06 Thread Jonathan Hsieh (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122369#comment-13122369
]

Jonathan Hsieh commented on HBASE-4548:
---

@Todd, (also posted in HBASE-4377).

I think there is some confusion. Clients do not directly access hdfs. Let me
add more detail.

In trunk post HBASE-451, the HMaster reads and caches data from the file system
(not the client). It then serves this the HTableDescriptors to the client rpc's
via HConnectionManager to talk to the HMaster which just ships the cached HTD
data.

HMaster on initialization reads file system for HTD data.
Client calls listTables() - HMaster (serve cached data from file system).

Pre-HBASE-451, it the client HConnectionManager does a meta scan and builds
HTableDescriptors.

Client calls listTables() which actually is a metascan and that builds htds.

Client should not look on HDFS to list tables
-

Key: HBASE-4548
URL: https://issues.apache.org/jira/browse/HBASE-4548
Project: HBase
Issue Type: Bug
Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
Fix For: 0.92.0

In HBASE-4377, Jon noticed that HConnectionManager.listTable now looks on
HDFS for the table list. This seems incorrect, since the client may not have
access to the hbase directory on HDFS (eg in a secure cluster). At the least,
it should RPC to the master to find a table list, and have the master do the
list on HDFS.

[jira] [Resolved] (HBASE-4548) Client should not look on HDFS to list tables

2011-10-06 Thread Jonathan Hsieh (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh resolved HBASE-4548.
---

Resolution: Not A Problem

 Client should not look on HDFS to list tables
 -

 Key: HBASE-4548
 URL: https://issues.apache.org/jira/browse/HBASE-4548
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 In HBASE-4377, Jon noticed that HConnectionManager.listTable now looks on 
 HDFS for the table list. This seems incorrect, since the client may not have 
 access to the hbase directory on HDFS (eg in a secure cluster). At the least, 
 it should RPC to the master to find a table list, and have the master do the 
 list on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found


[ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122370#comment-13122370
 ] 

Jonathan Gray commented on HBASE-4547:
--

Post-commit +1.

Stack, should we open another JIRA to deal with your TODO?

 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4549) Add thrift API to read version and build date of HBase

2011-10-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122373#comment-13122373
 ] 

Jonathan Gray commented on HBASE-4549:
--

+1

 Add thrift API to read version and build date of HBase 
 ---

 Key: HBASE-4549
 URL: https://issues.apache.org/jira/browse/HBASE-4549
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Song Liu
Priority: Minor
   Original Estimate: 2h
  Remaining Estimate: 2h

 Adding API to get the hbase server version and build date will be helpful for 
 the client to communicate with different versions of the server accordingly. 
 class VersionInfo can be reused to provide required information. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4548) Client should not look on HDFS to list tables

2011-10-06 Thread Jonathan Hsieh (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122372#comment-13122372
 ] 

Jonathan Hsieh commented on HBASE-4548:
---

closed out as not a problem.

 Client should not look on HDFS to list tables
 -

 Key: HBASE-4548
 URL: https://issues.apache.org/jira/browse/HBASE-4548
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 In HBASE-4377, Jon noticed that HConnectionManager.listTable now looks on 
 HDFS for the table list. This seems incorrect, since the client may not have 
 access to the hbase directory on HDFS (eg in a secure cluster). At the least, 
 it should RPC to the master to find a table list, and have the master do the 
 list on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

[
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122417#comment-13122417
]

jirapos...@reviews.apache.org commented on HBASE-4528:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2397
---

Overall I see this patch as trading off service resiliency in favor of
performance.

With the current ordering of operations (WAL append and sync prior to memstore
insert), we ensure that an error during sync is seen by the client and memstore
consistency is maintained. Importantly (at least for my goals), this also
allows us to do some reasoning about when it's necessary to abort the region
server or when we can take additional actions to try to ride over a transient
error. As long as there were no deferred flush edits, we could reason that any
error on sync was propagated back to the client as a failure and we did not
need to abort yet. This is the direction I've been trying to move with
HBASE-4222/4282 and a partial form of it was already in place prior to that.

I understand why we want to reorder these operations and move the sync outside
of the acquired row locks. From this standpoint, since an error on sync leaves
the memstore polluted, aborting immediately is the right thing to do. But I
don't think it's a desirable behavior. I think it will lead to more complaints
from users about observed instability of the system.

The use-case that motivated HBASE-4222 was performing a rolling restart of all
DataNodes in a cluster, with a running, but completely quiescent HBase cluster.
In this case, with no data durability at stake, we really should be able to
recover. But instead what will happen is a catastrophic failure of
RegionServers as each server tries to roll its HLog. The patch in it's current
state would regress to this behavior, triggering RS aborts even more quickly
than prior to HBASE-4222 (no HLog close would be attempted).

I would really like to find a way to keep the performance optimization of
moving the HLog sync outside of the row locks, while still being able to
guarantee memstore consistency in the case of failure, so that we can still
reason about whether or not a RS abort is really necessary.

Speaking naively, is it at all feasible that the RWCC.WriteEntry could track
the KeyValues instances it's used to apply to the memstore? And these
references could then be used to attempt a memstore rollback on failure? Any
other ways that we can maintain memstore consistency here without giving up and
aborting?

/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
https://reviews.apache.org/r/2141/#comment5501

Personally, I think this is a step in the wrong direction. I would like to
see us be _more_ resilient in the face of transient HDFS errors, as long as we
have sufficient information to reason that we have not compromised correctness.

- Gary

On 2011-10-06 08:08:49, Dhruba Borthakur wrote:
bq.
bq. ---
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/2141/
bq. ---
bq.
bq. (Updated 2011-10-06 08:08:49)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. ---
bq.
bq. The changes the multiPut operation so that the sync to the wal occurs
outside the rowlock.
bq.
bq. This enhancement is done only to HRegion.mut(Put[]) because this is the
only method that gets invoked from an application. The HRegion.put(Put) is used
only by unit tests and should possibly be deprecated.
bq.
bq. I have attached a unit test. I have not yet run all unit tests, but early
feedback on this patch will be very helpful.
bq.
bq.
bq. This addresses bug HBASE-4528.
bq. https://issues.apache.org/jira/browse/HBASE-4528
bq.
bq.
bq. Diffs
bq. -
bq.
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179529
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
1179529
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179529
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java
1179529
bq./src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
1179529
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java
PRE-CREATION
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
1179529
bq.
bq. Diff: https://reviews.apache.org/r/2141/diff
bq.
bq.
bq. Testing
bq. ---
bq.
bq. Not yet run the full suite of unit tests.
bq.
bq.
bq. Thanks,
bq.
bq. Dhruba
bq.

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122433#comment-13122433
 ] 

Ted Yu commented on HBASE-4528:
---

I was thinking about the possibility of memstore rollback as well.
Here're the operations in applyFamilyMapToMemstore() whose effect needs to be 
rolled back:
{code}
for (KeyValue kv: edits) {
  kv.setMemstoreTS(w.getWriteNumber());
  size += store.add(kv);
}
{code}

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
 appendNoSyncPut3.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122436#comment-13122436
 ] 

Jonathan Gray commented on HBASE-4528:
--

Dhruba and I just talked about this.  I also like the MemStore rollback.  It 
should not be that difficult, just removing the ListKV that we added.

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
 appendNoSyncPut3.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4482) Race Condition Concerning Eviction in SlabCache

2011-10-06 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122447#comment-13122447
 ] 

Hudson commented on HBASE-4482:
---

Integrated in HBase-0.92 #49 (See 
[https://builds.apache.org/job/HBase-0.92/49/])
HBASE-4482 Race Condition Concerning Eviction in SlabCache

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java


 Race Condition Concerning Eviction in SlabCache
 ---

 Key: HBASE-4482
 URL: https://issues.apache.org/jira/browse/HBASE-4482
 Project: HBase
  Issue Type: Sub-task
Reporter: Li Pi
Assignee: Li Pi
Priority: Blocker
 Fix For: 0.92.0

 Attachments: hbase-4482v1.txt, hbase-4482v2.txt, hbase-4482v4.2.txt, 
 hbase-4482v4.2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found

2011-10-06 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122446#comment-13122446
 ] 

Hudson commented on HBASE-4547:
---

Integrated in HBase-0.92 #49 (See 
[https://builds.apache.org/job/HBase-0.92/49/])
HBASE-4547 TestAdmin failing in 0.92 because .tableinfo not found

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java


 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4480) Testing script to simplfy local testing

2011-10-06 Thread Scott Kuehn (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Kuehn updated HBASE-4480:
---

Attachment: runtest2.sh

 Testing script to simplfy local testing
 ---

 Key: HBASE-4480
 URL: https://issues.apache.org/jira/browse/HBASE-4480
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Priority: Minor
  Labels: test
 Attachments: runtest.sh, runtest2.sh


 As mentioned by http://search-hadoop.com/m/r2Ab624ES3e and 
 http://search-hadoop.com/m/cZjDH1ykGIA it would be nice if we could have a 
 script that would handle more of the finer points of running/checking our 
 test suite.
 This script should:
 (1) Allow people to determine which tests are hanging/taking a long time to 
 run
 (2) Allow rerunning of particular tests to make sure it wasn't an artifact of 
 running the whole suite that caused the failure
 (3) Allow people to specify to run just unit tests or also integration tests 
 (essentially wrapping calls to 'maven test' and 'maven verify').
 This script should just be a convenience script - running tests directly from 
 maven should not be impacted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4480) Testing script to simplfy local testing

2011-10-06 Thread Scott Kuehn (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122449#comment-13122449
]

Scott Kuehn commented on HBASE-4480:

@Jesse, @Ted - The script has been extended with the features: print
slow/hanging tests, read test names from a file, select unit or
unit+integration tests.

usage:
{code}
usage: ./runtest2.sh [options] [test-name...]

Run a set of tests. Individual tests may be specified on the command
line or in a file specified by -f=FILE, containing one test per line.
Runs all tests by default.

options:
-h Show this message
-f=FILE Run the tests listed in the FILE
-u Only run unit tests. Default is to run
unit and integration tests
-n=NRun each test N times. Default = 1.
-s=NPrint N slowest tests
-H Print which tests are hanging (if any)
{code}

Testing script to simplfy local testing
---

Key: HBASE-4480
URL: https://issues.apache.org/jira/browse/HBASE-4480
Project: HBase
Issue Type: Improvement
Reporter: Jesse Yates
Priority: Minor
Labels: test
Attachments: runtest.sh, runtest2.sh

As mentioned by http://search-hadoop.com/m/r2Ab624ES3e and
http://search-hadoop.com/m/cZjDH1ykGIA it would be nice if we could have a
script that would handle more of the finer points of running/checking our
test suite.
This script should:
(1) Allow people to determine which tests are hanging/taking a long time to
run
(2) Allow rerunning of particular tests to make sure it wasn't an artifact of
running the whole suite that caused the failure
(3) Allow people to specify to run just unit tests or also integration tests
(essentially wrapping calls to 'maven test' and 'maven verify').
This script should just be a convenience script - running tests directly from
maven should not be impacted.

[jira] [Commented] (HBASE-4480) Testing script to simplfy local testing

2011-10-06 Thread Jesse Yates (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122475#comment-13122475
 ] 

Jesse Yates commented on HBASE-4480:


@Scott - awesome, thanks! I'm gonna go play with it tonight.

 Testing script to simplfy local testing
 ---

 Key: HBASE-4480
 URL: https://issues.apache.org/jira/browse/HBASE-4480
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Priority: Minor
  Labels: test
 Attachments: runtest.sh, runtest2.sh


 As mentioned by http://search-hadoop.com/m/r2Ab624ES3e and 
 http://search-hadoop.com/m/cZjDH1ykGIA it would be nice if we could have a 
 script that would handle more of the finer points of running/checking our 
 test suite.
 This script should:
 (1) Allow people to determine which tests are hanging/taking a long time to 
 run
 (2) Allow rerunning of particular tests to make sure it wasn't an artifact of 
 running the whole suite that caused the failure
 (3) Allow people to specify to run just unit tests or also integration tests 
 (essentially wrapping calls to 'maven test' and 'maven verify').
 This script should just be a convenience script - running tests directly from 
 maven should not be impacted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4488) Store could miss rows during flush


[ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122482#comment-13122482
 ] 

Lars Hofhansl commented on HBASE-4488:
--

I see that Store.compactStore does the same thing. The same reasoning goes 
there, that currently we are lucky that StoreScanner.next() never returns false 
when more rows are waiting.

There's even a comment about a do/while loop, but then it's just a while loop.

{code}
// since scanner.next() can return 'false' but still be delivering data,
// we have to use a do/while loop.
ArrayListKeyValue kvs = new ArrayListKeyValue();
// Limit to hbase.hstore.compaction.kv.max (default 10) to avoid OOME
while (scanner.next(kvs,this.compactionKVMax)) {
{code}

Looking at the history of the file this has been like this forever. This is a 
bug waiting to happen.
Should we have another patch with this one, or a separate jira?

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4333) Client does not check for holes in .META.


[ 
https://issues.apache.org/jira/browse/HBASE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122496#comment-13122496
 ] 

Lars Hofhansl commented on HBASE-4333:
--

I would prefer to have a log message on the server, rather than silently (from 
the viewpoint of the server logs) ignoring holes on the client.

With HBASE-4334 in place I propose closing this.


 Client does not check for holes in .META.
 -

 Key: HBASE-4333
 URL: https://issues.apache.org/jira/browse/HBASE-4333
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Joe Pallas

 If there is a temporary hole in .META., the client may get the wrong region 
 from HConnection.locateRegion.  
 HConnectionManager.HConnectionImplementation.locateRegionInMeta should check 
 the end key of the region found with getClosestRowBefore, just as it checks 
 the offline status, when it looks at the region info.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4462) Properly treating SocketTimeoutException